The aim of this study was to develop an appropriate parametric survival model to predict patient’s age at onset (AAO) for spinocerebellar ataxia type 3/Machado-Joseph disease (SCA3/MJD) populations from mainland China.
We compared the efficiency and performance of 6 parametric survival analysis methods (exponential, weibull, log-gaussian, gaussian, log-logistic, and logistic) based on cytosine-adenine-guanine (CAG) repeat length at ATXN3 to predict the probability of AAO in the largest cohort of patients with SCA3/MJD. A set of evaluation criteria, including –2 log-likelihood statistic, Akaike information criterion (AIC), bayesian information criterion (BIC), Nagelkerke R-squared (Nagelkerke R^2), and Cox-Snell residual plot, were used to identify the best model.
Among these 6 parametric survival models, the logistic model had the lowest –2 log-likelihood (6,560.12), AIC (6,566.12), and BIC (6,566.14) and the highest value of Nagelkerke R^2 (0.54), with the closest graph to the bisector Cox-Snell residual graph. Therefore, the logistic survival model was the best fit to the studied data. Using the optimal logistic survival model, we indicated the age-specific probability distribution of AAO according to the CAG repeat size and current age.
We first demonstrated that the logistic survival model provided the best fit for AAO prediction in patients with SCA3/MJD from mainland China. This optimal model can be valuable in clinical and research. However, the rigorous clinical testing and practice of other independent cohorts are needed for its clinical application. A unified model across multiethnic cohorts is worth further exploration by identifying regional differences and significant modifiers in AAO determination.