Predicting clinical scores for Alzheimer’s disease based on joint and deep learning
Introduction
Alzheimer's disease (AD) often occurs in the elderly population as a progressive neurodegenerative disease. The disease is usually accompanied by the loss of cognitive abilities, including daily activities and decision- making abilities, as well as the decline of social life abilities such as mobility disability, aphasia, and agnosia. AD has become the most feared disease in America rather than cancer, which causes more deaths in America than breast and prostate cancer combined. In 2018, 50 million people were living with AD. According to the report, AD has a heavy financial burden on individuals and societies with an estimated global cost of 1 trillion that will double by 2030 (Patterson, 2018). Once the patient is diagnosed with AD, there is no treatment currently available to transform its progressive course or to cure it (McKhann et al., 1984, McKhann et al., 2011). Therefore, the accurate diagnosis of AD is needed and it is important to provide timely treatment to patients to delay the condition possibly, especially in the early stage of AD (i.e., mild cognitive impairment (MCI)) (Lei, Cheng, et al., 2020). Presently, the clinical diagnosis of AD is based on the combination of neuroimaging (e.g., magnetic resonance imaging (MRI)) (De Leon et al., 2004, Jack et al., 1997, Nordberg, 2004) and various clinical scores (e.g., Alzheimer's disease assessment cognitive subscale (ADAS-Cog)) (Arevalo-Rodriguez et al., 2015, Delor et al., 2013). The doctor then estimates whether the severity of the patient's disease is in the early or late stages and recommends the corresponding treatment. Monitoring disease progression and relieving patients’ physical suffering are vital for early AD intelligent diagnosis. It is time-consuming for neurologists to obtain the clinical scores of patients and difficult for patients having poor compliance.
It is a progressive process from MCI to AD that requires multiple MRI scans and clinical evaluations. It will be very useful to obtain multiple time points data for later treatment of patients. Recently, with the rise of artificial intelligence, many scholars try to make use of the machine learning methods to predict clinical scores of AD based on neuroimaging data (Duc et al., 2020, Huang et al., 2016, Thung et al., 2014, Wang et al., 2010). However, the current score prediction model proposed in many studies still has the limitations. In traditional studies, the common clinical scores are mini-mental state examination (MMSE), the clinical dementia rating-global (CDR-GLOB), the sum of boxes (CDR-SOB), and ADAS-Cog. For instance, Stonnington et al. (2010) collected two independent datasets and used the correlation vector regression method to conduct training in one dataset and test in the other dataset to explore the correlation between MMSE, ADAS-Cog, and related changes in brain gray matter. Huang et al. (2016) proposed a longitudinal scores prediction model via the sparse regression-based random forest, which was trained on MRI images and predicted scores at different time points. Then, MMSE, CDR-GLOB, CDR-SOB and ADAS-Cog scores prediction are realized. The above researches only adopt baseline image features to predict scores at baseline or adopt the baseline features to predict future scores, which ignored the information of longitudinal image features. Zhang, Shen, and Initiative (2012b) proposed a group sparse method to select features extracted from MRI and fluorodeoxyglucose positron emission tomography (FDG-PET) data. They use the longitudinal multimodality data to predict MMSE scores at 24-month by adopting a coefficient linear regression model. However, there is still much space for improvement.
Owing to the limited samples and relatively high data dimensions, it causes the overfitting problem, which leads to low robustness of the model (Suk, Lee, & Shen, 2014). Feature selection is a common way to address the problem, which includes Chi-squared methods (Thaseen & Kumar, 2017), recursive feature elimination (Yin, Wang, Liu, Zhang, & Zhang, 2017) and least absolute shrinkage and selection operator (LASSO) (Tibshirani, 1996). LASSO is a simple and effective feature selection method, which is used in many researches for its promising performance (Wang, Li, & Tsai, 2007). As an enhanced LASSO, group LASSO has shown even promising performance for feature selection (Shi et al., 2014, Zhou et al., 2011). Therefore, we choose the group LASSO method to select the most discriminative features. In this paper, we revise group LASSO in a correntropy (Liu, Pokharel, & Príncipe, 2007) form to reduce noises in signals.
With the development of deep learning, the layered and hierarchical architectures extract more advanced and efficient data representation (Falahati et al., 2014, Schmidhuber, 2015), thus gaining a good reputation in the field of brain diseases including the AD diagnosis (Gupta, Ayhan, & Maida, 2013). In deep learning, the recurrent neural network (RNN) (Jun & Man, 1998) is able to capture the information of sequences, including time series. AD is a progressive process and there are some inter relation in longitudinal data. Good achievements have been reported in literature using RNN as well as its variations (e.g. long short-term memory, LSTM) for AD longitudinal analysis (Aghili et al., 2018, Cui et al., 2018). Independently recurrent neural network (IndRNN) (Li, Li, Cook, Zhu, & Gao, 2018) is one of the variations of RNN, and it is designed for a stacked multi-layer structure to increase the nonlinearity for better performance. In this paper, we originally propose a joint deep learning prediction method (i.e., IndRNN) to realize AD clinical scores prediction. Specifically, by combining group LASSO and correntropy, a feature selection method is used to achieve dimensionality reduction of MRI features, and to obtain the most informative features. We then explore the connection between different brain regions and the relation of the longitudinal time points data by multiple layers of IndRNN. By establishing the joint deep learning network model, which includes feature selection and multilayers IndRNN, we make prediction of the clinical score at future time points, and enhance possibility of early diagnosis and treatment. This paper proposes a new prediction model called CLSIndRNN, where CL represents feature selection including LASSO and correntropy, and SIndRNN represents stacked IndRNN regression. The effectiveness of the CLSIndRNN model is estimated by the 805 samples in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (http://adni.loni.usc.edu). Experiments demonstrate that the proposed CLSIndRNN has better performance than other models–for example, models that using feature selection or SIndRNN only.
Our contributions are summarized as below:
- (1)
We build an effective feature selection model based on group LASSO and correntropy, and optimize it with an efficient algorithm.
- (2)
We propose a joint deep learning method by combining the feature selection model and IndRNN, which can predict AD clinical scores from multiple time points MRI data.
- (3)
We evaluate the proposed method on the public ADNI dataset, and compare three different training strategies to show the effectiveness of the proposed method
Section snippets
Methodology
As shown in Fig. 1, our model mainly includes preparing data, extracting and selecting feature, and predicting results. The following section will describe the whole steps of the proposed method clearly.
Experimental setting
All the data used are obtained from the ADNI public database, using MRI and four clinical scores of 805 subjects at six-time points of baseline, M06, M12, M18, M24, and M36. Among them, the MRI and clinical score information of 805 subjects is the most complete at the baseline. Subjects are absent from data collection for various reasons, resulting in a decrease in the collected data. Detailed statistical information can be seen in Table 1.
In the experiment, the training set is the first
Longitudinal data predictions
Since the CLSIndRNN is composed of two parts: the joint feature selection based on group LASSO and correntropy, and prediction based on SIndRNN. We test and compare the comprehensive model with a partial model to prove the necessity and importance of feature selection. Among them, we compare the CLSIndRNN comprehensive model with the multi-layer SIndRNN and IndRNN, respectively, to verify the high efficiency of the CLSIndRNN comprehensive model. The experiment results are illustrated in Table 2.
Parameters of joint group LASSO
We use joint group LASSO to select the ROI features before inputting them into the SIndRNN. The iterations N and the weight of -norm regularization term are the most important parameters that have an influence on the selected features. To study the effect of and , we change the parameters and predict the clinical score at M36, we fix the training parameters of SIndRNN by setting: learning rate = 10−4, hidden layer size = 128, epochs = 50 and batch size = 128.
The effect of N and on
Conclusion
In this paper, we propose a new joint and deep learning regression clinical scores prediction model, which is based on IndRNN network regression to predict AD scores. The feature selection method combining group LASSO and correntropy is used to reduce dimension and obtain the features of brain regions related to the disease. With multilayer IndRNN, we study the internal connection between different brain regions, the time correlation between longitudinal data, the relationship between the MRI,
CRediT authorship contribution statement
Baiying Lei: Methodology, Software, Writing – original draft. Enmin Liang: Methodology, Software, Writing – original draft. Mengya Yang: Methodology, Data curation, Writing – original draft. Peng Yang: Formal analysis, Writing - review & editing. Feng Zhou: Formal analysis, Writing - review & editing. Ee-Leng Tan: Formal analysis, Writing - review & editing. Yi Lei: Formal analysis, Writing - review & editing. Chuan-Ming Liu: Supervision, Validation. Tianfu Wang: Supervision, Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported partly by National Natural Science Foundation of China (Nos. 61871274, U1909209 and 61801305), International Science and Technology Cooperation Projects of Guangdong (No. 2019A050510030), National Natural Science Foundation of Guangdong Province (Nos. 2019A1515111205 and 2019B1515120029), Key Laboratory of Medical Image Processing of Guangdong Province (No. K217300003). Guangdong Pearl River Talents Plan (2016ZT06S220), Shenzhen Peacock Plan (Nos. KQTD2016053112051497
References (50)
- et al.
RNN-based longitudinal analysis for diagnosis of Alzheimer’s disease
Computerized Medical Imaging and Graphics
(2019) - et al.
Longitudinal clinical score prediction in Alzheimer's disease with soft-split sparse regression based random forest
Neurobiology of Aging
(2016) - et al.
Self-calibrated brain network estimation and joint non-convex multi-task learning for identification of early Alzheimer's disease
Medical Image Analysis
(2020) - et al.
Deep and joint learning of longitudinal data for Alzheimer's disease prediction
Pattern Recognition
(2020) - et al.
The diagnosis of dementia due to Alzheimer's disease: Recommendations from the National Institute on Aging-Alzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease
Alzheimer's & Dementia
(2011) PET imaging of amyloid in Alzheimer's disease
The Lancet Neurology
(2004)Deep learning in neural networks: An overview
Neural networks
(2015)- et al.
Predicting clinical scores from magnetic resonance scans in Alzheimer's disease
Neuroimage
(2010) - et al.
Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and completion
Neuroimage
(2014) - et al.
High-dimensional pattern regression using machine learning: From medical images to continuous clinical variables
Neuroimage
(2010)
Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease
Neuroimage
Predictive modeling of longitudinal data for Alzheimer’s disease diagnosis using RNNs
Mini-Mental State Examination (MMSE) for the detection of Alzheimer's disease and other dementias in people with mild cognitive impairment (MCI)
Cochrane Database of Systematic Reviews
Accelerated gradient method for multi-task sparse learning problem
Longitudinal analysis for Alzheimer's disease diagnosis using RNN
MRI and CSF studies in the early diagnosis of Alzheimer's disease
Journal of Internal Medicine
Modeling Alzheimer's disease progression using disease onset time and disease trajectory concepts applied to CDR-SOB scores from ADNI
CPT: Pharmacometrics & Systems Pharmacology
Medial temporal lobe atrophy on MRI scans and the diagnosis of Alzheimer disease
Neurology
3D-Deep Learning Based Automatic Diagnosis of Alzheimer’s Disease with Joint MMSE Prediction Using Resting-State fMRI
Neuroinformatics
Multivariate data analysis and machine learning in Alzheimer's disease with a focus on structural magnetic resonance imaging
Journal of Alzheimer's Disease
Natural image bases to represent neuroimaging data
Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks
Bioinformatics
Learning temporal representation of transaction amount for fraudulent transaction recognition using CNN, Stacked LSTM, and CNN-LSTM
Long short-term memory
Neural Computation
Predicting Alzheimer’s Disease using LSTM
IEEE Access
Cited by (53)
Explainable AI-based Deep-SHAP for mapping the multivariate relationships between regional neuroimaging biomarkers and cognition
2024, European Journal of RadiologyTime-series visual explainability for Alzheimer's disease progression detection for smart healthcare
2023, Alexandria Engineering JournalEnsembling shallow siamese architectures to assess functional asymmetry in Alzheimer's disease progression
2023, Applied Soft ComputingMLKCA-Unet: Multiscale large-kernel convolution and attention in Unet for spine MRI segmentation
2023, OptikCitation Excerpt :For example, consistent perception-based generative adversarial network is used for segmentation of brain stroke lesions [9]. Deep learning-based clinical score is used for Alzheimer's disease prediction [10] High-order pooling tensor GANs are used for Alzheimer's disease assessment [11]. Fine-perception GANs are used to generate super-resolution MR images from low-resolution MR images of the brain [12].
Patch-based deep multi-modal learning framework for Alzheimer's disease diagnosis using multi-view neuroimaging
2023, Biomedical Signal Processing and ControlCitation Excerpt :In contrast, our PDMML method does not require any pre-defined ROIs, which makes our method more generalizable in practical applications. Compared to recent time series methods [37–39], the method in this paper can achieve significant diagnostic results with the help of baseline image data from a single time point only. The subject data collection cost required by the model in this paper is lower, and no significant dimensionality reduction operations are required for images collected at multiple time points.