Predicting clinical scores for Alzheimer’s disease based on joint and deep learning

https://doi.org/10.1016/j.eswa.2021.115966Get rights and content

Highlights

  • We design a joint and deep learning framework to predict the clinical scores of AD.

  • We use the group LASSO and correntropy for dimension reduction via feature selection.

  • We explore the multi-layer independently recurrent neural network regression.

  • We predict the clinical score by learning relationship between MRI and clinical score.

Abstract

Alzheimer's disease (AD) is a progressive neurodegenerative disease that often grows in middle-aged and elderly people with the gradual loss of cognitive ability. Presently, there is no cure for AD. Furthermore, the current clinical diagnosis of AD is too time-consuming. In this paper, we design a joint and deep learning framework to predict clinical scores of AD. Specifically, the feature selection method combining group LASSO and correntropy is used to reduce dimensions and screen the features of brain regions related to AD. We explore the multi-layer independently recurrent neural network regression to study the internal connection between different brain regions and the time correlation between longitudinal data. The proposed joint deep learning network studies the relationship between the magnetic resonance imaging and clinical score, and predicts the clinical score. The predicted clinical score values allow doctors to perform early diagnosis and timely treatment of patients’ disease condition.

Introduction

Alzheimer's disease (AD) often occurs in the elderly population as a progressive neurodegenerative disease. The disease is usually accompanied by the loss of cognitive abilities, including daily activities and decision- making abilities, as well as the decline of social life abilities such as mobility disability, aphasia, and agnosia. AD has become the most feared disease in America rather than cancer, which causes more deaths in America than breast and prostate cancer combined. In 2018, 50 million people were living with AD. According to the report, AD has a heavy financial burden on individuals and societies with an estimated global cost of 1 trillion that will double by 2030 (Patterson, 2018). Once the patient is diagnosed with AD, there is no treatment currently available to transform its progressive course or to cure it (McKhann et al., 1984, McKhann et al., 2011). Therefore, the accurate diagnosis of AD is needed and it is important to provide timely treatment to patients to delay the condition possibly, especially in the early stage of AD (i.e., mild cognitive impairment (MCI)) (Lei, Cheng, et al., 2020). Presently, the clinical diagnosis of AD is based on the combination of neuroimaging (e.g., magnetic resonance imaging (MRI)) (De Leon et al., 2004, Jack et al., 1997, Nordberg, 2004) and various clinical scores (e.g., Alzheimer's disease assessment cognitive subscale (ADAS-Cog)) (Arevalo-Rodriguez et al., 2015, Delor et al., 2013). The doctor then estimates whether the severity of the patient's disease is in the early or late stages and recommends the corresponding treatment. Monitoring disease progression and relieving patients’ physical suffering are vital for early AD intelligent diagnosis. It is time-consuming for neurologists to obtain the clinical scores of patients and difficult for patients having poor compliance.

It is a progressive process from MCI to AD that requires multiple MRI scans and clinical evaluations. It will be very useful to obtain multiple time points data for later treatment of patients. Recently, with the rise of artificial intelligence, many scholars try to make use of the machine learning methods to predict clinical scores of AD based on neuroimaging data (Duc et al., 2020, Huang et al., 2016, Thung et al., 2014, Wang et al., 2010). However, the current score prediction model proposed in many studies still has the limitations. In traditional studies, the common clinical scores are mini-mental state examination (MMSE), the clinical dementia rating-global (CDR-GLOB), the sum of boxes (CDR-SOB), and ADAS-Cog. For instance, Stonnington et al. (2010) collected two independent datasets and used the correlation vector regression method to conduct training in one dataset and test in the other dataset to explore the correlation between MMSE, ADAS-Cog, and related changes in brain gray matter. Huang et al. (2016) proposed a longitudinal scores prediction model via the sparse regression-based random forest, which was trained on MRI images and predicted scores at different time points. Then, MMSE, CDR-GLOB, CDR-SOB and ADAS-Cog scores prediction are realized. The above researches only adopt baseline image features to predict scores at baseline or adopt the baseline features to predict future scores, which ignored the information of longitudinal image features. Zhang, Shen, and Initiative (2012b) proposed a group sparse method to select features extracted from MRI and fluorodeoxyglucose positron emission tomography (FDG-PET) data. They use the longitudinal multimodality data to predict MMSE scores at 24-month by adopting a coefficient linear regression model. However, there is still much space for improvement.

Owing to the limited samples and relatively high data dimensions, it causes the overfitting problem, which leads to low robustness of the model (Suk, Lee, & Shen, 2014). Feature selection is a common way to address the problem, which includes Chi-squared methods (Thaseen & Kumar, 2017), recursive feature elimination (Yin, Wang, Liu, Zhang, & Zhang, 2017) and least absolute shrinkage and selection operator (LASSO) (Tibshirani, 1996). LASSO is a simple and effective feature selection method, which is used in many researches for its promising performance (Wang, Li, & Tsai, 2007). As an enhanced LASSO, group LASSO has shown even promising performance for feature selection (Shi et al., 2014, Zhou et al., 2011). Therefore, we choose the group LASSO method to select the most discriminative features. In this paper, we revise group LASSO in a correntropy (Liu, Pokharel, & Príncipe, 2007) form to reduce noises in signals.

With the development of deep learning, the layered and hierarchical architectures extract more advanced and efficient data representation (Falahati et al., 2014, Schmidhuber, 2015), thus gaining a good reputation in the field of brain diseases including the AD diagnosis (Gupta, Ayhan, & Maida, 2013). In deep learning, the recurrent neural network (RNN) (Jun & Man, 1998) is able to capture the information of sequences, including time series. AD is a progressive process and there are some inter relation in longitudinal data. Good achievements have been reported in literature using RNN as well as its variations (e.g. long short-term memory, LSTM) for AD longitudinal analysis (Aghili et al., 2018, Cui et al., 2018). Independently recurrent neural network (IndRNN) (Li, Li, Cook, Zhu, & Gao, 2018) is one of the variations of RNN, and it is designed for a stacked multi-layer structure to increase the nonlinearity for better performance. In this paper, we originally propose a joint deep learning prediction method (i.e., IndRNN) to realize AD clinical scores prediction. Specifically, by combining group LASSO and correntropy, a feature selection method is used to achieve dimensionality reduction of MRI features, and to obtain the most informative features. We then explore the connection between different brain regions and the relation of the longitudinal time points data by multiple layers of IndRNN. By establishing the joint deep learning network model, which includes feature selection and multilayers IndRNN, we make prediction of the clinical score at future time points, and enhance possibility of early diagnosis and treatment. This paper proposes a new prediction model called CLSIndRNN, where CL represents feature selection including LASSO and correntropy, and SIndRNN represents stacked IndRNN regression. The effectiveness of the CLSIndRNN model is estimated by the 805 samples in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (http://adni.loni.usc.edu). Experiments demonstrate that the proposed CLSIndRNN has better performance than other models–for example, models that using feature selection or SIndRNN only.

Our contributions are summarized as below:

  • (1)

    We build an effective feature selection model based on group LASSO and correntropy, and optimize it with an efficient algorithm.

  • (2)

    We propose a joint deep learning method by combining the feature selection model and IndRNN, which can predict AD clinical scores from multiple time points MRI data.

  • (3)

    We evaluate the proposed method on the public ADNI dataset, and compare three different training strategies to show the effectiveness of the proposed method

Section snippets

Methodology

As shown in Fig. 1, our model mainly includes preparing data, extracting and selecting feature, and predicting results. The following section will describe the whole steps of the proposed method clearly.

Experimental setting

All the data used are obtained from the ADNI public database, using MRI and four clinical scores of 805 subjects at six-time points of baseline, M06, M12, M18, M24, and M36. Among them, the MRI and clinical score information of 805 subjects is the most complete at the baseline. Subjects are absent from data collection for various reasons, resulting in a decrease in the collected data. Detailed statistical information can be seen in Table 1.

In the experiment, the training set is the first

Longitudinal data predictions

Since the CLSIndRNN is composed of two parts: the joint feature selection based on group LASSO and correntropy, and prediction based on SIndRNN. We test and compare the comprehensive model with a partial model to prove the necessity and importance of feature selection. Among them, we compare the CLSIndRNN comprehensive model with the multi-layer SIndRNN and IndRNN, respectively, to verify the high efficiency of the CLSIndRNN comprehensive model. The experiment results are illustrated in Table 2.

Parameters of joint group LASSO

We use joint group LASSO to select the ROI features before inputting them into the SIndRNN. The iterations N and the weight of l2,1-norm regularization term ρ are the most important parameters that have an influence on the selected features. To study the effect of N and ρ, we change the parameters and predict the clinical score at M36, we fix the training parameters of SIndRNN by setting: learning rate = 10−4, hidden layer size = 128, epochs = 50 and batch size = 128.

The effect of N and ρ on

Conclusion

In this paper, we propose a new joint and deep learning regression clinical scores prediction model, which is based on IndRNN network regression to predict AD scores. The feature selection method combining group LASSO and correntropy is used to reduce dimension and obtain the features of brain regions related to the disease. With multilayer IndRNN, we study the internal connection between different brain regions, the time correlation between longitudinal data, the relationship between the MRI,

CRediT authorship contribution statement

Baiying Lei: Methodology, Software, Writing – original draft. Enmin Liang: Methodology, Software, Writing – original draft. Mengya Yang: Methodology, Data curation, Writing – original draft. Peng Yang: Formal analysis, Writing - review & editing. Feng Zhou: Formal analysis, Writing - review & editing. Ee-Leng Tan: Formal analysis, Writing - review & editing. Yi Lei: Formal analysis, Writing - review & editing. Chuan-Ming Liu: Supervision, Validation. Tianfu Wang: Supervision, Validation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported partly by National Natural Science Foundation of China (Nos. 61871274, U1909209 and 61801305), International Science and Technology Cooperation Projects of Guangdong (No. 2019A050510030), National Natural Science Foundation of Guangdong Province (Nos. 2019A1515111205 and 2019B1515120029), Key Laboratory of Medical Image Processing of Guangdong Province (No. K217300003). Guangdong Pearl River Talents Plan (2016ZT06S220), Shenzhen Peacock Plan (Nos. KQTD2016053112051497

References (50)

  • D. Zhang et al.

    Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease

    Neuroimage

    (2012)
  • M. Aghili et al.

    Predictive modeling of longitudinal data for Alzheimer’s disease diagnosis using RNNs

  • I. Arevalo-Rodriguez et al.

    Mini-Mental State Examination (MMSE) for the detection of Alzheimer's disease and other dementias in people with mild cognitive impairment (MCI)

    Cochrane Database of Systematic Reviews

    (2015)
  • X. Chen et al.

    Accelerated gradient method for multi-task sparse learning problem

  • R. Cui et al.

    Longitudinal analysis for Alzheimer's disease diagnosis using RNN

  • M.J. De Leon et al.

    MRI and CSF studies in the early diagnosis of Alzheimer's disease

    Journal of Internal Medicine

    (2004)
  • I. Delor et al.

    Modeling Alzheimer's disease progression using disease onset time and disease trajectory concepts applied to CDR-SOB scores from ADNI

    CPT: Pharmacometrics & Systems Pharmacology

    (2013)
  • R. Duara et al.

    Medial temporal lobe atrophy on MRI scans and the diagnosis of Alzheimer disease

    Neurology

    (2008)
  • N.T. Duc et al.

    3D-Deep Learning Based Automatic Diagnosis of Alzheimer’s Disease with Joint MMSE Prediction Using Resting-State fMRI

    Neuroinformatics

    (2020)
  • F. Falahati et al.

    Multivariate data analysis and machine learning in Alzheimer's disease with a focus on structural magnetic resonance imaging

    Journal of Alzheimer's Disease

    (2014)
  • A. Gupta et al.

    Natural image bases to represent neuroimaging data

  • J. Hanson et al.

    Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks

    Bioinformatics

    (2017)
  • Y. Heryadi et al.

    Learning temporal representation of transaction amount for fraudulent transaction recognition using CNN, Stacked LSTM, and CNN-LSTM

  • S. Hochreiter et al.

    Long short-term memory

    Neural Computation

    (1997)
  • X. Hong et al.

    Predicting Alzheimer’s Disease using LSTM

    IEEE Access

    (2019)
  • Cited by (53)

    • MLKCA-Unet: Multiscale large-kernel convolution and attention in Unet for spine MRI segmentation

      2023, Optik
      Citation Excerpt :

      For example, consistent perception-based generative adversarial network is used for segmentation of brain stroke lesions [9]. Deep learning-based clinical score is used for Alzheimer's disease prediction [10] High-order pooling tensor GANs are used for Alzheimer's disease assessment [11]. Fine-perception GANs are used to generate super-resolution MR images from low-resolution MR images of the brain [12].

    • Patch-based deep multi-modal learning framework for Alzheimer's disease diagnosis using multi-view neuroimaging

      2023, Biomedical Signal Processing and Control
      Citation Excerpt :

      In contrast, our PDMML method does not require any pre-defined ROIs, which makes our method more generalizable in practical applications. Compared to recent time series methods [37–39], the method in this paper can achieve significant diagnostic results with the help of baseline image data from a single time point only. The subject data collection cost required by the model in this paper is lower, and no significant dimensionality reduction operations are required for images collected at multiple time points.

    View all citing articles on Scopus
    View full text