Background: At present, no predictive markers for Major Depressive Disorder (MDD) exist. The search for such markers has been challenging due to clinical and molecular heterogeneity of MDD, the lack of statistical power in studies and suboptimal statistical tools applied to multidimensional data. Machine learning is a powerful approach to mitigate some of these limitations. Methods: We aimed to identify the predictive markers of recurrent MDD in the elderly using peripheral whole blood from the Sydney Memory and Aging Study (SMAS) (N = 521, aged over 65) and adopting machine learning methodology on transcriptome data. Fuzzy Forests is a Random Forests-based classification algorithm that takes advantage of the co-expression network structure between genes; it allows to alleviate the problem of p >> n via reducing the dimensionality of transcriptomic feature space. Results: By adopting Fuzzy Forests on transcriptome data, we found that the downregulated TFRC (transferrin receptor) can predict recurrent MDD with an accuracy of 63%. Limitations: Although we corrected our data for several important confounders, we were not able to account for the comorbidities and medication taken, which may be numerous in the elderly and might have affected the levels of gene transcription. Conclusions: We found that downregulated TFRC is predictive of recurrent MDD, which is consistent with the previous literature, indicating the role of the innate immune system in depression. This study is the first to successfully apply Fuzzy Forests methodology on psychiatric condition, opening, therefore, a methodological avenue that can lead to clinically useful predictive markers of complex traits.
- Machine learning
- Random forests