TY - JOUR
T1 - Not all roads lead to the immune system
T2 - the genetic basis of multiple sclerosis severity
AU - Jokubaitis, Vilija G.
AU - Campagna, Maria Pia
AU - Ibrahim, Omar
AU - Stankovich, Jim
AU - Kleinova, Pavlina
AU - Matesanz, Fuencisla
AU - Hui, Daniel
AU - Eichau, Sara
AU - Slee, Mark
AU - Lechner-Scott, Jeannette
AU - Lea, Rodney
AU - Kilpatrick, Trevor J.
AU - Kalincik, Tomas
AU - De Jager, Philip L.
AU - Beecham, Ashley
AU - McCauley, Jacob L.
AU - Taylor, Bruce V.
AU - Vucic, Steve
AU - Laverick, Louise
AU - Vodehnalova, Karolina
AU - García-Sanchéz, Maria Isabel
AU - Alcina, Antonio
AU - van der Walt, Anneke
AU - Havrdova, Eva Kubala
AU - Izquierdo, Guillermo
AU - Patsopoulos, Nikolaos
AU - Horakova, Dana
AU - Butzkueven, Helmut
PY - 2023/6
Y1 - 2023/6
N2 - Multiple sclerosis is a leading cause of neurological disability in adults. Heterogeneity in multiple sclerosis clinical presentation has posed a major challenge for identifying genetic variants associated with disease outcomes. To overcome this challenge, we used prospectively ascertained clinical outcomes data from the largest international multiple sclerosis registry, MSBase. We assembled a cohort of deeply phenotyped individuals of European ancestry with relapse-onset multiple sclerosis. We used unbiased genome-wide association study and machine learning approaches to assess the genetic contribution to longitudinally defined multiple sclerosis severity phenotypes in 1813 individuals. Our primary analyses did not identify any genetic variants of moderate to large effect sizes that met genome-wide significance thresholds. The strongest signal was associated with rs7289446 (β = −0.4882, P = 2.73 × 10−7), intronic to SEZ6L on chromosome 22. However, we demonstrate that clinical outcomes in relapse-onset multiple sclerosis are associated with multiple genetic loci of small effect sizes. Using a machine learning approach incorporating over 62 000 variants together with clinical and demographic variables available at multiple sclerosis disease onset, we could predict severity with an area under the receiver operator curve of 0.84 (95% CI 0.79–0.88). Our machine learning algorithm achieved positive predictive value for outcome assignation of 80% and negative predictive value of 88%. This outperformed our machine learning algorithm that contained clinical and demographic variables alone (area under the receiver operator curve 0.54, 95% CI 0.48–0.60). Secondary, sex-stratified analyses identified two genetic loci that met genome-wide significance thresholds. One in females (rs10967273; βfemale = 0.8289, P = 3.52 × 10−8), the other in males (rs698805; βmale = −1.5395, P = 4.35 × 10−8), providing some evidence for sex dimorphism in multiple sclerosis severity. Tissue enrichment and pathway analyses identified an overrepresentation of genes expressed in CNS compartments generally, and specifically in the cerebellum (P = 0.023). These involved mitochondrial function, synaptic plasticity, oligodendroglial biology, cellular senescence, calcium and G-protein receptor signalling pathways. We further identified six variants with strong evidence for regulating clinical outcomes, the strongest signal again intronic to SEZ6L (adjusted hazard ratio 0.72, P = 4.85 × 10−4). Here we report a milestone in our progress towards understanding the clinical heterogeneity of multiple sclerosis outcomes, implicating functionally distinct mechanisms to multiple sclerosis risk. Importantly, we demonstrate that machine learning using common single nucleotide variant clusters, together with clinical variables readily available at diagnosis can improve prognostic capabilities at diagnosis, and with further validation has the potential to translate to meaningful clinical practice change.
AB - Multiple sclerosis is a leading cause of neurological disability in adults. Heterogeneity in multiple sclerosis clinical presentation has posed a major challenge for identifying genetic variants associated with disease outcomes. To overcome this challenge, we used prospectively ascertained clinical outcomes data from the largest international multiple sclerosis registry, MSBase. We assembled a cohort of deeply phenotyped individuals of European ancestry with relapse-onset multiple sclerosis. We used unbiased genome-wide association study and machine learning approaches to assess the genetic contribution to longitudinally defined multiple sclerosis severity phenotypes in 1813 individuals. Our primary analyses did not identify any genetic variants of moderate to large effect sizes that met genome-wide significance thresholds. The strongest signal was associated with rs7289446 (β = −0.4882, P = 2.73 × 10−7), intronic to SEZ6L on chromosome 22. However, we demonstrate that clinical outcomes in relapse-onset multiple sclerosis are associated with multiple genetic loci of small effect sizes. Using a machine learning approach incorporating over 62 000 variants together with clinical and demographic variables available at multiple sclerosis disease onset, we could predict severity with an area under the receiver operator curve of 0.84 (95% CI 0.79–0.88). Our machine learning algorithm achieved positive predictive value for outcome assignation of 80% and negative predictive value of 88%. This outperformed our machine learning algorithm that contained clinical and demographic variables alone (area under the receiver operator curve 0.54, 95% CI 0.48–0.60). Secondary, sex-stratified analyses identified two genetic loci that met genome-wide significance thresholds. One in females (rs10967273; βfemale = 0.8289, P = 3.52 × 10−8), the other in males (rs698805; βmale = −1.5395, P = 4.35 × 10−8), providing some evidence for sex dimorphism in multiple sclerosis severity. Tissue enrichment and pathway analyses identified an overrepresentation of genes expressed in CNS compartments generally, and specifically in the cerebellum (P = 0.023). These involved mitochondrial function, synaptic plasticity, oligodendroglial biology, cellular senescence, calcium and G-protein receptor signalling pathways. We further identified six variants with strong evidence for regulating clinical outcomes, the strongest signal again intronic to SEZ6L (adjusted hazard ratio 0.72, P = 4.85 × 10−4). Here we report a milestone in our progress towards understanding the clinical heterogeneity of multiple sclerosis outcomes, implicating functionally distinct mechanisms to multiple sclerosis risk. Importantly, we demonstrate that machine learning using common single nucleotide variant clusters, together with clinical variables readily available at diagnosis can improve prognostic capabilities at diagnosis, and with further validation has the potential to translate to meaningful clinical practice change.
KW - disease severity
KW - genetics
KW - machine learning
KW - multiple sclerosis
KW - prognostics
UR - http://www.scopus.com/inward/record.url?scp=85161660538&partnerID=8YFLogxK
U2 - 10.1093/brain/awac449
DO - 10.1093/brain/awac449
M3 - Article
C2 - 36448302
AN - SCOPUS:85161660538
SN - 0006-8950
VL - 146
SP - 2316
EP - 2331
JO - Brain
JF - Brain
IS - 6
ER -