» Articles » PMID: 36294727

LLM-PBC: Logic Learning Machine-Based Explainable Rules Accurately Stratify the Genetic Risk of Primary Biliary Cholangitis

Abstract

Background: The application of Machine Learning (ML) to genetic individual-level data represents a foreseeable advancement for the field, which is still in its infancy. Here, we aimed to evaluate the feasibility and accuracy of an ML-based model for disease risk prediction applied to Primary Biliary Cholangitis (PBC).

Methods: Genome-wide significant variants identified in subjects of European ancestry in the recently released second international meta-analysis of GWAS in PBC were used as input data. Quality-checked, individual genomic data from two Italian cohorts were used. The ML included the following steps: import of genotype and phenotype data, genetic variant selection, supervised classification of PBC by genotype, generation of "if-then" rules for disease prediction by logic learning machine (LLM), and model validation in a different cohort.

Results: The training cohort included 1345 individuals: 444 were PBC cases and 901 were healthy controls. After pre-processing, 41,899 variants entered the analysis. Several configurations of parameters related to feature selection were simulated. The best LLM model reached an Accuracy of 71.7%, a Matthews correlation coefficient of 0.29, a Youden's value of 0.21, a Sensitivity of 0.28, a Specificity of 0.93, a Positive Predictive Value of 0.66, and a Negative Predictive Value of 0.72. Thirty-eight rules were generated. The rule with the highest covering (19.14) included the following genes: RIN3, KANSL1, TIMMDC1, TNPO3. The validation cohort included 834 individuals: 255 cases and 579 controls. By applying the ruleset derived in the training cohort, the Area under the Curve of the model was 0.73.

Conclusions: This study represents the first illustration of an ML model applied to common variants associated with PBC. Our approach is computationally feasible, leverages individual-level data to generate intelligible rules, and can be used for disease prediction in at-risk individuals.

Citing Articles

Deep learning helps discriminate between autoimmune hepatitis and primary biliary cholangitis.

Gerussi A, Saldanha O, Cazzaniga G, Verda D, Carrero Z, Engel B JHEP Rep. 2025; 7(2):101198.

PMID: 39829723 PMC: 11741034. DOI: 10.1016/j.jhepr.2024.101198.


Genetic susceptibility to severe COVID-19.

Cappadona C, Rimoldi V, Paraboschi E, Asselta R Infect Genet Evol. 2023; 110:105426.

PMID: 36934789 PMC: 10022467. DOI: 10.1016/j.meegid.2023.105426.

References
1.
Mordenti M, Ferrari E, Pedrini E, Fabbri N, Campanacci L, Muselli M . Validation of a new multiple osteochondromas classification through Switching Neural Networks. Am J Med Genet A. 2013; 161A(3):556-60. DOI: 10.1002/ajmg.a.35819. View

2.
Jones D, Watt F, Metcalf J, Bassendine M, James O . Familial primary biliary cirrhosis reassessed: a geographically-based population study. J Hepatol. 1999; 30(3):402-7. DOI: 10.1016/s0168-8278(99)80097-x. View

3.
Wei W, Hemani G, Haley C . Detecting epistasis in human complex traits. Nat Rev Genet. 2014; 15(11):722-33. DOI: 10.1038/nrg3747. View

4.
Gunning D, Stefik M, Choi J, Miller T, Stumpf S, Yang G . XAI-Explainable artificial intelligence. Sci Robot. 2020; 4(37). DOI: 10.1126/scirobotics.aay7120. View

5.
Cangelosi D, Muselli M, Parodi S, Blengio F, Becherini P, Versteeg R . Use of Attribute Driven Incremental Discretization and Logic Learning Machine to build a prognostic classifier for neuroblastoma patients. BMC Bioinformatics. 2014; 15 Suppl 5:S4. PMC: 4095004. DOI: 10.1186/1471-2105-15-S5-S4. View