Phenotypic Evaluation of Deep Learning Models for Classifying Germline Variant Pathogenicity
@Nature_NPJ
• This study evaluates the real-world utility of three state-of-the-art deep learning models—AlphaMissense, EVE, and ESM1b—in classifying germline variants associated with hereditary cancer risks.
• Using data from 469,623 UK Biobank participants, the study focuses on missense variants in key cancer-related genes, including BRCA1, BRCA2, ATM, CHEK2, and PALB2.
• AlphaMissense and ESM1b models were able to identify pathogenic BRCA1 and BRCA2 variants that conferred increased risk for breast and ovarian cancer, but they struggled with certain other genes like ATM and CHEK2.
• Notably, AlphaMissense identified potentially pathogenic PALB2 variants, which were previously categorized as variants of uncertain significance (VUS) by ClinVar, hinting at the models’ potential for refining variant classification.
• Despite their success with some genes, all models exhibited limited accuracy in distinguishing VUSs associated with increased cancer risk, underscoring the need for cautious interpretation in clinical practice.
• Composite classifiers that combined ClinVar annotations with deep learning predictions reduced the proportion of participants classified as VUS carriers, though at the cost of predictive power.
• The study highlights the importance of gene-specific thresholds, as a uniform cutoff reduced model accuracy across multiple genes, indicating that custom thresholds may improve performance.
• The authors emphasize the need for diverse genomic data to mitigate biases in current models, especially since VUSs are more prevalent in non-European populations.
• While the study shows promise for integrating deep learning in clinical settings, it concludes that deep learning models are not yet ready to fully replace traditional variant classification methods for clinical decision-making.
@ravi_b_parikh @KLNathanson @ScienceChow
💻Code:
github.com/rdchow/UKB_pathog…
📜Paper:
nature.com/articles/s41698-0…