DlRNA-BERTa: A Transformer Approach for RNA-Drug Binding Affinity Prediction
1. DlRNA-BERTa is a novel RoBERTa-based framework that combines RNABERTa, pretrained on 9.76 million RNA sequences, with ChemBERTa-v2 to predict small molecule–RNA interactions. It includes six class-specific models for different RNA types and a general model for unknown RNA classes, outperforming existing RNA–drug interaction prediction methods with Pearson correlation coefficients up to 0.98 for miRNAs.
2. The study leverages a cross-attention mechanism to capture contextual interactions between RNA and drug tokens, enabling end-to-end, structure-free prediction of RNA–drug binding affinities. This approach allows the model to achieve robust performance across various RNA classes, even those with limited training data, such as repeats and riboswitches.
3. Application of DlRNA-BERTa to 3,492 approved drugs from the ChEMBL database identified 2,859 compounds with predicted affinities (pKd ≥6) across 294 RNA targets. Notably, bleomycin was highlighted with literature evidence supporting its RNA-binding activity, demonstrating the model's biological relevance and predictive reliability.
4. The model's architecture and training process incorporate several innovations, including the use of μParametrization for scalable hyperparameter transferability and the optimization of hyperparameters using Optuna with the Tree-structured Parzen Estimator (TPE) sampler. These enhancements contribute to the model's efficiency and performance.
5. A publicly accessible web application is available at
huggingface.co/spaces/IlPako…, providing user-friendly access to the general model. The source code and datasets are openly available at
github.com/IlPakoZ/rnaberta-…, in accordance with FAIR principles, allowing researchers to extend the framework and advance RNA-targeted drug discovery.
📜Paper:
biorxiv.org/content/10.1101/…
#RNAtherapeutics #DrugDiscovery #TransformerModels #Bioinformatics #AIinMedicine