Matlantis™

Matlantis™

Users
Tweets

Matlantis™

@matlantis_en

Jun 10

🧪 Finding cleaning chemicals that leave no residue on silicon substrates means navigating a vast molecular space — and experimentally evaluating every candidate is slow and costly. In a new Matlantis case study, SCREEN Holdings shows how Bayesian optimization paired with universal MLIPs accelerates this screening by 39× versus random sampling. Residue-free wafer cleaning is critical to modern semiconductor manufacturing. After hydrofluoric acid treatment, the H-terminated Si(100) surface must remain free of adsorbed contaminants — but the chemical space of candidate molecules is enormous. 🔬 The team's approach combines: - PFP in Matlantis for adsorption energy calculations on H-terminated Si(100) - Molecular descriptors from RDKit and Force-Field Kernel Mean methods - Bayesian optimization (via Optuna) to iteratively select promising candidates and refine the ML surrogate model Each round of calculation feeds back into the next selection, narrowing toward low-adsorption-energy molecules without exhaustive enumeration. 🌱 The 39× screening efficiency gain demonstrates a practical workflow for discovering new wafer-cleaning materials — one that goes beyond what experimental approaches alone can deliver. Read the full case study: matlantis.com/en/calculation… #Matlantis #Semiconductors #BayesianOptimization

Data-Driven Method for Discovering Low Adsorption Energy Molecules on Si Surfaces

In semiconductor manufacturing processes, a high level of cleanliness is required to prevent residues from remaining on Si substrates, and the application of new chemical solutions is being conside...

matlantis.com

192

Pradeep

Pradeep

@Pradeep891730

May 29

Inverse design in battery materials is in an awkward place. The forward direction is well-studied: pick a composition, predict energy density. Materials Project, OQMD, and AFLOWLIB give you decent training data for that. The inverse direction — "find me a composition that hits 300 Wh/kg with >1500 cycles at 4 C-rate" — is much harder, because you need joint distributions over composition, processing, and performance under realistic operating conditions. The public datasets don't have that. MAT-001 was built with inverse design in mind: → Composition features: cathode, anode, electrolyte, separator, crystal structure, 12 dopant species, 6 binder systems → Process features: manufacturing variation per batch, yield, anomaly rate → Performance features: energy density, power density, cycle life, thermal runaway temp, Coulombic efficiency, impedance growth, fast-charge capacity loss → All jointly distributed across 25,000 materials at full scale (2,500 in the free sample) Sample on Hugging Face: huggingface.co/datasets/xper… If you're working on Bayesian optimization, generative models, or active learning loops for battery materials, this gives you the joint distribution you've been missing. #materialsinformatics #inversedesign #bayesianoptimization #activelearning

xpertsystems/mat001-sample · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Bryan Kian Hsiang Low

Bryan Kian Hsiang Low

@bryanklow

May 27

📢 Tired of benchmarking your optimizer on Hartmann and Branin? Try BoLT ⚡, our new black-box optimization (BBO) benchmark grounded in 20K real LLM experiments instead! LLMs involve expensive, derivative-free decisions that BBO is built to handle. Yet, most BBO research still validates on synthetic functions that miss the challenges of real LLM tasks. BoLT ⚡ closes this gap so that you can evaluate BBO methods against realistic objectives without needing large-scale compute. 📦 3 task families, 10 problems spanning: • Hyperparameter optimization (LoRA fine-tuning, mixed variables, multi-fidelity); • Data mixture optimization (simplex constraints, multi-objective, heteroscedastic noise); • Prompt optimization (high-dimensional discrete search up to 768 dims). 🚀 Fast, validated emulators replace real LLM calls, returning results in milliseconds. Weights load automatically from HuggingFace on first use. 🔌 Every problem subclasses BoTorch's BaseTestProblem, so your existing optimizer code plugs straight in. Key findings from benchmarking 15 methods: GP-based BO consistently beats standard HPO baselines; NEHVI matches NSGA-II on multi-objective data mixture optimization with 50× fewer evaluations; trust-region methods are essential for high-dimensional discrete prompt search. Joint work with Ruth Chew @ruthchewing, Zhiliang Chen @ZhiliangChen94, and Apivich Hemachandra @apivich_h. Check us out @icmlconf #ICML2026 DEMO Workshop (decision-making-offline2onli…)! 📄 Preprint: arxiv.org/abs/2605.17000 🌐 Project page: chewwt.github.io/bolt ⭐ GitHub: github.com/chewwt/bolt (star to keep up with future updates) 💻 Docs: bolt-bench.readthedocs.io #BayesianOptimization #BlackboxOptimization #LLMs

779

Information MDPI

Information MDPI @InformationMDPI

May 11

💰 Read #NewPaper "A Bayesian-Optimized XGBoost Approach for Money Laundering Risk Prediction in Financial Transactions" by Zihao Zuo, et al. See more details at: doi.org/10.3390/info17040324 #AntiMoneyLaundering #XGBoost #ML #FinTech #BayesianOptimization @ComSciMath_Mdpi @zh_Zzuo

Mo Lotfollahi

Mo Lotfollahi

@mo_lotfollahi

May 2

Our paper "Many Needles in a Haystack" has been accepted at ICML 2026 — see you in Seoul! 🇰🇷🧬 CRISPR screens can test thousands of genes, but budgets are tight and hits are rare. Which perturbations should you run next? We frame this as a lab-in-the-loop design problem: AI proposes a batch → lab runs it → readouts update the model → repeat. Each cycle gets smarter about where hits are hiding. Our method, Probability-of-Hit, recovers more hits across 5 real immunology screens. More hits per plate, fewer wasted wells. Great work by Andrea Rubbi, Arpit Merchant, Samuel Ogden, Amir Akbarnejad, with Pietro Lio & Sattar Vakili 🎉 #ICML2026 #ActiveLearning #PerturbSeq #FunctionalGenomics #CRISPRscreen #LabInTheLoop #AI4Science #BayesianOptimization

222

22,491

Forecasting MDPI

Forecasting MDPI @forecast_MDPI

May 1

📢 Must-Read in #Forecasting 📖 Research on Dynamic Hyperparameter Optimization Algorithm for University Financial Risk Early Warning Based on Multi-Objective Bayesian Optimization 🔗 brnw.ch/21x26FY #BayesianOptimization #FinancialRisk #HigherEducation

Bryan Kian Hsiang Low

Bryan Kian Hsiang Low

@bryanklow

Apr 20

[1/3] 🤔An interesting and practical question: How can we find the optimal #LLM training data mixture that maximizes a free-form downstream task metric? For instance, what fine-tuning data mixture should we use to maximize same-demographic user ratings ⭐ across our chatbots? Our #ICLR2026 work (with @ZhiliangChen94 @greglau Chuan-Sheng Foo) called DUET interleaves #BayesianOptimization and #DataSelection to automatically discover the best data mixture that maximizes any free-form downstream feedback, without manually searching through countless combinations. 📄Paper: arxiv.org/abs/2502.00270 📅Catch us at @iclr_conf 🇧🇷Poster Session 3 Fri Apr 24 10:30AM Pavilion 3 P3-#305. More below👇.

0:13

560

Bryan Kian Hsiang Low

Bryan Kian Hsiang Low

@bryanklow

Apr 20

[3/4] Enter JoBS: Joint #BayesianOptimization (BO) with a Scaling-law-inspired predictor. The trick: Burn a small slice of compute on N full runs → train a neural net predictor that extrapolates final performance at 10,000 steps from just 100! Now every BO iteration is ~10x cheaper. 🔥

135

Bryan Kian Hsiang Low

Bryan Kian Hsiang Low

@bryanklow

Apr 20

[2/4] Most practitioners pick one of the following and hope for the best: Optimize data with a fixed LoRA config → leaves performance on the table. Optimize LoRA with a fixed mixture → same story. Joint optimization is the answer, but each full finetuning run costs $$$, making naive #BayesianOptimization (BO) infeasible.

Biology AI Daily

Biology AI Daily @BiologyAIDaily

Apr 17

BOAT: Navigating the Sea of In Silico Predictors for Antibody Design via Multi-Objective Bayesian Optimization 1. BOAT is a plug-and-play multi-objective Bayesian optimization (BO) framework that jointly optimizes multiple antibody properties predicted by arbitrary in silico “oracles,” aiming to replace inefficient sequential filtering pipelines with Pareto-aware design. 2. The key engineering idea: uncertainty-aware surrogate modeling (Gaussian processes) proposes which sequences to score next, while a genetic algorithm (GA) is used to optimize acquisition functions directly in discrete sequence space (avoiding invalid continuous edits and awkward projections). 3. BOAT targets realistic lead-optimization settings where objectives can conflict (e.g., affinity vs. developability vs. immunogenicity risk proxies). It supports full-sequence or region-restricted optimization (e.g., specific CDRs), plus practical constraints such as restricting mutable positions, allowed amino-acid dictionaries, and liability filtering (e.g., glycosylation motifs). 4. Method details: sequences are embedded (one-hot, BLOSUM-derived, bag-of-5-grams, or AbLang-2 embeddings), then modeled with a GP using a Tanimoto kernel to better handle high-dimensional sparse-like representations. Multi-objective acquisition uses EHVI (and NEHVI for noisy settings), implemented via BoTorch. 5. Cross-reactive VHH case study: BOAT optimizes CDR1/2/3 (up to 5 mutations per CDR) to improve binding to two related antigens, optionally adding humanness (OASis) and PLM likelihood (ESM-2) as additional objectives. Mutation choices are constrained to a curated per-position amino-acid dictionary grounded in available experimental single-point data. 6. Benchmarking against GA baselines (sum-of-objectives GA and NSGA-II): across 2–4 objectives and multiple CDRs, BOAT variants reach higher hypervolume earlier and end with better hypervolume under the same oracle-call budget (1000). NSGA-II degrades notably as objective count increases, consistent with many-objective optimization issues. 7. When exhaustive enumeration is feasible (smaller constrained spaces), BOAT recovers Pareto fronts close to the “ground-truth” oracle-induced Pareto frontier, including in very large enumerated CDR3 spaces (tens of millions of sequences), highlighting sample-efficient Pareto exploration rather than brute-force scoring. 8. Diversity matters for wet-lab follow-up: batch BO acquisition (qEHVI/qNEHVI) tends to produce higher Shannon-entropy sequence sets while maintaining strong hypervolume, whereas sequential EHVI can be more exploitative (competitive hypervolume but lower diversity). Larger batch sizes increase diversity, with some early hypervolume trade-offs. 9. Practical limits and regimes: (i) NEHVI can become dramatically slower as objectives increase (e.g., 3 objectives taking minutes per BO step vs seconds for 2), (ii) complex structure-based oracles (Boltz-2 ipTM) can break surrogate fidelity with simple encodings—here, semi-random GA search can be competitive, motivating richer structure-aware surrogates/kernels. 10. Comparison to generative multi-objective methods (LaMBO-2) on the 4-4-20 scFv affinity/expression dataset: using the same discriminative head as BOAT’s oracle, BOAT generally achieves higher hypervolume over generated sequences. However, BOAT can exploit predictor artifacts and go out-of-distribution; adding an ESM-2 likelihood objective acts as a “naturalness” regularizer, underscoring that oracle quality and priors critically shape in silico Pareto fronts. 💻Code: github.com/AstraZeneca/boat 📜Paper: arxiv.org/abs/2604.13980 #BayesianOptimization #MultiObjectiveOptimization #AntibodyDesign #ProteinEngineering #MachineLearning #ComputationalBiology #DrugDiscovery #ActiveLearning #GaussianProcesses #ParetoOptimization

1,132

IIT Comillas

IIT Comillas @IIT_Comillas

Apr 14

🤖📍 New #ModellingSeminars session 📖 Precision-Weighted Joint Entropy Search for Bayesian Optimization 🎤 Eduardo César Garrido Merchán 📆 April 16 | 12:30 📍 5th Floor, Rey Francisco 4 #BayesianOptimization #ResearchSeminar

Trending AI Topics

Generative Multi-Objective Bayesian Optimization with Scalable...

Designing molecules that must satisfy multiple, often conflicting objectives is a central challenge in molecular discovery. The enormous size of chemical space and the cost of high-fidelity...

arxiv.org

Biology AI Daily

Biology AI Daily @BiologyAIDaily

Mar 8

Deep Learning-Guided Evolutionary Optimization for Protein Design 1 BoGA introduces a hybrid approach combining genetic algorithms with Bayesian optimization, where a surrogate model acts as a discriminator to filter candidate sequences before expensive evaluation, dramatically improving optimization efficiency. 2 The key innovation lies in decoupling sequence generation from evaluation: the genetic algorithm proposes diverse candidates through mutation, while a deep learning surrogate model prioritizes which candidates merit costly structure prediction or docking calculations. 3 The framework demonstrates superior performance across multiple tasks including beta-sheet fraction optimization, normalized hydrophobic moment maximization, and AlphaFold-guided secondary structure design, with larger proposal pools consistently yielding better results. 4 In a real-world application, BoGA successfully designed peptide binders targeting pneumolysin, a critical virulence factor of Streptococcus pneumoniae, accelerating discovery of high-confidence binders compared to standard genetic algorithms. 5 The method offers significant advantages over existing approaches like hallucination or diffusion-based methods: no requirement for large-scale pre-training, flexible objective functions without retraining, and seamless integration of advancing structure prediction tools. 6 BoGA is implemented within the modular BoPep suite, supporting interchangeable embeddings, surrogate architectures, acquisition functions, and mutation operators, making it a generalizable strategy for diverse protein design objectives. 📜Paper: arxiv.org/abs/2603.02753 #ProteinDesign #BayesianOptimization #GeneticAlgorithm #DeepLearning #ComputationalBiology #PeptideBinders #Pneumolysin #Bioinformatics #AIforScience

120

5,978

Biology AI Daily

Biology AI Daily @BiologyAIDaily

Mar 8

Bayesian Optimization in Chemical Compound Sub-spaces Using Low-dimensional Molecular Descriptors 1) This work presents a data-efficient Bayesian optimization framework that can identify optimal molecular structures with fewer than 2,000 training points in a chemical sub-space containing over 133,000 molecules. 2) The key innovation is a reliable inverse mapping scheme that translates optimized points in descriptor space back into chemically valid molecular structures, bridging the gap between continuous optimization and discrete molecular design. 3) The framework employs low-dimensional, physics-informed molecular descriptors that enable accurate Gaussian Process Regression even with limited training data, addressing the curse of dimensionality that plagues traditional molecular optimization. 4) For entropy optimization, the approach achieves a 100% success rate while requiring fewer than 1,000 molecular evaluations in more than 80% of test cases on the QM9 benchmark dataset. 5) For zero-point vibrational energy (ZPVE), the success rate exceeds 80% for molecules containing more than two heavy atoms, demonstrating robust performance across different molecular properties. 6) The inverse mapping algorithm predicts chemical formulas from descriptor vectors by matching predicted stoichiometry and shape characteristics against molecular databases, with a fallback penalty for chemically implausible suggestions. 7) The method outperforms conventional generative approaches that typically require large datasets, making it particularly suitable for data-scarce settings in molecular discovery. 8) The descriptors combine Coulomb matrix eigenvalues with inner products of atomic reference probability densities, capturing both global molecular shape and local atomic environment information. 📜Paper: arxiv.org/abs/2603.02605 #BayesianOptimization #MolecularDesign #InverseDesign #GaussianProcess #QM9 #ChemicalSpace #LowDimensionalDescriptors #MolecularOptimization #ComputationalChemistry #MachineLearning

1,309

AIトレンド速報｜最新ニュース & 活用術

AIトレンド速報｜最新ニュース & 活用術

@AI_Bridge_Japan

Feb 24

Replying to @AI_Bridge_Japan @PyTorch

開発サイクルのボトルネックだった待ち時間が解消され、より高速なカーネル開発が可能になります。 Authors: Ethan Che, Oguz Ulgen, Maximilian Balandat, Jongsok Choi, Jason Ansel (Meta) #PyTorch #Helion #MachineLearning #BayesianOptimization #OpenSourceAI #Performance