h100envy

h100envy

Users
Tweets

h100envy

@h100envy

Apr 29

x.com/i/article/204944758839…

21,686

Valeriy M., PhD, MBA, CQF

Valeriy M., PhD, MBA, CQF

@predict_addict

Apr 22

Why your gradient boosting model is secretly overconfident (and how CatBoost gives it a reality check) We all know the feeling. You train an XGBoost or LightGBM model, and the training error drops beautifully. The metrics look amazing. Then you deploy it on new data, and performance degrades unexpectedly. Beyond standard overfitting, there is a deeper, subtle mathematical flaw in standard gradient boosting that contributes to this. It’s called Prediction Shift. Here is the hidden trap. In standard boosting, in iteration $k$, you calculate the gradient (the error) for a specific data point. To do this, you use the current model built from iterations $1$ to $k-1$. The problem is that the current model *was already trained using that exact data point* in those previous rounds. The model has "seen" this data point before. Therefore, the gradient it calculates on the training set is biased. It's too optimistic compared to the gradient it would see on fresh, unseen test data. It’s like practicing for a final exam using the exact questions that will appear on the test. You will score amazingly well in practice. Your confidence will soar. But when you face new questions on the real exam, you fail because you memorized specific answers instead of learning general concepts. Your model is deluding itself about how well it's actually doing. 🚀 CatBoost’s "Ordered Boosting" Reality Check CatBoost is the only major library that fixes this fundamental mathematical bias using a technique called Ordered Boosting. It utilizes the same "time-travel" permutation logic I mentioned in previous posts. To calculate the gradient for data point X, CatBoost uses a version of the model trained **only** on data points that appear *before* X in the shuffled timeline. It strictly forbids the model from peeking at point X when building the specific trees used to predict point X. The Result: By removing this bias from the gradient estimation, CatBoost gets a "reality check" during every step of training. The training process is harder, but the resulting model generalizes significantly better to new data, especially on smaller or noisier datasets where this overfitting bias is most damaging. TL;DR ❌ XGBoost / LightGBM: Calculate gradients on data the model has already seen, leading to overconfidence (Prediction Shift). ✅ CatBoost: Uses Ordered Boosting to ensure gradients are unbiased, leading to better generalization on fresh data. A little extra math in the training process saves a lot of headaches in production. Check my book -> valeman.gumroad.com/l/Master… #MachineLearning #DataScience #CatBoost #GradientBoosting #AI #Overfitting

1,651

Juan Aparicio

Juan Aparicio @JuanAparicioUMH

Apr 13

🚀 First ever: Gradient Tree Boosting for production frontier estimation — satisfying all microeconomic axioms. Result? 35% lower MSE vs. FDH. When ML meets production theory, both win. 📄 ESWA 2023 w/ Guillen & Esteve 👉 doi.org/10.1016/j.eswa.2022.… #GradientBoosting #ML #Efficiency

diwanshu.gg

diwanshu.gg @TheRealDiwanshu

Apr 13

Day 70 of ML !! Gradient Boosting (Regression Classification) From math → residuals → log-odds → full implementation code : [github.com/DiwanshuG/Machine…] #MachineLearning #GradientBoosting #LearnInPublic

Annals of Computer Science and Information Systems

Annals of Computer Science and Information Systems @annals_csis

Mar 9

“Exploring Stability and Performance of hybrid #GradientBoosting Classification and Regression Models in Sectors #StockTrendPrediction: A Tale of Preliminary Success and Final Challenge” by M. Liu, L. Cen, D. Ruta, QH Vu. ACSIS Vol. 39 p. 761–766; tinyurl.com/2uhrz43f

Valeriy M., PhD, MBA, CQF

Valeriy M., PhD, MBA, CQF

@predict_addict

Feb 19

821

Tijani Rofee'ah

Tijani Rofee'ah @Rofeeah_Tijani

Feb 14

#GradientBoosting #Machinelearning #SupervisedML #Working @ThePSF

853

mrlutz

mrlutz

@mr1lutz

Feb 8

x.com/i/article/202038922020…

1,939

Analytics Vidhya

Analytics Vidhya

@AnalyticsVidhya

Feb 2

Boosting battle! AdaBoost vs XGBoost vs LightGBM vs CatBoost - which reigns supreme? Find out the strengths & weaknesses of each in this head-to-head comparison! #MachineLearning #Boosting #GradientBoosting analyticsvidhya.com/blog/202…

Gradient Boosting vs AdaBoost vs XGBoost vs CatBoost vs LightGBM: Finding the Best Gradient...

A practical comparison of AdaBoost, GBM, XGBoost, AdaBoost, LightGBM, and CatBoost to find the best gradient boosting model.

analyticsvidhya.com

252

Valeriy M., PhD, MBA, CQF

Valeriy M., PhD, MBA, CQF

@predict_addict

20 Dec 2025

1,500

Valeriy M., PhD, MBA, CQF

Valeriy M., PhD, MBA, CQF

@predict_addict

17 Nov 2025

Serious about being a data scientist? Running fit() on a gradient boosting model isn’t enough. Mastery is knowing which algorithm to choose—and why. XGBoost vs LightGBM vs CatBoost comes down to three dimensions: 1. Optimization scheme • XGBoost popularized second‑order (Newton) updates using gradients and Hessians for robust, accurate minimization. 2. Tree construction strategy • LightGBM grows trees leaf‑wise (best‑first). It’s fast and memory‑efficient with GOSS and EFB, but can produce deep, asymmetric trees that overfit. • CatBoost builds balanced, symmetric “oblivious” trees, enabling fast inference and helping resist overfitting. 3. Statistical treatment of data • Standard boosting reuses the full sample to estimate gradients, causing bias and prediction shift. • CatBoost uses Ordered Boosting and Ordered Target Statistics, computing estimates only from “past” examples in a permutation. This reduces leakage and improves robustness, especially with noisy data and many categorical features. Stop guessing which booster to use. Start mastering the mechanics that power top‑tier tabular models. Ready to go deeper? Grab my book, Mastering CatBoost Pro, this Black Friday. valeman.gumroad.com/l/Master… Code: BF2025 #DataScience #MachineLearning #XGBoost #LightGBM #CatBoost #GradientBoosting

Mastering CatBoost - Pro Edition

🔥 Pro Edition: Mastering CatBoost — The Hidden Gem of Tabular AI(Early Access)The elite version of the book — trusted by data science leaders in 100 countries.Unlock the premium toolkit behind...

valeman.gumroad.com

2,714

Deeksha

Deeksha @DkssOfficial

27 Oct 2025

Day 88: Advanced Boosting. Hands-on with Gradient Boosting: trained both Classifier (GBC) and Regression (GBR) models Then deep dive into the in-depth intuition of XGBoost for classification, understanding the core logic that makes it so powerful. #ML #GradientBoosting #XGBoost

Ayush

Ayush

@TensorThrottleX

27 Oct 2025

Day 162: Data Science Journey ->GB: Uniform prob plane tags all class 1; residuals-> errors ->3D scatter contourf steering stumps to carve adpt bound, min log loss. ->Fix: F(x)=prev γ*tree (shrink γ<1), tunes 2x accuracy, kills overfit! #DataScience #ML #GradientBoosting

494

Douss

Douss @AdamsSalahou

26 Oct 2025

Official research paper out for one of our xAI startup. Can’t wait to share it with you guys 👀 #randomforrest #gradientboosting

160

Ayush

Ayush

@TensorThrottleX

26 Oct 2025

Day 161: Data Science Journey ->Flat pred plane at mean(y)=0.56 via Plotly 3D; all points classed as 1. ->Scatter3D real pts Surface for constant pred. ->Resid. r=y-p: vertical errors guide weak learners; start simple, correct via residuals. #DataScience #ML #GradientBoosting

ALT calculation of residual and log of odds

ALT gradient boosting with uniform prediction on a probability of class 1 for all the data points.

ALT When all that’s left of me is a heartbeat and defiance, I still drag myself forward.

463

HARSH KUMAR SHARMA

HARSH KUMAR SHARMA @harshhsharmaa57

16 Oct 2025

🚀 Day 52 – 100 Days of Machine Learning Journey Today’s topic: How to Tune Hyperparameters in Gradient Boosting ⚙️ 📘 Learn with @geeksforgeeks Nation SkillUp: 👉Course: geeksforgeeks.org/batch/ds-1… #100DaysOfML #MachineLearning #GradientBoosting #nationskillup #skillupwithgfg

HARSH KUMAR SHARMA

HARSH KUMAR SHARMA @harshhsharmaa57

15 Oct 2025

📘 Day 51 – 100 Days of ML Today’s concept: Boosting in Machine Learning ⚡ 📚 Learn with @geeksforgeeks Nation SkillUp: 👉Course: geeksforgeeks.org/batch/ds-1… #100DaysOfML #MachineLearning #AdaBoost #GradientBoosting #nationskillup #skillupwithgfg

Kadir Türok Özdamar -CPA/ Chartist Bilgi Tek. A.Ş

Kadir Türok Özdamar -CPA/ Chartist Bilgi Tek. A.Ş

@kadirturokozdmr

25 Aug 2025

🤖 XU100.IS AI TAHMİN BÜLTENİ (v2.5) 📊 Sembol: XU100.IS 💰 Güncel Fiyat: 11487.59 🎯 ANA TAHMİN: 📈 YUKARI ✨ Olasılık (Ağırlıklı): V.6 🔥 Güven Seviyesi: .2 🤖 Aktif Model Sayısı: 18 📈 Hedef: 11697.73 ( 1.83%) 🛑 Stop: 11272.97 (-1.87%) ⚖️ Risk/Getiri: 1:0.98 🧠 AKILLI ANALİZ: • Konsensus: 🔥🔥 YUKARI (Skor: 79/100) • Model Uyumu: r • Öneri: ✅ GÜÇLÜ SİNYAL 🔎 DERİNLEMESİNE ANALİZ: • Sinyal Tutarlılığı: ✅ Yüksek (Güçlü model aileleri aynı yönde) • Piyasa Volatilitesi: ⚡️ Orta (ATR: 1.12%) 💡 STRATEJİK YORUM: AI, mevcut trendin devam etme potansiyelini ve alım iştahını pozitif olarak değerlendiriyor. Tahmini Destekleyen Faktörler: • Fiyat, kısa vadeli ortalamanın (SMA20: 10948) üzerinde kalmaya devam ediyor. • MACD momentumu pozitif bölgede gücünü koruyor. Dikkat Edilmesi Gereken Riskler: • En yakın direnç seviyesi olan 11520 bölgesi kar satışları için izlenmelidir. • RSI göstergesinin aşırı alım bölgesinde olması, olası bir geri çekilme riskini artırmaktadır. 🔬 Model Detayları: Gradient Boosting: • CatBoost: 📈 p.9 🔥 • GradientBoosting: 📉 F.2 • LightGBM: 📈 c.6 • XGBoost: 📈 �.2 🔥 Diğer Modeller: • AdaBoost: 📉 @.8 • DecisionTree: 📈 0.0 🔥 • Ensemble_Soft: 📉 8.8 • ExtraTrees: 📈 h.0 • KNN: 📈 X.9 • LogisticRegression: 📉 7.6 • NaiveBayes: 📈 �.8 🔥 • NeuralNetwork_Large: 📉 .9 🔥 • NeuralNetwork_Small: 📉 (.6 🔥 • QDA: 📈 c.1 • RandomForest: 📈 Y.3 • Ridge: 📉 C.4 • SVM_Linear: 📈 T.1 • SVM_RBF: 📈 U.7 ⏰ Analiz Zamanı: 25/08/2025 15:10

143

17,713

Valeriy M., PhD, MBA, CQF

Valeriy M., PhD, MBA, CQF

@predict_addict

23 Aug 2025

Whether you’re aiming to win Kaggle competitions, deploy robust models in production, or simply level up your ML toolkit, Mastering CatBoost will get you there. #MachineLearning #DataScience #CatBoost #ML #AI #GradientBoosting #GBDT #Kaggle #Python #MLOps #TabularData #BookLaunch

567

Kadir Türok Özdamar -CPA/ Chartist Bilgi Tek. A.Ş

Kadir Türok Özdamar -CPA/ Chartist Bilgi Tek. A.Ş

@kadirturokozdmr

22 Aug 2025

🤖 BIST100 AI TAHMİN SİSTEMİ (Gelişmiş) 📊 Sembol: BIST100 (XU100.IS) 💰 Güncel Fiyat: 11372.33 🎯 ANA TAHMİN: 📈 YUKARI ✨ Olasılık (Ağırlıklı): a.2 🔥 Güven Seviyesi: ".4 🤖 Aktif Model Sayısı: 18 📈 Hedef: 11606.71 ( 2.06%) 🛑 Stop: 11170.40 (-1.78%) ⚖️ Risk/Getiri: 1:1.16 🧠 AKILLI ANALİZ: • Konsensus: 🔥🔥🔥 YUKARI (Skor: 83/100) • Model Uyumu: x • Öneri: ✅ GÜÇLÜ SİNYAL 🔬 Model Detayları: Gradient Boosting: • CatBoost: 📉 C.9 • GradientBoosting: 📈 S.7 • LightGBM: 📈 �.7 🔥 • XGBoost: 📈 �.8 🔥 Diğer Modeller: • AdaBoost: 📉 I.5 • DecisionTree: 📈 0.0 🔥 • Ensemble_Soft: 📉 @.9 • ExtraTrees: 📈 r.1 🔥 • KNN: 📈 a.1 • LogisticRegression: 📉 B.7 • NaiveBayes: 📈 �.2 🔥 • NeuralNetwork_Large: 📉 &.7 🔥 • NeuralNetwork_Small: 📉 B.3 • QDA: 📈 �.0 🔥 • RandomForest: 📈 a.2 • Ridge: 📉 F.4 • SVM_Linear: 📈 T.4 • SVM_RBF: 📈 V.7 ⏰ Analiz Zamanı: 23/08/2025 00:02

211

16,977