Day 8 of My AI & Robotics Challenge
So the Model from the day before wasn't generalizing well, was at 0.59(underfitting) reason was simply the fact that:
1. The dataset was quite small around 337
2. the features where mainly 0, 1 and so the age column was dragging the model to unstable terrain, in fact this is why I need Number 4.
3. Features where not Engineered, engineering the features got the columns to 21 using polynomials(w1X0**2) and interactions with the pairs (w1X0X1)
4. The Gradient Descent needed Regularization so that parameters like Weight(w) is minimized cause it gets large, bias(b) is not minimized, reason is it doesn't interact with any of the features so minimizing it unfairly skews generalization.
I was so careful with engineering the features so we don't overfit.
For the interaction with pairs, had to make it meaningful
feature_names = [
"age", "sex", "fever", "cold", "rigor", "fatigue",
"headache", "bitter_tongue", "vomitting", "diarrhea",
"convulsion", "anemia", "jaundice", "cocacola_urine",
"hypoglycemia", "prostration"
]
As a refresher I Built a malaria severity classifier from scratch in pure Python/NumPy
what I learned fixing a 59% accuracy model 🧵
The model had 21 features an extra 5 but only 337 patients. Without regularization, it memorized the training data instead of learning patterns.
Fix: L2 regularization adds a penalty for large weights, forcing the model to stay simple and generalize.
Fixed with this two lines:
• Cost: (λ/2m) · Σw²
• Gradient: (λ/m) · w
Feature engineering unlocked nonlinear patterns a linear model normally wouldn't see.
Added:
• age², fever² polynomial terms
• fever×rigor, fever×fatigue, anemia×jaundice interaction terms
Logistic regression is linear but in a higher-dimensional space, it can approximate curves.
Feature scaling was silently killing accuracy.
age² could be 2500. fever is 0 or 1. Gradient descent spends all its time fighting that scale mismatch.
Fix: z-score normalization subtract mean, divide by std. Every feature lands between -3 and 3.
Swapped Python loops for numpy vectorization.
Previously: nested for loops, that's one multiplication at a time.
Later on: X @ w b(
np.dot) one line, runs in C, operates on all patients simultaneously and you could see that it's was really fast.
Same math. 10x faster.
Video1: shows 100K iteration and the slow Gradient Descent from the 90K mark
Video2: shows same but much faster from 1.
Image3: shows the new accuracy 69.14%
numpy precompiled c code is the goat.
#MachineLearning #Python #AIChallenge #BuildInPublic #ICRA