Joined April 2024
137 Photos and videos
πŸš€ Week 6 was all about Exploratory Data Analysis (EDA) & Visualization! I cleaned, analyzed, and visualized an uncleaned dataset of Data Science jobs using Pandas, Matplotlib & Seaborn. Let's break it down! 🧡 @TDataImmersed #TDI @DabereNnamani
5
12
βœ… Cleaned messy data βœ… Uncovered job trends βœ… Created powerful visuals EDA & Visualization are πŸ”‘ for Data Science! Want to see everything? Check out my notebook: 🌐 anaconda.cloud/share/noteboo… Which visualization do you use most? Let’s discuss! πŸš€πŸ

πŸ”₯ Seaborn for Advanced Plots Heatmap: Correlation between key variables πŸ”₯ Box Plot: Job title vs company ratings 🎭 Pair Plot: Relationships between salary, rating & founding year Aesthetics Insights = πŸ’‘
3
πŸ“‰ Matplotlib for EDA Histogram: Salary distribution πŸ’° Bar Chart: Top locations for Data Science jobs πŸ—ΊοΈ Line Plot: Salary trends by company size 🏒 Visualizing data brings numbers to life! πŸ”₯
6
πŸ“Š EDA = Knowing Your Data Summary stats for Rating, Salary, and Revenue Identified top job titles & their average ratings Analyzed salary trends by company size EDA helps spot patterns & anomalies fast! πŸš€
3
🧼 Data Cleaning is the foundation of good analysis! Handled missing values πŸ•΅οΈ Extracted & cleaned Salary Estimate πŸ’° Standardized Company Names & Locations πŸ“ Data cleaning = better insights! βœ…
2
πŸš€ Week 5 was all about Data Cleaning & Transformation with Pandas! From handling missing values to merging DataFrames, this was a deep dive into real-world data prep. Let’s break it down! πŸ§΅πŸ‘‡
11
4
Wrap-Up & Full Notebook βœ… Data cleaned βœ… New features created βœ… Data merged βœ… Insights uncovered This was real-world data prep at its finest! Check out my full notebook here: 🌐 hhttps://anaconda.cloud/share/notebooks/bab3f1ea-092c-4be5-ac0d-4b16fad8224e/overview
6
String Cleaning & Deck Extraction πŸ”‘ Text manipulation in Pandas I extracted the deck from the Cabin column to analyze survival rates by deck. πŸ“· Question ➑️ πŸ“· My Solution Text data isn’t always cleanβ€”Pandas makes it easy!
7
πŸ”„ Merge vs. Concatenate? merge() = Joins datasets on a key (like PassengerId) concat() = Stacks datasets (vertically or horizontally) πŸ“· Question ➑️ πŸ“· My Solution These techniques help when dealing with multiple data sources!
5
Creating New Features πŸ› οΈ Feature Engineering I added: βœ… FamilySize = (sibsp parch 1) βœ… FarePerPerson = Fare Γ· FamilySize πŸ“· Question ➑️ πŸ“· My Solution Why? These features give new insights into passengers’ social & economic backgrounds!
7
πŸ’° Outliers distort averages! I detected extreme fare prices using the IQR method and capped them instead of removing. πŸ“· Question ➑️ πŸ“· My Solution Capping ensures we keep all data while limiting extreme values! πŸ›³οΈ
4
πŸ‘€ Data transformation step! Instead of 1, 2, 3, I converted Pclass into "1st Class", "2nd Class", "3rd Class" for better readability. πŸ“· Question ➑️ πŸ“· My Solution Why? Clear labels improve data storytelling! πŸ“Š
3
πŸ” Duplicate records skew analysis! Using drop_duplicates(), I checked and removed any duplicates in Titanic data. πŸ“· Question ➑️ πŸ“· My Solution Have you ever encountered duplicate headaches? 🀯
2
You may not know what to do with missing values... πŸ€” Drop or Fill? dropna() – Remove missing data (good if there’s little missing) fillna() – Replace missing values (mean, median, etc.) I used the median for Age to avoid outliers! πŸ“·
3
Finding Missing Data πŸ” Identifying missing values in the Titanic dataset using Pandas: πŸ“· Question ➑️ πŸ“· My Solution Missing values can break analysisβ€”step 1 is always detection!
3
🧼 Why is data cleaning important? Missing values can bias analysis πŸ“‰ Duplicates distort insights πŸ”„ Outliers skew statistics πŸ“Š A clean dataset = better decisions! βœ…
9
🌟 Week 3 of my Python journey was all about diving into File Handling, CSVs, and NumPy! πŸš€ From reading Titanic data to exploring arrays with NumPy, this week was packed with exciting tasks. Let’s break it down: 🧡 @DabereNnamani @TDataImmersed @JacobAjala #TDI
2
18
That wraps up my Week 3 highlights! 🐍 Want to explore the complete code and dive into more details? Check it out here: 🌐 anaconda.cloud/share/noteboo… What was your favorite part? Let’s discuss! ✨
4
πŸ“Š NumPy Adventures NumPy made math magical! I: Built and manipulated 1D/2D arrays Found fare stats (min, max, mean) for Titanic data Explored indexing and random arrays 🎲✨ πŸ“· Questions ➑️ πŸ“· My Solutions How do YOU use NumPy? Let me know! 🐍
10