RandomForest is a cool algo but be careful of the default tree depth hyper-parameter; it want's to overfit your data
Tree depth is how "deep" the decision tree goes, e.g. how many decisions are made before coming to an answer. If you use RF, but don't know about max tree depth...
Executing queries on spatial data can be difficult due to the complex polygonal shape of geographical boundaries. That's why geospatial systems resort to a strategy called “Filter-Refine” to evaluate geospatial predicates efficiently. Details in🧵
#geospatial#DataAnalytics#SQL
"With hard work and perseverance, anything is possible despite your initial circumstances. After 100 rejection letters, I was recruited by Snowflake in 2018 and have been enjoying every minute since!” — Marion B. Consulting Manager at Snowflake #BlackHistoryMonth
Finally learned some basic raster analysis, and took the USGS topographic elevation National Land Cover datasets to create a least-cost-distance path from Harrisburg to Pittsburgh. (Corridors in background). Results are interesting for high speed rail implications!
In honor of the NFC championship game this weekend, I give you a comparative population density map of San Francisco and Philadelphia. #NFL#rstats#rayshader code: bit.ly/3OKb00G
ALT Comparative population density of Philadelphia and San Francisco, the teams playing in the NFC championship game this weekend. Rendered in R using the Rayshader package.
Python and SQL are powerful on their own, but have higher value when they work together. Join us at @SnowflakeDB Snowpark Day to learn how data teams can use dbt Cloud to generate analytics and ML-ready pipelines with SQL and Python. Register here 👉 hubs.ly/Q01wVm4l0
Are you a data scientist, ML engineer, or data analyst scratching the surface in geospatial analytics? Here is a list of open source tools that can help you. Have more OSS tools, mention them👇🏽
#Geospatial#DataScience#DataAnalytics#SQL#Python#GIS#gischat
Our Newest! Longitudinal NLP of property listings & HMDA mortgage lending by race/income. (1)
Smart growth as a luxury amenity? Exploring the relationship between t... sciencedirect.com/science/ar…
Amazon Redshift? Snowflake? Google BigQuery? Databricks? Azure Synapse? So many options! 🤯
The newest edition of our Cloud Data Warehouse Benchmark is here! Check out the report to compare the price, performance, and more for these 5️⃣ popular warehouses: 5tran.co/3FIvSTE
Quizz: You have a 100 TB geospatial dataset made of continuous measures recorded at 30-minute intervals across 2M locations. How do you sort the rows of your 50 MB partitions storing data for 0.25° geodesic squares?
unsolicited kudos to the folks at @obsdmd (obsidian.md/). This app is really great for semi-structured note taking with simple markdown and hotkeys. It is the "mind mapping" app I have been looking for for a while but was unable to find 👏
Give it a try!