i think most people associate rag = chunking files since that what it was when we had 4k context limits
that "rag" is dead
but datapiplines and searching and reasoning through files "rag" is here for a while /forever
When #CICD (Continuous Integration / Continuous Deployment) is applied to #datacentric projects, there can be a #datagovernance problem. If the #dataengineers run the #datapiplines then they have become #dataoperations. They will gradually spend more and more time running rather than developing. Running regularly scheduled batch jobs in production, distributing outputs, troubleshooting, liaising with users, etc. is a specialization that should be left to those who do it well.
You asked we deliver! In this blog post, you'll learn all about #DataPiplines, their types, key components, best practices, and how #Airflow can make them better!
hubs.ly/H0T_b0h0
You asked we deliver! In this blog post, you'll learn all about #DataPiplines, their types, key components, best practices, and how #ApacheAirflow can make them better!
hubs.ly/H0T_b0h0