Join @dremio’s Tech advocacy & Eng team for the very first installment of the @ApacheIceberg Office Hours 📆 🚀
We will kick-off with a brief presentation on Copy-on-Write Vs Merge-on-Read strategies, followed up by Q&A on anything Iceberg related.
When: December 7th, 12 PM
Query planning in @ApacheIceberg
Being able to efficiently plan queries is super critical for faster execution of the queries run by analysts 🧑🏻💻
This is specifically critical when dealing with large-scale data such as data in data lakes. Read @IcebergDevs 👇
#dataengineering
The @ApacheArrow project has grown in all axes 🚀
In fact, more & more tools/libraries in the #dataanalytics space have started using Arrow.
In this blog post, we go through the evolution of Apache Arrow from usage, capability & community angles.
dremio.com/blog/apache-arrow…
Manage data as code?
Just like Git but for Data?
That's right!
@projectnessie is an open source work that brings the capabilities of Git-like branching to the world of data & specifically to data lake table formats like #ApacheIceberg#dataengineering
We're thrilled to announce that we've been named to @CNBC’s ‘Top Startups for the Enterprise’ Inaugural List 🎉
Read more about our open data lakehouse and this inaugural list here:
bwnews.pr/3UehuIN#CNBC#TopStartup#Tech
Are you heading to AWS re:Invent later this month? Check out this link for all the details on how you can:
➡️ Schedule a meeting with us
➡️ Enter our Dremio Cloud data challenge (for a chance to win a PS5!)
➡️ RSVP to our cocktail reception
awsreinventdremio2022.splash…#AWSreInvent
If you find what you see interesting here is a tutorial I wrote giving you a step by step guide getting setup and doing an example exercise -> dremio.com/blog/managing-dat…
How do we migrate from one catalog to another for @ApacheIceberg tables?
if you are already using a catalog (say HDFS) & want to change it to something else (say AWS Glue), how is that possible?
A 🧵 for @IcebergDevs
#dataengineering
With all the recent news about #ApacheIceberg we thought we'd share this video from last year's Subsurface Conference. We're looking for speakers for our event happening in spring 2023 🎤 submit your talk today!
sessionize.com/subsurface-li…#CallForSpeakers
Merge-On-Read (MOR) Vs Copy-On-Write (COW) in @ApacheIceberg.
Both these approaches are used to deal with deletes & updates of data files in the Data lake.
Let’s break down @IcebergDevs👇
#DataEngineering#data
Always great to catch up with people who have depth in the data space to share the stories from academic papers to how companies have been created. Thanks @juansequeda@TimGasper
A Data Catalog is like the parent to the data who makes sure that you go grow, be successful, have fun while being safe.
This is the result of our beer discussion with @TimGasper and
@mcl5tech in Austin (after Big Data London)