New work on Generic Table APIs from Snowflake and Onehouse in Polaris uses XTable for non-iceberg tables. Running XTable via REST API =
POST
/v1/conversion/table/
{"source-format": "HUDI | DELTA"
...}
Decoupling catalog spec from formats future proofs innovations in storage
Unity Catalog repo went public this morning. I had a talk scheduled in the afternoon on @apachextable. I was able to integrate my live demo across #ApacheHudi#DeltaLake tables into #UnityCatalog in time for my talk. I used EMR Spark DuckDB for read/write.
#dataaisummit
Off to #DataAISummit by @databricks 🙌🏻
Looking forward to meeting folks in person & tuning into some of the amazing sessions!
I will be talking about Lakehouse table formats (Apache Hudi, Apache Iceberg & Delta Lake) & most importantly interoperability with @apachextable
Delta Lake Universal Format (UniForm) and @apachextable are actively collaborating to help organizations build architectures that are not constrained to any single ecosystem!
Learn how Uniform & Xtable are unifying the open table formats. đź”—
delta.io/blog/unifying-open-…#opensource
I presented about @apachextable & how it brings interoperability to lakehouse table formats at Dremio's Subsurface.
Check out the recording of the talk: m.youtube.com/watch?v=Hm2Gi-…
Excited to see @Microsoft and @SnowflakeDB leveraging #ApacheXtable to build a stronger partnership together w/ interop between #ApacheIceberg and #DeltaLake 🎉
❄️ Quote: "Fabric will be able to store data in Iceberg format in OneLake via Apache XTable translation in OneLake."
.@Microsoft 🤝 Apache Iceberg 🤝 Snowflake
We’re expanding our partnership to make interoperability more efficient and cost-effective between Snowflake and Fabric OneLake. This is made possible with Apache Iceberg support in OneLake.
Fun evening @LinkedIn HQ demoing @apachextable & open table formats.
Also amazing stuff from @JayChia5 on getdaft.io & @w_moustafa on ViewShift (dynamic data masking for data lakes)
Super excited to be speaking at @LinkedIn Big Data Meetup happening on 3rd May at HQ, Sunnyvale!
I will be talking about @apachextable & interoperability in open table formats such as Apache Hudi, Iceberg & Delta Lake in collab with Microsoft folks!
Join us on 3rd May.
Here is a sneak peak of something that I have been working on.
@dremio reads @apachehudi tables.
This is another example of "interoperability" with open lakehouse table formats.
How to sync Apache XTable target tables with catalogs?
Once you translate an open table format using XTable, you might want to sync those tables with catalogs such as Hive Metastore, AWS Glue, Unity, etc.
đź§µ
OneTable is now Apache XTable (Incubating) 🎉
Table formats interoperability is a critical aspect for open lakehouse architectures.
If you are interested in solving some of the challenges within interoperability, join us!
GitHub: github.com/apache/incubator-…
3 steps to register a OneTable synced table to a Glue Catalog.
OneTable allows you to register the target table format to catalogs such as Hive Metastore, Glue, Big Lake & Unity.
Once you have these tables in the catalog, you can then plug in any query engine to read the data👇
Here are the 3 steps that shows how to register a OneTable synced table to Glue:
âś… Create a Glue database from your terminal
âś… Define the two config files - config.YAML & catalog.YAML
âś… Run the OneTable sync process which will translate the metadata in the S3 bucket
OneTable Sync modes.
OneTable provides users with the ability to translate metadata from one table format to another.
It provides 2 ways to translate metadata from source to target table formats - “Full Sync” and “Incremental Sync”
A đź§µ
âś… Full Sync: This mode translates all the commits from the source table format to the target format
âś… Incremental Sync: This mode allows translating commits that have not yet been synched with the target format
Link: onetable.dev/docs/features-a…
The incremental mode is a lightweight approach and has better performance, especially on large tables.
If there is anything that prevents the incremental mode from working properly, OneTable will fall back to the full sync mode.
Github Repo: github.com/onetable-io/oneta…