Last Week in a Byte on Delta Lake | 2023-03-21

Last Week in a Byte on Delta Lake | 2023-03-21

You can watch or read the latest #DeltaLake news a week late (2023-03-21 edition)!


Recent releases and contributions

  • We're proud to announce the release of delta-rs rust-v0.8.0 release which includes support for additional types of partition values, Implements pruning on partition columns, typed commit info, enables passing storage options to Delta table builder via Datafusion's CREATE EXTERNAL TABLE, and more! More information is available at https://github.com/delta-io/delta-rs/releases/tag/rust-v0.8.0. #rustlang
  • Gurunath Rajagopal recently released his Lakehouse Sharing project, which Demonstrates a table format agnostic data sharing server (based on delta-sharing protocol) implemented in Python for both #deltalake and #apacheiceberg formats. #deltasharing
  • Want to help with creating Delta Lake helper functions without Spark dependencies?  Check out https://github.com/MrPowers/levi and chat with Matthew Powers, CFA, who created the levi, mack, and jodie Delta Lake helper function libraries.
  • Get the latest Delta table version using mack helper functions using mack.
delta_table = DeltaTable.forPath(spark, path
mack.latest_version(delta_table)
>> 2
# import Jodie library
ChangeDataFeedHelper(deltaTablePath,0,25)
.dryRun()
.readCDF()


Upcoming events

No alt text provided for this image

We are happy to partner with Blueprint on their Velocity Tour to bring you demos, meet and greets, speaking sessions, and more! They will be at Data Council Austin 2023, PyCon US 2023 in Salt Lake City, and PyData Seattle 2023 in Seattle for March and April. Check out the Velocity Tour for all of their dates!


Latest community blogs

Robert Kossendey published the fourth blog in his insightful series on his journey to the #lakehouse with the post Lakehouse - A resumé.

Overall, we are more than satisfied with the outcome of our Lakehouse migration. We reduced our overall costs by 80% while improving our developer experience drastically. We don’t have to maintain a Redshift cluster anymore. Instead, we store all the data in a single place, S3. Further, the core of our infrastructure is powered by open source, namely Apache Spark and Delta Lake. That empowers us to move away from Databricks if we are ever unhappy with the service.
No alt text provided for this image

For more information, check out the vidcast D3L2: The Journey Unifying Data Lake and Data Warehouse with Robert Kossendey at Claimsforce. cc claimsforce

Connect with us!

Want to learn more about Delta Lake and chat with other users and contributors? Join us at delta.ioSlackLinkedIn, and GitHub.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics