Last Week in a Byte on Delta Lake | 2023-06-06

Last Week in a Byte on Delta Lake | 2023-06-06

You can watch or read the latest #DeltaLake news (2023-06-06 edition)! This week's edition is hosted by Will Girten!


Last Week's Publications

There was a great user story published on the Delta Lake blog this week, by Kubit . If you've never heard of Kubit, they are a product analytics platform built to leverage data-sharing technologies. What I really liked about this article is that it shares how Kubit uses Delta Sharing to provide their customers with easy, fast, and affordable access to massive amounts of product data. Plus, the tuning tips were a really nice bonus - tips like using serverless compute, table partitioning strategies, or even optimizing the underlying data files of the Delta tables.


I can hardly believe it it, but I’m excited to announce the release of Delta Lake 2.4.0. I feel like it was only a short time ago I was getting to know the release 2.3.0 of Delta Lake, and we already have another release. Some notable features in this release include:

  • Support for Apache Spark 3.4
  • Support for using deletion vectors for all WRITE operations
  • A new PURGE command for removing deletion vectors for the latest snapshot
  • Support for using a WHEN NOT BY MATCHED SOURCE clause in MERGE operations


Another cool development this week comes from the Dask community. We are excited to announce that the dask-deltatable project was donated to the dask-contrib organization.

In case you missed, Matthew Powers, CFA talked about this cool project during the Dask Demo Day this year, as a way to speed up queries from Dask by reading data from Delta tables.


Lastly, we are excited to announce the minor release of Delta Sharing 0.6.7, which adds the following features:

  • Adds a maxRetryDurationLimit to the retry logic
  • Retry on socket timeout
  • Refactor/Consolidate Spark configs


And I can’t forget to mention the upcoming Data and AI Summit hosted by Databricks from June 26-29, 2023 in sunny San Francisco, CA. There will a ton of featured speakers, special events, and even on-site training. You don’t want to miss this year’s event!

Connect with us!

Want to learn more about Delta Lake and chat with other users and contributors? Join us at delta.ioSlackLinkedIn, and GitHub.

Joe Hendren

Revenue Operations Manager at Warmly,

1y

🙌

KRISHNAN N NARAYANAN

Sales Associate at American Airlines

1y

Thanks for sharing

Denny Lee

Developer Relations at Databricks

1y

Thanks Will Girten for this awesome update!

Hema Kumar Gantepalli

Data & AI - Senior Technology Architect

1y

great stuff, thanks for sharing.

I love these weekly updates!

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics