Starburst Announces New Product Release Including up to 10x Performance Improvements for Parquet Files and Delta Lake Tables

Additional new capabilities in Starburst Enterprise empower data teams with new Lakehouse Data Manipulation Language (DML), add more Single Sign-On (SSO) functionality, and data governance integration

Starburst, the analytics anywhere company, announced the availability of the latest version of Starburst Enterprise. The new release provides Starburst customers with net new capabilities alongside more advanced connectivity, improved performance, and more security features. Among the new capabilities, Starburst Cached Views expands the concept of traditional materialized views to be applied to the data mesh. Starburst Cached Views allows domain experts to easily enrich, transform and move data according to their needs.

Marketing Technology News: AtScale Delivers “Live” Connection Experience for Microsoft Power BI

“Achieving a successful data mesh architecture requires the ability to access data in disparate systems and sources. Starburst Cached Views enables users to query data from other systems, and transparently cache that data in their own domain for increased performance,” said Matt Fuller, VP, Product and co-founder of Starburst Data. “This can provide significant performance advantages by precalculating complex queries and joins and caching closer to consumers. It can also significantly reduce data egress costs and enable domain experts to create a performant semantic layer.”

The release additionally includes a new proprietary Parquet reader, improving read performance on Parquet files by an average of 20% over Trino. With the release of the Parquet reader, Starburst has reinforced its position as the leader in analytics and performance on data lakes.

Marketing Technology News: MarTech Interview with Armen Adjemian, Co-Founder and CEO at DISQO

In addition to Starburst Cached Views and Parquet reader, this release of Starburst Enterprise includes these new offerings to help serve data mesh architectures:

  • Standardized data lineage and governance support allows data owners to control access while giving data consumers insight into the lineage of their data. Starburst’s Atlas integration allows sensitive data to be tracked as it is moved to ensure future access maintains the same security controls.
  • Starburst can authenticate with a number of OIDC compliant identity providers (Okta, Azure and on-premises ADFS) and pass-through the identity to authenticate to many underlying data sources that support OIDC. This allows organizations to centralize authentication control for improved security and reduced operational overhead.
  • As more data products are created in the mesh, it’s important to continue to add connectivity, preventing data silos. In this release, Starburst has added connectivity for DynamoDB. Previously, customers needed to extract data from DynamoDB and load into another database for SQL analytics. Now Starburst customers can simply use the DynamoDB connector and query the data directly on DynamoDB with Starburst without data movement.

As organizations create, manage and consume more and more data, lakehouses allow organizations to cheaply store large amounts of data while giving users the fine-grained control they need to keep their data up to date, and compliant with data regulations. In this release, Starburst is excited to announce the following improvements to support lakehouse architectures:

  • Enhanced support for the Delta Lake format. Starburst now has native statistics collection that is used by Starburst’s cost-based optimizer to improve query performance by a factor of 2. Starburst also integrates with the Delta Lake OPTIMIZE command which allows for more efficient data organization thus further increasing the performance of Delta Lake queries.
  • Numerous performance enhancements. Dynamic filtering accelerates federated JOINS, aggregation pushdown moves query processing closer to the source, and improvements to the Parquet reader significantly improve read throughput on data lakes. Analytical workloads perform up to an order of magnitude faster.

Marketing Technology News: How Is Blockchain Driving Digital Transformation?