Upsolver SQLake Makes Building a Pipeline for Data in Motion

Upsolver SQLake Makes Building a Pipeline for Data in Motion as Easy as Writing a SQL Query

Business Wire

November 9, 2022

-New cloud service unifies batch and stream processing in a single stack to accelerate market transition to real-time analytics

-Disruptive on-demand pricing model unveiled to eliminate exploding cloud bills – $99 per TB of data ingested, all transformation data pipelines free

Upsolver, the company dedicated to making data in motion accessible to every data practitioner, today announced the general availability of SQLake. The new service provides a SQL-based, self-orchestrating data pipeline platform that ingests and combines real-time events with batch data sources for up-to-the-minute analytics. It is available at a new ground-breaking price of $99 per TB ingested, with no charge for transformation processing, and no minimum commitment.

With SQLake, companies achieve a quantum leap in data freshness for use cases like ML model training, anomaly detection, and real-time BI and data science. It also makes data easily available to any SQL user – not just data engineers but data scientists, analysts, product managers and other data consumers.

SQLake fundamentally redefines the data pipeline development experience. As the only data pipeline platform that is data-aware, SQLake achieves unprecedented simplicity by automating numerous functions that usually require human intervention. Users no longer need to develop, test and maintain complicated orchestration logic (DAGs), optimize their data by hand, or manually scale their infrastructure. With SQLake, it’s all automatic.

Crossing the Chasm from Batch to Data in Motion for Fresh Insights

Fresher data leads to better decisions. However, businesses that want to move away from nightly batches and towards up-to-the-minute data freshness have had only two options, each accompanied by serious challenges. On the one hand, they could attempt to bolt batch processing onto data in motion, which creates an orchestration nightmare marked by unmanageable complexity, high operational costs, low data observability and mounting technical debt.

On the other hand, they could build a “Lambda architecture” that requires deploying and managing separate streaming infrastructure alongside their batch process. In this case they must hire specialized big data engineers, which is costly and creates a high barrier to self-service for non-engineer data consumers in the business. Furthermore, they must tune, optimize and scale the streaming solution, which results in increased operational overhead, violated SLAs and data consumer frustration.

Marketing Technology News: Trinity Audio Launches Bot for Twitter Threads

“Peer39 is the leading provider of contextual data used to optimize the effectiveness of marketing campaigns. We use Upsolver to ingest and optimize 20B events per day into our data lake on AWS, resulting in fresh data being available within minutes and a 10X acceleration of data lake queries”

The First Self-Orchestrating Data Pipeline Platform

Upsolver SQLake overcomes these challenges by treating all data as data in motion. It automatically determines the dependencies between each step in the pipeline to orchestrate and optimize the flow of data for efficient, resilient and performant delivery.

With SQLake, building a pipeline is as easy as writing a SQL query. This creates numerous benefits, including:

Pipeline development cycles are shortened from months to weeks or days.
SQL’s widespread adoption lets data users self-serve pipelines for fresh analytics. No Java, Python, Spark or Airflow expertise is required.
Production pipelines are more robust since human errors are eliminated, and failure scenarios are anticipated and handled gracefully.
Scaling stateful operations is automatic. Unlike limits to scaling in Apache Spark, SQLake’s unique state store efficiently handles billions of keys.

“Customers tell us that crossing the chasm to fresh data is extremely difficult, since stream processing is too complex for most users, and not powerful enough to replace batch workloads,” said Ori Rafael, CEO and co-founder of Upsolver. “SQLake changes the game. Now, anyone who knows SQL can easily develop and deploy data pipelines that blend real-time events with historical data, at massive scale.”

A Stateful Stream Processing Engine Proven at Scale

SQLake takes advantage of the same cloud-native processing engine used by Upsolver customers today, such as IronSource (mobile app user behavior), Proofpoint (network security) and Cox Automotive (VPC flows). It ingests streaming and batch data as events, supports stateful operations such as rolling aggregations, window functions, high-cardinality joins and UPSERTs, and delivers up-to-the minute and optimized data to query engines, data warehouses and analytics systems.

“Peer39 is the leading provider of contextual data used to optimize the effectiveness of marketing campaigns. We use Upsolver to ingest and optimize 20B events per day into our data lake on AWS, resulting in fresh data being available within minutes and a 10X acceleration of data lake queries,” says Boaz Goldstein, R&D Manager, Data Architecture & Business Intelligence at Peer39. “Upsolver’s SQLake offering will make it easy for our data engineers and data scientists to develop pipelines that bring together streaming and historical data without having to manually develop and manage complex orchestration logic or struggle to scale infrastructure to meet our data volume.”

Marketing Technology News: MarTech Interview with Ray Zhou, co-founder and co-CEO at Affinity

Write a Query, Get a Pipeline

SQLake redefines ease of use for pipeline development. Both data engineers and data consumers can build and deploy a continuous pipeline using only SQL in a few easy steps:

Select a use case from the SQLake template gallery, or start a pipeline from scratch.
Connect to data sources and ingest data into staging tables in the cloud data lake. SQLake automatically infers and evolves the source schema.
Inspect and profile the data using real-time statistics and SQL queries of the staging tables.
Develop a transformation job to create analytics-ready output tables in your data lake or data warehouse. Orchestration, data management and infrastructure scaling are automatic.
Preview the results and start the pipeline.

On-Demand Pricing at $99 per TB Ingested; Transformations are Free

With the launch of SQLake, Upsolver has moved to a predictable, value-based pricing model. Pricing is based solely on the volume of data ingested, with no limits on the number of pipelines in use, making transformations free. Unlike opaque “processing units” that many data management solutions use, Upsolver costs are straightforward to understand and tied to customer value, not vendor costs.

In order to make SQLake attractive for any size of pipeline project, SQLake is available for $99 per TB of data ingested with no minimum commitment. This ground-breaking entry price, plus 30 day free trial, allows any data user to get started risk-free with SQLake. Upsolver SQLake can be purchased on the AWS Marketplace.

Marketing Technology News: Using Decision-Driven Analytics to Maximize Ad Revenue