Databricks Introduces New Generative AI Tools, Investing in Lakehouse AI

Databricks’ data-centric approach to AI makes it easier to build, deploy and manage large language model (LLM) applications, enabling customers to accelerate their generative AI journey

At the sold-out Data + AI Summit, Databricks, the Data and AI company, announced new Lakehouse AI innovations that allow customers to easily and efficiently develop generative AI applications, including large language models (LLMs), directly within the Databricks Lakehouse Platform. Lakehouse AI offers a unique, data-centric approach to AI, with built-in capabilities for the entire AI lifecycle and underlying monitoring and governance. New features that will help customers more easily implement generative AI use cases include: Vector Search, a curated collection of open source models, LLM-optimized Model Serving, MLflow 2.5 with LLM capabilities such as AI Gateway and Prompt Tools, and Lakehouse Monitoring.

The demand for generative AI is driving disruption across industries, creating urgency for technical teams to build generative AI models and LLMs on top of their own data to differentiate their offerings. However, data determines success with AI, and when the data platform is separate from the AI platform, it’s difficult to enforce and maintain clean, high-quality data. Additionally, the process of getting a model from experimentation to production, and the related tuning, operationalizing, and monitoring of the models, is complex and unreliable.

With Lakehouse AI, Databricks unifies the data and AI platform, so customers can develop their generative AI solutions faster and more successfully – from using foundational SaaS models to training their own custom models securely with their enterprise data. By bringing together data, AI models, LLM operations (LLMOps), monitoring and governance on the Databricks Lakehouse Platform, organizations can accelerate their generative AI journey.

“At JetBlue, we inspire humanity through our product, culture and customer service. We’ve embarked on an AI transformation over the past year because we believe AI, and in particular LLMs, can fuel increased productivity and better customer experience for our travelers,” said Sai Ravuru, Senior Manager of Data Science and Analytics at JetBlue. “Databricks has been instrumental in our AI and ML transformation and has helped us build our own LLM, enabling our team to more effectively use the BlueSky platform to make decisions using real-time streams of weather, aircraft sensors, FAA data feeds and more. The deployment is significantly improving our onboarding time for new users. We’re excited about all of Databricks’ data-centric AI innovations, enabling customers like us to build LLMs in the lakehouse and govern them from there.”

Marketing Technology News: NetBase Quid Strengthens Leadership Team with Executive Hires

Offering the Best Data Platform to Develop Generative AI Solutions

Lakehouse AI unifies the AI lifecycle, from data collection and preparation, to model development and LLMOps, to serving and monitoring. Newly announced capabilities include:

  • Vector Search: Databricks Vector Search enables developers to improve the accuracy of their generative AI responses through embeddings search. It will fully manage and automatically create vector embeddings from files in Unity Catalog — Databricks’ flagship solution for unified search and governance across data, analytics and AI — and keep them updated automatically through seamless integrations Databricks Model Serving. Additionally, developers have the ability to add query filters to provide even better outcomes for their users.
  • Fine-tuning in AutoML: Databricks AutoML now brings a low-code approach to fine-tuning LLMs. Customers can securely fine-tune LLMs using their own enterprise data and they will own the resulting model that’s produced by AutoML, without having to send data to a third party. Additionally, with MLflow, Unity Catalog and Model Serving integrations, the model can be easily shared within an organization, governed for appropriate use, served for inference in production and monitored.
  • Curated open source models, backed by optimized Model Serving for high performance: Databricks has published a curated list of open source models available within Databricks Marketplace — including MPT-7B and Falcon-7B instruction-following and summarization models, and Stable Diffusion for image generation — making it easy to get started with generative AI across a variety of use cases. Lakehouse AI capabilities like Databricks Model Serving have been optimized for these models to ensure peak performance and cost optimization.

Marketing Technology News: MarTech Interview with Jason Grunberg, Chief Marketing Officer at Bluecore

Managing LLMOps Effectively and Reliably

Databricks also unveiled new innovations in LLMOps with the announcement of MLflow 2.5, the latest release of popular Linux Foundation open source project MLflow. This is Databricks’ latest contribution to one of the company’s flagship open source projects. MLflow is an open source platform for the machine learning lifecycle that sees nearly 11 million monthly downloads. MLflow 2.5 updates include:

  • MLflow AI Gateway: MLflow AI Gateway enables organizations to centrally manage credentials for SaaS models or model APIs and provide access-controlled routes for querying. Organizations can then provide these routes to various teams to integrate into their workflows or projects. Developers can easily swap out the backend model at any time to improve cost and quality, and switch across LLM providers. MLflow AI Gateway will also enable prediction caching to track repeated prompts and rate limiting to manage costs.
  • MLflow Prompt Tools: New, no-code visual tools allow users to compare various models’ output based on a set of prompts, which are automatically tracked within MLflow. With integration into Databricks Model Serving, customers can deploy the relevant model to production.

Additionally, following its release earlier this year, Databricks Model Serving has been optimized for the inference of LLMs up to 10x lower latency time and reduced costs. Fully managed by Databricks to offer frictionless infrastructure management, Model Serving now enables GPU-based inference support. It auto-logs and monitors all requests and responses to Delta Tables and ensures end-to-end lineage tracking through Unity Catalog. Finally, Model Serving quickly scales up from zero and back down as demand changes, reducing operational costs and ensuring customers pay only for the compute they use.

Intelligent Monitoring Across Data and AI Assets

Databricks also expanded its data and AI monitoring capabilities with the introduction of Databricks Lakehouse Monitoring to better monitor and manage all data and AI assets within the Lakehouse. Databricks Lakehouse Monitoring provides end-to-end visibility into data pipelines, to continuously monitor, tune and improve performance, without additional tools and complexity. By taking advantage of Unity Catalog, Lakehouse Monitoring provides users with deep insight into the lineage of their data and AI assets to ensure high quality, accuracy and reliability. Proactive detection and reporting will make it easy to spot and diagnose errors in pipelines, automatically perform root cause analysis and quickly find recommended solutions across the data lifecycle.

“We’ve reached an inflection point for organizations: leveraging AI is no longer aspirational — it is imperative for organizations to remain competitive,” said Ali Ghodsi, Co-Founder and CEO at Databricks. “Databricks has been on a mission to democratize data and AI for more than a decade and we’re continuing to innovate as we make the lakehouse the best place for building, owning and securing generative AI models.”

Databricks continues to expand the Lakehouse Platform, recently announcing Lakehouse Apps and the general availability of Databricks Marketplace, LakehouseIQ, new governance capabilities, and Delta Lake 3.0.

Marketing Technology News: Looking to Shake up your User Acquisition marketing?

Brought to you by
For Sales, write to: contact@martechseries.com
Copyright © 2024 MarTech Series. All Rights Reserved.Privacy Policy
To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.