UPDATED 22:37 EDT / JULY 04 2024

BIG DATA

Hopsworks wants to make a splash with the industry’s first AI data lakehouse

Swedish data lakehouse company Hopsworks AB is pitching what it says is the industry’s “first unified platform for building batch, real-time and large language model” artificial intelligence systems.

In the Hopsworks 4.0 release, the company is enhancing its flagship platform with multiple new capabilities designed to help AI systems leverage real-time data, including a new feature store and vector database index system. The release, which will become generally available “soon”, also adds native Python access and more granular fine-tuning features for LLMs.

The updates will enable teams to build AI systems of unparalleled scale and support the most challenging AI workloads, the company promised.

Hopsworks is the creator of an enterprise-grade machine learning platform that’s designed to support the development and operation of data pipelines at scale. It’s a rival to data warehouse platforms such as Snowflake and Databricks, with its platform said to be based on a “Feature Store” that enables teams to manage datasets used in AI model training and inference.

With the Hopsworks platform, teams get access to a centralized repository for managing features, experiments, AI models and data assets. It also provides multi-tenancy for users that need to collaborate on sensitive data.

Hopsworks supports popular machine learning tools and frameworks such as Apache Spark, TensorFlow, PyTorch and Scikit-Learn, making it extremely versatile. According to the company, it’s an integrated system that helps to overcome some of the biggest obstacles in the way of scaling AI, including hyperparameter tuning, feature engineering and training.

With the Hopswork 4.0 release, the company says it’s boosting both performance and availability. It has increased the platform’s resilience to survive both hardware and network failures with cross-region replication functionality. In the event of a data center outage, customers can simply switch to an alternative geographic region, without any data loss, to maintain their AI system’s operations.

Meanwhile, the addition of vector and similarity search capabilities within Hopsworks means teams will be able to employ “retrieval-augmented generation” or RAG techniques to boost the capabilities of LLMs by feeding them with proprietary data.

The platform has also enhanced its support for Kubernetes, and it can now be deployed, maintained and upgraded via Helm Charts, which means it can be run in any kind of information technology environment, including cloud platforms and air-gapped on-premises servers.

Performance gets a boost with the debut of the new Hopsworks Query Service, which provides Python clients with 45-times higher throughput when reading data from the lakehouse, compared to platforms like Databricks and Vertex AI. The platform also gains enhanced feature monitoring support, enabling users to keep track of how their data changes over time and compare it with information that was used to train a specific version of an AI model.

Hopsworks Chief Executive Jim Dowling hailed version 4.0 of the platform as the company’s “most innovative release” so far, adding that it provides a foundation for “game changing innovations” in AI system development.

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU