Plato Data Intelligence.
Vertical Search & Ai.

Putting LLMs into production via vector databases

Date:

Last month MongoDB announced its public preview of Vector Search among the updates to its developer platform of its Atlas database-as-a-service. The move means document database MongoDB joins Cassandra, PostgreSQL and SingleStore among the systems supporting similar features as the interest in putting large language models (LLMs) into production gathers pace.

LLMs have received a great deal of hype in the last six months, with OpenAI’s GPT 4.0 sucking up the lion’s share of media airtime. The idea is to extract some meaning – in the form of natural language question answering from a corpus of text. The relationships between words, sentences and other textual units are represented as multi-dimensional vectors (sometimes running into hundreds of dimensions), which are then resolved to find the most likely association.

Anticipating the boom in this form of analysis of text and other data, a group of vendors have developed specialist databases with architectures designed specifically for the task. The question is whether it is better to employ a database or use new features of a system already familiar to developers and enterprises, with a home marked out in the technology stack.

However, MongoDB argues that single-purpose databases for use cases like vector stores were often bolted on to existing technology stacks, and therefore led to greater administrative complexity and longer time to value. The approach also required developers to learn a new system.

Speaking to The Register, Ben Flast, product management lead for Vector Search, said high-dimensional vectors could be stored inside JSON documents around which MongoDB is designed.

“It’s quite straightforward to include these high-dimensional vectors inside of your documents,” he said. “As you look to add semantic search as a capability to your application and other new use cases around LLMs and chat bots arise, [you can] take that same data you were storing inside of your MongoDB deployment. You can embed or vectorize it, and add that vector to the individual documents and then create an index on it. We then manage all of the complexity behind the scenes in terms of having that index and supporting those queries.”

Other popular developer databases including open source relational system PostgreSQL and wide-column store Cassandra support similar features. Pgvector is an open source vector extension for similarity search for PostgreSQL. “Since on vector embeddings you can use AI tools for capturing relationships between objects (vector representations), you are also able to identify similarities between them in an easily computable and scalable manner,” according to database service provider Aiven.

The Cassandra features are available in DataStax’s database service Astra and are set for inclusion in open source Cassandra 5.0 later this year. Patrick McFadin, Apache Cassandra committer and developer relations veep at DataStax, told The Register: “Several new startups have created a business by constructing a specialized vector search database. However, this limited approach avoids other crucial data in AI workloads. These startups have taken a single function and attempted to transform it into a separate product.

“While vector search was once a niche industry requirement, these new products only fit those niche requirements. Nowadays, as vector search has become a mainstream requirement, mainstream databases are incorporating vector search as a feature for their developers.”

But for the specialists, it is a question of scale and performance, not developer convenience, that will ensure continuing demand for their approach.

Built by the team behind Amazon Sagemaker, Pinecone is designed to allow machine learning engineers to search through catalogues of embeddings, the continuous vector representations of separate variables fundamental to common ML algorithms. In April, it raised $100 million in series B funding, resulting in an estimated value of $750 million.

Speaking to The Register, Pinecone product veep Elan Dekel said that while every database is likely to feature some kind of vector support in the near future, they might not be the most effective approach for all use cases.

“If your use case is relatively small, then [a general purpose system is] probably enough,” he said. “But at some point, you’re going to realize that you’re starting to break the limits of the existing architecture. When you want to hit a real production scale, the retrofitting the existing solutions will mean the cost will explode to get this performance.

“If your use case is relatively small, or you don’t care about performance, you will be fine. There’ll be like this mid-tier of use cases where you can happily continue, but as you get to sort of real production scale, you will start to reach the limits of the existing systems. If you want high performance, support for high scale systems and you want it efficiently, at a reasonable cost, you’ll ultimately realise that you need a purpose-built database.”

Peter Zaitsev, an expert in MySQL performance and founder of database service company Percona, said there would not be not a single answer to the dilemma.

“Quite often, in the early stage, there are multiple technologies that appear on the market with slightly different approaches, features and focus, and it will take time for the market to settle,” he told The Register.

“In the end, I expect the SQL standard will include some things to support vector search applications, and we will have some unique extensions in various existing databases, whether relational, document and so on. Alongside these, we will get between three and five special purpose vector databases controlling 95 percent of the special purpose vector database market.”

Among specialist vector database systems, Pinecone is joined by Weaviate, Qdrant, Milvus, and Vespa.

Noel Yuhanna, veep and principal analyst with Forrester Research, said he was hearing positive feedback from organizations using these systems, which promise access control, high availability, transformation, query optimization, resource management, scalability, concurrency, and fast data queries that help support LLMs.

However, developer familiarity would be a strong draw towards most established databases supporting vector analysis.

“While native vector databases will stand out, having better performance and scale, we will likely see organizations also leveraging traditional databases with vector capabilities that need more integrated data comprising systems of record, systems of engagement, and vector data to deliver much richer LLM applications with less coding,” he said.

Poster child of the current LLM hype machine OpenAI was valued at around $29 billion earlier this year as it inhaled a $300 million investment. If business applications reflect anything like that kind of interest, the best supporting databases will rage for some time. ®

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?