Orama 2.0 is out - Hybrid search, Geosearch, Secure AI Proxy, Orama Cloud, and more

Orama 2.0 is out - Hybrid search, Geosearch, Secure AI Proxy, Orama Cloud, and more

Michele Riva

Product Updates

10

min read

Jan 16, 2024

In May 2023, we launched the first stable version of Orama, version 1.0.0. Following this, we dedicated substantial effort to developing a new kind of full-text and vector search engine. We have added numerous new features and created innovative services around it, which are included in the Orama Cloud offering.

Orama 2.0 introduces major new features and stabilizes existing beta APIs. Let's explore what's in this release together.

Full-text, Vector, and Hybrid Search

In Orama 1.2.0, we introduced vector search and storage, empowering users to carry out semantic and context-aware searches directly from their browsers or via Orama Cloud.

With Orama 2.0, we are expanding the number of search paradigms we support by incorporating hybrid search as a first-class, integral component of our database.

In Orama, hybrid search is a paradigm that combines full-text and vector search. Hybrid search optimizes search results based on your intent. If you make a contextual query, it returns more contextual data. Conversely, if your query is keyword-based, it returns more full-text-based data. This approach enables hybrid search to combine the advantages of different methods, resulting in higher-quality results when the dataset or search query doesn't align with just one search algorithm.

A common example is a short search query, such as a user searching for "shoes" in an e-commerce search box. This works well with full-text search, but vector search struggles due to the lack of context.

Conversely, a detailed search query like "best high heels shoes for a casual wedding" provides a lot of context. However, it doesn't perform well with full-text search algorithms, which focus on individual tokens and ignore context. But it excels with vector search.

With Orama 2.0, there's no compromise needed – we handle it all for you, automatically.

We've also standardized the search APIs to use a common format. Now, you can use the search function to perform full-text, vector, and hybrid searches all in one place:

const resultsFulltext = await search(db, {
  mode: 'fulltext', // you can omit `mode` as fulltext is the default search mode
  term: 'Videogame for little kids with a passion for ice cream'
})

const resultsVector = await search(db, {
  mode: 'vector',
  term: 'Videogame for little kids with a passion for ice cream'
})

const resultsHybrid = await search(db, {
  mode: 'hybrid',
  term: 'Videogame for little kids with a passion for ice cream'
})

Lastly, vector and hybrid search now support filters, facets, groups, and all other APIs typically reserved for full-text search.

Stable Geosearch APIs

Geosearch has been a highly requested feature for Orama, and we released its first implementation as part of the v2.0.0-beta.1 release - and developers loved it.

The demo on the full list of airports worldwide went viral and really shows the capabilities of this API for high-performance, real-world applications:

You can play with the demo here: https://orama-examples-geosearch-airports.vercel.app.

Over the past few weeks, we have enhanced the efficiency and performance of Geosearch by introducing a high-precision Geosearch API. This API is designed for highly accurate calculations on large polygons.

With the release of Orama 2.0.0, we now officially declare the Geosearch APIs as stable and ready for production use.

You can read the complete documentation about Geosearch APIs here.

Orama Secure AI Proxy

To execute vector and hybrid searches on the client side, specifically in your browser, you must generate embeddings.

OpenAI's text-embedding-ada-002 is the most commonly used model. You can invoke it by making a REST API call directly from your browser or server. However, this approach requires you to expose your OpenAI API key, which could lead to exorbitant bills if someone were to start using it for something else. OpenAI services aren't inexpensive!

While it's possible to create a server-side route in your Next.js or Nuxt application, or with any backend framework of your choice, you would still need to handle rate limiting, detect malicious traffic, and more.

For this reason, we developed the Orama Secure AI Proxy, a proxy nano-service that is deployed close to your users. It is integrated into our Global Search Network, which spans over 300 cities worldwide in more than 100 countries.

You can securely call the Orama Secure AI Proxy endpoint directly from your browser. This will relay your request to OpenAI, protecting your API key and applying two levels of rate limiting to help you stay within your monthly budget.

We have released the official @orama/plugin-secure-proxy npm package. This allows you to integrate the Secure AI Proxy directly into Orama. With it, you can perform vector and hybrid search queries using the same search API as for full-text search.

import { create, search, insertMultiple } from '@orama/orama' 
import { pluginSecureProxy } from '@orama/plugin-secure-proxy'

const secureProxy = secureProxyPlugin({
  apiKey: '<YOUR-PUBLIC-API-KEY>',
  defaultProperty: 'embeddings' // the default property to perform vector and hybrid search on
})

const db = await create({
  schema: {
    title: 'string',
    description: 'string',
    embeddings: 'vector[1536]'
  },
  plugins: [secureProxy]
})

const results = await search(db, {
  mode: 'vector', // or 'hybrid'
  term: 'Videogame for little kids with a passion for ice cream'
})

The Orama Secure AI Proxy is included in the Orama Cloud free plan, although there are some limitations. For more details on pricing and its operation, refer to the dedicated announcement.

Orama Native Embedding Formats

Although the Orama Secure AI Proxy can generate embeddings through OpenAI, we're pleased to introduce a high-performance alternative that could be extremely beneficial to Orama users on multiple levels.

We've forked GTE (General Text Embedding), a model trained by the Alibaba DAMO Academy and based on BERT. It's now available on our Global Search Network, allowing for faster, smaller, and cheaper on-the-fly embedding generation.

While OpenAI's text-embedding-ada-002 model is impressive, it generates large embeddings (1536 dimensions) that are burdensome to transfer and process.

On the other hand, Orama's GTE-Small model generates vectors of 384 dimensions directly on a CDN. This reduces latency and makes them very fast to process, both during search and indexing. These vectors are also less taxing on browser memory.

In terms of quality, GTE-Small outperforms OpenAI's text-embedding-ada-002 in HuggingFace's MTEB benchmarks.

We are actively working to support other models soon. In the meanwhile, you can start using Orama’s GTE-Small model through the Secure AI Proxy immediately.

Orama Cloud

We are making significant improvements to our Orama Cloud offering by expanding its limits and adding more features.

New Index Size Limits

Starting today, we're increasing the index size limit from 10,000 to 100,000 documents for the pro plan. You can begin indexing larger indexes right away, and they'll be distributed through our Global Search Network within minutes.

The index weight is now increasing from 10MB to 100MB.

New data source: the WebHook APIs

With Orama 2.0 we’re also releasing a highly requested feature: WebHook APIs. You can now update, remove, and delete documents from your index using the new WebHook APIs.

These are incredibly easy to run:

  • Release an Orama snapshot. You can deploy a new Orama snapshot by making a request to https://api.oramasearch.com/api/v1/webhook/snapshot and including your documents in the request body. If you include all your documents in this request, Orama will generate a snapshot and replace the live index with this new snapshot. This is useful for replacing the entire index as a bulk operation.

  • Notify a change in the index. You can now progressively notify Orama of changes in your indexes by calling the https://api.oramasearch.com/api/v1/webhook/notify endpoint, which will keep track of all the changes (single or bulk additions, removals, and upserts) and will deploy them when the deploy API is called.

  • Run a new deployment. After you’ve notified the index of all the changes, you will need to deploy them by calling the https://api.oramasearch.com/api/v1/webhook/deploy API.

These new APIs will enable Orama users to adopt standardized workflows when interacting with Orama Cloud indexes.

Automatic Embedding Generation

With the introduction of the Orama Secure AI Proxy, we're also launching the automatic embedding generation API as part of our Orama Cloud service.

When creating a new index and defining a schema, you can choose which properties to use for automatic embedding generation.

Orama handles the embedding generation process, supporting OpenAI embedding models (you will need to enter your OpenAI API key) or Orama Native Embedding models. The latter is the preferred solution as these models are specifically optimized to run on Orama Cloud.

But that’s not all

Orama 2.0 introduces numerous fixes and new features, including:

  • New plugin system: Learn more

  • New Vitepress Plugin: More details

  • A complete rewrite of the Docusaurus Plugin, which is now smaller and faster (#577)

  • Faster number indexing and filtering, achieved by fixing a bug that caused the internal data structure used for number indexing (AVL tree) to be unbalanced (#558)

  • 5x faster number indexing due to #572, which prevents a new array creation at each insertion

  • Fixed a bug where typo tolerance inadvertently disabled prefix search (#580)

Finally, A Word on Performance

We assert that Orama is the fastest AI Search and Answer Engine. But what makes us confident in this claim?

  1. Orama can execute hybrid and vector searches in the browser, which are crazy fast.

  2. Orama Native Embeddings are twice as fast for our use case compared to OpenAI. In our benchmarks conducted a few weeks ago, OpenAI was already the quickest embedding generation service. We operate embedding generation at the edge, which eliminates latency and yields embeddings that are four times smaller. These smaller embeddings are quicker to transfer, index, analyze, and use for search.

  3. Orama has the capability to execute aggressive and smart caching on the browser, CDN, and AI interactions.

  4. Orama Cloud integrates these elements to deliver a hybrid search experience. It typically returns full hybrid and vector search results in 250 to 350ms (when using Orama Native Embeddings), and full-text results in 60 to 120ms, depending on the index size and user location. Orama Open Source, when run on a browser, is even faster.

In conclusion

Orama 2.0 introduces major features including hybrid search, stable Geosearch APIs, Secure AI Proxy, and Orama Cloud enhancements. Hybrid search combines full-text and vector search for optimized results. Geosearch APIs are now stable and ready for production use. Secure AI Proxy is a nano-service that protects your API key and applies rate limiting.

Orama Cloud now supports larger index sizes, WebHook APIs, and automatic embedding generation. Other updates include a new plugin system, Vitepress Plugin, a rewrite of the Docusaurus Plugin, and faster number indexing and filtering.

We're excited to see what you'll create with Orama Cloud.

Run unlimited full-text, vector, and hybrid search queries at the edge, for free!