A vector index is a specialized data store that is optimized for storing and searching high-dimensional vectors. Instead of searching through row- or column-oriented data as seen in relational databases, vector indexes are designed to calculate the similarity between vectors. They are particularly useful for tasks like text search, image search, recommendations, image and voice recognition, and more. Vector search comes into play in any application where data can be represented as a vector and when similarity searches are important. For a more detailed introduction to vector indexes, see Kirk Kirkonnell’s article about vector indexes. For more in-depth information on how they work and how to integrate them into machine learning workflows, see Michael Landis’ talk on vector searching.
Momento Vector Index provides a fast and easy way to get started with vector indexes. In this article, we’ll make a simple Python program that sets up a vector index, adds data to it, and searches that data.
Momento API Key
To use Momento Vector Index you will need a Super User key for the AWS us-west-2 region. Go to the Momento Console and follow the instructions to log in with your email address, Google account, or GitHub account.
In the console, select the API Keys menu option.
Once on the API key page, select the information that matches where your caches live:
- Cloud provider: AWS
- Region: us-west-2
- Key Type: Super User
- (Optional) Expiration date
The Momento Python SDK
You’ll need to install the Momento SDK package to use it in your program. It is available on pypi: https://pypi.org/project/momento/. If you’re using pip, then run pip install momento.
Writing your first Momento Vector Index program
Here is the basic Momento Vector Index example we’ll be going over. It creates an index, lists all indexes, uploads data, searches for that data, and finally deletes the index.
After installing the Momento dependency we can run the program and get the following output:
In the next section, we’ll explain how this output was produced.
The first thing we need to do is create a vector index client:
A vector index client, like all Momento clients, requires a configuration object and an auth provider. The auth provider loads and parses your Momento API key. It can either load from an environment variable or directly from a string.
The configuration contains settings for the underlying gRPC client, as well as any customer configurable features. To avoid needing to create one every time, we provide prebuilt configurations. VectorIndexConfigurations.Default.latest() is the newest version of the Default configuration. We version the configurations for backwards compatibility, so changes to a configuration won’t affect customer deployments. latest will always point to the newest version.
Since the client is an asynchronous context manager, we can use async with to automatically clean up the client when it goes out of context.
Now that we have a client, we can create an index:
The create_index function takes three arguments:
- index_name: the name of the index
- num_dimensions: the number of dimensions in the index
- similarity_metric: the metric that will be used to compare vectors in a search
For this example we are creating an index with 2 dimensions. Since a dimension represents a feature or attribute of a piece of complex data, a real-world index may have hundreds. We’re doing this to make it easier to visualize how the index compares vectors when searching. We’re also using the cosine similarity as our similarity metric. It compares the angles between vectors, normalized to between -1 and 1. It is the default choice when setting up a Momento vector index.
This function illustrates the error handling pattern that the Momento APIs use. A client method should never throw an exception. Instead, it returns a response object that represents different types of call results. Here it can be Success if the index is created, IndexAlreadyExists if there was already an index by that name, or Error if the call failed. All Momento calls can return an error object containing details about the specific failure.
Now that we created an index, we can see it by listing all indexes:
The list_indexes function takes no arguments and returns a Success object with a list of all index names in your account in your region. This function is useful if your code doesn’t use a hard-coded index and needs to look one up or if you need to keep track of your index count to make sure you aren’t creating too many.
We can now add vectors to our new index:
The upsert_item_batch function takes in an index name and a list of items representing vectors and inserts them into the index, replacing any existing vectors with matching IDs. An item contains a unique ID, a vector matching the dimensionality of the index, and optional metadata. The metadata keys must be strings but the values can be strings, ints, floats, booleans, or lists of strings.
We uploaded the vectors [-1.0, 0.0], [0.0, 1.0], [0.5, 0.5], and [1.0, 0.0]. Since they are two dimensional, we can visualize them:
You can see how we could compare a query vector to these by the difference in their angles.
Now that our index contains data, we can search for it:
The search function takes an index name, a query vector matching the dimensionality of the index, an optional top_k argument representing the number of results to return, and an optional metadata_fields argument representing which metadata you want returned. We’re using the ALL_METADATA sentinel value here, meaning that all metadata should be returned, but you could also supply a list of fields, e.g. metadata_fields=[“key1”, “key3”] to specify that only metadata matching those field names should be returned. If metadata_fields is not specified, no metadata is returned.
A Success search response contains a list of search hits. Each hit contains the ID of the vector, the score, i.e. the similarity of that vector to the search vector (from -1 to 1 for cosine similarity), and any requested metadata. Here is an example from our program’s output:
Matches are returned ordered by score. We searched for the vector [1.0, 0.0], so item_2, which matches that vector exactly, has a score of 1.0. item_3, with a vector [-1.0, 0.0], is the exact opposite, and has a score of -1.0. item_1, which has an vector orthogonal to our search vector, has a score of 0.0. Higher dimensional vectors cannot be easily visualized, but the pattern is the same: the more closely a vector matches the search vector, the closer its score will be to 1.0. The more a vector matches the opposite of the search vector, the closer it will be to -1.0.
Finally, we delete the index to clean up for after example:
The delete_index function takes an index name and deletes that index and all data in it. It returns success if the deletion is successful or there is no index by that name to delete.
Ready to start?
At Momento, we approach our services with simplicity in mind. We aim to give you the fastest, easiest developer experience on the market so you can focus on solving the problems that you actually care about. Momento Vector Index is a fully serverless vector index, and can be used to its fullest extent with only five API calls.