Embedding based Search Capabilities with Elasticsearch and KNN
In this article, we’ll explore how Elasticsearch works, understand the basics of KNN search, and learn how to integrate both. Get ready to enhance your search capabilities and improve your applications!
Section 1: Setting up Elasticsearch with Docker: Let’s start by setting up Elasticsearch using Docker, ensuring a smooth and hassle-free installation. Execute the following commands:
docker pull docker.elastic.co/elasticsearch/elasticsearch:8.7.1
docker network create elastic
docker run --name es01 --net elastic -p 9200:9200 -it docker.elastic.co/elasticsearch/elasticsearch:8.7.1
Section 2: Installing the HNSW Plugin: To use KNN search, we need to install the HNSW plugin. Run the following command inside the contatiner shell to install the plugin:
./bin/elasticsearch-plugin install https://github.com/alexklibisz/elastiknn/releases/download/8.7.1.0/elastiknn-8.7.1.0.zip
Section 3: Python Integration with Elasticsearch: To connect Elasticsearch with our Python code, we need to install the Elasticsearch Python client:
pip install elasticsearch
Next, let’s establish a connection to Elasticsearch using the Elasticsearch client:
from elasticsearch import Elasticsearch
es = Elasticsearch(hosts="https://localhost:9200", basic_auth=('user', 'pass'), verify_certs=False)
Section 4: Leveraging KNN Search for Queries: To make the most of KNN search, we need to define the appropriate index and mapping settings. Let’s create an index and specify the relevant properties:
index_name = 'my_index'
mapping_settings = {
"mappings": {
"properties": {
"product-vector": {
"type": "dense_vector",
"index": True,
"similarity": "l2_norm",
"dims": 5,
},
"price": {
"type": "long"
}
}
}
}
es.indices.create(index=index_name, body=mapping_settings)
documents = [
{
'product-vector': [230.0, 300.33, -34.8988, 15.555, -200.0],
'price': 199
},
{
'product-vector': [-0.5, 100.0, -13.0, 14.8, -156.0],
'price': 1599,
},
# Add more documents with embeddings
]
for doc in documents:
es.index(index=index_name, body=doc)
Section 5: Perform KNN Search: Now, let’s use the power of KNN search to retrieve relevant results based on a query embedding. We’ll search for the nearest neighbors of a given query vector:
query_embedding = [0.2, 0.3, 0.4, 5, 6]
search_query = {
"knn": {
"field": "product-vector",
"query_vector": query_embedding,
"k": 10,
"num_candidates": 100
}
}
results = es.search(index=index_name, body=search_query)
Section 6: Index Maintenance: To ensure your Elasticsearch setup remains efficient, you can delete the index when it’s no longer needed:
es.indices.delete(index=index_name)
Congratulations!
You’ve started using Elasticsearch and KNN search! By connecting Elasticsearch with Python, you can now add search features to your apps. Explore the possibilities of Elasticsearch and KNN to improve your user experience.
Don’t forget to try new things and adjust your setup to fit your needs. Happy searching!