Skip to main content

NanoPQ (Product Quantization)

Product Quantization algorithm (k-NN) in brief is a quantization algorithm that helps in compression of database vectors which helps in semantic search when large datasets are involved. In a nutshell, the embedding is split into M subspaces which further goes through clustering. Upon clustering the vectors the centroid vector gets mapped to the vectors present in the each of the clusters of the subspace.

This notebook goes over how to use a retriever that under the hood uses a Product Quantization which has been implemented by the nanopq package.

%pip install -qU langchain-community langchain-openai nanopq
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-google-vertexai 1.0.3 requires langchain-core<0.2.0,>=0.1.42, but you have langchain-core 0.2.3 which is incompatible.
Note: you may need to restart the kernel to use updated packages.
from langchain_community.retrievers import NanoPQRetriever
from langchain_openai import OpenAIEmbeddings
---------------------------------------------------------------------------
``````output
ImportError Traceback (most recent call last)
``````output
Cell In[3], line 1
----> 1 from langchain_community.retrievers import NanoPQRetriever
2 from langchain_openai import OpenAIEmbeddings
``````output
ImportError: cannot import name 'NanoPQRetriever' from 'langchain_community.retrievers' (/Users/bagatur/langchain/libs/community/langchain_community/retrievers/__init__.py)

Create New Retriever with Texts

retriever = NanoPQRetriever.from_texts(
["foo", "bar", "world", "hello", "foo bar"], OpenAIEmbeddings()
)

Use Retriever

We can now use the retriever!

result = retriever.invoke("foo")
result

Was this page helpful?


You can also leave detailed feedback on GitHub.