* Update docs for new requirements.txt path Signed-off-by: Marcus Köhler <khler.marcus@gmail.com> * Fix typo (.PONY -> .PHONY) in python backend makefiles Signed-off-by: Marcus Köhler <khler.marcus@gmail.com> --------- Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
3.4 KiB
+++ disableToc = false title = "🧠 Embeddings" weight = 2 +++
LocalAI supports generating embeddings for text or list of tokens.
For the API documentation you can refer to the OpenAI docs: https://platform.openai.com/docs/api-reference/embeddings
Model compatibility
The embedding endpoint is compatible with llama.cpp models, bert.cpp models and sentence-transformers models available in huggingface.
Manual Setup
Create a YAML config file in the models directory. Specify the backend and the model file.
name: text-embedding-ada-002 # The model name used in the API
parameters:
model: <model_file>
backend: "<backend>"
embeddings: true
# .. other parameters
Bert embeddings
To use bert.cpp models you can use the bert embedding backend.
An example model config file:
name: text-embedding-ada-002
parameters:
model: bert
backend: bert-embeddings
embeddings: true
# .. other parameters
The bert backend uses bert.cpp and uses ggml models.
For instance you can download the ggml quantized version of all-MiniLM-L6-v2 from https://huggingface.co/skeskinen/ggml:
wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
To test locally (LocalAI server running on localhost),
you can use curl (and jq at the end to prettify):
curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{
"input": "Your text string goes here",
"model": "text-embedding-ada-002"
}' | jq "."
Huggingface embeddings
To use sentence-transformers and models in huggingface you can use the sentencetransformers embedding backend.
name: text-embedding-ada-002
backend: sentencetransformers
embeddings: true
parameters:
model: all-MiniLM-L6-v2
The sentencetransformers backend uses Python sentence-transformers. For a list of all pre-trained models available see here: https://github.com/UKPLab/sentence-transformers#pre-trained-models
{{% notice note %}}
- The
sentencetransformersbackend is an optional backend of LocalAI and uses Python. If you are runningLocalAIfrom the containers you are good to go and should be already configured for use. - If you are running
LocalAImanually you must install the python dependencies (make prepare-extra-conda-environments). This requirescondato be installed. - For local execution, you also have to specify the extra backend in the
EXTERNAL_GRPC_BACKENDSenvironment variable.- Example:
EXTERNAL_GRPC_BACKENDS="sentencetransformers:/path/to/LocalAI/backend/python/sentencetransformers/sentencetransformers.py"
- Example:
- The
sentencetransformersbackend does support only embeddings of text, and not of tokens. If you need to embed tokens you can use thebertbackend orllama.cpp. - No models are required to be downloaded before using the
sentencetransformersbackend. The models will be downloaded automatically the first time the API is used.
{{% /notice %}}
Llama.cpp embeddings
Embeddings with llama.cpp are supported with the llama backend.
name: my-awesome-model
backend: llama
embeddings: true
parameters:
model: ggml-file.bin
# ...
💡 Examples
- Example that uses LLamaIndex and LocalAI as embedding: here.