🚀 How to Use Remote Models
First and foremost, make sure to keep your API keys secure and never commit them to version control.
OpenAI
This is the easiest setup, technically you can start with only one set variable. If your .env
consists only of one LLM_API_KEY
per default it is considered as an OpenAI key.
LLM_API_KEY = "your_api_key"
Azure OpenAI
You can use Azure’s OpenAI models for both LLM and embeddings. Set the following environment variables in your .env
file:
LLM_PROVIDER=openai
LLM_MODEL=azure/gpt-4o-mini
LLM_ENDPOINT=https://...
LLM_API_KEY="your_api_key"
LLM_API_VERSION=2024-08 ## (different when not mini model)
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=azure/text-embedding-3-large
EMBEDDING_ENDPOINT=https://...
EMBEDDING_API_KEY="your_api_key"
EMBEDDING_API_VERSION=2023-05-15
Your model endpoint should look like this for example:
https://.......-swedencentral.cognitiveservices.azure.com/openai/deployments/gpt-4o-mini
Google Gemini
You can use Google’s Gemini models for both LLM and embeddings. Set the following environment variables in your .env
file:
For LLM:
LLM_PROVIDER="gemini"
LLM_API_KEY="your_api_key"
LLM_MODEL="gemini/gemini-1.5-flash"
LLM_ENDPOINT="https://generativelanguage.googleapis.com/"
LLM_API_VERSION="v1beta"
For embeddings:
EMBEDDING_PROVIDER="gemini"
EMBEDDING_API_KEY="your_api_key"
EMBEDDING_MODEL="gemini/text-embedding-004"
EMBEDDING_ENDPOINT="https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004"
EMBEDDING_API_VERSION="v1beta"
EMBEDDING_DIMENSIONS=768
EMBEDDING_MAX_TOKENS=8076
Custom endpoints
Anyscale
LLM_PROVIDER = 'custom'
LLM_MODEL = "anyscale/mistralai/Mixtral-8x7B-Instruct-v0.1"
LLM_ENDPOINT = "https://api.endpoints.anyscale.com/v1"
LLM_API_KEY = "your_api_key"
Deep Infra
LLM_PROVIDER="custom"
LLM_API_KEY="your_api_key"
LLM_MODEL="deepinfra/meta-llama/Meta-Llama-3-8B-Instruct"
LLM_ENDPOINT="https://api.deepinfra.com/v1/openai"
As of writing this you cannot use embedding models from Deep Infra with cognee
.
OpenRouter
You will need an API key set up with OpenRouter.
LLM_PROVIDER="custom"
LLM_API_KEY="your_api_key"
LLM_MODEL="openrouter/google/gemini-2.0-flash-thinking-exp-1219:free"
LLM_ENDPOINT="https://openrouter.ai/api/v1"
As of writing this OpenRouter does not have embedding models.
For a list of providers you can use with cognee
as custom providers, check out the LiteLLM documentation .