Skip to Content
How-to GuidesRemote Models

🚀 How to Use Remote Models

First and foremost, make sure to keep your API keys secure and never commit them to version control.

OpenAI

This is the easiest setup, technically you can start with only one set variable. If your .env consists only of one LLM_API_KEY per default it is considered as an OpenAI key.

LLM_API_KEY = "your_api_key"

Azure OpenAI

You can use Azure’s OpenAI models for both LLM and embeddings. Set the following environment variables in your .env file:

LLM_PROVIDER=openai LLM_MODEL=azure/gpt-4o-mini LLM_ENDPOINT=https://... LLM_API_KEY="your_api_key" LLM_API_VERSION=2024-08 ## (different when not mini model) EMBEDDING_PROVIDER=openai EMBEDDING_MODEL=azure/text-embedding-3-large EMBEDDING_ENDPOINT=https://... EMBEDDING_API_KEY="your_api_key" EMBEDDING_API_VERSION=2023-05-15

Your model endpoint should look like this for example:

https://.......-swedencentral.cognitiveservices.azure.com/openai/deployments/gpt-4o-mini

Google Gemini

You can use Google’s Gemini models for both LLM and embeddings. Set the following environment variables in your .env file:

For LLM:

LLM_PROVIDER="gemini" LLM_API_KEY="your_api_key" LLM_MODEL="gemini/gemini-1.5-flash" LLM_ENDPOINT="https://generativelanguage.googleapis.com/" LLM_API_VERSION="v1beta"

For embeddings:

EMBEDDING_PROVIDER="gemini" EMBEDDING_API_KEY="your_api_key" EMBEDDING_MODEL="gemini/text-embedding-004" EMBEDDING_ENDPOINT="https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004" EMBEDDING_API_VERSION="v1beta" EMBEDDING_DIMENSIONS=768 EMBEDDING_MAX_TOKENS=8076

Custom endpoints

Anyscale

LLM_PROVIDER = 'custom' LLM_MODEL = "anyscale/mistralai/Mixtral-8x7B-Instruct-v0.1" LLM_ENDPOINT = "https://api.endpoints.anyscale.com/v1" LLM_API_KEY = "your_api_key"

Deep Infra

LLM_PROVIDER="custom" LLM_API_KEY="your_api_key" LLM_MODEL="deepinfra/meta-llama/Meta-Llama-3-8B-Instruct" LLM_ENDPOINT="https://api.deepinfra.com/v1/openai"

As of writing this you cannot use embedding models from Deep Infra with cognee.

OpenRouter

You will need an API key set up with OpenRouter.

LLM_PROVIDER="custom" LLM_API_KEY="your_api_key" LLM_MODEL="openrouter/google/gemini-2.0-flash-thinking-exp-1219:free" LLM_ENDPOINT="https://openrouter.ai/api/v1"

As of writing this OpenRouter does not have embedding models.

For a list of providers you can use with cognee as custom providers, check out the LiteLLM documentation.