Load Your Relational Database
Difficulty: Easy
Overview
Migrating your relational data to a graph can open up new ways to explore and search your data—with minimal code changes using cognee’s built-in ingestion tasks. Below is a simple tutorial on how to migrate a relational database into cognee’s graph engine. We’ll walk through the setup—using SQLite as the source relational database, and NetworkX as the graph database. At the end, we’ll cover how to switch to Postgres or another graph database (Neo4j or Kuzu).
cognee provides functionality to:
- Connect to a relational database (e.g. SQLite, Postgres).
- Extract the schema and data.
- Migrate tables and relationships into a graph structure.
- Store the new graph either in-memory (NetworkX) or in a more robust graph DB like Neo4j or Kuzu.
Once migrated, you can run queries or semantic searches (using the cognee.search
API) against your newly formed knowledge graph.
1. Prerequisites & Environment Setup
-
Clone the cognee project or have it locally
git clone https://github.com/topoteretes/cognee.git
-
Navigate to cognee & install required dependencies:
cd cognee poetry install
-
Prepare your
.env
for SQLite & NetworkXCopy
.env.template
and rename it to .env.# MIGRATION CONFIG MIGRATION_DB_PATH="/path/to/migration/directory" #replace as necessary MIGRATION_DB_NAME="migration_database.sqlite" #replace as necessary MIGRATION_DB_PROVIDER="sqlite" # GRAPH CONFIG GRAPH_DATABASE_PROVIDER=networkx
2. Prepare Your SQLite Database
- Download or create a SQLite
.db
file (for example, the Chinook database ). - Place it in your
MIGRATION_DB_PATH
. - Ensure you have the correct foreign key constraints (crucial for relationships to be recognized).
3. Run the example script for migration
Look at an example like relational_database_migration_example.py
here :
import asyncio
import cognee
import os
from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.infrastructure.databases.relational import get_migration_relational_engine
from cognee.modules.search.types import SearchType
from cognee.tasks.ingestion import migrate_relational_database
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.infrastructure.databases.relational import (
create_db_and_tables as create_relational_db_and_tables,
)
from cognee.infrastructure.databases.vector.pgvector import (
create_db_and_tables as create_pgvector_db_and_tables,
)
async def main():
# 1) Connect to relational DB
engine = get_migration_relational_engine()
# 2) Prune existing data (if you want a fresh graph)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# 3) Create necessary tables in relational DB
await create_relational_db_and_tables()
await create_pgvector_db_and_tables()
# 4) Extract schema
schema = await engine.extract_schema()
print("Relational schema extracted:", schema)
# 5) Get Graph Engine, then migrate
graph = await get_graph_engine()
await migrate_relational_database(graph, schema=schema)
print("Relational -> Graph migration complete.")
# 6) Optional: Visualize the resulting graph
home_dir = os.path.expanduser("~")
destination_file_path = os.path.join(home_dir, "graph_visualization.html")
await visualize_graph(destination_file_path)
print(f"Visualization at {destination_file_path}")
# 7) Example search query
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What kind of data do you contain?"
)
print("Search results:", search_results)
# Expected outcome for trimmed Chinook data is something like:
# ['The data contains information about tracks and their media types. Specifically, there are several tracks (Track:34, Track:3, Track:4, Track:2, Track:5) linked to different media types (MediaType:1 and MediaType:2) through a relation identified as MediaTypeId.']
if __name__ == "__main__":
asyncio.run(main())
In your cognee directory, run:
python examples/python/relational_database_migration_example.py
Once this completes, your entire SQLite schema and data should be in the NetworkX graph. Open in browser.
Troubleshooting and Tips
- Foreign Keys: Only actual foreign key constraints in your relational DB become edges in the graph.
- Case Sensitivity: Note that entity naming is derived from your DB tables/rows.
- Environment Variables: Make sure
.env
is loaded or environment variables are otherwise set so thatget_migration_relational_engine()
andget_graph_engine()
can pick them up.
Using Postgres Instead
If you want to switch to Postgres as your relational DB:
-
Change your
.env
to:MIGRATION_DB_PROVIDER=postgres MIGRATION_DB_NAME=my_postgres_db MIGRATION_DB_HOST=127.0.0.1 MIGRATION_DB_PORT=5432 MIGRATION_DB_USERNAME=cognee MIGRATION_DB_PASSWORD=cognee # GRAPH still with networkx, or you can do neo4j/kuzu GRAPH_DATABASE_PROVIDER=networkx
-
Make sure your Postgres instance is running and that your user can connect to
my_postgres_db
. -
Run the same migration code. The main difference is MIGRATION_DB_PROVIDER=postgres, so
get_migration_relational_engine()
connects to your Postgres DB.
Everything else is the same. If you have the Chinook schema in Postgres, it will be migrated to the graph.
Using Neo4j or Kuzu as the Graph DB
If you want a more robust or persistent graph database, simply change:
GRAPH_DATABASE_PROVIDER=kuzu # or neo4j
# Below only needed if using neo4j
GRAPH_DATABASE_URL=
GRAPH_DATABASE_USERNAME=
GRAPH_DATABASE_PASSWORD=
Conclusion
- SQLite + NetworkX is the simplest configuration (no external DB servers needed).
- To switch to more production-level databases (Postgres → Neo4j, etc.), adjust
.env
accordingly. - The main migration code stays the same:
extract_schema()
, thenmigrate_relational_database()
.
With that, you should be able to ingest your own relational data and explore it in a graph format with cognee. Happy coding!