Customize data ingestion using Pydantic
Difficulty: Medium
Overview
cognee let’s you organize and model your user’s data for LLMs to use. In this way you can choose how to load only the data you need. Let’s say you need all persons mentioned in a novel. We enable you to:
- Specify which persons you want extracted
- Load them into the cognee data store
- Retrieve them with natural language query
Let’s try it out!
Let’s model your data based on your preferences
Why is this important? Let’s visualize our data before and after.
On this image you can see that purple color nodes are exactly the nodes that represent people mentioned in the novel.
Let’s create the graph ourselves.
Step 1: Clone cognee repo
git clone https://github.com/topoteretes/cognee.git
And our getting started repo
git clone https://github.com/topoteretes/cognee-starter.git
Step 2: Install with poetry
Navigate to cognee repo
cd cognee
Install with poetry
poetry install
Step 3: Use example from our starter repo
Create a python script called example_ontology.py and copy the content of the following file to it
https://github.com/topoteretes/cognee-starter/blob/main/src/pipelines/custom-model.py
Step 4: Run the script
Run the script using python
python example_ontology.py
Make sure that the script has access to the data in the cognee-starter repo
Step 5: Inspect your graph
The script will create an html file in the root folder that you can inspect and check the graph. You can also run a small http server that will render your semantic layer by doing the following
import webbrowser
import os
from cognee.api.v1.visualize.visualize import visualize_graph
html = await visualize_graph()
home_dir = os.path.expanduser("~")
html_file = os.path.join(home_dir, "graph_visualization.html")
display(html_file)
webbrowser.open(f"file://{html_file}")
Join the Conversation!
Have questions? Join our community now to connect with professionals, share insights, and get your questions answered!