Create a knowledge graph from your codebase
Difficulty: Beginner
Overview
Cognee offers a simple way to build a code graph from your repositories. Once generated, this graph makes it easier to navigate and query your code.
Below, you’ll learn how to install cognee, analyze a codebase using a code graph pipeline, and run a code-based search query in just a few easy steps.
Here is a quick-start guide.
Step 1: Install cognee
pip install cognee[codegraph]
We begin by installing the cognee[codegraph]
package, which contains all the dependencies for generating and analyzing code graphs.
Step 2: Set Environment Variables
import os
os.environ["LLM_API_KEY"] = "sk-" # Replace with your actual API key
Remember to replace “sk-” with your actual key before running the notebook. Here’s a guide on how to get your OpenAI API key.
In case you want to use another provider, please set the env variables. See an example for Mistral here.
Step 3: Clone a Repo to Analyze
git clone https://github.com/hande-k/simple-repo.git # Replace it with your preferred repo
repo_path = "/{path-to-your-repo}/simple-repo"
This sample repo will be used by cognee to build a code graph. Adjust repo_path
if you clone to a different location.
Step 4: Create & Run the Code Graph Pipeline
import cognee
from cognee.api.v1.cognify.code_graph_pipeline import run_code_graph_pipeline
async def codify(repo_path: str):
print("\nStarting code graph pipeline...")
async for result in run_code_graph_pipeline(repo_path, False):
print(result)
print("\nPipeline completed!")
await codify(repo_path)
Running this pipeline analyzes the code in the repo and constructs an internal graph representation for quick navigation and searching.
Step 5: Set Up Summarization Prompt
with open("summarize_search_results.txt", "w") as f:
f.write(
"You are a helpful assistant that understands the given user query "
"and the results returned based on the query. Provide a concise, "
"short, to-the-point user-friendly explanation based on these."
)
We create a text file containing a system prompt. The language model will use this prompt to summarize search results.
Step 6: A Helper Function for Search & Summary
from cognee.modules.search.types import SearchType
from cognee.infrastructure.llm.prompts import read_query_prompt
from cognee.infrastructure.llm.get_llm_client import get_llm_client
async def retrieve_and_generate_answer(query: str) -> str:
search_results = await cognee.search(query_type=SearchType.CODE, query_text=query)
prompt_path = "/content/summarize_search_results.txt"
system_prompt = read_query_prompt(prompt_path)
llm_client = get_llm_client()
answer = await llm_client.acreate_structured_output(
text_input = (
f"Search Results:\n{str(search_results)}\n\n"
f"User Query:\n{query}\n"
),
system_prompt=system_prompt,
response_model=str,
)
return answer
With this function, cognee retrieves code-based search results for a user query, and the language model converts them into a concise explanation.
Step 7: Run a Sample Query
user_query = "add_your_query" # Replace it with your query
answer = await retrieve_and_generate_answer(user_query)
print("===== ANSWER =====")
print(answer)
Cognee uses its code graph to find relevant references, and the language model produces a short, user-friendly answer.
We hope this quick walkthrough helps you get started with cognee’s code graph and search capabilities. Experiment with different codebases and queries to see the full power of cognee in action!
Join the Conversation!
Have questions or need more help? Join our community to connect with professionals, share insights, and get your questions answered!