Skip to content

kuzudb/baml-kuzu-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Graph extraction and Graph RAG with BAML and Kuzu

This repository contains a demonstration of transforming unstructured data from clinical notes and drug side effects into a knowledge graph in Kuzu. The graph is then used to answer questions about the data using a Graph RAG pipeline.

Tools used

  • BAML: AI and LLM prompting framework
  • Kuzu: Embedded, fast, scalable graph database
  • Streamlit: Visualization framework for building custom web apps

Overview

The goal is to show how to build modular "agents" (i.e., prompt pipelines that accomplish a subtask) that can then be strung together to accomplish more complex tasks. BAML allows users to compose together these kinds of agents with testing and validation capabilities, by offering a reliable means to generate structured outputs from LLMs.

Kuzu is used as a data store for the graph, and Streamlit is used to build a simple UI for interacting with the graph.

The general components of the pipeline are shown in the diagram below.

Setup

Ensure you have Python 3.11+ installed.

  1. Clone this repository
  2. Install the required dependencies:
    # Install the uv package manager
    curl -fsSL https://get.uvm.dev | bash
    # Install the dependencies
    uv sync
  3. Activate the environment and generate the BAML client files
cd
source .venv/bin/activate
baml-cli generate

This will generate the BAML client files in the src/baml_client directory.

Extract data from images and text

To extract data from images and text, run the following command:

cd src
# Extract data from images that represent tables from the PDF of drugs and side effects
uv run image_extractor.py
# Extract data from the text of clinical notes
uv run notes_extractor.py

This will output JSON files into the ../data/extracted_data directory.

Creating the graph

To create the graph in Kuzu, run the following command:

uv run src/01_create_drug_graph.py

This will persist the Kuzu graph locally in the ex_kuzu_db directory.

To add the patient data to the graph, run the following command:

uv run src/02_create_patient_graph.py

This will augment the pre-existing graph (from the prior step) with the data from the patient notes.

Create vector index

To help with vector-assisted graph traversal, we can also create a vector index in Kuzu (requires kuzu>=0.10.0)! The script 03_create_vector_index.py does the following:

  • Create a vector index on the Condition node table to perform fuzzy search on conditions treated by drugs
  • Create another vector index on the Symptom node table to perform fuzzy search on symptom or side effects caused by a drug.
uv run src/03_create_vector_index.py

Evaluation

1. Graph construction

The main motivation for this project was to evaluate the performance of BAML and the given LLM for the task of extracting data from unstructured text. For the graph construction task, we have two main stages to evaluate:

  1. Extracting drugs and side effects from a table in a PDF
  2. Extracting medications and side effects from clinical notes

To evaluate the performance of the image extractor, run the following command:

cd evals
uv run image_extractor_eval.py
Model Date[^3] Exact Match Mismatch Missing Potential
Hallucination
Cost Cost
factor
openai/gpt-4o-mini Mar 2025 170 0 2 2 0.0008 1.0
openai/gpt-4o Mar 2025 174 1 1 2 $0.0277 35x
anthropic/claude-3.5-sonnet Mar 2025 173 0 2 2 $0.0551 69x
google/gemini-2.0-flash Mar 2025 158 2 12 8 Free tier N/A

Note that your costs, latency and results may differ based on when you run the code, as models are being updated continually.

To evaluate the performance of the notes extractor, run the following command:

cd evals
uv run notes_extractor_eval.py
Model Date[^3] Exact Match Mismatch Missing Potential
Hallucination
Cost Cost
factor
openai/gpt-4o-mini Mar 2025 19 0 0 0 $0.0003 1.0
openai/gpt-4o Mar 2025 19 0 0 0 $0.0044 15x
anthropic/claude-3.5-sonnet Mar 2025 19 0 0 0 $0.0074 25x
google/gemini-2.0-flash Mar 2025 19 0 0 0 Free tier N/A

The text extraction task is well handled by all models tested!

2. Graph RAG + agent router

The second part of the project involves building a Graph RAG pipeline on top of the graph constructed earlier from the two data sources. This stage involves evaluating the correctness/relevance of RAG pipeline that retrieves from a) just the graph and b) the graph + vector index using an agentic router. Text2Cypher is used to generate Cypher queries via LLMs, and so some experiments on open source vs. proprietary LLM performance are run to help understand the difference in quality of responses.

Run FastAPI server

A FastAPI server is provided to interact with the graph RAG and agentic pipeline. To run the server, run the following command:

cd src && uvicorn app:app --host 0.0.0.0 --port 8001 --reload

This will start the server on http://localhost:8001.

Run tests

To run the tests, run the following command:

uv run pytest

This will run the test suite in the src/tests directory and output the results to the console. The test suite is designed to test the performance of the Graph RAG pipeline in two scenarios:

  1. Vanilla Graph RAG: Retrieving information from the graph only
  2. Graph RAG with agent router: Retrieving information from the graph + vector index via an agentic router

The tests are designed to be run on a local machine with a FastAPI server running.

Vanilla Graph RAG

The following plot shows the results of vanilla Graph RAG, i.e., a single pass at Text2Cypher. Some of the queries (e.g., Q6 and Q7) are objectively hard (borderline impossible) to answer with a single Cypher query, as the terms may not align with those that are in the database -- a vector search would be needed to help the LLM find the right query terms.

Router Agent Graph RAG

When including a router agent that can pick the appropriate vector search tools, we can see that the results are significantly improved. Recent models like openai/gpt-4.1 and google/gemini-2.0-flash and google/gemini-2.5-flash do really well on all 10 queries in the test suite.

Run the Streamlit app

To run the Streamlit app, run the following command:

cd src
uv run streamlit run ui.py

This will start the Streamlit app on http://localhost:8501.

Sample queries are provided in the UI, but you are free to ask any question you want! An example run is shown below:

About

Demo of knowledge graph creation and Graph RAG with BAML and Kuzu

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages