Harness the Power of Graph RAG: Unlock Unstructured Data with Semantic Search, Embeddings, and More
Unlock the power of Graph RAG for semantic search, information extraction, and advanced data analysis. Explore this open-source, retrieval-augmented generation framework that leverages knowledge graphs to enhance large language models. Boost accuracy and relevance for complex queries.
February 20, 2025

Unlock the power of semantic search, embeddings, and vector search with GraphRAG - the ultimate open-source RAG engine from Microsoft AI. Discover how this innovative solution can transform your data analysis and question-answering capabilities, delivering more relevant and reliable insights.
What is RAG (Retrieval Augmented Generation)?
How is GraphRAG Different from Traditional RAG Systems?
Getting Started with GraphRAG
Indexing and Configuring GraphRAG
Chatting with GraphRAG
What is RAG (Retrieval Augmented Generation)?
What is RAG (Retrieval Augmented Generation)?
RAG (Retrieval Augmented Generation) is an approach used to enhance existing large language models by incorporating external knowledge. The key idea behind RAG is to combine the power of large language models with the ability to retrieve and leverage relevant information from external sources, such as knowledge bases or text corpora.
The main benefits of the RAG approach are:
-
Improved Relevance: By retrieving and incorporating relevant information, RAG can provide more accurate and relevant responses, especially for questions that require specific knowledge.
-
Reduced Hallucination: RAG has been shown to reduce the tendency of large language models to generate hallucinated or factually incorrect content, as the responses are grounded in the retrieved information.
-
Versatility: In addition to question answering, RAG can be applied to various NLP tasks such as information extraction, recommendation, sentiment analysis, and summarization.
-
Private Data Handling: RAG can work with private or sensitive data sets, as the information is processed and stored locally, without the need to share the data with external services.
The key difference between traditional baseline RAG systems and the Graph RAG approach is the use of knowledge graphs. Graph RAG combines text extraction, network analysis, and language model prompting to provide a more holistic and powerful system for leveraging large language models in advanced data analysis and question answering tasks.
How is GraphRAG Different from Traditional RAG Systems?
How is GraphRAG Different from Traditional RAG Systems?
GraphRAG is a significant advancement over traditional RAG (Retrieval Augmented Generation) systems. Here's how it differs:
-
Knowledge Graph Extraction: Unlike simple text-based retrieval, GraphRAG combines text extraction with network analysis and language model prompting to construct a comprehensive knowledge graph from the input data. This allows for a deeper, more holistic understanding of the content.
-
Improved Accuracy and Relevance: By leveraging the knowledge graph, GraphRAG can provide more accurate and relevant responses, especially for complex or specialized datasets. The graph-based approach helps connect disparate pieces of information and synthesize insights that outperform baseline RAG techniques.
-
Holistic Data Understanding: GraphRAG follows a more comprehensive approach, enhancing the overall understanding and summarization of large data collections. This makes it a superior choice for leveraging large language models in advanced data analysis and question-answering tasks.
-
Reduced Hallucination: GraphRAG has been shown to reduce the tendencies of large language models to generate "hallucinated" content that is not grounded in the provided information. The graph-based approach helps the model adhere more closely to the reliable information in the context.
-
Versatility: In addition to question-answering, GraphRAG can be applied to a variety of natural language processing tasks, such as information extraction, recommendations, sentiment analysis, and summarization, all within a private, local storage environment.
In summary, GraphRAG represents a significant advancement in the field of retrieval-augmented generation, offering improved accuracy, relevance, and holistic understanding of data, making it a powerful framework for leveraging large language models in advanced applications.
Getting Started with GraphRAG
Getting Started with GraphRAG
To get started with GraphRAG, follow these steps:
-
Install Prerequisites:
- Ensure you have Python installed on your system.
- Install the required packages by running
pip install graphrag
in your terminal or command prompt.
-
Clone the Repository:
- Open Visual Studio Code (or your preferred IDE) and create a new folder for the project.
- In the terminal, navigate to the project folder and run
git clone https://github.com/microsoft/graph-rag.git
to clone the GraphRAG repository.
-
Set up the Environment:
- In the terminal, navigate to the
graph-rag
directory. - Export your OpenAI API key by running
export GRAPHRAG_API_KEY=your_api_key_here
.
- In the terminal, navigate to the
-
Create an Input Folder:
- In the terminal, run
mkdir input
to create an input folder for your documents.
- In the terminal, run
-
Index the Documents:
- Place your documents (e.g., text files, PDFs) in the
input
folder. - In the terminal, run
python dm_rag_index.py
to index the documents.
- Place your documents (e.g., text files, PDFs) in the
-
Chat with the Documents:
- In the terminal, run
python dm_graph_rag.py --query "your_query_here" --root_dir . --method global
. - Replace
"your_query_here"
with the question or query you want to ask about the documents.
- In the terminal, run
GraphRAG will now use the knowledge graph it created during the indexing process to provide relevant and comprehensive responses to your queries, outperforming traditional retrieval-augmented generation techniques.
Indexing and Configuring GraphRAG
Indexing and Configuring GraphRAG
To get started with GraphRAG, you'll need to follow these steps:
-
Install Prerequisites:
- Ensure you have Python installed on your system.
- Install Pip by running the provided command in your command prompt.
-
Clone the Repository:
- Open Visual Studio Code and create a new window.
- Open the terminal by clicking on the toggle panel button.
- In the terminal, navigate to the bash environment and run the command
pip install graphrag
to install the necessary packages.
-
Set up the Environment:
- In the terminal, type
cd graphrag
to navigate to the cloned repository. - Export your OpenAI API key by running the command
export GRAPHRAG_API_KEY=your_api_key_here
.
- In the terminal, type
-
Create an Input Folder:
- In the terminal, run the command
mkdir input
to create an input folder where you'll place your files or documents. - Open the folder in VS Code by clicking on "File" > "Open Folder" and selecting the cloned repository.
- In the terminal, run the command
-
Index the Document:
- Place your document (e.g., a financial report) in the input folder.
- In the terminal, run the command
python dm_rrag index
to index the current document. - This will create a community report on the indexed document, which you can now use for chatting.
-
Configure the Environment:
- In the
env
file, you can configure the API key, model type, and other settings. - You can specify the use of an LLAMA model or the OpenAI interface.
- Save the changes to the
env
file.
- In the
-
Run the Code:
- In the terminal, run the command
python dm_rrag query --root_folder . --method global --query "your_query_here"
to start chatting with the indexed document.
- In the terminal, run the command
By following these steps, you can set up GraphRAG, index your documents, and start using the retrieval-augmented generation capabilities to enhance your natural language processing tasks.
Chatting with GraphRAG
Chatting with GraphRAG
To chat with GraphRAG, follow these steps:
-
After indexing the document using the
python dm_rrag index
command, you can initiate the chat by running the commandpython dm_rrag query --root_folder . --method global "your query here"
. -
Replace
"your query here"
with the question or prompt you want to ask GraphRAG about the indexed document. -
GraphRAG will then use the knowledge graph it created during the indexing process to provide a relevant and informative response, leveraging the power of large language models and the structured information in the knowledge graph.
-
You can continue chatting with GraphRAG by running the same command with different queries. The system will use the existing knowledge graph to provide responses tailored to your questions.
-
If you want to switch to a different language model, you can configure the model in the
.env
file by specifying theLLM_TYPE
and providing the appropriate API endpoint or local model path. -
GraphRAG's holistic approach to retrieval-augmented generation allows it to outperform traditional baseline RAG techniques, especially for complex or private datasets, by connecting disparate pieces of information and providing synthesized insights.
FAQ
FAQ