# How to create a knowledge base using knowledge graph?

{% embed url="<https://youtu.be/AbUaCflHCW8?si=wlCDHJ0jNWqcPran>" %}
A quick 'How-to' video with steps to create your knowledge base using knowledge graph
{% endembed %}

### Knowledge graph selection <a href="#knowledge-graph-selection" id="knowledge-graph-selection"></a>

Depending on your requirements, ZBrain lets you create a knowledge graph (KG) as an alternative to a traditional vector store. If your use case involves uncovering relationships between concepts, like how policies, products, or people are connected, a Knowledge Graph can provide deeper insights and more structured answers.

* To build a Knowledge Graph, you need to select RAG definition as Knowledge Graph in **Data Refinement Tuning.**

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FTZRuCZGvVMrOVTuz6dB9%2Fimage.png?alt=media&#x26;token=c65d8cb9-d57e-4c66-87d9-30c6ce4adf90" alt=""><figcaption></figcaption></figure>

#### **Chunk settings** <a href="#chunk-settings" id="chunk-settings"></a>

Configure the chunk settings as part of the setup process. For detailed instructions, refer to the **Data Refinement Tuning** guide.

#### **Graph store** <a href="#graph-store" id="graph-store"></a>

Below is the available graph store option for the knowledge graph:

* **Economical:** This option utilizes ZBrain's built-in vector store with cost-effective vector engines and keyword indexes for efficient data handling.

#### **File store selection** <a href="#file-store-selection" id="file-store-selection"></a>

* **ZBrain S3 storage:** This option utilizes ZBrain's secure and scalable S3 storage for data management. It offers enhanced data management features and precise retrieval results without incurring additional token costs.

#### **Retrieval settings** <a href="#retrieval-settings" id="retrieval-settings"></a>

**For knowledge graph selection**

* **Retrieval type:** You can choose between five search types:

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FTVsREe4sFJG8FRfNqjvG%2Fimage.png?alt=media&#x26;token=a0ae5855-d173-4efb-bff6-f28745521a16" alt=""><figcaption></figcaption></figure>

* \
  **Naive Mode:** Falls back to basic vector similarity on text chunks (no KG traversal).
  * **Best suited for:** Quick POCs; content without rich relationships.
* **Local Mode**: This search looks up context-dependent facts about a single entity using low-level keywords.
  * **Best suited for:** Ideal for Q\&A about a particular policy, product feature, or isolated technical detail.
* **Global Mode:** Emphasizes relationship-based knowledge, traversing edges to reveal broader connections between concepts.
  * **Best suited for:** For holistic questions that require networked insights, e.g., “How do X, Y, and Z relate?”
* **Hybrid Mode**: Combines both local and global retrieval, then merges the results.
  * **Best suited for:** Complex business questions that need both entity facts and contextual relationships.
* **Mix Mode**: Executes both vector (semantic) and graph retrieval in parallel, drawing from unstructured and structured data, including time metadata.
  * **Best suited for:** Multi-layered queries that span different data types or dimensions, such as timelines, comparisons, or multifaceted evaluations.

**Top K:** This setting determines the number of most relevant results returned for a user's search query. You can specify the desired number of results (default is 50).

**Score threshold:** This setting defines the minimum score a result needs to achieve to be included in the search results. You can specify a score between 0.01 and 1 (default is 0.2).

#### **Embedding model** <a href="#embedding-model" id="embedding-model"></a>

* Choose the embedding type that best suits your use case to optimize text representation and improve performance.

The following embedding models are available when a knowledge graph is chosen in the RAG definition:

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FqmwoBd9RF8aNg1wyRj7w%2Fimage.png?alt=media&#x26;token=fc932a28-751a-4e8b-94c9-807c8962c57e" alt=""><figcaption></figcaption></figure>

* It will then display the proposed document and the estimated number of chunks for your review.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FTkgjyUPpi7VkIZAZ9p02%2Fimage.png?alt=media&#x26;token=39f40577-fdfb-47b4-89db-ab7cad5e9acb" alt=""><figcaption></figcaption></figure>

* You can check the approximate cost and credits consumed for each Knowledge Graph creation. Actual values will depend on the number of chunks.
* Click ‘Details’ on ‘Credit Usage’ to reveal credits consumed for knowledge base creation.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FRPqYSMmNHwGbBHegGEKK%2Fimage.png?alt=media&#x26;token=298421bc-0c26-470e-ad16-b8dafd0161c6" alt=""><figcaption></figcaption></figure>

### Knowledge Graph LLM ( for knowledge graph selection) <a href="#knowledge-graph-llm-for-knowledge-graph-selection" id="knowledge-graph-llm-for-knowledge-graph-selection"></a>

* Choose the LLM that will perform reasoning over the knowledge graph (default: `gpt-4o`). The chosen model powers query rewriting, path finding and answer synthesis.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2F6je37y79mXbQYFOqWhtq%2Fimage.png?alt=media&#x26;token=9c464793-1d49-4b4d-916d-b20ec4d3b27c" alt=""><figcaption></figcaption></figure>

#### **Adding instructions for knowledge graph generation** <a href="#adding-instructions-for-knowledge-graph-generation" id="adding-instructions-for-knowledge-graph-generation"></a>

* You can enter custom instructions to define exactly how a Knowledge Graph should be built in the **Knowledge Graph Instructions** box.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FyxNNvZ3NV7rCZl2GMZWQ%2Fimage.png?alt=media&#x26;token=8069612e-a17f-4b37-9efb-aace4712657d" alt=""><figcaption></figcaption></figure>

* **Enter custom instructions:** ZBrain allows advanced users to edit the instructions sent to the LLM during knowledge graph creation. Click ‘Edit' to customize or modify the default prompt and type your instructions, or click 'Generat&#x65;**’** to let ZBrain draft a prompt template, so the system knows exactly how to extract entities and relationships.

*Note: Adding custom instructions is only applicable for advanced users.*

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2F0lWLfrnmru1HozoJu7oo%2Fimage.png?alt=media&#x26;token=2a4c8776-5459-4e73-960d-59ea677c57a4" alt=""><figcaption></figcaption></figure>

**Customize the prompt for the knowledge graph**

If the prompt instructions are not given properly, or the output of the prompt is not in the expected format, or there is a possibility that the knowledge graph creation will fail, you can customize the prompt. This step is optional and intended for users who have a detailed understanding of prompt formatting.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FA28PQCtrzP7rP6djYk9w%2Fimage.png?alt=media&#x26;token=f2761c6c-9bf7-4353-92c8-02a1f70f51ba" alt=""><figcaption></figcaption></figure>

You can choose to:

* Use the default prompt (recommended for most users)
* Replace only the placeholder values
* Remove all default instructions and write your own

**Default prompt**

In this step, a structured prompt is sent to the model by default. It contains placeholders that the system automatically fills in with preset values. These include:

| Placeholder                                | Populated with                                   | Default values                                                             |
| ------------------------------------------ | ------------------------------------------------ | -------------------------------------------------------------------------- |
| {language}                                 | The selected language for output                 | English                                                                    |
| {entity\_types}                            | Entity types chosen by the user                  | "organization", "person", "geo", "event", "category"                       |
| {tuple\_delimiter}                         | Symbol for separating elements within a tuple    | <\|>                                                                       |
| {record\_delimiter}                        | Symbol for separating entries in the output list | ##                                                                         |
| {completion\_delimiter}                    | Final output marker                              | <\|COMPLETE\|>                                                             |
| {examples}                                 | Examples to guide the LLM                        | Predefined list of example outputs                                         |
| {input\_text}                              | The input document content                       | <p>Automatically populated at runtime with data from the knowledge source. |
| <br>Should not be removed or modified.</p> |                                                  |                                                                            |

> Do not delete these placeholders unless replacing them intentionally.

**Output format**

If you create a custom prompt, it must return output in a specific format. This includes:

* Entities:\
  `("entity"{tuple_delimiter}entity_name{tuple_delimiter}entity_type{tuple_delimiter}entity_description)`
* Relationships:\
  `("relationship"{tuple_delimiter}source_entity{tuple_delimiter}target_entity{tuple_delimiter}relationship_description{tuple_delimiter}relationship_keywords{tuple_delimiter}relationship_strength)`
* Content keywords:\
  `("content_keywords"{tuple_delimiter}high_level_keywords)`
* Output must be a flat list, separated by `{record_delimiter}`, and end with `{completion_delimiter}`.

**The final prompt**

This is the prompt that goes to the LLM after placeholders are replaced in the backend.

**Watch the tooltip:** If your instructions are incomplete or ambiguous, an inline warning appears to flag the risk of skewed results.

Once the prompt reflects your requirements, proceed to generate the Knowledge Graph; the platform will apply your refined instructions to the uploaded content.

#### **Agentic Retrieval** <a href="#agentic-retrieval" id="agentic-retrieval"></a>

This is a retrieval strategy where an LLM actively plans and decomposes a complex query into smaller sub-queries to guide information retrieval. It relies on having enough context (e.g., a full question rather than a single word) to create a structured search plan. This allows the agent or app to orchestrate both keyword and semantic search engines for more accurate, context-aware results.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2F6sAm59G8RkwaTIjlyoYR%2Fimage.png?alt=media&#x26;token=f7bcc00a-a2ff-4fc8-8cd4-8b95f9deadd9" alt=""><figcaption></figcaption></figure>

#### Enabling Agentic Retrieval <a href="#enabling-agentic-retrieval" id="enabling-agentic-retrieval"></a>

To enable and configure Agentic Retrieval:

* Toggle on the Agentic Retrieval switch.
* From the Agentic Retrieval Model dropdown, select the LLM (e.g., `gpt-4o`) to be used for orchestrating the sub-query planning and execution.
* Once you have confirmed your selections, click the ‘Next’ button.

### **Execute and finish** <a href="#execute-and-finish" id="execute-and-finish"></a>

On this screen, review all the details of the knowledge base you have provided earlier. If everything appears accurate, click the ‘Manage Knowledge Base’ button to complete the creation process. You can monitor the embedding progress of the knowledge base in real-time using the slider, whether it has been created or is currently in progress.&#x20;

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2F4HJXllfjnjPFSFZAlZGA%2Fimage.png?alt=media&#x26;token=98c24038-bd5e-4dfe-bad8-37708edcbe87" alt=""><figcaption></figcaption></figure>

Your newly created knowledge base is now accessible for use within your ZBrain solutions. You can create additional knowledge bases by clicking on the ‘Add’ button or delete existing ones using the ‘Delete’ button.

You will receive clear, contextual error feedback when indexing fails during Knowledge Base creation, along with detailed explanations of the errors. Each error provides actionable guidance (e.g., verify API keys, check permissions) to help you resolve issues quickly.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FjzaybW3A21Q9ADsfuHkA%2Fimage.png?alt=media&#x26;token=799cc55f-2b72-48ba-be98-bc921fc24dfc" alt=""><figcaption></figcaption></figure>

**Note:** If a knowledge base is initially created using a knowledge graph, the vector store option is hidden for all subsequent document uploads under that knowledge base and vice versa.

<figure><img src="https://3781630280-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIbuSicczDKTyHzwzicar%2Fuploads%2FPduDyi65yD4nPSZCtPAe%2Fimage.png?alt=media&#x26;token=83aa5e35-b536-4177-84df-66784c177840" alt=""><figcaption></figcaption></figure>
