πŸ“šKnowledge base

What is ZBrain’s knowledge base?

The knowledge base is the core of ZBrain, serving as the foundation for building LLM-based applications. This integral component enables seamless integration and management of users' proprietary data, providing them with the flexibility to import or upload various data types effortlessly. Here's an overview of the knowledge base's capabilities:

Data integration and flexibility

ZBrain's knowledge base supports seamless data ingestion from multiple sources and formats, including PDF, TXT, CSV, JSON, DOCX, PPTX, and XLSX. Users can import data from a variety of tools and platforms, including:

  • Web URLs

  • Google Sheets

  • Notion

  • MongoDB

  • ServiceNow

  • Confluence

  • JIRA

  • PostgreSQL

  • AWS RedShift

  • SharePoint

  • Microsoft Teams

  • OneDrive

  • Google Drive

This flexibility ensures comprehensive data connectivity, allowing users to build a robust knowledge repository and integrate diverse datasets effortlessly into their AI applications.

Structure and organization

The knowledge base offers information schema capability to transform unstructured data like PDF and text files into structured information through advanced processing. This structured data is essential for extracting meaningful insights and facilitating decision-making processes. By leveraging Large Language Models (LLMs), the system can effectively analyze and interpret large volumes of data, making it readily available for querying.

Retrieval and optimization

ZBrain's retrieval and optimization capabilities ensure efficient and accurate information access from your knowledge base. With options like vector search, full-text search, and hybrid search, users can effectively retrieve relevant documents. Customizable settings, such as Top K results and score thresholds, further refine the search output. Once the parameters are set and the knowledge base is created, users can conduct retrieval testing by querying the data. The system identifies and ranks relevant data chunks based on these parameters, displaying the most pertinent information first. This optimization process guarantees that the LLMs provide the best possible responses to user queries.

Note: For more information on vector search, full-text search, hybrid search, Top K results, and score thresholds, please refer to How to create a knowledge base? | ZBrain Documentation

Secure and scalable storage

ZBrain's knowledge base supports various vector stores for efficient data indexing and retrieval. It is agnostic to the underlying storage provider, offering options like Pinecone for scalable vector indexing and ZBrain's built-in vector store for cost-effective data management. Additionally, it employs secure storage solutions such as ZBrain S3 storage to manage data securely and efficiently. ZBrain S3 storage delivers precise retrieval results without incurring additional token costs.

Continuous improvement and customization

The knowledge base allows users to refine and customize their data-handling strategies. Users can configure chunking rules, choose the preferred embedding models, select appropriate vector stores, and set retrieval parameters such as search type, top K results, and score thresholds. Once the knowledge base is created, users can also edit each chunk to add or remove information and update metadata. This customization ensures that the knowledge base is tailored to meet specific business needs and enhances the accuracy of AI-generated responses.

Summary and management

Users can generate summaries of their documents using available models, providing a concise overview of the content. The knowledge base interface allows for easy management of data chunks, including editing, disabling, or adding new chunks. This ensures that the stored information remains relevant and up-to-date.

In summary, the knowledge base in ZBrain is designed to provide a comprehensive and flexible platform for data integration, storage, and retrieval. It underpins the effectiveness of ZBrain's AI applications, ensuring they deliver accurate, relevant, and context-specific responses.

Last updated