Configure your Bot (Basic)

Within the Configure your Bot section, you can tailor your Public/Private apps using various parameters to enhance the user experience and user interaction with the bot, allowing users to ask questions and receive responses.

Features in the Configure your Bot section -

  • Mode (Basic/Advanced) - The user can select the type of mode using which the bot should respond to the queries. There are two modes - Basic and Advanced. In Basic mode, the bot relies on the knowledge base for queries, whereas in Advanced mode, you can create your own flow and query the bot accordingly.

In basic mode, there are various settings that you can update, which will be explained in the following points.

  • Choose Knowledge Base(s) - The user can select one or more knowledge bases from the available list of knowledge bases that have been created for the particular app. Select the checkbox to add the required knowledge base.

  • Select Information Schema - The user can select one or more schema associated with the selected knowledge base to query data in depth.

  • Add Instructions - The user can add additional bot instructions to help assist the end user.

  • Followup Conversation - Users can toggle the Followup Conversation option if they desire a conversational experience from the bot. This mode allows the bot to provide precise results based on previous questions without requiring additional context, relying solely on the previous context.

  • Use CSV Agent - Users can enable the Use CSV Agent option when working with a knowledge base based on CSV files and need to perform specific mathematical or column-based operations. This feature enhances the accuracy of data insights for such knowledge bases.

  • Summarize Docs - Users can activate the Summarize Docs option to retrieve a summary of the knowledge base, even if it contains a large amount of data. They can achieve this by specifying the knowledge base file name when making queries within the app.

  • Retriever - The basic concept of retrievers is to get the closest similar context chunks of our document for our query using semantic search. In this process, a numerical vector (an embedding) is calculated for all documents, and those vectors are then stored in a vector database optimized for storing and querying vectors. Incoming queries are vectorized as well, and the documents retrieved are those that are closest to the query in the embedding space.

    • ZBrain Cosine Retriever - The zbrain cosine retriever is a retrieval system used to obtain the closest, contextually similar chunks of text from a document for a given query. This retrieval method relies on semantic search and calculates numerical vectors (embeddings) for both documents and queries. Incoming queries are also vectorized, and the documents retrieved are those that are closest to the query in the embedding space. The ZBrain cosine retriever serves as the default retriever in all the apps of ZBrain.

    • Multivector Retriever - It offers the advantage of storing multiple vectors for each document, and this versatility proves advantageous across a range of use cases. The methods to create multiple vectors per document include:

      1. Langchain Smaller Multivector: It splits a document into smaller chunks and embeds those chunks.

      2. Langchain Summary Multivector: It generates a summary for each document and embeds it alongside, or even in place of, the original document.

      3. Langchain Hypothetical Multivector: It formulates hypothetical questions that are suitable for addressing each document and seamlessly embeds these questions alongside, or as a replacement for, the document itself.

    • Vector store-backed retriever - A vector store retriever is a retriever that uses a vector store to retrieve documents. It is a lightweight wrapper around the vector store class to align with the retriever interface. It uses the search methods implemented by a vector store, like similarity search and MMR, to query the texts in the vector store. 1. Langchain Marginal Vector Backed - Maximal Marginal Relevance MMR is a method used to avoid redundancy while retrieving relevant items to a query. Instead of merely retrieving the most relevant items (which can often be very similar to each other), MMR ensures a balance between relevancy and diversity in the items retrieved.

      2. Langchain Score Vector Backed - This retrieval method establishes a similarity score threshold and only returns documents with a score above that threshold. In our setup, we've configured this threshold to be 0.5.

      3. Langchain TopK Vector Backed - You can enhance the retrieval process by specifying search keyword arguments, such as ‘k.’ We've set it to 50 by default, so it adds context that fits within our token input size.

    • Contextual Compression - One challenge with retrieval is that usually, you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses. Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query so that only the relevant information is returned. This "compression" involves both condensing the contents of individual documents and, if needed, filtering out entire documents. 1. Langchain Extractor Contextual Retriever - Wrap our base retriever with a ContextualCompressionRetriever. We'll add an LLMChainExtractor, which systematically processes the initially retrieved documents and selectively extracts the content that directly pertains to the given query, enhancing the efficiency and relevance of the retrieved information.

      2. Langchain Filter Contextual Retriever - The LLMChainFilter is a slightly simpler but more robust compressor that uses an LLM chain to decide which of the initially retrieved documents to filter out and which ones to return without manipulating the document contents.

      3. Langchain Embedding Filter Contextual Retriever - Making an extra LLM call over each retrieved document is expensive and slow. The EmbeddingsFilter works by embedding both the documents and the query, and then it returns only those documents whose embeddings closely match the query. This approach significantly reduces expenses and speeds up the retrieval process.

      4. Langchain Compressor Contextual Retriever - Using the DocumentCompressorPipeline, we can also easily combine multiple compressors in sequence. Along with compressors, we can add BaseDocumentTransformers to our pipeline, which focuses on transforming sets of documents without engaging in contextual compression. For instance, TextSplitters can divide documents into more manageable sections, while EmbeddingsRedundantFilter can effectively identify and filter out redundant documents based on their embedding similarity.

  • Default Settings - In the default settings, users can view the GPT configurations being used, such as the model type, context length, response length and other factors that enhance the bot's performance. Users can modify these settings by clicking the Edit button to customize the default configurations.

  • Model- User can select any LLM model to get the query response.

  • Temp- User can select temperature from 0.1 to 1 for the randomness of query response.

  • Score- User can select score from 0.1 to 1 to match the query context with the knowledgebase.

  • Number of Chunks- User can select any number of matching chunks to get the relavent response.

  • Context Max Token- Number of input tokens send to LLM.

  • Response Max Token- Number of output tokens from LLM for query response.

  • Save/Test Bot Performance - Once you've made changes in the Config Your Bot section, you'll find two buttons: Save and Test Bot Performance. By clicking on Save, the configuration updated by the user will be saved for future use. The user can view the bot performance report generated according to the provided configurations by clicking on Test Bot Performance.

Last updated