📺Monitor

Overview

The Monitor feature in ZBrain delivers continuous, end-to-end oversight of every deployed agent and application. It automatically captures each input and output, evaluates the output against a comprehensive set of metrics, and surfaces real-time performance trends, so you can detect issues early, correct them quickly, and maintain consistently high-quality AI interactions.

Monitor schedules evaluations at defined intervals, logs both successes and failures, and presents the results in an intuitive console. Built-in notifications keep you updated on whether the flow is successful or fails.

Metric categories

  • LLM-Based -Utilizes a language model to evaluate answers for relevance and factual accuracy.

    • Response relevancy - Measures how accurately the response answers the user’s query.

    • Faithfulness - Evaluates factual alignment with context to minimize hallucinations.

  • Non-LLM Metrics - Relies on deterministic checks (health, exact match, similarity) without invoking an LLM.

    • Health check - Confirms the app/agent can return a valid response; halts further checks if an invalid response is received.

    • Exact match- Compares the app/agent agent response character-by-character with the expected output.

    • F1 score: Balances precision and recall to evaluate content overlap.

    • Levenshtein similarity: Measures similarity based on edit distance between two strings.

    • ROUGE-L score: Detects the longest common sequence between the response and reference text.

  • LLM-As-Judge - Have an LLM emulate human reviewers on traits like creativity, clarity, and helpfulness.

    • Creativity: Rates originality in response generation.

    • Helpfulness: Evaluates how effectively the response assists in resolving the user’s query.

    • Clarity: Measures how clearly the message is conveyed.

Key capabilities of Monitor

  • Automated evaluation: Assess responses using LLM-based and non-LLM-based metrics.

  • Performance tracking: Track success/failure trends.

  • Query-level monitoring: Configure evaluations at the individual query level within a session.

  • Agent and app support: Monitor both AI apps and AI agents.

  • Input flexibility: Monitor responses for .txt, PDF, image, and other file types.

  • Notification alerts: Enable real-time notifications for event status updates when an event succeeds or fails.

Monitor interface navigation

The monitor module consists of four main sections, accessible from the left navigation panel:

  • Events: View and manage all configured monitoring events

  • Monitor logs: Review detailed execution results and metrics

  • Event settings: Configure evaluation metrics and parameters

  • User management: Configure role-based user permissions

Together, these capabilities provide a single dashboard for validating fixes, identifying quality drift, and ensuring that every user interaction meets your organization’s standards.

Last updated