Monitor
Last updated
The Monitor feature in ZBrain offers comprehensive oversight of your AI applications by automating evaluation and performance tracking. It ensures response quality, helps identify issues proactively, and maintains optimal performance across all deployed solutions.
Monitor captures inputs and outputs from your applications, continuously evaluating responses against defined metrics at scheduled intervals. This provides real-time insights into performance, tracking success and failure rates, and highlighting patterns needing attention. The results are displayed in an intuitive interface, enabling you to quickly identification and resolution of issues and ensure consistent, high-quality AI interactions.
The monitor module consists of three main sections, accessible from the left navigation panel:
Events: View and manage all configured monitoring events
Monitor logs: Review detailed execution results and metrics
Event settings: Configure evaluation metrics and parameters
To set up monitoring for an application:
Access the app session:
Navigate to the Apps page
Click on your desired application
Go to the query history section of the app
Select a specific user session from the list to monitor (shows session ID, user info, prompt count)
Review the conversation:
View the session details and chat history
Examine the response options (copy, conversation logs, edit annotation reply and feedback)
Access conversation logs:
Click 'Conversation Log' to see interaction details
Review status, time, and token usage
Check the input, output, and metadata
Enable monitoring:
Click the ‘Monitor’ button in the overview tab Click ‘Configure now’ when prompted with ‘Added for monitoring’
You will be redirected to the Events > Monitor page. In the last status column, click ‘Configure’ to open the event settings page. On the event settings screen:
Review entity information
Entity name: The name of your application
Entity type: The type of entity being monitored (e.g., App)
Verify monitored content
Monitored input: The query or prompt being evaluated
Monitored output: The response being assessed
Set evaluation frequency
Click the dropdown menu under "Frequency of evaluation"
Select the desired interval (Hourly, Every 30 minutes, Every 6 hours, Daily, Weekly, or Monthly)
Configure evaluation conditions
Click ‘Add metric’ in the Evaluation Conditions section
Select a metric type:
Response relevancy: Measures how well the response answers the question
Faithfulness: Evaluates the accuracy alignment with the provided context
Choose the evaluation method (is less than, is greater than, or equals to)
Set the threshold value (0.1 to 5.0)
Click ‘Add’ to save the metric
Set the "Mark evaluation as" dropdown to fail or success
Test your configuration
Click the ‘Test’ button
Enter a test message if needed
Review the test results
Click ‘Reset’ if you want to try again
Save your configuration
Click ‘Update’ to save and activate your monitoring event
The events dashboard displays all configured monitoring events in a tabular format:
Entity name: The agent or application being monitored
Entity type: Classification (App, Playground, etc.)
Input: The query being evaluated
Output: The response being assessed
Run frequency: How often does evaluation occur
Last run: When the last evaluation occurred
Last status: Current status with the ‘Configure’ option
Use the search box and dropdown filters, Entity (App/Playground) and Status (All, Success, Failed), to quickly locate specific events. Click ‘Configure’ in the last status column to modify event settings.
The monitor logs interface provides detailed performance tracking:
Event information header
Event ID: Unique identifier for the monitoring event
Entity name: The agent being monitored
Entity type: Classification (App, Playground, etc.)
Frequency: How often monitoring occurs
Metric: Performance criteria being measured
Log status visualization
Colored bars provide a quick visual indicator of recent execution results
Red bars indicate failures, green indicates successful evaluations
Filtering options
Status dropdown: Filter by Success/Failed status
Log time dropdown: Filter by active/inactive
Log details table
Log ID: Unique identifier for each log entry
Log time: When the evaluation occurred
LLM response: The query or prompt content
Credits: Resource utilization
Cost: Associated expense
Metrics: Success (✅), Failure (❌)
Status: Outcome (Success/Failed with color coding)
Response relevancy
Measures how well the response answers the given question
If the response is unrelated or off-topic, the score decreases
Higher scores indicate more relevant responses
Faithfulness
Evaluates how accurately a response aligns with the provided context
Measures whether the response contains information not present in the context
Higher scores indicate more faithful responses
Start with conservative thresholds (0.5-0.7) and adjust based on observed performance
Consider your use case requirements when setting thresholds:
Customer-facing applications may require higher thresholds
Internal tools might tolerate lower thresholds
Regularly review and adjust thresholds as your applications evolve
Investigate failed evaluations by reviewing the specific LLM responses
Check metric scores to understand why responses did not meet thresholds
Adjust prompts or application configuration based on monitoring insights