Model Monitoring

_images/Monitor_Block_Diagram.png

Over time, most models will degrade, providing inference results that no longer achieve your business goals. DKube integrates model monitoring into the overall workflow. This allows the data science or production team to monitor the serving results, and take action if the results are no longer within acceptable tolerances.

  • Alerts can be set up based on goals and tolerances

  • A Dashboard provides a snapshot of all monitored models

  • Problems can be viewed in a set of hierarchical graphs

  • The problem, and its root cause, can be determined

  • Retraining and redeployment can be performed

Monitor Workflow

The general workflow to make use of the model monitoring system is described in this section.

  • A monitor is created, and any alerts are added at that time Create a Monitor

  • The monitor, including adding and changing alerts, can be modified after it has been created Edit an Existing Monitor

  • The status of the monitors and alerts can be viewed in real-time from the monitor dashboard screen Monitor Dashboard

  • Based on the alerts, a specific monitor can be hierarchically investigated to determine what is causing the alert Monitor Details

Monitor Dashboard

_images/Monitor_Dashboard_R31.png

The Monitor Dashboard provides a summary of the current active monitors. This includes:

  • A graphical, time-based summary of monitor alerts

  • A tabular list of active monitors

Note

The box labeled Monitors with Alerts applies to the last run only

The status column of the monitor list is defined as follows:

Status

Meaning

init

A field is missing

baselining

Calculating results after adding datasets

ready

Available for monitor, but not active

active

Running analysis

error

Problem with the monitor

For the Data Drift and Performance Decay columns are defined as follows:

Status

Meaning

Green Dot

The last run had no alerts

Red Dot

An alert was generated during the last run

From this screen, monitors can be added, and existing monitors can be started, stopped, deleted, cloned and modified.

Create a Monitor

_images/Monitor_Dashboard_Add_Monitor.png

A monitor can be created from the summary dashboard screen by selecting the “+ Monitor” button at the far right above the list of monitors. This will bring up a series of screens that are completed to provide the context of the model.

Basic

_images/Monitor_Add_Monitor_Basic_R31.png

The Basic fields describe:

  • The information on the model being monitored, such as the name, type of model, and framework

  • How often the monitor should be run

  • What type of detection algorithm should be used to identify drift

  • The email address to use for notifications

Note

The Model, Model Version, & Model Endpoint fields are not currently used in the monitor. They can be optionally added to track the source Model manually. They will be used within the monitoring system in a future release.

Train Data

_images/Monitor_Add_Monitor_Train_R31.png

The Train Data screen provides information on the dataset that was used train the model. This is used to determine if the live inference data has drifted out of tolerance compared to the data that was originally used.

Optionally, a transformer script can be uploaded for preprocessing or postprocessing.

The Model Metrics section on this screen can provide a file to identify which model metrics should be tracked. A sample file is included below the field.

The value for a metric that is in the model metric file is used for an alert if the % deviation option is chosen.

Live Data

_images/Monitor_Add_Monitor_Live_R31.png

The Live Data screen includes the datasets used for inference prediction. This is used to determine if the live data has drifted from the data that was originally used to train the model, and is used for the deployed Model to predict a result.

The Predict Data section is the input dataset that is used for live inference.

The folder hierarchy for the prediction data is:

<monitor name>/predict/<Year>/<Month>/<Day>/<Hour>/<Minute>/

_images/Monitor_Predict_Data_Format.png

If the correct prediction is available (ground truth), it can also be included from the Labelled Data section. This provides the model prediction value, and the ground truth. This will measure how well the model predicts based on the live data, based on the metrics input in the Train Data/Model Metrics section.

The folder hierarchy for the labelled data is <monitor name>/groundtruth/

_images/Monitor_Groundtruth_Format.png

Schema

After the basic information and dataset sections have been completed, the schema needs to be modified to reflect the features. This is accomplished by selecting the “Edit Schema” icon from the dashboard.

_images/Monitor_Dashboard_Schema_R31.png

The Schema screen lists the features that are part of the training data. From this screen, you can choose which features to monitor, what type of feature it is (input, prediction, etc), and whether the feature is continuous (a number) or categorical (something is a distinct category such a true or false).

_images/Monitor_Edit_Schema_R31.png

Alerts

Alerts provide notifications when an input or output of the Model drifts out of tolerance. Alerts can be added by selecting the “Add Alert” icon on the dashboard.

_images/Monitor_Dashboard_Alerts_R31.png

The Alerts screen shows the alerts that have been added, and alows allows the user to create a new alert. The Alert is configured by selecting what type of Alert is monitored (feature drift or performance decay). In each case, an email can be configured to notify an Alert trigger.

_images/Monitor_Alerts_R31.png

Type

Description

Feature Drift

Compares the training input dataset to the prediction input dataset

Performance Decay

Compares an output metric based on a threshold value, or a percentage deviation from the original training metric

_images/Monitor_Add_Alert_Popup_R31.png

The alert will show up on the list of Alerts once successfully created.

_images/Monitor_Edit_Alert_R31.png

Alerts can be edited from the Alert List screen by selecting the Edit icon on the far right.

Edit an Existing Monitor

_images/Monitor_Dashboard_Edit_Monitor_R31.png

An existing monitor can be modified by selecting the edit icon on the right of the monitor summary.

Monitor Details

_images/Monitor_Dashboard_Select_Monitor_R31.png

The process of identifying the root cause of a monitor deviation involves successively reviewing more information on an Alert. From the Monitoring Dashboard, select one of the Monitors to find out more details on the alerts for that Monitor.

From the monitor summary dashboard, the details of a specific monitor can be viewed by selecting the monitor name.

_images/Monitor_Details_Dashboard_R31.png

This brings up a dashboard for that particular monitor, with the associated details. It includes:

  • A graph of Alerts for that monitor only, for the selected timeframe

  • A list of active features and their configuration.

Monitor Graphs

_images/Monitor_Details_Monitor_Data_Drift_R31.png

Selecting the Monitor tab within the Monitor Details screen provides graphs and tables that help to identify what has drifted, with data to determine why it has drifted. There are 2 types of detailed Monitor graphs:

  • Data Drift

  • Performance Decay

Data Drift

In the Data Drift selection, the input live data file is compared to the original training data to determine if it has drifted. The top graph overlays the number of production serving requests with the number of Alerts. This allows the Production Engineer to determine the amount of live inference traffic activity, and how it compares to the threshold alerts for the features.

The table below the summary graph provides visual and quantified information on how the selected features are changing, and how important a feature is to the resulting Model output. This allows the user to see how if the original training data still matches the live inference data.

For example, the top feature in the figure indicates that the data has significantly deviated. The feature impacts the outcome in a non-trivial way, and the calculated drift for this analysis indicates that there is a very high amount of drift. The graph on the right shows how the drift varies over time. This might be a place to start for a retraining activity.

Performance Decay

_images/Monitor_Details_Monitor_Performance_R31.png

If Performance is selected, the graphs show how well the Model is performing based on the chosen Model metrics. The top graph combines the number of production requests and the number of alerts.

The bottom graph shows how the metrics are performing.

Configuration

_images/Monitor_Details_Configuration_R31.png

The Configuration, Scheme, and Alert tabs allows the user to view the options used for the monitor.