# Azure Machine Learning Model Monitoring SDK
The `azureml-ai-monitoring` package provides an SDK to enable Model Data Collector (MDC) for custom logging allows customers to collect data at arbitrary points in their data pre-processing pipeline. Customers can leverage SDK in `score.py` to log data to desired sink before, during, and after any data transformations.
Start by importing the `azureml-ai-monitoring` package in `score.py`
```
import pandas as pd
import json
from azureml.ai.monitoring import Collector
def init():
global inputs_collector, outputs_collector
# instantiate collectors with appropriate names, make sure align with deployment spec
inputs_collector = Collector(name='model_inputs')
outputs_collector = Collector(name='model_outputs')
def run(data):
# json data: { "data" : { "col1": [1,2,3], "col2": [2,3,4] } }
pdf_data = preprocess(json.loads(data))
# tabular data: { "col1": [1,2,3], "col2": [2,3,4] }
input_df = pd.DataFrame(pdf_data)
# collect inputs data, store correlation_context
context = inputs_collector.collect(input_df)
# perform scoring with pandas Dataframe, return value is also pandas Dataframe
output_df = predict(input_df)
# collect outputs data, pass in correlation_context so inputs and outputs data can be correlated later
outputs_collector.collect(output_df, context)
return output_df.to_dict()
def preprocess(json_data):
# preprocess the payload to ensure it can be converted to pandas DataFrame
return json_data["data"]
def predict(input_df):
# process input and return with outputs
...
return output_df
```
Create environment with base image `mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04` and conda dependencies, then build the environment.
```
channels:
- conda-forge
dependencies:
- python=3.8
- numpy=1.23.5
- pandas=1.5.2
- pip=22.3.1
- pip:
- azureml-defaults==1.38.0
- requests==2.28.1
- azureml-ai-monitoring~=0.1.0b1
name: model-env
```
Create deployment with custom logging enabled (model_inputs and model_outputs are enabled) and the environment you just built, please update the yaml according to your scenario.
```
#source ../configs/model-data-collector/data-storage-basic-OnlineDeployment.YAML
$schema: http://azureml/sdk-2-0/OnlineDeployment.json
endpoint_name: my_endpoint #unchanged
name: blue #unchanged
model: azureml:my-model-m1:1 #azureml:models/<name>:<version> #unchanged
environment: azureml:custom-logging-env:1 #unchanged
data_collector:
collections:
model_inputs:
enabled: true
model_outputs:
enabled: true
```
By default, we'll raise the exception when there is unexpected behavior (like custom logging is not enabled, collection is not enabled, not supported data type), if you want a configurable on_error, you can do it with
```
collector = Collector(name="inputs", on_error=lambda e: logging.info("ex:{}".format(e)))
```
# Change Log
## [v0.1.0b1](https://pypi.org/project/azureml-ai-monitoring) (2023.4.25)
**New Features**
- Support model data collection for pandas Dataframe.