Predictive Analytics application

Overview

The Predictive Analytics application enables you to manage your models by providing options for uploading, downloading, or activating/deactivating your models. Additionally, it also provides you with an insight into your models by capturing runtime performance and showcasing it via meaningful KPIs.

Moreover, it enables you to manage custom resources which your models might need. These resources include custom functions and look-up tables.

The following sections will walk you through all functionalities of the Predictive Analytics application in detail.

Home screen

In the Cumulocity application, you access the Predictive Analytics application through the app switcher.

Clicking Predictive Analytics in the app switcher will open the Predictive Analytics application showing the Home screen of the application.

Home screen

The Home screen provides:

Managing models

In the Models page you actually do the model management.

Model management functionality includes:

Click Models in the navigator, to open the Models page.

Models manager

Uploading models

To upload a new model, click Add model, navigate to the desired model’s PMML file and then click Open.

Once your model is successfully uploaded, you will see a corresponding confirmation message. The new model will be added to the models list.

On uploading a model, use the Apply PMML Cleanser toggle in the top menu bar to enable/disable the PMML cleanser.

By default, the toggle is enabled.

During model upload if the Apply PMML Cleanser toggle is enabled, comprehensive semantic checks and corrections will be performed on the provided PMML file. Disabling it will improve upload time, but this is not recommended. If the PMML file is large, such as Random Forest model, we recommend compressing the file using ZIP/GZIP before uploading. This will reduce the upload time drastically.

Downloading models

A model can be downloaded in various formats for future use.

For each format a clickable icon is provided in the model cards.

Icon Download format
Download icon 1 downloads the PMML source as PMML file without annotations
Download icon 2 downloads the PMML source as PMML file with annotations
Download icon 3 downloads the model´s serialized version as binary file

Activating or deactivating models

A model, if not being used for a long time, can be deactivated so that it doesn’t occupy space in the memory of the system.

Click the Active/Inactive toggle button in a model´s card to deactivate/activate the model.

Deleting models

To delete a model, click the delete icon on its card and confirm the deletion.

Once a model is deleted, it will be removed permanently from your list of models.

Viewing model properties and KPIs

A model has many important properties such as model inputs and outputs as well as meaningful KPIs like memory snapshots which help you to get an insight into the run-time performance of the model.

Click the details icon Details on the top right of a card, to view the properties and KPIs of a model.

Besides the name, description and status of the model, the Model Details window shows the inputs and outputs of the model and some useful charts created using the KPIs. These charts currently include the Memory Metrics and the Prediction Metrics.

Model details

Memory Metrics provides information about the memory footprint of the model on the server and its related attributes like used memory, free memory and total memory of the application. The same information is represented as a vertical bar chart.

Prediction Metrics provides a scoring result summary for the models. Prediction Metrics of a classification model displays the predicted categories and its respective counts as a pie chart. Prediction Metrics of a regression model displays the Five Point Summary of predicted values i.e., Minimum, FirstQuartile, Median, ThirdQuartile and Maximum values as a box plot. Initially, the Prediction Metrics of any model is empty and it will be displayed only if scoring is applied on the model. Prediction Metrics of a model will be reset when the model is deleted or deactivated. Also the Prediction Metrics information that shows up is always the cumulative result with the past scoring of the model.

Currently the Prediction Metrics feature is supported only for classification and regression models.

Info: By default, the Inputs and Outputs panels are in collapsed state. Click the labels to expand them.

Managing resources

In the Resources page you manage the resources, i.e. the custom functions and look-up tables which a model might need.

Resource management functionality includes:

Click Resources in the navigator, to open the Resources page.

Resources

Uploading resources

To upload a new resource, click Add resource, navigate to the desired resource file and then click Open.

Once your resource is successfully uploaded, you will see a corresponding confirmation message. The new resource will be added to the resources list.

Downloading resources

To download the source file of a resource, click the download icon in its card.

Typically the source of the resource will either be a jar file or an Excel sheet.

Deleting resources

To delete a resource, click the delete icon on its card and confirm the deletion.

Once a resource is deleted, it will be removed permanently from your resources list.

Processing data

The Predictions page allows you to do meaningful predictions by scoring the data from your devices against your predictive models.

Click Predictions in the navigator, to open the Predictions page.

Predictions

Batch Processing

Batch Processing allows you to process data records supplied to the model in a CSV or JSON file. The records are processed in batches. It also allows you to pass on image data as a JPEG or PNG file for processing.

Batch Processing is primarily targeted for testing the accuracy of your predictive models by applying it against your test data. Hence, it is expected that you know the predicted outputs beforehand.

Running the batch process

To run the batch process, perform the following steps:

  1. Click Start in the Batch processing tab to initiate the batch processing.
  2. In the Batch Processing wizard, select a model from the dropdown list. The dropdown list shows all models which you have uploaded to the Models page.
    Batch process 1
    Click Next to proceed.
  3. Upload the file containing the CSV/JSON records of your test data or choose the image file (JPEG/PNG) you want to process. Drag and drop a file or select it by browsing.
    Batch process 2
    On uploading a valid file, you will see an uploading message.

Info: The size of the uploaded file must not exceed 500 MB.

After the processing has been completed, you will see a corresponding notification.

Viewing the results

Click Show Results to preview the scored or processed results.

Results

By design, the Results page will only preview maximum 500 records in a paginated manner, displaying 10 records per page.

In the top right of the Results page you find several buttons to perform the following actions:

Button Action
Download Download the entire set of processed results.
Filter Enable or disable filters.
Configure Configure the columns to be shown in the results table.

Ideally, for measuring the accuracy of the model against your data, you should specify the desired outputs as part of you data file. If specified, the processed results will include a separate column called Match which indicates if the computed and the expected outputs have matched.

Click the cogwheel icon File and select Hide matching rows, to hide all rows where the Match column is true, i.e. to display only records where computed and expected outputs differ.

Click the file icon File in front of a row, to download a full execution trace, showing what exactly happened when that record was applied against the model. In this way, you can investigate why the outputs did not match.

Scheduled Processing

Scheduled Processing allows you to schedule batch jobs for processing measurements from devices or device groups against an available model.

The job scheduler can be used to trigger one-time or periodic jobs on data captured from devices. The scheduler allows you to map device data to model inputs by providing a mapping tool. Periodic executions of batch jobs can be useful when aggregate information on model’s predictions is required for a desired time period.

Scheduling a job

To schedule a new job, perform the following steps:

  1. Click Create Job in the Scheduled Processing tab.
  2. In the Job Config wizard, enter the name and description of the job you want to create. Select a target device or device group from the dropdown list. The list shows maximum 2000 devices or groups. Once done, select a target model which will be used for processing the data captured from your selected device or device group. The dropdown list shows all models which you have uploaded to the Models page.
    Scheduled process 1
    Click Next to proceed.
  3. Each device can have various measurements which are persisted in Cumulocity. In the Mapping section, map the device measurements to the corresponding model inputs.
    Scheduled process 2
    Click Next to proceed.
  4. Set the schedule of the job by selecting the frequency for the job followed by when it should run. You also need to specify the data range to be used for processing when the job is executed. Scheduled process 3
    Click Finish to schedule the job that you just configured.

Info:
1. For a periodic frequency, a CRON expression is generated and used by the scheduler.
2. The data range selected for the schedule must not exceed 24 hours.
3. For a one-time job, you need to select the date when the job should run. You also need to specify the data range to be used for processing when the job is executed.

After the job is scheduled, you will see a corresponding notification.

Note that if there are too many jobs scheduled, then, over time the underlying MongoDB of a tenant might become over-populated with execution data from these jobs. Hence it is recommended to have a retention rule in place to clean up data which is too old.
In order to do so, create a retention rule for events containing ZementisExecution in its type field. This rule would not remove the jobs themselves but only the data from the execution of the jobs. For details on adding retention rules, see To add a retention rule.

Viewing the scheduled jobs

Scheduled Jobs

By design, the Scheduled Processing tab previews all scheduled jobs in a paginated manner, displaying 10 jobs per page.

Click any link in the NAME column to view the configuration of that specific job. Click the delete icon of any job to remove the job.

Viewing the execution results of jobs

To view the execution results of any job, click on the history icon associated to that job in the My Jobs section of the Scheduled Processing tab.

Job History

By design, the Execution Results page previews all executions of the job in a paginated manner, displaying 10 executions per page.

For executions with status Warning or Failure, hover over the status to see the detailed reason behind the status. Click Back to see all scheduled jobs.

Viewing the inferences of a job execution

To view the inferences generated by any execution of a job, click on the details icon associated to that execution in the Execution Results page.

Execution Inferences Continuous Execution Inferences Categorical

The Inferences window shows two different types of charts, a line-chart plotting the continuous outputs of the model and a pie-chart plotting the model’s categorical outputs.

The inferences are shown in a paginated manner, displaying 2000 inferences per page. For executions containing device groups, it will also allow you to shuffle between different devices which were part of that execution.