Operating Cumulocity IoT DataHub
This section describes how you can access system information, usage statistics, and audit logs.
This section describes how you can access system information, usage statistics, and audit logs.
In the navigator, select Administration and then System status to get information about the system configuration and its status.
Under Microservice you will find the status of the microservice, which is either marked as green or red. This status reflects whether the microservice can be accessed from the web application. If the microservice is accessible, its current version is shown. If not, check the status of the microservice and its logs as described in section Administration > Managing applications in the Cumulocity IoT User guide.
Under Web application you will find the version of the web application.
Under Dremio you will find the status of Dremio, which is either marked as green or red. This status reflects whether Dremio can be accessed from the microservice. If Dremio is accessible, its current version is shown. If not, check the status of the microservice and its logs as described in section Administration > Managing applications in the Cumulocity IoT User guide.
Under Management you will find the setup of the system. If you expand that box by clicking on the arrow to the right, all relevant system properties and their values are listed. Note that these values cannot be modified for a running microservice. The tenant administrator needs to redeploy the microservice with corresponding new values.
If enabled, Cumulocity IoT DataHub tracks usage statistics on the amount of data being processed. These statistics are collected for offloading queries and track the amount of data these queries read from the Operational Store of Cumulocity IoT. The statistics are also collected for ad-hoc queries and track the amount of data these queries read from the data lake. The usage statistics can be utilized for a volume-based charging. They can also be utilized to pinpoint resource-intensive queries in terms of network load.
In the navigator, select Administration and then Usage statistics to view the usage statistics.
In the action bar, a date control allows you to select the month for which you want to see the usage statistics.
The three top panels show overall summary statistics as well as statistics separated for offloading and ad-hoc queries. If data from the month before the selected month is available, a tendency arrow illustrates whether the data volume of the selected month has decreased, increased, or stayed flat. The panels with the offloading and the ad-hoc query statistics additionally list the days with minimum/maximum volume as well as the daily average volume.
The table below the summary statistics shows the details on a per-day basis for the selected month. For each day, the volume offloaded and the volume queried are shown as well as their sum, which constitutes the daily volume. In addition the percentage of the monthly volume is shown, that is, how much did the daily volume contribute to the overall monthly volume. The date of each entry links to the Query log, which lists all queries for the respective day.
Auditing shows in the query log the queries being executed and in the system log the operations that users have carried out.
In the navigator, select Auditing and then Query log to view the query log.
At the top of the page you can select either offload or ad-hoc queries, define a text filter on the offloading task/ad-hoc query string, and select a time period. Use the pagination buttons at the bottom of the page to navigate through the result list.
For each offloading query, the following information is provided:
Column name | Description |
---|---|
Offloading task | The task name of the offloading pipeline, complemented by a status icon showing success or failure of the pipeline execution |
Runtime (s) | The runtime of the execution in seconds |
Data scanned (MB) | The amount of data the offloading query has read from the Operational Store of Cumulocity IoT |
Data billed (MB) | The amount of data being billed (depending also on your contract); amounts of data less than 10 MB in an offloading query will be billed as if they were 10 MB |
Details | The internal task UUID in an expandable box |
For each ad-hoc query, the following information is provided:
Column name | Description |
---|---|
User | The username of the Dremio user, which has been used to execute the query |
Query | The SQL query, complemented by a status icon showing success or failure of the query execution |
Runtime (s) | The runtime of the execution in seconds |
Data scanned (MB) | The amount of data the ad-hoc query has read from the data lake |
Data billed (MB) | The amount of data being billed (depending also on your contract); amounts of data less than 10 MB in an ad-hoc query will be billed as if they were 10 MB |
Details | The query string as well as a link to the associated Dremio job in an expandable box |
In the navigator, select Auditing and then System log to view the system log.
At the top of the page you can select log entries having status all/successful/errorneous/running, define a text filter on the log entries, and select a time period. Use the pagination buttons at the bottom of the page to navigate through the result list.
For each log entry, the following information is provided:
Column name | Description |
---|---|
User | The user that has carried out the operation |
Event | The type of operation |
Details | The details of the operation and, if available, further information in an expandable box |
The Cumulocity IoT DataHub microservice exposes an endpoint to automatically monitor the health of active offloading configurations. The ETL pipeline health can be monitored with the endpoint GET /service/datahub/scheduler/health:
The endpoint examines the latest job executions of all jobs and classifies them:
If all jobs are classified as STEADY, the endpoint returns the HTTP status code 200 with the following message:
“HTTP 200 CDHCBEI0029 - Scheduler healthcheck succeeded.”
Otherwise, the endpoint returns the HTTP status code 500 with the following message:
“HTTP 500 CDHCBEE0031 - Scheduler healthcheck failed: There were failed or suspended jobExecutions.”
The response body indicates the jobs to be checked by an administrator:
{
"error" : "There were failed or suspended jobExecutions: \n\nCRITICAL: Job failed: uuid=0d2eb545-cae5-4718-b6c1-50c4169bac69, jobType=CTAS, jobRunId=NON_CLUSTERED1580741460697\n\n"
}
The endpoint can be accessed by any logged in Cumulocity IoT user who is authorized to access the Cumulocity IoT DataHub microservice.