Integrating Cumulocity DataHub with other products

Integrating Cumulocity DataHub with TrendMiner

TrendMiner provides process manufacturing companies the analytical means to further optimize their production processes. The self-service analytics approach allows you to conduct time-series industrial analytics, with data being automatically visualized in displays and dashboards.

For that purpose, TrendMiner accesses industrial data generated by these production processes, resulting in time series of sensor, instrument, and asset data. TrendMiner analyzes these time series in order to identify trends and patterns and derive actionable insights solving production issues.

With the offloading and query capabilities of Cumulocity DataHub, TrendMiner can also access and analyze the data being managed by the Cumulocity platform. Key features of the integration between Cumulocity DataHub and TrendMiner are:

  • TrendMiner can leverage historical data of the Cumulocity platform without adversely affecting the Operational Store of the platform. Cumulocity DataHub offloads for that purpose the data from the Operational Store to a data lake.
  • TrendMiner offers a time-series visualization interface and operational monitoring, both relying on live data from the Cumulocity platform. Cumulocity DataHub provides for that purpose a live view on recent data in the Operational Store of the platform.
  • Cumulocity DataHub unifies the data access layer so that TrendMiner can access historical as well as live data by querying a single view.
  • Cumulocity DataHub ensures that the layout of that table meets the query needs of TrendMiner, that is, the data is in a relational and flattened format, not in a document-based format as in the Operational Store.

The following diagram illustrates the high-level concepts of the integration between Cumulocity DataHub and TrendMiner.

Integration of Cumulocity DataHub and TrendMiner

Design of a TrendMiner offloading pipeline

Providing TrendMiner access to Cumulocity data requires you solely to define an offloading pipeline using the TrendMiner data layout. When the offloading pipeline is in place, Cumulocity data is regularly extracted from the Operational Store, flattened, and exported into a data lake. In addition, Dremio is configured to access recent data from the Operational Store, using the same schema as for the historical data.

In Dremio a new view is provided, which combines the historical data in the data lake with recent data from the Operational Store, effectively providing a unified view over hot data in the Operational Store and cold data in the data lake. Cumulocity DataHub takes care that the combined data in that view is lossless and does not introduce duplicates. This view is the single connection point to provide TrendMiner access to historical and live data of the Cumulocity platform.

Info
So far Cumulocity DataHub provides TrendMiner acccess to the measurements collection. Other base collections are not yet supported.

You must follow the instructions in Configuring offloading jobs on how to configure an offloading pipeline for the measurements collection, so that TrendMiner can access the data.

Accessing Cumulocity data in TrendMiner

Once you have defined and activated a TrendMiner offloading pipeline, the initial offload must be completed before you can start querying the data in TrendMiner.

Info
The offloading pipeline must be active. If the pipeline is deactivated, you can only query the contents offloaded into the data lake so far. Access to recent data will be deactivated.

Cumulocity DataHub provides the following views within Dremio, based on tables having the same name and the same schema:

  • c8y_cdh_tm_measurements is the view over the table in the data lake, which stores historical data being offloaded from the Operational Store so far.
  • c8y_cdh_tm_measurements_live is the live view combining c8y_cdh_tm_measurements with recent data from the Operational Store. Both views have the same schema.
  • c8y_cdh_tm_tags is the view over the table in the data lake, which stores the tag names and the source IDs. The source ID identifies the device managed in the Cumulocity platform. The tagname combines the source ID with the path in the measurements documents to the values establishing the time series. In TrendMiner you use the tagnames to select the time series you want to investigate. With this view you can map this series to the device in the platform.

For details on the schema of these views/tables, see Offloading Cumulocity base collections.

In TrendMiner you must connect to these Dremio views using ODBC. For the ODBC connection settings, you must navigate to the Home page in the Cumulocity DataHub UI and click the ODBC icon to open the ODBC connection settings.

For more details on the steps required in TrendMiner, see also the corresponding TrendMiner documentation of the connector configuration.

Integrating Cumulocity DataHub with Microsoft Power BI

Microsoft Power BI is a business intelligence tool which allows you to create and use interactive reports for data from various sources. These reports can also be built on your IoT data. Given your devices are connected with the Cumulocity platform, you can utilize Cumulocity DataHub to offload the data into a data lake of your choice. Then you can create a Microsoft Power BI report which is based on the data in the data lake. Cumulocity DataHub allows you to access and work with these reports from within the Cumulocity DataHub web frontend.

Prerequisites

Before setting up the connection to Microsoft Power BI in Cumulocity DataHub, conduct the following steps.

Accessing data lakes in Microsoft Power BI reports

Cumulocity DataHub leverages the native interaction between Microsoft Power BI and Dremio. Microsoft Power BI reports can consume data from data lakes using Dremio as query and data access layer. When creating a new report in Microsoft Power BI desktop, you can select Dremio as a database and establish a connection to the Dremio cluster, using the ODBC connection settings as provided on the homepage of the Cumulocity DataHub UI. With this connection you have access to the data lakes connected to Dremio.

Info
The Microsoft Power BI datasets should use the DirectQuery mode, which prevents replicating and caching the data from the data lake.

In contrast to versions prior to 10.18, it is no longer required to deploy a Microsoft Power BI gateway. A native connector from Power BI Web to Dremio is available now.

Configuring access to Microsoft Power BI reports

To make reports available in its web frontend, Cumulocity DataHub embeds Microsoft Power BI content. Users neither must sign in to Microsoft Power BI nor need a Microsoft Power BI license to access the reports. For access authentication an Azure Active Directory service principal object with an application secret is used.

The following configuration steps are required, as discussed in detail in the corresponding Microsoft documentation.

As prerequisite you need an Azure Active Directory tenant. If you do not have an Azure Active Directory tenant, follow the instructions in the Microsoft documentation.

Next you must register an Azure Active Directory application, which serves as service principal. You must configure the service principal application to access the REST APIs of Microsoft Power BI, following the instructions on the Microsoft Power BI website:

  1. Select Embed for your customers.
  2. Sign in to Microsoft Power BI.
  3. Register an application with respective permissions.
  4. Skip creating a workspace and importing content.
  5. Grant permissions to the service principal.
Info
An application created with the wizard can be used as a service principal.

Alternatively, you can create a service principal application following the section Creating an Azure AD app in the Microsoft Azure portal in the Microsoft documentation.

Additionally, you must add a client secret for the service principal application. You can do that via the Azure portal. Search for App registrations, select your application by its name under All applications, and click the link next to the Client credentials entry on the Overview page of the application.

Next you can define a workspace to organize your reports. By adding the service principal application as a member or admin to the workspace, it can access the reports of the workspace. Go to the Microsoft Power BI website and conduct the following steps to grant the permissions:

  1. Sign in to Microsoft Power BI.
  2. Click Workspaces.
  3. Select the context menu of the workspace to share with the service principal.
  4. Select Workspace access.
  5. Enter the name of your recently created service principal application and grant the Member or Admin permission.

Only workspaces granting access to the service principal application can be browsed from within Cumulocity DataHub. Once the workspace is available, you can publish reports to it and access it in Cumulocity DataHub.

Setting up the connection in Cumulocity DataHub

In the navigator, select Settings and then Microsoft Power BI to define the connection settings.

Settings Description
Azure Active Directory tenant ID The ID of the Azure Active Directory tenant. Within the tenant, an Azure Active Directory application must exist with a service principal that is allowed to access corresponding resources of Microsoft Power BI.
Client ID The ID of the Azure Active Directory application which has permissions to call the REST APIs of Microsoft Power BI.
Client secret The client secret, which is configured for the Azure Active Directory application.

Once all settings are completed, click Save on the action bar to save the settings and establish the connection.

If you want to delete the settings, click Delete on the action bar. You cannot access reports afterwards.

Working with reports

Once the settings are defined, you can access and work with the reports.

  1. In the navigator, select Microsoft Power BI. The menu entry is only shown if the connection settings are defined.

  2. On the Reports page, click Add report in the action bar. A dialog opens with two dropdown boxes. The first dropdown box lists all workspaces which grant member or admin access to the service principal. Select the workspace you are interested in. The second dropdown box provides all reports of the selected workspace. Select a report from the dropdown box.

  3. Click Select to open the report or Cancel to close the dialog without selecting a report.

The selected report is shown and can be interacted with. You can open multiple reports. For each opened report, a tab entry shows up in the action bar. To close the currently selected report, click Remove report in the action bar.

Info
The list of currently opened reports is not stored permanently. When closing the browser, the list will be flushed. It will also be flushed if the settings are deleted.