DataHub
Cumulocity IoT DataHub Release 10.6, April 2020, includes the following improvements, limitations, and known issues:
DataHub Edge
Cumulocity IoT DataHub is now also available as an add-on to Cumulocity IoT Edge, the local version of Cumulocity IoT. DataHub Edge offers the same capabilities as the cloud variant, namely data pipelines for moving data from Cumulocity’s Operational Store to a local data lake and an SQL query interface for querying that data in an efficient manner.
Learn more about DataHub Edge in section Running DataHub on the Edge.
Limitations
Description |
---|
If the collection to be offloaded has JSON attributes consisting of more than 32,000 characters, its data cannot be offloaded. |
If the collection to be offloaded has more than 800 JSON attributes, its data cannot be offloaded. |
If an attribute of a collection has varying types associated, the result table will contain a mixed type which may render query writing difficult or lead to problems with subsequent consumer applications. |
DataHub requires a separate Kubernetes instance with version 1.9 or higher for running the Dremio cluster; it cannot run within the Kubernetes instance of the Cumulocity IoT platform. |
Duplicate attribute names with respect to case-insensitivity may lead to data loss during offloading. This refers to the case that the data has two or more attributes with the same name in terms of case-insensitivity, for example, myDevice and Mydevice would be equal. Instead of the actual payload of the data, the value null will be offloaded for one of the two attributes, as case-insensitive handling of attributes is not properly supported. |
Known issues
Edition |
Description |
---|---|
Edge & cloud | If you define an offloading pipeline and insert an invalid additional filter predicate or additional column expression, the resulting error message can be hard to read. |
Edge & cloud | If you do not have the required roles for DataHub and log in to DataHub UI, you will not get a notification for the missing roles. The menu bar at the left of the UI is not shown. Thus, you cannot interact with the UI. |
Edge | There are no retention policies in place that prevent the data lake contents from exceeding the hard disk limits. |
Edge | TLS is not supported for ODBC and JDBC. |
Cloud | Data lake configuration validation is broken in terms of wrong bucket names (AWS S3) and wrong account names (Azure Storage). When saving the settings with an invalid bucket/account name, DataHub fails to quickly detect the problem and will instead run a time-consuming check, which shows up as an ongoing save request in the UI. Eventually the request will fail in the UI with a timeout and the save request in the backend will fail as well. In such a case, please carefully check the bucket/account name and try saving again. |