Smart functions are the core processing mechanism of Cumulocity Data Preparation. Every rule comprises a smart function that handles all data transformation, enrichment, and message processing.
Write smart functions to parse device payloads, create measurements and events, enrich messages with calculated fields, implement custom validation logic, or filter data based on your requirements—all without building a full microservice.
Explore the topics below to learn how to implement and deploy smart functions in Data Preparation rules.
Overview
This section explains how to write smart functions for use within Cumulocity Data Preparation rules. It covers the function signatures you implement, the data types you receive and return, the runtime environment your code executes in, and the behavior guarantees the platform makes.
What is a Data Preparation smart function?
A Data Preparation smart function is a small Javascript function that implements a Data Preparation rule. The platform invokes the function when a message matching the rules filters arrives, with that message as the argument, and uses the values you return to update the Cumulocity operational store or forward messages to other destinations.
Within the platform these smart functions use Javascript, but you can write your function in TypeScript for type safety while developing externally and then transpile the function before deployment.
Inbound device messages
This page describes the smart function used to process inbound messages received from devices. The function is implemented as onMessage in your Javascript file. It is invoked once per inbound message that matches the rule’s conditions, and it returns the Cumulocity objects (measurements, events, alarms, operations) to be created or updated as a result.
Purpose
Use this function to:
Decode raw device payloads from a device transport (currently only MQTT service is supported).
Map device data to Cumulocity domain objects.
Enrich messages with calculated fields, lookups, or context.
The function may also be declared async if an API being used returns a promise that is already fulfilled.
When the function is invoked
The function is invoked once per inbound DeviceMessage that matches the rule’s conditions. Specifically:
Each message arriving on a transport that matches the rule’s filters (device, topic, transport) triggers exactly one invocation.
Messages that do not match the rule’s conditions never reach the function.
The function is invoked synchronously with respect to the message — the rule waits for the function (or its returned promise) to complete before proceeding to the next message in the same shard.
msg — a DeviceMessage representing the inbound message. The payload is always present (as a Uint8Array), the transportID, topic, clientID and time will always be present. Additional information may be present in transportFields depending on the transport.
For the full list of fields available on DeviceMessage, see DeviceMessage. For details on the context object, see Context.
Outputs
The function returns an array of Cumulocity domain objects: Measurement, Event, Alarm, or Operation. Each object in the array is processed independently:
Each object is created in the Cumulocity operational store.
Returning an empty array ([]) drops the message — no objects are created and no error is reported.
Each object must specify an externalSource to identify the target device.
If a returned object specifies an externalSource that does not match an existing device, that device is created automatically with a default configuration.
Behavior notes
Ordering: Within a single shard (one device’s clientID), invocations are strictly serial and in-order. The function is never called concurrently for the same source client ID. See Runtime behavior and limits for details.
Errors: If the function throws an error or returns an unparsable object, the message is dropped, an error is logged and an alarm is raised. The platform does not retry. To drop a message intentionally, return an empty array.
State: The function is stateless — it cannot rely on global variables to persist data across invocations. Any global state may be wiped between calls.
Data types
This section describes the data types you receive as inputs and produce as outputs in Data Preparation smart functions.
DeviceMessage
A DeviceMessage represents a message received from a device transport.
Field
Type
Required
Description
payload
Uint8Array
Yes
The message payload as bytes. Always present, even if empty. Use TextDecoder to decode text-based payloads, or use a binary library (protobufjs, cbor2) for binary formats.
transportID
string
Yes
Identifier of the source transport, for example "mqtt".
clientID
string
No
Identifier of the transport client. For MQTT, this is the MQTT client ID.
topic
string
Yes
The topic, path, or equivalent on the transport. For MQTT this is the MQTT topic.
transportFields
{ [key: string]: string }
No
Optional dictionary of transport-specific metadata. Values are strings.
time
Date
No
Timestamp the message was received by the platform.
Cumulocity objects
When you return objects from onMessage, you return one of five domain object types: Measurement, Event, Alarm, Operation, or ManagedObject. All five share the same common fields and differ only in their payload structure.
Common fields
Field
Type
Required
Description
cumulocityType
string
Yes
Discriminator for the object type. One of: "measurement", "event", "alarm", "operation", "managedObject".
payload
object
Yes
The object data, in the same shape as the Cumulocity REST API, but without the source field. These are described for each type below.
externalSource
ExternalId[]
Yes
One or more external IDs (externalId and type pair) identifying the target device. The platform looks these up to find the Cumulocity device.
destination
string
No
Advanced. Destination for the object. Defaults to "cumulocity", to create the object in the operational store.
Measurement
A measurement payload includes sensor data as numeric values. Each measurement has a type and one or more fragments (object properties mapping series names to numeric values and units).
Payload Field
Type
Required
Description
type
string
Yes
The measurement type (for example, "c8y_Temperature").
time
Date
Yes
The measurement timestamp.
[fragment: string]
{ [series: string]: MeasurementValue } | any
No
One or more fragments mapping series names to MeasurementValue for measurement series, or any custom data for other fragments.
MeasurementValue:
Field
Type
Required
Description
value
number
Yes
The numeric value of the measurement.
unit
string
No
The unit (for example, "C" or "hPa").
Event
An event records an occurrence or state change. Event payloads typically include a type, timestamp, and human-readable description.
Payload Field
Type
Required
Description
type
string
Yes
The event type (for example, "c8y_LocationUpdate").
text
string
Yes
A human-readable description.
time
Date
Yes
The event timestamp.
[fragment: string]
any
No
Optional custom or standard fragments (for example, c8y_Position).
Alarm
An alarm represents an error or alert condition. Alarm payloads include a severity level and optional context fragments.
Payload Field
Type
Required
Description
type
string
Yes
The alarm type.
severity
"CRITICAL" | "MAJOR" | "MINOR" | "WARNING"
Yes
The alarm severity level.
text
string
Yes
A human-readable description.
time
Date
Yes
The alarm timestamp.
Operation
An operation represents a request to perform an action on a device, such as restart or firmware update. Operations are typically used for device control.
Payload Field
Type
Required
Description
status
"PENDING" | "SUCCESSFUL" | "FAILED" | "EXECUTING"
No
The operation status.
description
string
No
Human-readable description.
[fragment: string]
any
No
Custom fragments (for example, c8y_Restart to issue a restart operation).
Managed object
A managed object update applies an update to an existing managed object (MO) in the Cumulocity inventory. Use this type to update managed object details and custom fragments on a device or asset.
The external ID you provide in externalSource is used to identify the target managed object. If no managed object exists for that external ID, the platform creates one automatically before applying the update. For details on automatically created devices, see Device onboarding.
Every field in the Managed Object API is optional, so you only need to include the fields you want to change. To remove a fragment, set its value to null.
Important
Managed object updates are designed for updating existing managed objects, not as the primary way to create new managed objects. The ManagedObject type does not have the ability to make hierarchy changes (for example, assigning child devices or assets).
Field
Type
Required
Description
name
string
No
The display name of the managed object.
owner
string
No
The owner of the managed object.
type
string
No
The managed object type.
c8y_IsDevice
{}
No
Marks the managed object as a device. Set to an empty object to add this marker. Set to null to remove it.
c8y_SupportedOperations
string[]
No
List of operation types the device supports.
[fragment: string]
any
No
Any custom fragment to add, update, or remove. Set to null to remove a fragment.
Context object
Every Data Preparation smart function receives a context object as its final parameter. The context provides runtime metadata.
DataPrepContext
Field
Type
Description
runtime
"c8y-data-preparation"
Identifies the runtime environment. Always "c8y-data-preparation" for Data Preparation smart functions.
Using the context
Currently, the context object is mostly informational. The runtime property allows code shared between different smart-function implementations to detect where it is running.
Standard libraries and imports
Data Preparation smart functions run in a Javascript environment conforming to ES2023. The following additional global objects and functions are available without any import statement. Additional libraries for specific data formats can be imported explicitly.
Globals
console
Logs output for debugging. All methods accept any number of arguments, which are converted to strings and joined with spaces. Use Javascript string interpolation rather than relying on format strings, for example:
When developing outside the platform, you can use any third-party Javascript or TypeScript library by bundling it into a single file with your smart function. The result is uploaded as a single Javascript module.
This is the recommended approach for libraries that are not provided by the platform. The platform does not download libraries from package registries at runtime.
For details on the external development workflow, including transpilation and bundling, see the external development section.
Runtime behavior and limits
This section covers the execution environment your smart function runs in: how it is invoked, what guarantees the platform makes about ordering and concurrency, what limits apply, and what happens when things go wrong.
For the cross-component view of sandboxing and resource limits that applies to all smart functions, see Sandbox and limits. This section adds Data Preparation-specific detail.
Execution model
Each Data Preparation rule runs across multiple shards to scale throughput. Within each shard, the platform maintains its own independent Javascript runtime for the rule.
Sharding key: The shard is determined by the device’s clientID.
Per-shard guarantees:
Within a single shard, smart function invocations are strictly serial and processed in arrival order.
For a given clientID, the platform guarantees end-to-end serial, in-order execution: a message is fully processed before the next one starts.
Across shards, invocations run concurrently. There is no ordering guarantee between different clientIDs.
Per-shard runtime isolation:
Each shard has its own Javascript runtime instance. Code loaded in one shard is not visible in another, even for the same rule.
The platform may reinstantiate runtimes at any time — for example, after rule updates, scaling events, or recovery from errors.
This is the reason global state is not shared across invocations: even if your code runs back-to-back for the same clientID, the runtime may have been reinstantiated between calls.
State handling
Smart functions must not rely on global Javascript state to persist data between invocations.
Top-level let, const, and var declarations may be reset between calls. Do not store mutable data in them.
Module-level objects (caches, counters) are unreliable because runtimes can be recreated at any time.
Static configuration loaded once at module load is acceptable, as long as you accept it may be reloaded at any time.
If you need state across messages, persist it externally (for example, by emitting it as a Cumulocity object).
Synchronous vs. asynchronous functions
You can declare your smart function as synchronous or async. The platform handles both forms:
Synchronous: returns an array directly.
Asynchronous: returns a Promise that resolves to an array. This promise will be immediately resolved. Use for calling libraries which use an async API.
Even when async, the function executes serially within its shard. The platform waits for the promise to resolve before moving to the next message in the same shard. Async does not give you parallelism within a shard.
Error handling
If your function throws synchronously, the returned promise rejects, or the return values are not parseable:
The message that caused the error is dropped.
An error is logged with the function name, error message, and (where possible) stack trace.
An alarm is raised on the tenant with the failing device message and the error.
The platform does not retry the message.
The shard continues processing the next message.
To drop a message without raising an error, return an empty array ([]).
Logs
All output written with console.log, console.info, console.warn, console.error, and console.debug is written to the Apama microservice log file. For per-tenant microservices, this log is visible in the Administration application. More details are available in the Streaming Analytics documentation.
When running tests in the rule editor before deployment, all log output is also shown directly in the test UI.
Resource limits
The platform enforces per-invocation limits to protect against runaway functions:
Execution time: 1 second elapsed time per invocation. Functions that exceed this limit are terminated and the message is dropped.
Memory: 100 MB per rule. This covers function compilation, input consumption, stack, processing, and output production, which implicitly limits input and output size.
When a limit is exceeded, the function is terminated mid-execution, the message is dropped, an error is logged and an alarm is raised in the tenant.
These limits are designed to protect the platform, not as a target to build towards. You should not expect to be able to consume the full limits on every function invocation within the resources deployed to the platform.
Sandboxing
Smart functions run in a sandboxed Javascript environment. Within Data Preparation specifically:
No filesystem access — there is no fs or equivalent.
No network access — you cannot open sockets, perform HTTP requests, or contact external services. All I/O happens through the function’s input arguments and return value.
No process control — you cannot spawn workers, threads, or subprocesses.
No access to other tenants’ data — the runtime is scoped to a single tenant.
No access to other rules’ data — runtimes are scoped to a single rule.
No access to other devices’ data — rules execute in different contexts for each incoming device.
No environment variables or system info — process is not available.
The platform may reinstantiate the runtime at any time, including:
After a rule update or redeployment.
After scaling events (shards being added or removed).
After errors that compromise the runtime.
During platform upgrades.
Reinstantiation never interrupts an in-progress invocation. It can only occur between the processing of two messages. It may also happen at different times across different shards — for example, a rule redeployment may take effect on one shard before another.
When the runtime is reinstantiated, top-level code runs again. Be aware that:
Module-load side effects may run multiple times during a function’s lifetime.
You cannot rely on top-level console.log calls firing exactly once.
Initialization should be cheap and idempotent.
Examples
This section provides practical examples of onMessage smart functions for Data Preparation. Each example shows the input message, the function, and the output it produces.
Decode a JSON payload and produce a temperature measurement. This example also parses a timestamp from the payload and uses it instead of the message arrival time.
Output: one Event of type c8y_LocationUpdate with a c8y_Position fragment for device SN-004.
Update a managed object
Apply an update to a managed object to edit its details. The platform identifies the managed object by the external ID and only updates the fields you include.
Output: one managedObject update for device SN-005, changing the name to Sensor 5, and setting the c8y_Firmware and c8y_Hardware fragments. All other fragments on the managed object are unchanged.
Remove a fragment from a managed object
Set a fragment to null to remove it from a managed object.
Output: one managedObject update. The c8y_Firmware fragment is removed from the managed object for device SN-006.
Parse binary data directly
Extract values directly from a binary payload with a known fixed structure, without text decoding.
Info
The Test data portion of the UI currently does not support non-JSON payload types. However, the rest of the Data Preparation application supports any payload type, including binary.
Example input: a 9-byte binary payload structured as:
Bytes 0–3: device ID as a 32-bit big-endian unsigned integer
Bytes 4–7: temperature as a 32-bit big-endian IEEE 754 float
Byte 8: battery level as an unsigned integer (0–100)
Output: two Measurement objects — one c8y_Temperature and one c8y_Battery — for the device identified by the first four bytes of the payload.
API reference
The TypeScript API for Data Preparation smart functions is published as a public npm package: @c8y/dataprep-types. It contains type definitions for all input and output types, the context object, and the onMessage function signature. You can use it when developing and testing smart functions outside the platform.