Data Preparation Smart functions

Smart functions

Smart functions are the core processing mechanism of Cumulocity Data Preparation. Every rule comprises a smart function that handles all data transformation, enrichment, and message processing.

Write smart functions to parse device payloads, create measurements and events, enrich messages with calculated fields, implement custom validation logic, or filter data based on your requirements—all without building a full microservice.

Explore the topics below to learn how to implement and deploy smart functions in Data Preparation rules.

Overview

This section explains how to write smart functions for use within Cumulocity Data Preparation rules. It covers the function signatures you implement, the data types you receive and return, the runtime environment your code executes in, and the behavior guarantees the platform makes.

What is a Data Preparation smart function?

A Data Preparation smart function is a small Javascript function that implements a Data Preparation rule. The platform invokes the function when a message matching the rules filters arrives, with that message as the argument, and uses the values you return to update the Cumulocity operational store or forward messages to other destinations.

Within the platform these smart functions use Javascript, but you can write your function in TypeScript for type safety while developing externally and then transpile the function before deployment.

Inbound device messages

This page describes the smart function used to process inbound messages received from devices. The function is implemented as onMessage in your Javascript file. It is invoked once per inbound message that matches the rule’s conditions, and it returns the Cumulocity objects (measurements, events, alarms, operations) to be created or updated as a result.

Purpose

Use this function to:

Decode raw device payloads from a device transport (currently only MQTT service is supported).
Map device data to Cumulocity domain objects.
Enrich messages with calculated fields, lookups, or context.
Filter or drop messages based on content.

Signature

export function onMessage(
  msg: DeviceMessage,
  context: DataPrepContext
): CumulocityObject[];

The function may also be declared async if an API being used returns a promise that is already fulfilled.

When the function is invoked

The function is invoked once per inbound DeviceMessage that matches the rule’s conditions. Specifically:

Each message arriving on a transport that matches the rule’s filters (device, topic, transport) triggers exactly one invocation.
Messages that do not match the rule’s conditions never reach the function.
The function is invoked synchronously with respect to the message — the rule waits for the function (or its returned promise) to complete before proceeding to the next message in the same shard.

For details on how invocations are sharded and ordered, see Runtime behavior and limits.

Inputs

The function receives two arguments:

msg — a DeviceMessage representing the inbound message. The payload is always present (as a Uint8Array), the transportID, topic, clientID and time will always be present. Additional information may be present in transportFields depending on the transport.
context — a DataPrepContext providing runtime metadata.

For the full list of fields available on DeviceMessage, see DeviceMessage. For details on the context object, see Context.

Outputs

The function returns an array of Cumulocity domain objects: Measurement, Event, Alarm, or Operation. Each object in the array is processed independently:

Each object is created in the Cumulocity operational store.
Returning an empty array ([]) drops the message — no objects are created and no error is reported.
Each object must specify an externalSource to identify the target device.

For the full list of domain object fields, see Cumulocity objects.

When devices are created

If a returned object specifies an externalSource that does not match an existing device, that device is created automatically with a default configuration.

Behavior notes

Ordering: Within a single shard (one device’s clientID), invocations are strictly serial and in-order. The function is never called concurrently for the same source client ID. See Runtime behavior and limits for details.

Errors: If the function throws an error or returns an unparsable object, the message is dropped, an error is logged and an alarm is raised. The platform does not retry. To drop a message intentionally, return an empty array.

State: The function is stateless — it cannot rely on global variables to persist data across invocations. Any global state may be wiped between calls.

Data types

This section describes the data types you receive as inputs and produce as outputs in Data Preparation smart functions.

DeviceMessage

A DeviceMessage represents a message received from a device transport.

Field	Type	Required	Description
`payload`	`Uint8Array`	Yes	The message payload as bytes. Always present, even if empty. Use `TextDecoder` to decode text-based payloads, or use a binary library (protobufjs, cbor2) for binary formats.
`transportID`	`string`	Yes	Identifier of the source transport, for example `"mqtt"`.
`clientID`	`string`	No	Identifier of the transport client. For MQTT, this is the MQTT client ID.
`topic`	`string`	Yes	The topic, path, or equivalent on the transport. For MQTT this is the MQTT topic.
`transportFields`	`{ [key: string]: string }`	No	Optional dictionary of transport-specific metadata. Values are strings.
`time`	`Date`	No	Timestamp the message was received by the platform.

Cumulocity objects

When you return objects from onMessage, you return one of five domain object types: Measurement, Event, Alarm, Operation, or ManagedObject. All five share the same common fields and differ only in their payload structure.

Common fields

Field	Type	Required	Description
`cumulocityType`	`string`	Yes	Discriminator for the object type. One of: `"measurement"`, `"event"`, `"alarm"`, `"operation"`, `"managedObject"`.
`payload`	`object`	Yes	The object data, in the same shape as the Cumulocity REST API, but without the source field. These are described for each type below.
`externalSource`	`ExternalId[]`	Yes	One or more external IDs (externalId and type pair) identifying the target device. The platform looks these up to find the Cumulocity device.
`destination`	`string`	No	Advanced. Destination for the object. Defaults to `"cumulocity"`, to create the object in the operational store.

Measurement

A measurement payload includes sensor data as numeric values. Each measurement has a type and one or more fragments (object properties mapping series names to numeric values and units).

Payload Field	Type	Required	Description
`type`	`string`	Yes	The measurement type (for example, `"c8y_Temperature"`).
`time`	`Date`	Yes	The measurement timestamp.
`[fragment: string]`	`{ [series: string]: MeasurementValue }` \| `any`	No	One or more fragments mapping series names to `MeasurementValue` for measurement series, or any custom data for other fragments.

MeasurementValue:

Field	Type	Required	Description
`value`	`number`	Yes	The numeric value of the measurement.
`unit`	`string`	No	The unit (for example, `"C"` or `"hPa"`).

Event

An event records an occurrence or state change. Event payloads typically include a type, timestamp, and human-readable description.

Payload Field	Type	Required	Description
`type`	`string`	Yes	The event type (for example, `"c8y_LocationUpdate"`).
`text`	`string`	Yes	A human-readable description.
`time`	`Date`	Yes	The event timestamp.
`[fragment: string]`	`any`	No	Optional custom or standard fragments (for example, `c8y_Position`).

Alarm

An alarm represents an error or alert condition. Alarm payloads include a severity level and optional context fragments.

Payload Field	Type	Required	Description
`type`	`string`	Yes	The alarm type.
`severity`	`"CRITICAL"` \| `"MAJOR"` \| `"MINOR"` \| `"WARNING"`	Yes	The alarm severity level.
`text`	`string`	Yes	A human-readable description.
`time`	`Date`	Yes	The alarm timestamp.

Operation

An operation represents a request to perform an action on a device, such as restart or firmware update. Operations are typically used for device control.

Payload Field	Type	Required	Description
`status`	`"PENDING"` \| `"SUCCESSFUL"` \| `"FAILED"` \| `"EXECUTING"`	No	The operation status.
`description`	`string`	No	Human-readable description.
`[fragment: string]`	`any`	No	Custom fragments (for example, `c8y_Restart` to issue a restart operation).

Managed object

A managed object update applies an update to an existing managed object (MO) in the Cumulocity inventory. Use this type to update managed object details and custom fragments on a device or asset.

The external ID you provide in externalSource is used to identify the target managed object. If no managed object exists for that external ID, the platform creates one automatically before applying the update. For details on automatically created devices, see Device onboarding.

Every field in the Managed Object API is optional, so you only need to include the fields you want to change. To remove a fragment, set its value to null.

Important

Managed object updates are designed for updating existing managed objects, not as the primary way to create new managed objects. The ManagedObject type does not have the ability to make hierarchy changes (for example, assigning child devices or assets).

Field	Type	Required	Description
`name`	`string`	No	The display name of the managed object.
`owner`	`string`	No	The owner of the managed object.
`type`	`string`	No	The managed object type.
`c8y_IsDevice`	`{}`	No	Marks the managed object as a device. Set to an empty object to add this marker. Set to `null` to remove it.
`c8y_SupportedOperations`	`string[]`	No	List of operation types the device supports.
`[fragment: string]`	`any`	No	Any custom fragment to add, update, or remove. Set to `null` to remove a fragment.

Context object

Every Data Preparation smart function receives a context object as its final parameter. The context provides runtime metadata.

DataPrepContext

Field	Type	Description
`runtime`	`"c8y-data-preparation"`	Identifies the runtime environment. Always `"c8y-data-preparation"` for Data Preparation smart functions.

Using the context

Currently, the context object is mostly informational. The runtime property allows code shared between different smart-function implementations to detect where it is running.

Standard libraries and imports

Data Preparation smart functions run in a Javascript environment conforming to ES2023. The following additional global objects and functions are available without any import statement. Additional libraries for specific data formats can be imported explicitly.

Globals

console

Logs output for debugging. All methods accept any number of arguments, which are converted to strings and joined with spaces. Use Javascript string interpolation rather than relying on format strings, for example:

console.info(`Device ${deviceId} reported temperature: ${temperature}`);

Method	Description
`console.log(...args)`	Alias for `console.info`.
`console.info(...args)`	Informational message.
`console.warn(...args)`	Warning message.
`console.error(...args)`	Error message.
`console.debug(...args)`	Debug-level output.

For more details about how to view the logs, see Runtime behavior and limits.

TextEncoder

Encodes strings as UTF-8 bytes. Instantiate with a zero-argument constructor.

Method / Property	Description
`encode(input)`	Encodes a string and returns a `Uint8Array`.
`encodeInto(input, dest)`	Encodes a string into an existing `Uint8Array`.
`encoding`	Always returns `"utf-8"`.

const encoder = new TextEncoder();
const bytes = encoder.encode('Hello World');

TextDecoder

Decodes bytes into a string. Instantiate with the encoding name as the first argument.

Method / Property	Description
`decode(input)`	Decodes a `Uint8Array` and returns a string.
`encoding`	Returns the encoding specified at construction.

const decoder = new TextDecoder('utf-8');
const text = decoder.decode(msg.payload);

Base64

Encodes and decodes Base64 data. All methods are static.

Method	Description
`Base64.encode(bytes)`	Encodes a `Uint8Array` to a Base64 string.
`Base64.decode(str)`	Decodes a Base64 string to a `Uint8Array`.
`Base64.encodeStr(str)`	Encodes a plain string to a Base64 string.
`Base64.decodeStr(str)`	Decodes a Base64 string to a plain string.

OPCUACodec

Encodes and decodes OPC UA binary data. Instantiate with a zero-argument constructor.

Method	Description
`decode(bytes)`	Decodes OPC UA binary data.
`decodeDataValue(bytes)`	Decodes an OPC UA DataValue.
`encode(value)`	Encodes a value to OPC UA binary.
`encodeDataValue(value)`	Encodes a value as an OPC UA DataValue.

Importable libraries

The following libraries are available as explicit imports.

protobufjs

Parse and encode Protocol Buffer messages.

import protobuf from 'protobufjs.js';

Data Preparation provides protobufjs version 8.

cbor2

Work with CBOR (Concise Binary Object Representation) encoded data.

import { /* exported names */ } from 'cbor2.js';

Data Preparation provides cbor2 version 1.

Bundling external libraries

When developing outside the platform, you can use any third-party Javascript or TypeScript library by bundling it into a single file with your smart function. The result is uploaded as a single Javascript module.

This is the recommended approach for libraries that are not provided by the platform. The platform does not download libraries from package registries at runtime.

For details on the external development workflow, including transpilation and bundling, see the external development section.

Runtime behavior and limits

This section covers the execution environment your smart function runs in: how it is invoked, what guarantees the platform makes about ordering and concurrency, what limits apply, and what happens when things go wrong.

For the cross-component view of sandboxing and resource limits that applies to all smart functions, see Sandbox and limits. This section adds Data Preparation-specific detail.

Execution model

Each Data Preparation rule runs across multiple shards to scale throughput. Within each shard, the platform maintains its own independent Javascript runtime for the rule.

Sharding key: The shard is determined by the device’s clientID.

Per-shard guarantees:

Within a single shard, smart function invocations are strictly serial and processed in arrival order.
For a given clientID, the platform guarantees end-to-end serial, in-order execution: a message is fully processed before the next one starts.
Across shards, invocations run concurrently. There is no ordering guarantee between different clientIDs.

Per-shard runtime isolation:

Each shard has its own Javascript runtime instance. Code loaded in one shard is not visible in another, even for the same rule.
The platform may reinstantiate runtimes at any time — for example, after rule updates, scaling events, or recovery from errors.
This is the reason global state is not shared across invocations: even if your code runs back-to-back for the same clientID, the runtime may have been reinstantiated between calls.

State handling

Smart functions must not rely on global Javascript state to persist data between invocations.

Top-level let, const, and var declarations may be reset between calls. Do not store mutable data in them.
Module-level objects (caches, counters) are unreliable because runtimes can be recreated at any time.
Static configuration loaded once at module load is acceptable, as long as you accept it may be reloaded at any time.

If you need state across messages, persist it externally (for example, by emitting it as a Cumulocity object).

Synchronous vs. asynchronous functions

You can declare your smart function as synchronous or async. The platform handles both forms:

Synchronous: returns an array directly.
Asynchronous: returns a Promise that resolves to an array. This promise will be immediately resolved. Use for calling libraries which use an async API.

Even when async, the function executes serially within its shard. The platform waits for the promise to resolve before moving to the next message in the same shard. Async does not give you parallelism within a shard.

Error handling

If your function throws synchronously, the returned promise rejects, or the return values are not parseable:

The message that caused the error is dropped.
An error is logged with the function name, error message, and (where possible) stack trace.
An alarm is raised on the tenant with the failing device message and the error.
The platform does not retry the message.
The shard continues processing the next message.

To drop a message without raising an error, return an empty array ([]).

Logs

All output written with console.log, console.info, console.warn, console.error, and console.debug is written to the Apama microservice log file. For per-tenant microservices, this log is visible in the Administration application. More details are available in the Streaming Analytics documentation.

When running tests in the rule editor before deployment, all log output is also shown directly in the test UI.

Resource limits

The platform enforces per-invocation limits to protect against runaway functions:

Execution time: 1 second elapsed time per invocation. Functions that exceed this limit are terminated and the message is dropped.
Memory: 100 MB per rule. This covers function compilation, input consumption, stack, processing, and output production, which implicitly limits input and output size.

When a limit is exceeded, the function is terminated mid-execution, the message is dropped, an error is logged and an alarm is raised in the tenant.

These limits are designed to protect the platform, not as a target to build towards. You should not expect to be able to consume the full limits on every function invocation within the resources deployed to the platform.

Sandboxing

Smart functions run in a sandboxed Javascript environment. Within Data Preparation specifically:

No filesystem access — there is no fs or equivalent.
No network access — you cannot open sockets, perform HTTP requests, or contact external services. All I/O happens through the function’s input arguments and return value.
No process control — you cannot spawn workers, threads, or subprocesses.
No access to other tenants’ data — the runtime is scoped to a single tenant.
No access to other rules’ data — runtimes are scoped to a single rule.
No access to other devices’ data — rules execute in different contexts for each incoming device.
No environment variables or system info — process is not available.

For the underlying security model, see Sandbox and limits.

Reinstantiation and idempotency

The platform may reinstantiate the runtime at any time, including:

After a rule update or redeployment.
After scaling events (shards being added or removed).
After errors that compromise the runtime.
During platform upgrades.

Reinstantiation never interrupts an in-progress invocation. It can only occur between the processing of two messages. It may also happen at different times across different shards — for example, a rule redeployment may take effect on one shard before another.

When the runtime is reinstantiated, top-level code runs again. Be aware that:

Module-load side effects may run multiple times during a function’s lifetime.
You cannot rely on top-level console.log calls firing exactly once.
Initialization should be cheap and idempotent.

Examples

This section provides practical examples of onMessage smart functions for Data Preparation. Each example shows the input message, the function, and the output it produces.

For the data types used in these examples, see Data types. For the runtime guarantees that apply, see Runtime behavior and limits.

Parse JSON and create a measurement

Decode a JSON payload and produce a temperature measurement. This example also parses a timestamp from the payload and uses it instead of the message arrival time.

Example input (msg.payload decoded as UTF-8):

{ "deviceId": "SN-001", "tempCelsius": 22.5, "timestamp": "2026-05-12T14:30:00Z" }

Function:

export function onMessage(msg, context) {
  const data = JSON.parse(new TextDecoder("utf-8").decode(msg.payload));

  return [{
    cumulocityType: "measurement",
    payload: {
      type: "c8y_Temperature",
      time: new Date(data.timestamp),
      c8y_Temperature: {
        T: { value: data.tempCelsius, unit: "C" }
      }
    },
    externalSource: [{ externalId: data.deviceId, type: "c8y_Serial" }]
  }];
}

Output: one Measurement of type c8y_Temperature for device SN-001, with the timestamp parsed from the payload.

Filter messages by condition

Only produce output for messages that meet a condition; silently drop the rest.

Example input (msg.payload decoded as UTF-8):

{ "deviceId": "SN-002", "temperature": -5 }

Function:

export function onMessage(msg, context) {
  const data = JSON.parse(new TextDecoder("utf-8").decode(msg.payload));

  if (data.temperature <= 0) {
    // Drop sub-zero readings
    return [];
  }

  return [{
    cumulocityType: "measurement",
    payload: {
      type: "c8y_Temperature",
      time: msg.time,
      c8y_Temperature: {
        T: { value: data.temperature, unit: "C" }
      }
    },
    externalSource: [{ externalId: data.deviceId, type: "c8y_Serial" }]
  }];
}

Output: no output (message dropped because temperature is -5).

Create an alarm

Raise a MAJOR alarm when a sensor reading exceeds a threshold.

Example input (msg.payload decoded as UTF-8):

{ "deviceId": "SN-003", "pressure": 1100 }

Function:

export function onMessage(msg, context) {
  const data = JSON.parse(new TextDecoder("utf-8").decode(msg.payload));

  if (data.pressure > 1050) {
    return [{
      cumulocityType: "alarm",
      payload: {
        type: "c8y_PressureAlarm",
        severity: "MAJOR",
        text: `Pressure exceeded threshold: ${data.pressure} hPa`,
        time: msg.time
      },
      externalSource: [{ externalId: data.deviceId, type: "c8y_Serial" }]
    }];
  }

  return [];
}

Output: one Alarm of type c8y_PressureAlarm with severity MAJOR for device SN-003.

Create a location event

Produce a location update event with a standard c8y_Position fragment.

Example input (msg.payload decoded as UTF-8):

{ "deviceId": "SN-004", "lat": 51.5, "lng": -0.1, "alt": 10 }

Function:

export function onMessage(msg, context) {
  const data = JSON.parse(new TextDecoder("utf-8").decode(msg.payload));

  return [{
    cumulocityType: "event",
    payload: {
      type: "c8y_LocationUpdate",
      text: "Location update",
      time: msg.time,
      c8y_Position: {
        lat: data.lat,
        lng: data.lng,
        alt: data.alt
      }
    },
    externalSource: [{ externalId: data.deviceId, type: "c8y_Serial" }]
  }];
}

Output: one Event of type c8y_LocationUpdate with a c8y_Position fragment for device SN-004.

Update a managed object

Apply an update to a managed object to edit its details. The platform identifies the managed object by the external ID and only updates the fields you include.

Example input (msg.payload decoded as UTF-8):

{ "deviceId": "SN-005", "name": "Sensor 5", "firmwareVersion": "2.1.0", "hwModel": "SensorX" }

Function:

export function onMessage(msg, context) {
  const data = JSON.parse(new TextDecoder("utf-8").decode(msg.payload));

  return [{
    cumulocityType: "managedObject",
    payload: {
      name: data.name,
      c8y_Firmware: { version: data.firmwareVersion },
      c8y_Hardware: { model: data.hwModel }
    },
    externalSource: [{ externalId: data.deviceId, type: "c8y_Serial" }]
  }];
}

Output: one managedObject update for device SN-005, changing the name to Sensor 5, and setting the c8y_Firmware and c8y_Hardware fragments. All other fragments on the managed object are unchanged.

Remove a fragment from a managed object

Set a fragment to null to remove it from a managed object.

Example input (msg.payload decoded as UTF-8):

{ "deviceId": "SN-006" }

Function:

export function onMessage(msg, context) {
  const data = JSON.parse(new TextDecoder("utf-8").decode(msg.payload));

  return [{
    cumulocityType: "managedObject",
    payload: {
      c8y_Firmware: null
    },
    externalSource: [{ externalId: data.deviceId, type: "c8y_Serial" }]
  }];
}

Output: one managedObject update. The c8y_Firmware fragment is removed from the managed object for device SN-006.

Parse binary data directly

Extract values directly from a binary payload with a known fixed structure, without text decoding.

Info

The Test data portion of the UI currently does not support non-JSON payload types. However, the rest of the Data Preparation application supports any payload type, including binary.

Example input: a 9-byte binary payload structured as:

Bytes 0–3: device ID as a 32-bit big-endian unsigned integer
Bytes 4–7: temperature as a 32-bit big-endian IEEE 754 float
Byte 8: battery level as an unsigned integer (0–100)

Function:

export function onMessage(msg, context) {
  const view = new DataView(msg.payload.buffer, msg.payload.byteOffset, msg.payload.byteLength);

  const deviceId = view.getUint32(0, false).toString();
  const temperature = view.getFloat32(4, false);
  const battery = view.getUint8(8);

  return [
    {
      cumulocityType: "measurement",
      payload: {
        type: "c8y_Temperature",
        time: msg.time,
        c8y_Temperature: { T: { value: temperature, unit: "C" } }
      },
      externalSource: [{ externalId: deviceId, type: "c8y_Serial" }]
    },
    {
      cumulocityType: "measurement",
      payload: {
        type: "c8y_Battery",
        time: msg.time,
        c8y_Battery: { level: { value: battery, unit: "%" } }
      },
      externalSource: [{ externalId: deviceId, type: "c8y_Serial" }]
    }
  ];
}

Output: two Measurement objects — one c8y_Temperature and one c8y_Battery — for the device identified by the first four bytes of the payload.

API reference

The TypeScript API for Data Preparation smart functions is published as a public npm package: @c8y/dataprep-types. It contains type definitions for all input and output types, the context object, and the onMessage function signature. You can use it when developing and testing smart functions outside the platform.

The full generated API documentation is available at the TypeDoc API reference.

For information on developing rules in your own IDE with TypeScript support, version control, and CI/CD, see External development.