Edge Operations

This section describes the main operating procedures for standard tasks that must be carried out when managing Cumulocity Edge.

Accessing logs

The Edge operator deploys and configures a Fluent Bit daemonset on the Kubernetes node to collect the container and application logs from the node file system. Fluent Bit queries the Kubernetes API, enriches the logs with metadata about the pods (in the Edge namespace), and transfers both the logs and metadata to Fluentd. Fluentd receives, filters, and persists the logs in the persistent volume claim configured for logging.

To download the diagnostic log archive, run the command below. It generates a file named c8yedge-logs-{current date}.tar.gz in the current directory.

kubectl get edge c8yedge -n c8yedge --output jsonpath='{.status.helpCommands.downloadLogs}' | sh
Info
Substitute the Edge name and namespace name c8yedge in the command above with the specific Edge name and namespace name you have specified in your Edge CR.

Download the log archives remotely from your cloud tenant. For more information, see Downloading diagnostics remotely.

Backup and recovery

For installations on self-managed Kubernetes clusters

For self-managed Cumulocity Edge deployments, implement backup and recovery procedures according to your organization’s data protection policies and Kubernetes cluster management practices. Ensure that Persistent Volume Claims (PVCs) and the Edge custom resource configuration are included in your backup strategy.

Caution
Once the storageClassName field is configured in the Edge custom resource (CR), it cannot be changed. Ensure you select the appropriate storage class during initial installation, as this setting is immutable after Edge is deployed.

For c8yedge installations

This runbook describes how to capture and restore a Cumulocity Edge deployment running on K3s installed via the c8yedge tool. Follow the numbered steps to create a consistent backup, reinstall the same Edge version, and validate the restored environment.


Step 1 - Data to preserve

The following directories must be preserved for disaster recovery. Implement appropriate data protection measures (for example, backups or redundant storage) to safeguard these directories:

  • /var/lib/rancher/k3s

Step 2 - Prepare the restore target

  1. Install the same operating system (or a compatible one) that originally hosted Edge.
  2. Make sure no prior K3s installation or Edge data exists on the target disk.
  3. Make the backup available to the target system: a. For file-based backups, transfer the backup artifacts to the node. b. For snapshot-based or storage-level backups, ensure the snapshot is accessible and ready for restore.
Caution
Installing a different Edge version on top of a restored data set is unsupported and may fail the upgrade guard rails.

Step 3 - Restore the data directories

Restore the backup so that the directories land in their original locations, preserving paths, ownership, and permissions. Omitting directories or restoring to incorrect locations can corrupt the cluster.

Confirm the directories exist and contain the expected ownership:

ls -ld /var/lib/rancher/k3s

Step 4 - Reinstall the matching Edge release

Re-run the installer with the exact version captured in the backup:

sudo c8yedge install --version <previous_version>

Or use the offline alternative if you are in an airgapped environment:

sudo c8yedge install -s c8yedge.tar

The installer detects that Cumulocity Edge is already installed and waits for the recovery to complete before exiting.

Watch for the following success messages:

...
2025-12-17T11:59:47Z	Cumulocity Edge update is complete in 3m14s (running version 2025.0.X)
2025-12-17T11:59:47Z	Edge recovered successfully.

Installing and Configuring Monitoring Tools

Installing Prometheus

Prometheus is an open-source project that is used for monitoring application state. See https://prometheus.io/ for detailed information on Prometheus and how to use it. See Installing Prometheus for detailed steps on installing Prometheus.

Installing Grafana

Grafana is an open-source project which serves as an introductory tool for querying, visualising, alerting, and exploring metrics, logs, and traces from diverse storage locations. See https://grafana.com/docs/grafana/latest/ for detailed information on Grafana and how to use it. See Installing Grafana for detailed steps on installing Grafana.

Monitoring

Edge supports monitoring the Edge deployment with Prometheus, an open-source project used to monitor application state. See https://prometheus.io/ for detailed information on Prometheus and how to use it.

The Edge operator exposes a Prometheus-compatible metrics endpoint, https://<domain>:3443/metrics, where the domain is the one you configured in the Edge CR.

Monitoring the Edge metrics from your cloud tenant

In your Cumulocity cloud tenant, you can monitor the measurements of the Edge listed in the table below. To monitor the measurements from your cloud tenant, ensure that you have registered your Edge with the Cumulocity cloud tenant. See Registering Edge in the cloud tenant for more information.

Measurement
Metrics
Description
Disk space - Total disk space
- Free disk space
- Used disk space
- Percentage of used disk space
Edge sends the disk space metrics as a measurement for both installation disk and data disk, every 10 minutes. The measurements are sent in gigabytes (GB) rounded to two decimal places. The percentage is rounded to one decimal place. The data points for this measurement are:

- c8y_InstallationDisk

- c8y_DataDisk


If Edge is unable to read the metrics from the installation disk or the data disk, an alarm is sent to the Cumulocity tenant. The alarms have a minor severity and the data points for the alarms are:

- c8y_FileSystemMeasurementErrorInstallationDisk

- c8y_FileSystemMeasurementErrorDataDisk

Memory (RAM) - Total RAM
- Free RAM
- Used RAM
- Percentage of RAM used
Edge sends the memory usage metrics as a measurement every 5 seconds in gibibytes (GiB). The data point for this measurement is c8y_Memory

If Edge is unable to read the metrics from the memory, an alarm is sent to the Cumulocity tenant. The data point for the alarm is:

- c8y_MemoryMeasurementError.

CPU Percentage of CPU used

Unit: Percentage
Edge sends the percentage of CPU used at intervals over 5 seconds, 60 seconds, and 600 seconds. The data points for this measurement are:

- c8y_CpuUsage5Seconds

- c8y_CpuUsage60Seconds

- c8y_CpuUsage600Seconds


If Edge is unable to read the metrics from the CPU, an alarm is sent to the Cumulocity tenant. The data point for the alarm is:

- c8y_CPUMeasurementError.

Disk I/O - Data read per second
- Data written per second

Unit: KB/s
Edge sends the disk input/output metrics as a measurement for both installation disk and data disk at intervals over 5 seconds, 60 seconds, and 600 seconds. The data points for this measurement are:

- c8y_DataDiskIo5Seconds

- c8y_DataDiskIo60Seconds

- c8y_DataDiskIo600Seconds

- c8y_InstallationDiskIo5Seconds

- c8y_InstallationDiskIo60Seconds

- c8y_InstallationDiskIo5Seconds


If Edge is unable to read the metrics from the disk, an alarm is sent to the Cumulocity tenant. The data point for the alarm is:

- c8y_DiskIOMeasurementError.

Network - Data and packets sent per second
- Data and packets received per second

Unit: KB/s and packets/s
Edge sends the network metrics as a measurement at intervals over 5 seconds, 60 seconds, and 600 seconds. The data points for this measurement are:

- c8y_NetworkInterface_lo-5Seconds

- c8y_NetworkInterface_lo-60Seconds

- c8y_NetworkInterface_lo-600Seconds


If Edge is unable to read the metrics from the network, an alarm is sent to the Cumulocity tenant. The data point for the alarm is:

- c8y_NetworkIoMeasurementError.

To monitor the metrics in your Cumulocity tenant, you can create a dashboard and add widgets in the Cockpit application of your tenant. For more information about creating dashboards, see Working with dashboards.

Also, you can define smart rules to create alerts or raise alarms for the metrics. For example, when the free disk space is less than 5 GB, create an alert. For more information about smart rules, see Smart rules.