Handling personal data "at rest" in the correlator input log file

The correlator has an optional input log, which when enabled records all incoming Apama events to a text file on disk (see also Replaying an input log to diagnose problems). This can be very useful for diagnosing and reproducing problems experienced in a production environment, for use when you are debugging your application, or by Cumulocity product support.

Many of the same considerations apply to the input log as to other correlator log files: it is essential to protect the contents by setting appropriate file system permissions on the input log files.

As with the other log files, it is possible to periodically rotate the input log as described in Rotating an input log file. Note that an incomplete input log is useless. So if rotation is in use, it is important that all previous input logs from a given invocation of the correlator are able to be retrieved if necessary from secure backups or archives (that is, they have not been deleted). It is necessary to manually concatenate the input log files before they can be used with the extract_replay_log.py script.

Input logs are read-only logs (not databases), and the file format is not intended for modification by users, so rectification of personal data is not relevant.

In some cases, you may wish to implement erasure of personal data in input logs, particularly if they are being kept for a long time. Input logs contain every Apama event sent into the correlator. The input log file format is not intended for consumption or editing by customers. However, in practice it is usually safe to remove individual lines that start with “EVNT” (indicating an incoming event), and this could be used to strip out lines containing the personal data of a particular user on request. It is possible that removing lines from the input log will prevent it from accurately replaying the original behavior, but in most cases where the processing of different users is fairly independent, it is likely to work adequately.

If correlator persistence is enabled, then the input log contains a copy of the persistence datastore at the point when the correlator was started. It is not possible to provide any erasure of data in the persistence datastore, so this approach is only possible when persistence is disabled.