Documenting personal data flows within an Apama application

You are strongly encouraged to write and maintain a personal data register, a document describing the places where personal data is handled within your application, how long it is stored for, and how that data is protected through access controls and policies. This should also describe the flows of personal data, for example, being explicit about when personal data is passed to an EPL plug-in such as the MemoryStore, or sent outside the correlator to another system such as a JMS topic or queue, a database or an external MemoryStore. This register will be useful for demonstrating compliance with regulations, and also for enabling an effective response should any data breach occur.

In addition to documenting occurrences of standard “personally identifiable information” (PII) such as usernames and IP addresses, any “sensitive personal information” (SPI) such as information about medical conditions should be explicitly called out, since additional regulations and a higher level of caution may apply for such data.

At a minimum, we recommend writing ApamaDoc comments on all event definitions and monitors that handle personal data. See Generating documentation for your EPL code for more information about writing ApamaDoc. Using ApamaDoc as a starting point for describing personal data flows has several advantages:

  • It ensures that developers working on the application are aware of the regulatory implications of changes they make to code that involves personal data.
  • It helps developers to be more mindful of the need to write the application code in a way that minimizes the amount of personal data held and transferred between parts of the system.
  • It reduces the chance of personal data documentation getting out of sync with the codebase, as the application evolves.
  • The HTML that can be generated from the ApamaDoc comments provides a great starting point for writing and maintaining other documents regarding personal data that your organization’s compliance policies may require.

The conventions and guidelines for ApamaDoc commenting should be defined by the customer developing the Apama application, and checked through a code review process.

See below for an example illustrating a possible use of ApamaDoc to describe personal data storage and flows:

/** Represents an incoming order to be handled by Apama.

  Contains "personal data" in the "username" and "ip" fields.
  This personal data is stored in the OrderManager monitor, is
  stored in the on-disk MemoryStore table "MyOrders",
  and is also sent out to the JMS "OrderAlert" topic.
  @see OrderManager
  */
event MyOrder
{
  string orderId;
  float timestamp;

  /** This field contains "personal data". */
  string username;

  /** This field contains "personal data". */
  string ip;
}

/** Manages orders.

   Holds "personal data" identifying customers who placed orders, and
   stores this data in the on-disk MemoryStore table "MyOrders".

   @listens MyOrder Receives incoming orders containing "personal data".
   @sends OrderAlert Sends "personal data" to the JMS "OrderAlert" topic.
*/
monitor OrderManager
{
   /** Contains "personal data". */
   dictionary<string, MyOrder> orders;
...
}

If you intend to use the HTML ApamaDoc to summarize personal data uses, make sure that the @private tag is not applied to any fields containing personal data, which would suppress them from the HTML documentation.