Apama Deploying and managing Apama applications Deploying and Managing Queries

Deploying and Managing Queries

Overview of deploying and managing query applications

Typically, query application deployments script the start up and management of all Apama query application components outside of the Apama development environment in Apama Plugin for Eclipse. Apama recommends the use of the Ant export facility of Apama Plugin for Eclipse to aid in this.

Queries can also be run from Apama Plugin for Eclipse. However, Apama Plugin for Eclipse can run only a single correlator deployment. To run multiple correlator deployments, use Apama macros for Ant.

Queries can be deployed on a single node, but typically would be deployed across multiple nodes, forming a cluster. While involving more components, a cluster provides:

Scale out across multiple hosts.
Resiliency against failures.
Continued availability if some nodes fail.

Using a cluster will involve the following:

Some number of correlators that are executing queries.
A distributed MemoryStore for storing event history. Terracotta’s TCStore is the recommended MemoryStore driver for Apama queries.

Note: Support for using BigMemory Max for Apama queries is deprecated and will be removed in a future release.
A JMS bus for distributing events to correlators.

Query application architecture

In a query deployment, incoming events are delivered to correlators, typically via a JMS message bus, such that every event is delivered to one correlator. The correlators store the event history for each query in the distributed MemoryStore. On every event, one correlator reads the latest history for the partition or partitions to which the event belongs, and writes that event to the distributed MemoryStore for access by other correlators. The entire window history is then evaluated against the query patterns.

Queries can make use of the following technologies to provide a scalable platform:

JMS queues — these are used to distribute events to multiple correlators, which automatically spreads the load across a number of servers.
TCStore, which is the recommended distributed MemoryStore — this allows state (event history) to be accessed quickly across multiple servers, and replicated to safeguard against hardware failures. This should be configured to give the desired amount of resiliency and scaled appropriately to the deployment.

It is possible to use Apama queries in a standalone mode on a single correlator. This allows easy testing by means of event files. However, all state is stored in-memory, and is lost when the correlator is stopped. Thus, this mode is only recommended for development, not for deployments.

When an event is sent to a cluster of correlators over a JMS queue, the following happens:

Each event goes to one correlator.
A received event is handled by one of several processing threads within that correlator.
The key of the event is extracted based on the definitions of running queries that use that event.
The window of events for that key value is retrieved from the distributed MemoryStore.
The current event is added to the retrieved window, which is written back to the distributed MemoryStore.
The event pattern of interest (what you are looking for) is evaluated against the stored window to determine whether there is a match.

Because events are sent to multiple threads in different correlators, small differences in timing across hosts can result in events being processed out of order. If there is a large number of events in the window, the cost of reading and writing the historic window will be excessive. Events for the same key may be processed by different correlators. Consequently, between events, the only state kept by the system is the window of historic event data.

Upon matching an event pattern, queries may send events to other monitors or to adapters. These can be shared adapters across the cluster, or more typically, adapters local to each correlator.

Deploying query applications

Apama recommends that you use the Ant export facility in Apama Plugin for Eclipse to help you deploy your query application. The general steps for deploying an Apama query application include:

In Apama Plugin for Eclipse, enable JMS support and distributed MemoryStore support. See Correlator arguments.
In Apama Plugin for Eclipse, generate an Ant deployment script. The generated files are placed in a directory that you specify. See Exporting to a deployment script.
Copy the resultant directory onto each host that will run a correlator.
If necessary, edit the environment.properties file on each correlator host.
Ensure that the distributed MemoryStore and JMS servers are running.
On each correlator host, run the Ant deployment script to start the correlator.

If the project does not contain a distributed MemoryStore configuration, a local in-process MemoryStore will be used to store events. This is not shared or persistent, so only supports a single correlator deployment. If this correlator stops, it will drop all event history data. Apama recommends Terracotta’s TCStore and a corresponding configuration for production use. See Deploying a Terracotta Server Array (TSA) and Configuring the TCStore driver.

Apama does not recommend running multiple correlators on a single machine. The assumption is that each correlator can use all of the CPU resources available. Also, running multiple correlators on one host does not provide any extra resilience. However, it is possible to run multiple correlators on a single machine. To do so:

Copy the exported deployment directory to separate directories on the correlator host machine.
Edit the environment.properties file to specify a different port number for each correlator and for each (if any) adapter in your project.

Running queries on correlator clusters

Deploying queries on multiple correlators

When using multiple correlators to deploy an Apama query application, it is the administrator’s responsibility to keep the resources of the exported project up to date. If changes are made to a query, if queries are added or removed from a project, then all correlators should be updated to reflect the new state. It is possible to inject queries into a live running correlator, or delete queries from a correlator. Make sure that the injections and deletions are performed on all correlators in the cluster. Use engine_delete -F query-name to delete a query (see also Deleting code from a correlator). Note that this will also delete any queries using that query’s output event (see also Using the output of another query as query input).

The queries runtime assumes that all members of a cluster:

Share access to the same distributed MemoryStore state - by using a TCStore or BigMemory Terracotta Server Array.

Note: Terracotta’s TCStore is the recommended MemoryStore driver for Apama queries. Support for using BigMemory Max for Apama queries is deprecated and will be removed in a future release.
Can connect freely between nodes.
Run with clocks synchronized to within 1 second of each other. Apama recommends the use of the Network Time Protocol (NTP) to synchronize clocks.

The queries runtime will nominate a single member of the cluster to be primary, which will handle book keeping tasks such as garbage collecting nodes or handling failed cluster nodes.

If a correlator member of a cluster is using external clocking, then some functionality may not be available. The members will be able to share the same data, but an externally clocked node cannot be the primary node and timers will not be failed over from an externally clocked node. In normal operation, external clocking should only be used for testing purposes on a single node (where failover and scalability is not required).

A production deployment of multiple nodes would not use external clocking for routine processing of events. Use the source timestamp feature if the events may be delayed or delivered out of order. For more information, see Using source timestamps of events.

Deploying a Terracotta Server Array (TSA)

To deploy a TCStore Terracotta Server Array, see the Terracotta documentation at https://docs.webmethods.io/on-premises/terracotta. To deploy a BigMemory Terracotta Server Array, see the BigMemory Max documentation at https://docs.webmethods.io/on-premises/bigmemory-max.

For resilient operations, Apama recommends at least one backup on a separate host. You may want to consider using multiple stripes in order to improve performance. Ensure that the BigMemory Max or Terracotta server is accessible from all cluster members.

Note: Terracotta’s TCStore is the recommended MemoryStore driver for Apama queries. Support for using BigMemory Max for Apama queries is deprecated and will be removed in a future release.

Configuring the TCStore driver

If you want to use TCStore for queries, you need to configure the TCStore driver as described below.

To configure the TCStore driver

In Apama Plugin for Eclipse, add the Distributed MemoryStore adapter bundle to your Apama project. In the Distributed MemoryStore Configuration Wizard, select Apama Queries (using TCStore) as the store provider; ApamaQueriesStore is then automatically provided as the store name. See Configuring a distributed store for more detailed information.
Adapt the list of Terracotta servers in the storeName-spring.xml file. See TCStore (Terracotta) driver details for more information.

Important

You must leave the useCompareAndSwap property in its default (true) setting for correct behavior of Apama queries.

Configuring the BigMemory Max driver

Note:

Support for using BigMemory Max for queries is deprecated and will be removed in a future release. It is recommended that you now use Terracotta’s TCStore for queries.

If you still want to use BigMemory Max for queries in new projects, you can add a BigMemory Max driver to the project as described below. Existing deployments using BigMemory Max for queries are unaffected; this only covers developing new projects in Apama Plugin for Eclipse. Keep in mind that you can no longer select the option Apama Queries (using BigMemory). This option has been replaced by the Apama Queries (using TCStore) option.

To configure the BigMemory Max driver in a new project

In Apama Plugin for Eclipse, add the Distributed MemoryStore adapter bundle to your Apama project. In the Distributed MemoryStore Configuration Wizard, select BigMemory Max as the Store provider and specify “ApamaQueriesStore” as the store name. See Configuring a distributed store for more detailed information.

Note:

If you specify a different store name or do not specify a name at all, an in-process only memory store will be used.
Check that the cluster name is set correctly for the host/port pairs of all of the BigMemory Terracotta Server Array.
Set the providerDir property to the Terracotta installation directory.
Optionally, edit the on-heap and off-heap storage and other parameters as needed (see BigMemory Max driver details).

Important

You must leave the useCompareAndSwap property in its default (true) setting for correct behavior of Apama queries.

Using JMS to deliver events to queries running on a cluster

When running queries across multiple correlators in a cluster, as well as configuring all correlators to access the same distributed MemoryStore, Apama recommends that all events are delivered into the cluster using a JMS queue. By using a JMS queue, each correlator will pull events from the JMS queue unless it has a full input queue (that is, it is behind on processing events) or has stopped running (for example, shut down for maintenance or suffered a hardware failure). In either case, events will continue to be processed by other correlators in the cluster. Correlators can also be added to or removed from the cluster to scale the cluster capacity if desired. It is also possible to use per-correlator adapters for incoming events, but the adapters must co-ordinate so that every event is sent to only one correlator, and should one adapter/correlator pair fail, then other adapters process events that the failed node would have processed. Each event should only be delivered to one correlator, else multiple correlators will store the event in the shared cache, which can result in erroneous matches. Using JMS queues, this happens automatically, giving an “elastic” system that can be scaled and continues running in the face of failure.

To run queries across multiple correlators in a cluster:

Configure each correlator to access the same distributed MemoryStore. This is a requirement.
Use a JMS queue to deliver events into the cluster. This is a recommendation.

When the cluster uses a JMS queue, each correlator pulls events from the queue. If the input queue of one correlator in the cluster becomes full and it cannot pull events from the JMS queue, the other correlators continue to do so and continue to process events. A correlator may stop pulling events because the correlator is behind on processing events or because it has stopped running, perhaps for maintenance or because of a hardware failure.

Using a JMS queue makes it easy to scale the cluster capacity by adding or removing correlators.

An alternative to using a JMS queue is to use an adapter for each correlator. For example, by having an IAF-based adapter connected to each correlator, it is possible to send messages to and from a query application without using JMS. A disadvantage of using per-correlator adapters is that the adapters must coordinate the following:

Each event goes to only one correlator in the cluster. If an event goes to more than one correlator, then multiple correlators store the same event in the shared cache. This can result in erroneous matches.
Should one adapter/correlator pair fail, then the other adapters process the events that the failed node would have processed.

Use of a JMS queue automatically ensures that an event goes to only one correlator and that all received events are processed. The result is an “elastic” system that can be scaled and that continues to run even if a node fails.

Similar to using multiple contexts in a correlator, delivering events through JMS can result in events that occur close together in time being processed in an order that is different than the order in which they were created or sent to the JMS message bus.

Messages may be lost in the event of node failure, unless you have configured JMS for reliable message delivery (see also Handling node failure and failover).

Configure your JMS bus to have one or more queues, and configure a static JMS receiver connection. See Getting started with simple correlator-integrated messaging for JMS. You will also need to provide mapping for all event types that flow into the queries. See Mapping Apama events and JMS messages.

The queries runtime ensures that after all queries have been injected into the correlator and started, they automatically start to receive events from JMS queues. There is no need to explicitly call jms.onApplicationInitialized() as described in Using EPL to send and receive JMS messages.

For all applications that do not consist entirely of queries, for example, applications that contain additional EPL monitors or Java monitors, then it may be required to delay starting JMS until the application and queries are both ready to process events. The auto-starting of JMS behavior of queries can be controlled by sending a QueriesShouldNotAutoStartJMS() event to the main context. This event can be routed by an application’s onload() method. If this is done, then a monitor in the main context should listen for a QueriesStarted() event and should wait until both the application and queries have started. The monitor can then call jms.onApplicationInitialized() directly. For example, the following monitor delays starting JMS until queries are started and a StartMyApp() event has been processed:

using com.apama.queries.QueriesShouldNotAutoStartJMS;
using com.apama.queries.QueriesStarted;
event StartMyApp {
}
monitor MyApp {
   import "JMSPlugin" as jms;
   action onload() {
      route QueriesShouldNotAutoStartJMS();
      on QueriesStarted() and StartMyApp() {
         jms.onApplicationInitialized();
      }
   }
}

Mixing queries with monitors

It is possible to have both monitors and queries in a project.

Events that are to be processed by queries should be sent to the com.apama.queries channel from monitors. Queries may send events to any channel which EPL monitors may be subscribed to.

While queries will automatically scale and share state across a cluster, EPL monitors will not. Thus, be aware that a query may process subsequent events matching a pattern on different nodes. On different nodes, monitors with potentially different state will be executing. Similarly, the state of EPL monitors is not automatically stored in the distributed MemoryStore.

Both EPL monitors and Apama queries can make use of actions defined on events, subject to some limitations on the use of spawn, die, and event listeners. See Restrictions in queries.

Handling node failure and failover

A node may stop processing events from time to time. This may be because it is stopped for planned maintenance, or the node failed in some way. In these cases:

Events that have been delivered to the node but not yet processed will be lost. This will typically be a small window of events.

This does not apply if you are sending and receiving events via JMS where you have configured JMS for reliable messaging. See Avoiding message loss with JMS for more information.
If using JMS, then events continue to be delivered to and processed by other correlators in the cluster. The failed correlator will not hold up processing on other nodes. Other nodes continue processing events, including matching against events that the failed node had previously received (if they had been processed).
Any clients connected to the failed correlator will need to re-connect to another correlator. The same set of parameterized query instances is kept in synchronization across the cluster. See Managing parameterized query instances.

Similarly, nodes running a Terracotta Server Array may fail. For this reason, TCStore or BigMemory Max should be configured with sufficient backups to ensure no data is lost in this case. See https://docs.webmethods.io/on-premises/terracotta or https://docs.webmethods.io/on-premises/bigmemory-max for detailed information.

Avoiding message loss with JMS

If all of your incoming and outgoing events are received/sent via correlator-integrated JMS (see also Correlator-integrated support for the Java Message Service (JMS)) and if this has been configured with APP_CONTROLLED receivers and BEST_EFFORT senders (see also Sending and receiving reliably without correlator persistence), then no events are lost in the event of a node failure. Any events that have been delivered from JMS to queries on that node are then handled by another node if they had not been fully processed before the failure. Any events sent to JMS by queries on that node are delivered by another node if they had not been successfully delivered before the failure.

This only works if the queries (or a chain of queries) are receiving events directly from JMS receivers and are sending their output directly to JMS senders. There are no guarantees if EPL monitors are processing query input or output, interposing themselves between the queries and JMS.
No EPL monitors in the same correlator should be performing acknowledgments to APP_CONTROLLED receivers themselves, as those receivers are entirely under the control of the queries runtime.
Incoming events may be delivered twice or be delivered out of order during the failover window. This is the time between the node failure and the cluster (including the JMS broker) detecting the failure/disconnection. It is your responsibility to make sure that your queries are not sensitive to duplicates or re-ordering within this failover window.
Outgoing events may also be delivered in duplicate during the failover window.
Queries using source timestamps (see Using source timestamps of events) cannot make use of JMS reliable messaging.

You should also ensure that the JMS broker does not lose messages in the case of a broker failure. Make sure that all JMS senders have their messageDeliveryMode property set to PERSISTENT, as well as doing any necessary broker-specific configuration on the broker itself.

Note: Reliable messaging will not take effect unless your queries are exclusively using correlator-integrated JMS as their message source and destination. It does not apply when using connectivity plug-ins as your event source or destination (even if they support reliable messaging).

Managing parameterized query instances

Creating new query instances by setting parameter values

Use the Scenario Browser to set parameter values for a parameterized query and thus create new parameterized query instances, also referred to as parameterizations. See also Using the Scenario Browser view.

Changing parameter values for queries that are running

Use the Scenario Browser to change the parameter values for a running parameterized query instance, also referred to as a parameterization. See also Using the Scenario Browser view.

Monitoring running queries

To help you monitor queries that are running on a given correlator, Apama provides data about active queries in DataViews. To display the information provided by these DataViews, you can create a dashboard in which an end user can:

Monitor query runtime performance.
Determine whether a query is behaving as intended. For example, you can see how incoming events are distributed across partitions. If you are expecting a particular send and match rate, you can see if you are getting the results you expect.
Ensure that the window size (the number of events in the window) is not too large. The expectation is that your application is designed so that partitioning keeps any given window size as small as possible.

The Queries_Statistics_Sample that is provided with Apama (located in the \samples\queries directory of your Apama installation) contains such a dashboard. It shows you how to build a dashboard that allows you to monitor the performance of running queries.

For information about exposing DataViews in dashboards, see Building dashboard clients.

A running query is either a non-parameterized query instance or a parameterization. For each running query, there is a DataView for each of its input event types. For example, if a query instance has two input event types, then there are two DataViews that provide statistics for that query, one for each input event type.

Each DataView:

Contains data about the activity during the last second of one running query and one of its input event types.
Contains the fields described in the table below. The value contained in each field is an exponentially weighted moving average (EWMA).
Is updated every 10 seconds by default if the information has changed since the last update.

By sending a SetQueryStatisticsPeriod event, you can control the frequency of the statistics gathering or disable query statistics entirely. For example, to update the query statistics every second:
```
com.apama.queries.SetStatisticsUpdatePeriod(1,1)
```
To disable query statistics entirely:
```
com.apama.queries.SetStatisticsUpdatePeriod(0,0)
```

Statistical Field	Description
`% Threads Active EWMA`	Apama uses multiple threads to process a given query. This is the percentage of those threads that were used within the last second to process the input event type that this DataView provides information for. While there is not a linear correlation, as this percentage goes down, the reliability of the rest of the statistics becomes weaker. This is because a smaller proportion of threads are contributing information.
`Avg. Inbound Event Rate/s EWMA`	The average rate per second at which events of this type are being processed.
`Avg. % of Successful Matches EWMA`	The average percentage of the number of received events that cause a match.
`No. Unique Keys Processed EWMA`	The number of unique query partitions that were accessed for this event type within the past second.
`Avg. Window Size/Key EWMA`	The average window size (number of events that it contains) of each unique partition that was accessed within the past second.

The display name of these DataViews is Correlator Query Statistics.

After a non-parameterized query is injected into the correlator, Apama provides a DataView for each input event type and begins writing data to it. After a non-parameterized query is deleted, Apama no longer makes the DataViews for that query instance available.

For a parameterized query, after a parameterization is created, then Apama adds new DataViews and begins populating them. When a parameterization is deleted, then Apama no longer provides the DataViews that correspond to that parameterization. If the definition of a parameterized query is deleted, then Apama no longer provides DataViews for any parameterization of that query.

To help you monitor queries that are running across multiple correlators in a cluster, Apama also provides the same type of performance statistics provided for a given correlator but where the underlying data has been aggregated across all the clustered correlators running those queries.

The display name of these DataViews is Cluster-Wide Query Statistics.

This means that for each query running on a correlator, two types of monitoring data are provided:

Statistics generated from data from only that correlator.
Statistics generated from data aggregated across all correlators in the cluster running that same query.