Apama provides several standard codec IAF plug-ins for your convenience, which can be used for testing or in combination with custom plug-ins. They are described in the topics below.
The compiled binaries for all the standard plug-ins are available in the \bin and \lib directories (for the C and Java versions respectively).
The String codec IAF plug-in
The StringCodec/JStringCodec codec plug-ins read transport events as simple text strings and breaks them into fields, names and values, using delimiter strings supplied by configuration properties.
Events are assumed to have the following general format:
where <name> corresponds to the field name, followed by a delimiter character or string <sepA>, followed by the field’s value, <value>. The complete <name> and <value> pair is then separated from another such sequence by a <sepB> delimiter. This pattern is assumed to repeat itself.
Fields with empty values are permitted. Because the terminator is optional, the codec will consume names and values up to the end of the input string if no terminator is found.
In order to load this plug-in, the <codecs> element in the adapter’s configuration file must load the StringCodec library (this represents the filename of the library that implements the plug-in). Note that for the Java version, the full path to the plug-in’s .jar file must be specified.
The NullCodec/JNullCodec codec layer plug-ins are very useful in situations where it does not make sense to decouple the transport and codec layers. The transport layer plug-in might be best placed to perform all the necessary encoding and/or decoding of events, and to supply and receive Apama normalized events, rather than custom transport-specific messages.
The Null codec plug-in is provided to make it easy to develop such transport plug-ins. This is a trivial codec layer plug-in that passes downstream normalized events from the transport layer to the Semantic Mapper, and upstream normalized events from the Semantic Mapper to the transport layer with no modification.
In order to load this plug-in, the <codec> element in the adapter’s configuration file needs to load the NullCodec or JNullCodec library (this represents the filename of the library that implements the plug-in). Note that for the Java version, the full path to the plug-in’s .jar file needs to be specified.
The NullCodec and JNullCodec plug-ins can only be used with transport plug-ins that understand NormalisedEvent objects. The Null codec plug-ins expect downstream NormalisedEvent objects from the transport and pass upstream NormalisedEvent objects it receives directly to the transport plug-in. Using the Null codec plug-ins with a transport that expects any other kind of object does not work and can possibly crash the adapter.
Null codec transport-related properties
This codec plug-in supports standard Apama properties that are used to specify the name of the transport that will send upstream messages.
Transport-related properties
transportName. This property specifies the transport that the codec should send upstream events to. The property can be used multiple times. The codec maintains a list of all transport names specified in the IAF configuration file. A transportName property with an empty value is ignored by the codec.
If no transports are provided in the configuration file then the codec saves the last added EventTransport as the default transport. An upstream event is sent to the default transport if no transport information is provided in the normalized event or in the IAF configuration file.
transportFieldName. This property specifies the name of the normalized event field whose value gives the name of the transport that the codec should send the upstream event to. You can also provide a transport name by specifying a value in the __transport field. Empty values of these fields are ignored and treated as if not present.
removeTransportField. The value of this property specifies whether the transport related fields should be removed from the upstream event before sending it to transport. The default value is true. If the property is set then the field specified by the transportFieldName property and the field named __transport are removed from the upstream event if they are present. Values ‘yes’, ‘y’, ‘true’, ‘t’, ‘1’ ignore cases and are treated as true for this property; any other value is treated as false.
Null codec upstream behavior
The plug-in’s behavior when an upstream event is received proceeds in this order:
The codec gets the name of the field that contains the transport name from the value of transportFieldName property. From the specified field, the codec then gets the transport name and sends the event to that transport. If the transportFieldName property is not specified, if the value of the property is empty, if the field is not present in the event, or if the transport name is empty then codec tries [2].
For example, the following configuration specifies two transports and the filter codec specifies a transport field named TRANSPORT:
The IAF can now route any upstream event that defines a TRANSPORT field to one of these two transports. The value of the TRANSPORT field, either MARKET_DATA or ORDER_MANAGEMENT, determines the transport. Note: If the removeTransportField property is set true or not defined, then the TRANSPORT field and __transport will be removed (if present) from the upstream event before sending it to transport.
The codec gets the transport name from the _transport field of the normalized event and sends the event to specified transport. If the _transport field is not present or if the transport name specified is empty, the codec then tries [3].
For example, in the above configuration, consider an upstream event that does not have a TRANSPORT field or the value of the field is empty. If this event has a value in the __transport field of either MARKET_DATA or ORDER_MANAGEMENT, then that value determines the transport.
The codec loops through all transports specified in the transportName property and sends the event to the transport. If no transport is specified then the codec tries [4]. Note that the codec ignores all transport names that are empty.
If an exception occurs while sending the event to any transport, then the codec logs the exception and continues sending events to the remaining transports. If the codec was able to send the event to at least one transport, then it does not throws an exception; otherwise, it throws the last captured exception.
For example, the following configuration specifies two transports:
In this example, the codec has not defined the transportFieldName property. The IAF will route any upstream event that does not contain a __transport field or has empty value in that field to the ORDER_MANAGEMENT transport.
If a default transport name is present, then the codec sends the event to that transport. The default transport is the last-added transport. If a default transport is also not found then, it throws an exception.
The Filter codec IAF plug-in
The Apama Filter codec plug-ins filter normalized event fields. You can use the Filter codec to:
Route upstream events to particular transports
Remove particular fields from upstream and/or downstream events
To use the Filter codec, the FilterCodec or JFilterCodec library must be available to the IAF at runtime. These are the filenames of the C++ and Java libraries that implements the plug-in.
In order to load this plug-in, the <codec> element in the adapter’s configuration file needs to load either the FilterCodec or JFilterCodec library. Note that for the Java version, the full path to the plug-in’s .jar file needs to be specified.
Details for replacing the variables in the above codec section are in the topics below.
Filter codec transport-related properties
This codec plug-in supports standard Apama properties that are used to specify the name of the transport that will send upstream messages.
Transport-related properties
transportName. This property specifies the transport that the codec should send upstream events to. The property can be used multiple times. The codec maintains a list of all transport names specified in the IAF configuration file. A transportName property with an empty value is ignored by the codec.
If no transports are provided in the configuration file then the codec saves the last added EventTransport as the default transport. An upstream event is sent to the default transport if no transport information is provided in the normalized event or in the IAF configuration file.
transportFieldName. This property specifies the name of the normalized event field whose value gives the name of the transport that the codec should send the upstream event to. You can also provide a transport name by specifying a value in the __transport field. Empty values of these fields are ignored and treated as if not present.
removeTransportField. The value of this property specifies whether the transport related fields should be removed from the upstream event before sending it to transport. The default value is true. If the property is set then the field specified by the transportFieldName property and the field named __transport are removed from the upstream event if they are present. Values ‘yes’, ‘y’, ‘true’, ‘t’, ‘1’ ignore cases and are treated as true for this property; any other value is treated as false.
Filter codec upstream behavior
The plug-in’s behavior when an upstream event is received proceeds in this order:
The codec gets the name of the field that contains the transport name from the value of transportFieldName property. From the specified field, the codec then gets the transport name and sends the event to that transport. If the transportFieldName property is not specified, if the value of the property is empty, if the field is not present in the event, or if the transport name is empty then codec tries [2].
For example, the following configuration specifies two transports and the Filter codec specifies a transport field named TRANSPORT:
The IAF can now route any upstream event that defines a TRANSPORT field to one of these two transports. The value of the TRANSPORT field, either MARKET_DATA or ORDER_MANAGEMENT, determines the transport. Note: If the removeTransportField property is set true or not defined, then the TRANSPORT field and __transport will be removed (if present) from the upstream event before sending it to transport.
The codec gets the transport name from the _transport field of the normalized event and sends the event to specified transport. If the _transport field is not present or if the transport name specified is empty, the codec then tries [3].
For example, in the above configuration, consider an upstream event that does not have a TRANSPORT field or the value of the field is empty. If this event has a value in the __transport field of either MARKET_DATA or ORDER_MANAGEMENT, then that value determines the transport.
The codec loops through all transports specified in the transportName property and sends the event to the transport. If no transport is specified then the codec tries [4]. Note that the codec ignores all transport names that are empty.
If an exception occurs while sending the event to any transport, then the codec logs the exception and continues sending events to the remaining transports. If the codec was able to send the event to at least one transport, then it does not throws an exception; otherwise, it throws the last captured exception.
For example, the following configuration specifies two transports:
In this example, the codec has not defined the transportFieldName property. The IAF will route any upstream event that does not contain a __transport field or has empty value in that field to the ORDER_MANAGEMENT transport.
If a default transport name is present, then the codec sends the event to that transport. The default transport is the last-added transport. If a default transport is also not found then, it throws an exception.
Specifying filters for the Filter codec
You specify each filter as a codec property. The Filter codec plug-in applies each filter you specify to incoming and outgoing events as they pass through the codec. The property name identifies the field(s) that the filter applies to and the property value specifies the condition that must be true for the filter to operate.
The following filter removes the name field from upstream and downstream events when the value of the name field is NULL:
<property name="filter.both.name" value="NULL"/>
In upstream events, the following filter removes each field in which the value is 55:
<property name="filter.upstream" value="55"/>
In upstream and downstream events, the following filter removes each field in which the value is <remove>:
<property name="filter" value="<remove>"/>
The XML codec IAF plug-in
The Apama XML codec converts messages between the following two formats:
IAF normalized event whose field values are strings that contain XML data.
Normalized event in which each field is a name/value pair. These unordered fields contain elements, attributes, CDATA, and text.
To use the XML codec, you must add some information to the IAF configuration file and then set up the classpath. After you do this, you can launch the adapter by running the IAF executable.
For an example configuration file, see adapters\config\XMLCodec-example.xml.dist in the Apama installation directory. This file can be changed as required for the purposes of your data and the content added to the adapter configuration file in which the codec is to be used.
Use the information in the topics below to help you configure the XML codec.
Supported XML features
The XML codec can convert messages that contain the following:
Elements
Attributes
Text nodes
CDATA nodes, including CDATA nodes that contain an XML document to be parsed
CDATA nodes are supported only in the downstream direction.
Namespace prefixes and definitions (only basic support)
XPath expressions, including functions
Result types of XPath expressions must be simple. For example,
string contains();
The XML codec cannot convert XML data that contains the following XML features:
Document type specifiers
Processing instructions
Notations and entities
XML with more than one top-level (root) element
Node or nodeset XPath expressions
For Node or nodeset XPath expressions, only the first match is returned.
Adding XML codec to adapter configuration
To include the XML codec in the adapter configuration, add the following to the <codecs> section of the IAF configuration file:
<codec name="XMLCodec"
className="com.apama.iaf.codec.xml.XMLCodec"
jarName="@ADAPTERS_JARDIR@\XMLCodec.jar"
>
<!-- Properties go here -->
</codec>
Typically, @ADAPTERS_JARDIR@ is the APAMA_HOME\adapters\lib directory.
To use the XML codec, ensure the following JAR files in the APAMA_HOME\lib directory are in the adapter classpath when you run the IAF.
ap-iaf-extension-api.jar
ap-util.jar
jdom.1.0.jar
If the XML codec JAR file is in the APAMA_HOME\adapters\lib directory, you are all set. The IAF finds these dependencies automatically. Otherwise, set the classpath either as an environment variable or in the <java> section of the IAF configuration file.
About the XML parser
On startup, the XML codec logs the names of the classes it is using for XML parsing and XML generation. For example:
INFO [11808] - XMLCodec: Encoder initialized: using XML Document builder
'org.apache.xerces.jaxp.DocumentBuilderImpl'
INFO [11808] - XMLCodec: Decoder initialized: using Streaming API for XML (StAX)
'com.ctc.wstx.stax.WstxInputFactory'
Apama uses Xerces for encoding (creating XML docs) and Woodstox StAX for decoding (parsing).
XML namespace support
If your application relies on the standard XML parsing/generation behavior (that is, not XPath) there is no concept of “declaring namespaces” in the XML codec nor is it required as long as the XML document is valid (that is, it declares any namespace prefixes it uses) then you can just use namespaceprefix:elementName when referring to elements in your mapping rules. If there is any doubt, you can run your sample message through the XMLCodec property logFlattenedXML=true and it will show you what to specify in your mapping rules, for example, consider the following sample message:
If you use XPath in your application, XPath itself contains operators to access the local (non-namespace) name and namespace URI of any XML content. However it is often convenient to define some global prefixes to make it easier to refer to namespaced elements. Apama supports this by allowing any number of XPathNamespace:myprefix codec properties, whose value is the URN that the specified prefix should point to. For example,
In the XML codec section of the IAF configuration file, you can set a number of XML properties. For details about setting properties in the IAF configuration file, see Plug-in <property> elements.
When you reload the IAF, any changes to these configuration properties take effect in the codec. In addition to specifying these properties, you must also set up event mappings for XML messages. See Event mappings configuration.
Properties are described in the topics below.
Required XML codec properties
The XML codec requires you to set the XMLField and transportName properties. All other properties are optional.
XMLField — This property identifies the field name that XML will be read from when decoding, and will be written to when encoding. The flattened XML representation is stored in fields with names prefixed with the value you specify for the XMLField property.
When you are familiar with how the XML codec behaves, you can specify the XMLField property multiple times to parse/generate multiple XML documents per event. Parsing follows the order in which XMLField properties appear, and generating XML follows the reverse order.
It is possible to use this mechanism to parse an XML string embedded as CDATA in another XML string. To do this, specify the flattened field name of the CDATA node as an XMLField. However, note that sequence fields across separate CDATA nodes are not supported.
transportName — The XML codec sends upstream events to the transport that this property identifies. This transport must be defined in the same IAF configuration file.
XML codec transport-related properties
This codec plug-in supports standard Apama properties that are used to specify the name of the transport that will send upstream messages.
Transport-related properties
transportName. This property specifies the transport that the codec should send upstream events to. The property can be used multiple times. The codec maintains a list of all transport names specified in the IAF configuration file. A transportName property with an empty value is ignored by the codec.
If no transports are provided in the configuration file then the codec saves the last added EventTransport as the default transport. An upstream event is sent to the default transport if no transport information is provided in the normalized event or in the IAF configuration file.
transportFieldName. This property specifies the name of the normalized event field whose value gives the name of the transport that the codec should send the upstream event to. You can also provide a transport name by specifying a value in the __transport field. Empty values of these fields are ignored and treated as if not present.
removeTransportField. The value of this property specifies whether the transport related fields should be removed from the upstream event before sending it to transport. The default value is true. If the property is set then the field specified by the transportFieldName property and the field named __transport are removed from the upstream event if they are present. Values ‘yes’, ‘y’, ‘true’, ‘t’, ‘1’ ignore cases and are treated as true for this property; any other value is treated as false.
XML codec upstream behavior
The plug-in’s behavior when an upstream event is received proceeds in this order:
The codec gets the name of the field that contains the transport name from the value of transportFieldName property. From the specified field, the codec then gets the transport name and sends the event to that transport. If the transportFieldName property is not specified, if the value of the property is empty, if the field is not present in the event, or if the transport name is empty then codec tries [2].
For example, the following configuration specifies two transports and the filter codec specifies a transport field named TRANSPORT:
The IAF can now route any upstream event that defines a TRANSPORT field to one of these two transports. The value of the TRANSPORT field, either MARKET_DATA or ORDER_MANAGEMENT, determines the transport. Note: If the removeTransportField property is set true or not defined, then the TRANSPORT field and __transport will be removed (if present) from the upstream event before sending it to transport.
The codec gets the transport name from the _transport field of the normalized event and sends the event to specified transport. If the _transport field is not present or if the transport name specified is empty, the codec then tries [3].
For example, in the above configuration, consider an upstream event that does not have a TRANSPORT field or the value of the field is empty. If this event has a value in the __transport field of either MARKET_DATA or ORDER_MANAGEMENT, then that value determines the transport.
The codec loops through all transports specified in the transportName property and sends the event to the transport. If no transport is specified then the codec tries [4]. Note that the codec ignores all transport names that are empty.
If an exception occurs while sending the event to any transport, then the codec logs the exception and continues sending events to the remaining transports. If the codec was able to send the event to at least one transport, then it does not throws an exception; otherwise, it throws the last captured exception.
For example, the following configuration specifies two transports:
In this example, the codec has not defined the transportFieldName property. The IAF will route any upstream event that does not contain a __transport field or has empty value in that field to the ORDER_MANAGEMENT transport.
If a default transport name is present, then the codec sends the event to that transport. The default transport is the last-added transport. If a default transport is also not found then, it throws an exception.
Message logging properties
logFlattenedXML — If true, the IAF log contains a list of the name/value pairs generated by the XML codec when flattening XML received from the transport, at CRIT level. Each field is on a different line, which makes it easy to see what fields are being generated and what the mapping’s transport field names should be set to. Turning this on in production impacts performance. The default is false.
logAllMessages — If true, the IAF log contains the full contents of every message sent upstream or downstream, before and after encoding, and before and after decoding, all at CRIT level. Turning this on in production impacts performance. The default is false.
Downstream node order suffix properties
generateTwinOrderSuffix — If true, all field names for text, CDATA and element nodes are appended with "", “[2]”, “[3]”, and so on. The number specifies the position of this node relative to ’twins’, that is, nodes of the same type and name. These order suffixes provide a partial order for the XML nodes. Note that the first child node with a given name is defined to have no suffix (rather than an explicit “[1]”), to improve readability. The default is false.
Use this property when you need to map fields without sensitivity to the precise order in which differently named nodes appear in the XML. This is probably a more useful option than setting the generateSiblingOrderSuffix property for most users of the XML codec.
generateSiblingOrderSuffix — If true, all field names for text, CDATA and element nodes (except the root element) are appended with “#1”, “#2”, and so on. The number specifies the position of this node relative to all its siblings (of any type, such as element or CDATA.). These order suffixes provide a total order for the XML nodes. The default is true.
Use this property when you need to map fields using the precise order in which differently named nodes appear in the XML, or for total control over node ordering when generating XML upstream.
You can set both node order properties to true. For sample output when both are set to true, see Examples of conversions. The default values of these two properties may change in a future release, so the recommendation is to explicitly specify both properties according to the behavior required.
Additional downstream properties
XPath: *XMLField* -> *ResultField* — The value of this property specifies an XPath expression that should be evaluated for the specified *XMLField*, with the result put into the *ResultField* in the normalized event. Only simple data types (boolean/float/string) can be returned at present, so XPath expressions that match multiple nodes only return the first matching node. See XPath examples.
trimXMLText — If true, the XML codec removes any leading or trailing whitespace characters from XML text data in downstream messages before adding the text to the normalized event. The default is true.
Sequence field properties
sequenceField — The value of this property is a field that is treated as a sequence. This means that all XML nodes that match this name are translated to a single entry in the normalized event, in the form of an EPL sequence of type string. The element name should be a plain name, without a node order suffix. In other words, the value of this property and the field in the outgoing event should be in the form: elementA/elementB/@attrib. You can specify this property multiple times.
ensurePresent — This property specifies an attribute, text string or CDATA node of an element that will be added to the output event as a blank string even if it is not present in the XML. This is mostly useful for fields identified with the sequenceField property, as empty strings get added to the sequence for optional attributes. You can specify this property multiple times.
separator: *elementName* — Whenever the specified element occurs in the XML message, the value of this property is prepended to any sequences in nodes below the specified element. See Sequence field example.
Upstream properties
indentGeneratedXML — If true, the generated XML is indented to make it easier to read. The default is false.
omitGeneratedXMLDeclaration — If true, the <?xml... ?> declaration at the start of the generated XML is not included. The default is false.
Performance properties
skipNullFields — A boolean that indicates whether you want the XML codec to omit nodes with null values from downstream, flattened, normalized events. Specify true to omit nodes with null values. The default is false.
The skipNullFields property applies to the name/value pairs for XML elements themselves. These have no associated data, so generating normalized event fields for them is not necessary unless they are required for ID rules. The skipNullFields property does not apply to a node whose value is an empty string.
Setting skipNullFields to true has no effect on the ordering suffixes that the codec adds to nodes. For example, consider an XML element that is deep within an XML hierarchy such as the following:
<root>
<a>
<b>
<c>
I want this string
</c>
</b>
</a>
</root>
In the downstream direction, the XML codec creates a normalized event that contains a dictionary of name/value pairs that includes an entry for each element. If you specify sibling suffixes and Test as the XML field name, the dictionary contains the following:
{ "Test.root/":null,
"Test.root/a#1/":null,
"Test.root/a#1/b#1/":null,
"Test.root/a#1/b#1/c#1/":null,
"Test.root/a#1/b#1/c#1/text()#1:"I want this string" }
Unless you require one of the null value fields for an ID rule, you do not need the null value fields. If you set skipNullFields to true, the XML codec drops the null value fields from the normalized event. In this example, the result is a dictionary with one entry:
{ "Test.root/a#1/b#1/c#1/text()#1:"I want this string" }
As you can see, this is much more lightweight. Turning this feature on can sometimes improve throughput by up to 1.5 times.
parseNode — Specify this property one or more times to identify only those nodes that you want parsed, flattened, and added to the normalized event.
By default, the XML codec parses, flattens, and adds all nodes to the normalized event. If you specify one or more parseNode property entries, the XML codec processes only the node or nodes specified by a parseNode property.
The value of a parseNode property can be any node path. The codec ignores order suffixes (#*n* or [*n*]) if you specify them in node paths. In other words, the codec parses all elements of the type specified in the parseNode property.
For example, suppose the value of the XML field property is Test and you have the following XML:
<root><a>ignore me</a><b>look at me</b><c>look at me</c><b>look at me again</b><root>
You can specify the following parseNode properties:
The XML codec produces the following dictionary entries:
"Test.root/b#1/text()#1" = "look at me"
"Test.root/c#2/text()#1" = "look at me"
"Test.root/b#3/text()#1" = "look at me again"
As you can see, the XML codec ignores the [99999999999] suffix.
Typically, you would specify the following parseNode properties:
For each mapping rule, specify a parseNode property whose value is the transport field for that rule.
For each ID rule in the adapter configuration file, specify a parseNode property whose value is the field name.
It is not necessary to specify parseNode properties for nodes identified by sequenceField or separator:*elementName* properties.
Setting the parseNode property prevents some nodes from being parsed. Consequently, the order of subsequent nodes might change, and therefore they would have different node order suffixes. For this reason, you probably want to set the logFlattenedXML property to true to see in what order suffixes are being generated before you add parseNode properties. Then add the parseNode properties and update the node paths used in mapping and ID rules as needed.
Specifying parseNode properties instead of parsing the entire document can result in very substantial throughput improvements. This is especially true for documents in which only a small proportion of the XML is actually going to be mapped.
Description of event fields that represent normalized XML
As mentioned before, a single XML field on the transport side is represented on the correlator side as a series of name/value fields, all prefixed by the value you specified for the XMLField property. This section describes how the XML codec names fields, based on the XML data.
Note, any field not specified as an XMLField for the XMLCodec will pass through the system as normal. These fields are not dropped/ignored.
If there is any uncertainty about the correct transport field names to use in the IAF mapping rules, try setting the logFlattenedXML codec property to true.
To preserve XML node ordering information, the codec adds ordering information to node names by appending a suffix according to the suffix generation mode enabled — either “”, #2, #3, and so on or [1], [2], [3], and so on.
The #*n* sibling format provides a total ordering across all child nodes under a given parent, specifying each node’s position relative to all of its sibling nodes. This suffix mode is the default. To turn it off, set the generateSiblingOrderSuffix codec property to false. Note that the root node never has a sibling order suffix because only one root exists. Sample field names:
The twin [*n*] format is insensitive to the order in which nodes appear as long as they have different names, and it specifies a node’s position relative to its twin nodes. (Twins are siblings with the same node name.) This suffix mode is disabled by default (for backwards compatibility). To turn it on, set the generateTwinOrderSuffix codec property to true. To improve readability the first sibling node with a given name has no suffix. That is, the [1] suffix is implicit. Sample field names:
Note that for a message to be correctly translated in the upstream direction (from the correlator), there do not have to be enough suffixes in the event to form a total order, but any suffixes that are provided will be used. In the absence of sibling order suffixes to determine ordering of different node types, the XML codec generates the XML nodes in the following order:
Text data
CDATA
Elements
The XML codec maps XML elements, attributes, CDATA and text data as described in the following sections. In the following topics, assume that the value of the XMLField property is Test.
Elements
An XML element maps to a field with the following characteristics:
The name is separated and terminated with the slash (/) character.
The value is an empty string ("").
For example, an element B nested inside an element A is represented in the normalized event as follows:
"Test.A/B#1/" = ""
When the XML codec generates XML for upstream events, it is not a requirement to have an associated field for every element. The XML codec automatically creates ancestor XML elements when they do not have associated fields. For example, consider the following field:
"Test.A/B#1/@att" = ""
If necessary, the codec creates the A and B element nodes.
Element attributes
XML element attributes map to fields with names equal to the parent element’s field name, followed by @*att* where *att* is the name of the attribute, and the field’s value is the attribute value. For example, an attribute B of an element A with the value Hello is represented as follows:
"Test.A/@B" = "Hello"
CDATA
XML CDATA in an element maps to a field with a name equal to the parent element’s field name followed by CDATA() and a value that contains the text data. For example, an element A with CDATA " Hello " followed by sub-element B followed by CDATA " World " is represented as follows:
Text data in an XML element maps to a field with a name equal to the parent element’s field name followed by text(). The value of the field is the text data. Unless the trimXMLText is false (the default is that it is true), the codec strips leading and trailing whitespace from text data. For example, an element A that contains the text " Hello World " followed by sub-element B followed by text " ! " is represented as follows:
In the event of errors during XML parsing, the parser
Logs the errors in the IAF log file
Tries to send to the semantic mapper a flattened, normalized event that contains the remaining fields
Examples of conversions
Suppose that the value of the XMLField property is Test, and the value of the trimXMLText property is true. Consider the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<Message>
<ElementA>
Hello there
<ElementB/>
!
<ElementC/>
<![CDATA[Sample CDATA (with < and > comparison operators). ]]>
<ElementB att1="X" att2="Y">
<![CDATA[More CDATA in the same element.]]>
</ElementB>
</ElementA>
</Message>
With sibling order suffixing, this XML maps to the following normalized event fields:
"Test.Message/" =
"Test.Message/ElementA#1/" =
"Test.Message/ElementA#1/text()#1" = "Hello there"
"Test.Message/ElementA#1/ElementB#2/" =
"Test.Message/ElementA#1/text()#3" = "!"
"Test.Message/ElementA#1/ElementC#4/" =
"Test.Message/ElementA#1/CDATA()#5" =
"Sample CDATA (with < and > comparison operators). "
"Test.Message/ElementA#1/ElementB#6/" =
"Test.Message/ElementA#1/ElementB#6/@att1" = "X"
"Test.Message/ElementA#1/ElementB#6/@att2" = "Y"
"Test.Message/ElementA#1/ElementB#6/CDATA()#1" =
"More CDATA in the same element."
With twin order suffixing, the same XML maps to the following normalized event fields:
"Test.Message/" =
"Test.Message/ElementA/" =
"Test.Message/ElementA/text()" = "Hello there"
"Test.Message/ElementA/ElementB/" =
"Test.Message/ElementA/text()[2]" = "!"
"Test.Message/ElementA/ElementC/" =
"Test.Message/ElementA/CDATA()" =
"Sample CDATA (with < and > comparison operators). "
"Test.Message/ElementA/ElementB[2]/" =
"Test.Message/ElementA/ElementB[2]/@att1" = "X"
"Test.Message/ElementA/ElementB[2]/@att2" = "Y"
"Test.Message/ElementA/ElementB[2]/CDATA()" =
"More CDATA in the same element."
To construct the XML above (assuming element ordering matters, but allowing for text() concatenation), the following name/value pairs are all that is required:
"Test.Message/ElementA#1/text()#1" = "Hello there"
"Test.Message/ElementA#1/ElementB#2/" =
"Test.Message/ElementA#1/text()#3" = "!"
"Test.Message/ElementA#1/ElementC#4/" =
"Test.Message/ElementA#1/CDATA()#5" =
"Sample CDATA (with < and > comparison operators). "
"Test.Message/ElementA#1/ElementB#6/@att1" = "X"
"Test.Message/ElementA#1/ElementB#6/@att2" = "Y"
"Test.Message/ElementA#1/ElementB#6/CDATA()#1" =
"More CDATA in the same element."
With both sibling order suffixing and twin order suffixing set to true, the XML codec generates two field/value pairs for each node. For example, the same XML used in the previous two examples maps to the following:
"Test.Message/" =
"Test.Message/ElementA/" =
"Test.Message/ElementA#1/" =
"Test.Message/ElementA/text()" = "Hello there"
"Test.Message/ElementA#1/text()#1" = "Hello there"
"Test.Message/ElementA/ElementB/" =
"Test.Message/ElementA#1/ElementB#2/" =
"Test.Message/ElementA/text()[2]" = "!"
"Test.Message/ElementA#1/text()#3" = "!"
"Test.Message/ElementA/ElementC/" =
"Test.Message/ElementA#1/ElementC#4/" =
"Test.Message/ElementA/CDATA()" =
"Sample CDATA (with < and > comparison operators). "
"Test.Message/ElementA#1/CDATA()#5" =
"Sample CDATA (with < and > comparison operators). "
"Test.Message/ElementA/ElementB[2]/" =
"Test.Message/ElementA#1/ElementB#6/" =
"Test.Message/ElementA/ElementB[2]/@att1" = "X"
"Test.Message/ElementA#1/ElementB#6/@att1" = "X"
"Test.Message/ElementA/ElementB[2]/@att2" = "Y"
"Test.Message/ElementA#1/ElementB#6/@att2" = "Y"
"Test.Message/ElementA/ElementB[2]/CDATA()" = "More CDATA in the same element."
"Test.Message/ElementA#1/ElementB#6/CDATA()#1" =
"More CDATA in the same element."
Since the suffix properties are orthogonal, you can set both to true, and the XML codec generates normalized fields with each kind of suffix. This allows you to use the same instance of the XML codec for XML elements that need sibling suffixing and XML elements that need twin suffixing. While this impacts memory usage according to the amount of XML data being normalized, you can specify mapping rules to filter for the fields of interest.
If you define the following mapping rules in the IAF configuration file, you can map these normalized event fields to and from string fields in a sequence field of an Apama event.
With these property values, the XML fragment maps to the following normalized event fields:
"MyXPathResult.last-a" = "A text 2"
"MyXPathResult.first-att" = "100.1"
"MyXPathResult.first-a-text" = "A text 1"
"MyXPathResult.att>200" = "true"
"MyXPathResult.att-count" = "3"
"MyXPathResult.text-contains" = "true"
The CSV codec IAF plug-in
The CSV codec plug-in (JCSVCodec) translates between comma separated value (CSV) data and a sequence of string values. This codec (or the Fixed Width codec plug-in; see The Fixed Width codec IAF plug-in) can be used with the standard Apama File adapter to read data from files and to write data to files.
CSV format is a simple way to store data on a value by value basis. Consider an example CSV file that contains stock tick data. The lines in the file are ordered by Symbol, Exchange, Current Price, Day High, and Day Low, as follows:
In this example, each field is separated from the next by a comma. You can use other characters as separators as long as you identify the separator character for the CSV codec.
To specify a separator character other than a comma, do one of the following:
Set the separator property in the IAF configuration file that you use to start the File adapter. For example:
<property name="separator" value=","/>
If you set the separator property, the codec uses the separator you specify by default. If you do not specify the separator property, and the codec does not receive any configuration events before receiving messages to encode or decode, the codec refuses to process messages. The codec throws an exception back to the module that called it, which is either the transport or the semantic mapper depending on whether the data is flowing downstream or upstream.
Optionally, you can also set the excelCompatible property in the IAF configuration file. By default, this is set to false. If set to true, Excel compatibility mode is enabled, and double quotes are then used to match the behavior of Excel. The separator property is still required when using the excelCompatible property. For example:
For an example configuration file, see adapters\config\JCSVCodec-example.xml.dist in the Apama installation directory. The JCSVCodec-example.xml.dist file itself should not be modified, but you can copy relevant sections of the XML code, modify the code as required for the purposes of your data, and then add the modified content to the adapter configuration file in which the codec is to be used.
Multiple configurations and the CSV codec
The CSV codec supports multiple configurations for interpreting separated data from different sources. A transport that is using the CSV codec can use the com.apama.iaf.plugin.ConfigurableCodec interface to set up different configurations for interpreting data from multiple sources that use different formats.
The transport can set a configuration by calling the following method on the codec:
public void addConfiguration(int sessionId,
NormalisedEvent configuration)
throws java.io.IOException
The sessionId represents the ID value for this configuration.
The normalized event should contain the following key/value pairs stored as strings that will be parsed in the codec:
Key
Value
separator
The character that is to be used as the separator character, for example, a comma (,) or semicolon (;).
excelCompatible
Optional. If set to true, Excel compatibility mode is enabled. Double quotes are then used to match the behavior of Excel. Default: false.
The transport can remove a configuration by calling the following method:
The sessionId represents the ID value initially used to add the configuration with the addConfiguration() method.
Decoding CSV data from the sink to send to the correlator
To decode an event into a sequence of fields, the transport can then call:
public void sendTransportEvent(Object event, TimestampSet timestamps)
throws CodecException, SemanticMapperException
The event object is assumed to be a NormalisedEvent instance. It must contain a key of data, which has a value of string type that contains the data to decode. That is, the string contains the line containing the separated data. The codec then decodes the data, and stores the value from each field in a string sequence. This value from each field replaces the value for the data key.
If the event object also contains a sessionId key with an integer value associated with it, the value of the key identifies the configuration the codec uses to interpret the data. If the event does not contain a sessionId, the codec uses the default configuration as specified in the adapter configuration file.
Encoding CSV data from the correlator for the sink
Encoding CSV data works in the exact opposite way as decoding. The semantic mapper calls:
public void sendNormalisedEvent(NormalisedEvent event,
TimestampSet timestamps)
throws CodecException, TransportException
The sendNormalisedEvent()method retrieves the data associated with the data key. The retrieved data is a sequence of strings, each of which contains the value of a field. The method then encodes the sequence into a single line to send to the transport so the transport can write the data to the sink. The CSV codec stores the result of the encoding in the data field. If the event contains a sessionId value, this is the configuration that the codec uses to encode the data. If the event does not contain a sessionId, the codec uses the default adapter configuration as specified in the adapter’s configuration file initially used to start the adapter.
For a given event mapping in the IAF configuration file, it is not possible to dynamically specify the event decoder property, which identifies the codec that sends this event to the transport. Consequently, an adapter that is using several different codecs is unable to receive the same type of event from each codec. If it is necessary for your adapter to receive the same type of event from multiple codecs, set the event decoder property to the Null codec. This lets the transport receive the event and subsequently reroute the event back to the CSV codec by calling the following method:
The CSV codec then returns the encoded data to the transport.
The Fixed Width codec IAF plug-in
The Fixed Width codec plug-in (JFixedWidthCodec) translates between fixed width data and a sequence of string values. This codec (or the CSV codec plug-in) can be used with the standard Apama File adapter to read data from files and write data to files. For more information on the CSV codec, see The CSV codec IAF plug-in.
Fixed width data is a method of storing data fields in a packet or a line that is a fixed number of characters in size. Data stored in a fixed width format can be expressed by the following three parameters:
The field widths used (that is, the number of characters used for storing each field)
The padding character used if the data for a given field can be stored in less than the number of characters allocated for it
Whether or not the data is left or right aligned within the field.
For example, consider the following, which describes a tick with ordered properties:
symbol
6 characters
exchange
4 characters
current price
9 characters
day high
9 characters
day low
9 characters
If the pad character is “`-`”, an example of a left-aligned line is as follows:
TSCO--L---392.25---400.25---382.25---
The following is an example of a right-aligned line:
--TSCO---L---392.25---400.25---382.25
To specify fixed width data properties, do one of the following:
Set the fixed width properties in the IAF configuration file you use to start the adapter. For example, to obtain the left-aligned fixed width data above:
If you set all these properties, the codec uses them by default when decoding or encoding events.
If you do not set any of these properties, the codec expects to receive configuration events (as described in Multiple configurations and the Fixed Width codec), prior to receiving messages to encode or decode. Otherwise, the codec refuses to process these messages. The codec throws an exception back to the module that called it, which is either the transport or the semantic mapper depending on whether the data is flowing downstream or upstream.
If you require a default configuration, be sure to set all of these properties in the configuration file. If you set some of the properties, but not all of them, the codec cannot start.
For an example configuration file, see adapters\config\JFixedWidthCodec-example.xml.dist in the Apama installation directory. The JFixedWidthCodec-example.xml.dist file itself should not be modified, but you can copy relevant sections of the XML code, modify the code as required for the purposes of your data, and then add the modified content to the adapter configuration file in which the codec is to be used.
Multiple configurations and the Fixed Width codec
The Fixed Width codec supports multiple configurations for interpreting fixed width data from different sources. A transport that is using the Fixed Width codec can use the com.apama.iaf.plugin.ConfigurableCodec interface to set the configuration that you want the adapter to use.
The transport can set a configuration by calling the following method on the codec:
public void addConfiguration(int sessionId,
NormalisedEvent configuration)
throws java.io.IOException
The *sessionId* represents the ID value for this configuration.
The normalized event should contain key/value pairs that are stored as strings the Fixed Width codec can parse.
Key
Value
fieldLengths
A string sequence that contains the number of characters each field value is stored in. For example, “[5,6,5,9”] where the first value is stored in the first 5 characters, the second value is stored in the next 6 characters, and so on.
isLeftAligned
true or false, depending on whether data is left or right aligned in a field.
padCharacter
“_” where ‘_’ is the pad character used when the data requires padding to fill the field.
The transport can remove a configuration by calling the following method:
The sessionId represents the ID value initially used to add the configuration using the addConfiguration() method.
Decoding fixed width data from the sink to send to the correlator
To decode an event into a sequence of fields, the transport calls the sendTransportEvent() method as follows:
public void sendTransportEvent(Object event, TimestampSet timestamps)
throws CodecException, SemanticMapperException
The event object is assumed to be a NormalisedEvent. It must contain the key data, which has a value of string type containing the data to decode. That is, the line that contains the fixed width data. The Fixed Width codec then decodes the data and stores the value from each field in a string sequence. This value from each field replaces the value for the data key.
If the event also contains a sessionId key with an integer value associated with it, this is the configuration that the codec uses to interpret the data. If the event does not contain a sessionId the codec uses the default configuration as specified in the configuration file.
Encoding fixed width data from the correlator for the sink
Encoding fixed width data works in the exact opposite way to decoding. The semantic mapper calls:
public void sendNormalisedEvent(NormalisedEvent event,
TimestampSet timestamps)
throws CodecException, TransportException
This method retrieves the data associated with the data key. The data is in a string sequence where each member contains the value of a field. The method encodes the sequence members into a single line to send to the transport so the transport can write the data to the sink. Finally, the method stores the result of the encoding in the data field again.
If the event contains a sessionId value, this is the configuration that the codec uses to encode the data. If the event does not contain a sessionId, the codec uses the default File adapter configuration as specified in the File adapter configuration file initially used to start the file adapter.
For a given event mapping in the IAF configuration file, it is not possible to dynamically specify the event decoder property, which identifies the codec that sends the event to the transport. Consequently, an adapter that is using several different codecs is unable to receive the same type of event from each codec. If it is necessary for your adapter to receive the same type of event from multiple codecs, set the event decoder property to the Null codec. This lets the transport receive the event and subsequently reroute the event back to the Fixed Width codec by calling the following method: