Friday, May 6, 2011

Batching and Throttling

In an enterprise application ,Some processes are called frequently for each message - for example Status update of an order, Logging or Auditing Services.
Using a plain Asynchronous communication would make the service overloaded and any underlying data service to take excess of load in given time. This will not be scalable approach.Instead we can use batching to group some messages in to a batch and introduce a delay between each batch. A batch is generally determined for an application by making a load test on the underlying data services and find out how many calls can it take at one point of time. Lets look at an example:

Client sends a message ORDER_STATUS that has multiple orders in it. All orders need to call updateStatus() web service to update their status in the database. Now there can be some thousands of ORDER's in one ORDER_STATUS message and the Web service cant handle the call to insert them all at a time. The web service lets say has a capability to scale up to 50 orders /5 secs. The best way is to send each batch one at a time with a delay of 4s between each batch.
The following snippet explains how we divide the message in to multiple batches:

BatchOrderStatusMessage.xsl :

.....After Imports ...
<code>
<pre>
<xsl:param name="partitionCount"/>
<xsl:template match="/">
<xsl:variable name="orderStatusCount"
select="count(/os:ORDER_STATUS/ORDERS/ORDER)"/>
<tns:OrderStatusCollection>
<xsl:call-template name="groupOrderStatus">
<xsl:with-param name="statusMessages"
select="/os:ORDER_STATUS/ORDERS/ORDER"/>
<xsl:with-param name="fields" select="/os:ORDER_STATUS/*[not(local-name()='ORDERS')]"/>
<xsl:with-param name="groupSize" select="number($partitionCount)"/>
<xsl:with-param name="counter" select='1'/>
</xsl:call-template>
</tns:OrderStatusCollection>
</xsl:template>

<xsl:template name="groupOrderStatus">
<xsl:param name="statusMessages"/>
<xsl:param name="fields"/>
<xsl:param name="groupSize"/>
<xsl:param name="counter"/>
<os:ORDER_STATUS>
<xsl:copy-of select="$fields"/>
<ORDERS>
<xsl:for-each select="$statusMessages[(position() <= number(($counter)*$groupSize))]">
<xsl:copy-of select="."/>
</xsl:for-each>
</ORDERS>
</os:ORDER_STATUS>
<xsl:variable name="nextStatusMessages"
select="$statusMessages[position() greater-than number(($counter)*$groupSize)]"/>
<xsl:if test="count($nextStatusMessages) greater-than 0">
<xsl:call-template name="groupOrderStatus">
<xsl:with-param name="statusMessages" select="$nextStatusMessages"/>
<xsl:with-param name="fields" select="$fields"/>
<xsl:with-param name="groupSize" select="number($groupSize)"/>
<xsl:with-param name="counter" select="number($counter)"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>

In this partitionCount is the batch size which is passed via the parameters in the call to XSLT.ie.,

ora:processXSLT('BatchOrderStatusMessage.xsl',bpws:getVariableData('processOrderStatusInputVar','payload'),bpws:getVariableData('paramsVar'))

Where paramsVar is of type element "parameters" in the below XSD

<xsd:element name="parameters">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="item" minOccurs="1" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="value" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>


Summary : Using Batching technique may be helpful when the payload is very huge and we need to split them into multiple batches to process them further in the flow.