Contents

Duplicate content check microservice is used to classify input messages into duplicate and unique based on the content in the input XML in the XPath provided using CPS. Depending on Cache Duration Time and Sleep Interval, input messages are retained in cache and sent on sleep intervals.

Configuration


Figure 1: Component Property Sheet

Validate Input

  • yes: The microservice tries to validate the input received.
  • no: The microservice will not validate the input and hence the performance increases.

    Icon

    Setting this to "no" may cause undesired results if the input XML is not valid.

Error handling configuration

Allows configuring actions to be taken when an exception occurs during the execution. Following are the configuration options:

  • JMS Error
  • Response Generation Error
  • Request Processing Error
  • Invalid Request Error
Icon

For descriptions, refer to the Error Handling section in the Common Configurations page.

Input Source

The source type that has to be taken as input. Choose one of the below:

  • XML
  • Text

Configure schema of input

Using schema editor, load/enter the XML schema for the input message. This property will be visible if the input source is chosen as XML.


Figure 2: Configure input schema

Sleep Interval

The time gap (in milliseconds) with which duplicate messages are throttled through the Duplicate port.

Cache Duration Time

Time duration for which message should be retained in cache. If in case cache is not empty and duration exceeds, then cache is not cleaned until all messages are throttled to duplicate ports.

Cache Duration Time Unit

The unit of the cache duration time can be set here. While the default unit is milliseconds, other options are:

  • Seconds
  • Minutes
  • Hours
  • Days

Duplicate Identifier Source

The message is checked if it is duplicate of the earlier message based on the parameters below:

  • Body (Text or XML)
  • Header

XPath

Choose XPath from which content should be extracted. This property will be visible if the Duplicate Identifier Source is selected as BODY.


Figure 3: Configure XPath

Property Name

Specify the property name whose value will be used to check if the message is duplicate of the earlier message. This property will be visible if the Duplicate Identifier Source is selected as HEADER.

Icon

Refer to the XPath Editor section in the Message Body XPath selector page to know more.

Functional Demonstration

Scenario 1

Configure component as shown in Figure 1. Send input messages using Feeder. It can be observed that messages with the same "Author" are classified into duplicate and unique and sent to the display microservices connected to the respective ports.


Figure 4: Sample Event Process

Sample Input

Send 10 messages with same "Author" from Feeder.

 

 
Figure 5: Sending messages with same "Author" using Feeder

Sample Output

Unique port

The display window for unique port shows only 1 message which is unique since cache has no messages at present.


Figure 6: Unique Port display window showing the unique message

Duplicate port

The display window for duplicate port shows 9 messages since they have the same Author as the first message


Figure 7: Duplicate Port display window showing messages having the same Author as the first message

Icon

When the component is restarted, since some acknowledgments may not be received, redelivery of messages takes place. Such messages will be sent to the duplicate port.

To reduce the number of duplicate messages, set the Acknowledgement mode to 'Auto' in the input port of the microservice by clicking the DuplicateContentCheck input port and select "Auto" from the Acknowledge Mode property under Properties > Messaging.


Figure 8: Setting the Acknowledge mode to "Auto" to reduce the number of duplicate messages

Useful Tips

  • CPS does not accept negative values for sleep interval and cache duration. Only if values entered are non-negative Validate button press will pop-up Success message.
  • If Cache time is zero, then component throttles messages to Unique port with no gap (sleep interval), irrespective of sleep interval being non-zero value.
  • If sleep interval is zero, then cache is cleared as per cache duration time. While clearing cache, no messages are lost.
Adaptavist ThemeBuilder EngineAtlassian Confluence