Retry Queues Integration

The Retry Queues integration lets you configure Retry Queues used by specific workflow engine functions, notably exportViaRestWithRetry. The standard exportViaRest and sendViaRest functions do not attempt to retry a failed REST request. These actions log the failure, but take no further action. Where an outbound request needs to be retried on failure, then a you should use a retry queue and a retrying function.

A retry queue defines the characteristics of the queue (the max retry count, the retry interval and the workflow) that should be retried.

Constraints

  • The retry queue is held in memory. This queue is shared between moog_farmd instances. If both primary and secondary moog_farmd instances are shut down, then the retry queue is lost and is not recreated on startup.

  • The retry queue is not aware of the error that caused a retry to be attempted. For example, if a request failed due to an authentication error, it would still be retried, even if it was liable to fail again without intervention.

  • The retry queue does not hold the failed payload itself, it holds the Alert or Situation Id. When the retry is attempted, the current Alert or Situation state or contents is sent to the configured workflow.

    For example, if an event was being exported and had a severity of 5, but between the export failure and retry attempt the alert had cleared, the current cleared alert would be the payload.

  • The user is not notified of retries that exceed the retry limit except via log entries.

Installation

The Retry Queue behavior relies on the Scheduler Moolet, and an associated Moobot JavaScript file (MOCScheduler.js) . A Moolet configuration file (scheduler-235.conf) and the base Moobot are distributed as part of the Add-ons.

Important

If the Scheduler Moolet is not currently in use, then these files can be used as-is (rename schedule-235.conf to scheduler.conf). If the Scheduler Moolet is already in use, then you need to merge the Retry Queues Moobot contents into the existing Scheduler Moobot file along with the job schedules itself.

  1. Ensure the Scheduler Moolet is included in the moog_farmd.conf configuration file. The Add-ons ship with a Moolet configuration scheduler-235.conf that you can use.

    { include : "scheduler.conf" }
  2. Ensure that the Scheduler Moobot contains the job schedules associated with the Retry Queues.

    scheduler.scheduleJob(this, "retry", 10, 30);
    function retry() {
      ...
    }

    By default, the retry job runs 10 seconds after moog_farmd startup, and every 30 seconds thereafter. This is not the retry interval, but is the interval the schedule checks to see if entries in a queue need to be retried. A Retry Queue configured with an interval of 60 seconds, will be tried every second run of the retry job schedule.

Configuration

A Retry Queue has the following configuration items:

Configuration item

Required?

Description

Queue name

yes

A unique name for the queue, referenced in the WFE action using the queue.

Queue type

yes

Alert or Situation. The type of object that is held in this queue. Since Alert and Situation Ids can collide, this is needed to ensure we fetch the correct object for the retry.

Retry interval

yes

How often a retry should be attempted (in seconds). A retry is attempted on the Scheduler “retry” job closest to this interval.

Max retries

yes

The maximum number of retries to attempt. When the retry count exceeds this value, the retry stops. A subsequent failed request for the same Alert or Situation Id pushes the Alert or Situation back to the retry queue; so, when the max retries value is exceeded, it does not prevent futher retries.

Workflow Engine Name

yes

The name of the workflow engine that contains the retry workflow. This must be an Inform engine.

Workflow Name

yes

The name of the workflow within the defined workflow engine to pass the Alert or Situation to.

Discard Closed Items

no

A checkbox. If, on retry, the Alert or Situation is closed, then the retry is aborted when this option is selected.

Workflow

The retry workflow:

Within a workflow function (example: exportViaRestWithRetry)

  • If the export action fails:

    If a retry queue is configured in the action, the CEvent id is added to the retry queue

  • If the export action succeeds:

    If a retry queue is configured in the action and it contains the CEvent id, it is removed (no more retries are attempted).

retry_workflow.png

The Retry scheduled job:

Iterate over configured retry queues:

  • If the retry interval has passed:

    • Iterate over the items in the queue and send to the configured workflow if the retry count has not been exceeded.

    • Increment the retry counter.

      The export workflow removes the item from the list if it is successful.

  • If the retry interval has not passed, then ignore this run.

retry_scheduled_job_workflow.png

Using the Retry Queues in a workflow

The following functions support retry queues:

Both functions have optional parameters that allow a retry queue to be defined.

retry_AlertExport.png

The value entered here must match a retry queue name defined in the Retry Queues integration.

retry_queues_settings.png

The Retry workflow

The Retry scheduled job acquires the CEvent object and passes it to a specified WFE Inform engine. By default, an Alert Inform Engine and Situation Inform Engine are configured (v 8.0.x), and you can configure the retry workflow in these.

  • The retry workflow should be a duplicate of the original workflow—each action in the original workflow should be present in the retry workflow—including entry filters, payload manipulation, etc.

  • The export action within the retry workflow should use the same export action with retry and the same Retry Queue as the original. Failure to do this may result in retries after a successful export.