Scheduling Tasks and Events

The OpenIDM scheduler enables you to schedule reconciliation and synchronization tasks, trigger scripts, collect and run reports, trigger workflows, and perform custom logging.

OpenIDM supports cron-like syntax to schedule events and tasks, based on expressions supported by the Quartz Scheduler (bundled with OpenIDM).

If you use configuration files to schedule tasks and events, you must place the schedule files in your project’s conf/ directory. By convention, OpenIDM uses file names of the form schedule-schedule-name.json, where schedule-name is a logical name for the scheduled operation, for example, schedule-reconcile_systemXmlAccounts_managedUser.json. There are several example schedule configuration files in the openidm/samples/schedules directory.

You can configure OpenIDM to pick up changes to scheduled tasks and events dynamically, during initialization and also at runtime. For more information, see "Changing the Default Configuration".

In addition to the fine-grained scheduling facility, you can perform a scheduled batch scan for a specified date in OpenIDM data, and then automatically run a task when this date is reached. For more information, see "Scanning Data to Trigger Tasks".

Scheduler Configuration

Schedules are configured through JSON objects. The schedule configuration involves three files:

The boot.properties file, where you can enable persistent schedules.
The scheduler.json file, that configures the overall scheduler service.
One schedule-schedule-name.json file for each configured schedule.

In the boot properties configuration file (project-dir/conf/boot/boot.properties), the instance type is standalone and persistent schedules are enabled by default:

# valid instance types for node include standalone, clustered-first, and clustered-additional
openidm.instance.type=standalone

# enables the execution of persistent schedulers
openidm.scheduler.execute.persistent.schedules=true

The scheduler service configuration file (project-dir/conf/scheduler.json) governs the configuration for a specific scheduler instance, and has the following format:

{
    "threadPool" : {
        "threadCount" : "10"
    },
    "scheduler" : {
        "executePersistentSchedules" : "&{openidm.scheduler.execute.persistent.schedules}"
    }
}

The properties in the scheduler.json file relate to the configuration of the Quartz Scheduler:

threadCount specifies the maximum number of threads that are available for running scheduled tasks concurrently.
executePersistentSchedules allows you to disable persistent schedules for a specific node. If this parameter is set to false, the Scheduler Service will support the management of persistent schedules (CRUD operations) but it will not run any persistent schedules. The value of this property can be a string or boolean and is true by default.

Note that changing the value of the openidm.scheduler.execute.persistent.schedules property in the boot.properties file changes the scheduler that manages scheduled tasks on that node. Because the persistent and in-memory schedulers are managed separately, a situation can arise where two separate schedules have the same schedule name.
advancedProperties (optional) enables you to configure additional properties for the Quartz Scheduler.

In clustered environments, the scheduler service obtains an instanceID, and checkin and timeout settings from the cluster management service (defined in the project-dir/conf/cluster.json file).

For details of all the configurable properties for the Quartz Scheduler, see the Quartz Scheduler Configuration Reference.

Each schedule configuration file (project-dir/conf/schedule-schedule-name.json) has the following format:

{
 "enabled"             : true,
 "persisted"           : false,
 "concurrentExecution" : false,
 "type"                : "cron",
 "startTime"           : "(optional) time",
 "endTime"             : "(optional) time",
 "schedule"            : "cron expression",
 "misfirePolicy"       : "optional, string",
 "timeZone"            : "(optional) time zone",
 "invokeService"       : "service identifier",
 "invokeContext"       : "service specific context info",
 "invokeLogLevel"      : "(optional) level"
}

The schedule configuration properties are defined as follows:

enabled

Set to true to enable the schedule. When this property is false, OpenIDM considers the schedule configuration dormant, and does not allow it to be triggered or launched.

If you want to retain a schedule configuration, but do not want it used, set enabled to false for task and event schedulers, instead of changing the configuration or cron expressions.

persisted (optional)

Specifies whether the schedule state should be persisted or stored in RAM. Boolean (true or false), false by default.

In a clustered environment, this property must be set to true to have the schedule fire only once across the cluster. For more information, see "Configuring Persistent Schedules".

concurrentExecution

Specifies whether multiple instances of the same schedule can run concurrently. Boolean (true or false), false by default. Multiple instances of the same schedule cannot run concurrently by default. This setting prevents a new scheduled task from being launched before the same previously launched task has completed. For example, under normal circumstances you would want a LiveSync operation to complete before the same operation was launched again. To enable multiple schedules to run concurrently, set this parameter to true. The behavior of missed scheduled tasks is governed by the misfirePolicy.

type

Currently OpenIDM supports only cron.

startTime (optional)

Used to start the schedule at some time in the future. If this parameter is omitted, empty, or set to a time in the past, the task or event is scheduled to start immediately.

Use ISO 8601 format to specify times and dates (YYYY-MM-DD Thh:mm :ss).

endTime (optional)

Used to plan the end of scheduling.

schedule

Takes cron expression syntax. For more information, see the CronTrigger Tutorial and Lesson 6: CronTrigger.

misfirePolicy

For persistent schedules, this optional parameter specifies the behavior if the scheduled task is missed, for some reason. Possible values are as follows:

fireAndProceed. The first run of a missed schedule is immediately launched when the server is back online. Subsequent runs are discarded. After this, the normal schedule is resumed.
doNothing. All missed schedules are discarded and the normal schedule is resumed when the server is back online.

timeZone (optional)

If not set, OpenIDM uses the system time zone.

invokeService

Defines the type of scheduled event or action. The value of this parameter can be one of the following:

sync for reconciliation
provisioner for LiveSync
script to call some other scheduled operation defined in a script
taskScanner to define a scheduled task that queries a set of objects. For more information, see "Scanning Data to Trigger Tasks".

invokeContext

Specifies contextual information, depending on the type of scheduled event (the value of the invokeService parameter).

The following example invokes reconciliation:

{
    "invokeService": "sync",
    "invokeContext": {
        "action": "reconcile",
        "mapping": "systemLdapAccount_managedUser"
    }
}

For a scheduled reconciliation task, you can define the mapping in one of two ways:

Reference a mapping by its name in sync.json, as shown in the previous example. The mapping must exist in your project’s conf/sync.json file.
Add the mapping definition inline by using the mapping property, as shown in "Specifying the Mapping as Part of the Schedule".

The following example invokes a LiveSync operation:
```
{
    "invokeService": "provisioner",
    "invokeContext": {
        "action": "liveSync",
        "source": "system/OpenDJ/__ACCOUNT__"
    }
}
```
For scheduled LiveSync tasks, the source property follows OpenIDM’s convention for a pointer to an external resource object and takes the form system/resource-name/object-type.

The following example invokes a script, which prints the string Hello World to the OpenIDM log (/openidm/logs/openidm0.log.X).

{
    "invokeService": "script",
    "invokeContext": {
        "script": {
            "type": "text/javascript",
            "source": "console.log('Hello World');"
        }
    }
}

+ Note that these are sample configurations only. Your own schedule configuration will differ according to your specific requirements.

invokeLogLevel (optional)

Specifies the level at which the invocation will be logged. Particularly for schedules that run very frequently, such as LiveSync, the scheduled task can generate significant output to the log file, and you should adjust the log level accordingly. The default schedule log level is info. The value can be set to any one of the SLF4J log levels:

trace
debug
info
warn
error
fatal

Schedules and Daylight Savings Time

The schedule service uses Quartz cronTrigger syntax. CronTrigger schedules jobs to fire at specific times with respect to a calendar (rather than every N milliseconds). This scheduling can cause issues when clocks change for daylight savings time (DST) if the trigger time falls around the clock change time in your specific time zone.

Depending on the trigger schedule, and on the daylight event, the trigger might be skipped or might appear not to fire for a short period. This interruption can be particularly problematic for liveSync where schedules execute continuously. In this case, the time change (for example, from 02:00 back to 01:00) causes an hour break between each liveSync execution.

To prevent DST from having an impact on your schedules, set the time zone of the schedule to Coordinated Universal Time (UTC). UTC is never subject to DST, so schedules will continue to fire as normal.

Configuring Persistent Schedules

By default, scheduling information, such as schedule state and details of the schedule run, is stored in RAM. This means that such information is lost when OpenIDM is rebooted. The schedule configuration itself (defined in your project’s conf/schedule-schedule-name.json file) is not lost when OpenIDM is shut down, and normal scheduling continues when the server is restarted. However, there are no details of missed schedule runs that should have occurred during the period the server was unavailable.

You can configure schedules to be persistent, which means that the scheduling information is stored in the internal repository rather than in RAM. With persistent schedules, scheduling information is retained when OpenIDM is shut down. Any previously scheduled jobs can be rescheduled automatically when OpenIDM is restarted.

Persistent schedules also enable you to manage scheduling across a cluster (multiple OpenIDM instances). When scheduling is persistent, a particular schedule will be launched only once across the cluster, rather than once on every OpenIDM instance. For example, if your deployment includes a cluster of OpenIDM nodes for high availability, you can use persistent scheduling to start a reconciliation operation on only one node in the cluster, instead of starting several competing reconciliation operations on each node.

Persistent schedules rely on timestamps. In a deployment where OpenIDM instances run on separate machines, you must synchronize the system clocks of these machines using a time synchronization service that runs regularly. The clocks of all machines involved in persistent scheduling must be within one second of each other. For information on how you can achieve this using the Network Time Protocol (NTP) daemon, see the NTP RFC.

To configure persistent schedules, set persisted to true in the schedule configuration file (schedule-schedule-name.json).

If OpenIDM is down when a scheduled task was set to occur, one or more runs of that schedule might be missed. To specify what action should be taken if schedules are missed, set the misfirePolicy in the schedule configuration file. The misfirePolicy determines what OpenIDM should do if scheduled tasks are missed. Possible values are as follows:

fireAndProceed. The first run of a missed schedule is immediately implemented when the server is back online. Subsequent runs are discarded. After this, the normal schedule is resumed.
doNothing. All missed schedules are discarded and the normal schedule is resumed when the server is back online.

Schedule Examples

The following example shows a schedule for reconciliation that is not enabled. When the schedule is enabled ("enabled" : true,), reconciliation runs every 30 minutes, starting on the hour:

{
    "enabled": false,
    "persisted": false,
    "type": "cron",
    "schedule": "0 0/30 * * * ?",
    "invokeService": "sync",
    "invokeContext": {
        "action": "reconcile",
        "mapping": "systemLdapAccounts_managedUser"
    }
}

The following example shows a schedule for LiveSync enabled to run every 15 seconds, starting at the beginning of the minute. The schedule is persisted, that is, stored in the internal repository rather than in memory. If one or more LiveSync runs are missed, as a result of OpenIDM being unavailable, the first run of the LiveSync operation is implemented when the server is back online. Subsequent runs are discarded. After this, the normal schedule is resumed:

{
    "enabled": true,
    "persisted": true,
    "misfirePolicy" : "fireAndProceed",
    "type": "cron",
    "schedule": "0/15 * * * * ?",
    "invokeService": "provisioner",
    "invokeContext": {
        "action": "liveSync",
        "source": "system/ldap/account"
    }
}

Managing Schedules Over REST

OpenIDM exposes the scheduler service under the /openidm/scheduler context path. The following examples show how schedules can be created, read, updated, and deleted, over REST, by using the scheduler service. The examples also show how to pause and resume scheduled tasks, when an OpenIDM instance is placed in maintenance mode. For information about placing OpenIDM in maintenance mode, see "Placing an OpenIDM Instance in Maintenance Mode" in the Installation Guide.

When you configure schedules in this way, changes made to the schedules are not pushed back into the configuration service. Managing schedules by using the /openidm/scheduler context path essentially bypasses the configuration service and sends the request directly to the scheduler.

If you need to perform an operation on a schedule that was created by using the configuration service (by placing a schedule file in the conf/ directory), you must direct your request to the /openidm/config endpoint, and not to the /openidm/scheduler endpoint.

Creating a Schedule

You can create a schedule with a PUT request, which allows you to specify the ID of the schedule, or with a POST request, in which case the server assigns an ID automatically.

The following example uses a PUT request to create a schedule that fires a script (script/testlog.js) every second. The schedule configuration is as described in "Scheduler Configuration":

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --header "Content-Type: application/json" \
 --request PUT \
 --data '{
    "enabled":true,
    "type":"cron",
    "schedule":"0/1 * * * * ?",
    "persisted":true,
    "misfirePolicy":"fireAndProceed",
    "invokeService":"script",
    "invokeContext": {
        "script": {
            "type":"text/javascript",
            "file":"script/testlog.js"
        }
    }
 }' \
 "https://localhost:8443/openidm/scheduler/testlog-schedule"
{
  "type": "cron",
  "invokeService": "script",
  "persisted": true,
  "_id": "testlog-schedule",
  "schedule": "0/1 * * * * ?",
  "misfirePolicy": "fireAndProceed",
  "enabled": true,
  "invokeContext": {
    "script": {
      "file": "script/testlog.js",
      "type": "text/javascript"
    }
  }
}

The following example uses a POST request to create an identical schedule to the one created in the previous example, but with a server-assigned ID:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --header "Content-Type: application/json" \
 --request POST \
 --data '{
    "enabled":true,
    "type":"cron",
    "schedule":"0/1 * * * * ?",
    "persisted":true,
    "misfirePolicy":"fireAndProceed",
    "invokeService":"script",
    "invokeContext": {
        "script": {
            "type":"text/javascript",
            "file":"script/testlog.js"
        }
    }
 }' \
 "https://localhost:8443/openidm/scheduler?_action=create"
{
  "type": "cron",
  "invokeService": "script",
  "persisted": true,
  "_id": "d6d1b256-7e46-486e-af88-169b4b1ad57a",
  "schedule": "0/1 * * * * ?",
  "misfirePolicy": "fireAndProceed",
  "enabled": true,
  "invokeContext": {
    "script": {
      "file": "script/testlog.js",
      "type": "text/javascript"
    }
  }
}

The output includes the _id of the schedule, in this case "_id": "d6d1b256-7e46-486e-af88-169b4b1ad57a".

Obtaining the Details of a Schedule

The following example displays the details of the schedule created in the previous section. Specify the schedule ID in the URL:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request GET \
 "https://localhost:8443/openidm/scheduler/d6d1b256-7e46-486e-af88-169b4b1ad57a"
{
  "_id": "d6d1b256-7e46-486e-af88-169b4b1ad57a",
  "schedule": "0/1 * * * * ?",
  "misfirePolicy": "fireAndProceed",
  "startTime": null,
  "invokeContext": {
    "script": {
      "file": "script/testlog.js",
      "type": "text/javascript"
    }
  },
  "enabled": true,
  "concurrentExecution": false,
  "persisted": true,
  "timeZone": null,
  "type": "cron",
  "invokeService": "org.forgerock.openidm.script",
  "endTime": null,
  "invokeLogLevel": "info"
}

Updating a Schedule

To update a schedule definition, use a PUT request and update all properties of the object. Note that PATCH requests are currently supported only for managed and system objects.

The following example disables the schedule created in the previous section:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --header "Content-Type: application/json" \
 --request PUT \
 --data '{
    "enabled":false,
    "type":"cron",
    "schedule":"0/1 * * * * ?",
    "persisted":true,
    "misfirePolicy":"fireAndProceed",
    "invokeService":"script",
    "invokeContext": {
        "script": {
            "type":"text/javascript",
            "file":"script/testlog.js"
        }
    }
 }' \
 "https://localhost:8443/openidm/scheduler/d6d1b256-7e46-486e-af88-169b4b1ad57a"
   null

Listing Configured Schedules

To display a list of all configured schedules, query the openidm/scheduler context path as shown in the following example:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request GET \
 "https://localhost:8443/openidm/scheduler?_queryId=query-all-ids"
{
  "remainingPagedResults": -1,
  "pagedResultsCookie": null,
  "totalPagedResultsPolicy": "NONE",
  "totalPagedResults": -1,
  "resultCount": 2,
  "result": [
    {
      "_id": "d6d1b256-7e46-486e-af88-169b4b1ad57a"
    },
    {
      "_id": "recon"
    }
  ]
}

Deleting a Schedule

To deleted a configured schedule, call a DELETE request on the schedule ID. For example:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request DELETE \
 "https://localhost:8443/openidm/scheduler/d6d1b256-7e46-486e-af88-169b4b1ad57a"
null

Obtaining a List of Running Scheduled Tasks

The following command returns a list of tasks that are currently executing. This list enables you to decide whether to wait for specific tasks to complete before you place an OpenIDM instance in maintenance mode.

Note that this list is accurate only at the moment the request was issued. The list can change at any time after the response is received.

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request POST \
 "http://localhost:8080/openidm/scheduler?_action=listCurrentlyExecutingJobs"
[
    {
        "concurrentExecution": false,
        "enabled": true,
        "endTime": null,
        "invokeContext": {
            "script": {
                "file": "script/testlog.js",
                "type": "text/javascript"
            }
        },
        "invokeLogLevel": "info",
        "invokeService": "org.forgerock.openidm.script",
        "misfirePolicy": "doNothing",
        "persisted": false,
        "schedule": "0/10 * * * * ?",
        "startTime": null,
        "timeZone": null,
        "type": "cron"
    }
]

Pausing Scheduled Tasks

In preparation for placing an OpenIDM instance into maintenance mode, you can temporarily suspend all scheduled tasks. This action does not cancel or interrupt tasks that are already in progress - it simply prevents any scheduled tasks from being invoked during the suspension period.

The following command suspends all scheduled tasks and returns true if the tasks could be suspended successfully.

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request POST \
 "https://localhost:8443/openidm/scheduler?_action=pauseJobs"
{
    "success": true
}

Resuming All Running Scheduled Tasks

When an update has been completed, and your instance is no longer in maintenance mode, you can resume scheduled tasks to start them up again. Any tasks that were missed during the downtime will follow their configured misfire policy to determine whether they should be reinvoked.

The following command resumes all scheduled tasks and returns true if the tasks could be resumed successfully.

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request POST \
 "https://localhost:8443/openidm/scheduler?_action=resumeJobs"
{
    "success": true
}

Scanning Data to Trigger Tasks

In addition to the fine-grained scheduling facility described previously, OpenIDM provides a task scanning mechanism. The task scanner enables you to perform a batch scan on a specified property in OpenIDM, at a scheduled interval, and then to launch a task when the value of that property matches a specified value.

When the task scanner identifies a condition that should trigger the task, it can invoke a script created specifically to handle the task.

For example, the task scanner can scan all managed/user objects for a "sunset date" and can invoke a script that launches a "sunset task" on the user object when this date is reached.

Configuring the Task Scanner

The task scanner is essentially a scheduled task that queries a set of managed users. The task scanner is configured in the same way as a regular scheduled task in a schedule configuration file named (schedule-task-name.json), with the invokeService parameter set to taskscanner. The invokeContext parameter defines the details of the scan, and the task that should be launched when the specified condition is triggered.

The following example defines a scheduled scanning task that triggers a sunset script. The schedule configuration file is provided in openidm/samples/taskscanner/conf/schedule-taskscan_sunset.json. To use this sample file, copy it to the openidm/conf directory.

{
    "enabled" : true,
    "type" : "cron",
    "schedule" : "0 0 * * * ?",
    "concurrentExecution" : false,
    "invokeService" : "taskscanner",
    "invokeContext" : {
        "waitForCompletion" : false,
        "maxRecords" : 2000,
        "numberOfThreads" : 5,
        "scan" : {
            "_queryId" : "scan-tasks",
            "object" : "managed/user",
            "property" : "sunset/date",
            "condition" : {
                "before" : "${Time.now}"
            },
            "taskState" : {
                "started" : "sunset/task-started",
                "completed" : "sunset/task-completed"
            },
            "recovery" : {
                "timeout" : "10m"
            }
        },
        "task" : {
            "script" : {
                "type" : "text/javascript",
                "file" : "script/sunset.js"
            }
        }
    }
}

The schedule configuration calls a script (script/sunset.js). To test the sample, copy this script file from openidm/samples/taskscanner/script/sunset.js to the openidm/script directory. The remaining properties in the schedule configuration are as follows:

The invokeContext parameter takes the following properties:

waitForCompletion (optional)

This property specifies whether the task should be performed synchronously. Tasks are performed asynchronously by default (with waitForCompletion set to false). A task ID (such as {"_id":"354ec41f-c781-4b61-85ac-93c28c180e46"}) is returned immediately. If this property is set to true, tasks are performed synchronously and the ID is not returned until all tasks have completed.

maxRecords (optional)

The maximum number of records that can be processed. This property is not set by default so the number of records is unlimited. If a maximum number of records is specified, that number will be spread evenly over the number of threads.

numberOfThreads (optional)

By default, the task scanner runs in a multi-threaded manner, that is, numerous threads are dedicated to the same scanning task run. Multi-threading generally improves the performance of the task scanner. The default number of threads for a single scanning task is ten. To change this default, set the numberOfThreads property.

scan

Defines the details of the scan. The following properties are defined:

_queryId

Specifies the predefined query that is performed to identify the entries for which this task should be run.

The query that is referenced here must be defined in the database table configuration file (conf/repo.orientdb.json or conf/repo.jdbc.json). A sample query for a scanned task (scan-tasks) is defined in the JDBC repository configuration file as follows:

"scan-tasks" : "SELECT fullobject FROM ${_dbSchema}.${_mainTable}
 obj INNER JOIN ${_dbSchema}.${_propTable}
 prop ON obj.id = prop.${_mainTable}_id
 LEFT OUTER JOIN ${_dbSchema}.${_propTable}
 complete ON obj.id = complete.${_mainTable}_id
 AND complete.propkey=${taskState.completed}
 INNER JOIN ${_dbSchema}.objecttypes objtype
 ON objtype.id = obj.objecttypes_id
 WHERE ( prop.propkey=${property} AND prop.propvalue < ${condition.before}
 AND objtype.objecttype = ${_resource} )
 AND ( complete.propvalue is NULL )",

Note that this query identifies records for which the value of the specified property is smaller than the condition. The sample query supports only time-based conditions, with the time specified in ISO 8601 format (Zulu time). You can write any query to target the records that you require.

object

Defines the managed object type against which the query should be performed, as defined in the managed.json file.

property

Defines the property of the managed object, against which the query is performed. In the previous example, the "property" : "sunset/date" indicates a JSON pointer that maps to the object attribute, and can be understood as sunset: {"date" : "date"}.

If you are using a JDBC repository, with a generic mapping, you must explicitly set this property as searchable so that it can be queried by the task scanner. For more information, see "Using Generic Mappings".

condition (optional)

Indicates the conditions that must be matched for the defined property.

In the previous example, the scanner scans for users whose sunset/date is prior to the current timestamp (at the time the script is run).

You can use these fields to define any condition. For example, if you wanted to limit the scanned objects to a specified location, say, London, you could formulate a query to compare against object locations and then set the condition to be:

"condition" : {
    "location" : "London"
},

For time-based conditions, the condition property supports macro syntax, based on the Time.now object (which fetches the current time). You can specify any date/time in relation to the current time, using the ` or `-` operator, and a duration modifier. For example: `${Time.now + 1d}` would return all user objects whose `sunset/date` is the following day (current time plus one day). You must include space characters around the operator (` or -). The duration modifier supports the following unit specifiers:

s - second
m - minute
h - hour
d - day
M - month
y - year

taskState

Indicates the names of the fields in which the start message and the completed message are stored. These fields are used to track the status of the task.

started specifies the field that stores the timestamp for when the task begins.
completed specifies the field that stores the timestamp for when the task completes its operation. The completed field is present as soon as the task has started, but its value is null until the task has completed.

recovery (optional)

Specifies a configurable timeout, after which the task scanner process ends. For clustered OpenIDM instances, there might be more than one task scanner running at a time. A task cannot be launched by two task scanners at the same time. When one task scanner "claims" a task, it indicates that the task has been started. That task is then unavailable to be claimed by another task scanner and remains unavailable until the end of the task is indicated. In the event that the first task scanner does not complete the task by the specified timeout, for whatever reason, a second task scanner can pick up the task.

task

Provides details of the task that is performed. Usually, the task is invoked by a script, whose details are defined in the script property:

type ‒ the type of script, either JavaScript or Groovy.
file ‒ the path to the script file. The script file takes at least two objects (in addition to the default objects that are provided to all OpenIDM scripts):
- input ‒ the individual object that is retrieved from the query (in the example, this is the individual user object).
- objectID ‒ a string that contains the full identifier of the object. The objectID is useful for performing updates with the script as it allows you to target the object directly. For example: openidm.update(objectID, input['_rev'], input);.
A sample script file is provided in openidm/samples/taskscanner/script/sunset.js. To use this sample file, copy it to your project’s script/ directory. The sample script marks all user objects that match the specified conditions as inactive. You can use this sample script to trigger a specific workflow, or any other task associated with the sunset process.

For more information about using scripts in OpenIDM, see "Scripting Reference".

Managing Scanning Tasks Over REST

You can trigger, cancel, and monitor scanning tasks over the REST interface, using the REST endpoint https://localhost:8443/openidm/taskscanner.

Triggering a Scanning Task

The following REST command runs a task named "taskscan_sunset". The task itself is defined in a file named conf/schedule-taskscan_sunset.json:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request POST \
 "https://localhost:8443/openidm/taskscanner?_action=execute&name=schedule/taskscan_sunset"

By default, a scanning task ID is returned immediately when the task is initiated. Clients can make subsequent calls to the task scanner service, using this task ID to query its state and to call operations on it.

For example, the scanning task initiated previously would return something similar to the following, as soon as it was initiated:

{"_id":"edfaf59c-aad1-442a-adf6-3620b24f8385"}

To have the scanning task complete before the ID is returned, set the waitForCompletion property to true in the task definition file (schedule-taskscan_sunset.json). You can also set the property directly over the REST interface when the task is initiated. For example:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request POST \
 "https://localhost:8443/openidm/taskscanner?_action=execute&name=schedule/taskscan_sunset&waitForCompletion=true"

Canceling a Scanning Task

You can cancel a scanning task by sending a REST call with the cancel action, specifying the task ID. For example, the following call cancels the scanning task initiated in the previous section:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request POST \
 "https://localhost:8443/openidm/taskscanner/edfaf59c-aad1-442a-adf6-3620b24f8385?_action=cancel"
{
    "_id":"edfaf59c-aad1-442a-adf6-3620b24f8385",
    "action":"cancel",
    "status":"SUCCESS"
}

Listing Scanning Tasks

You can display a list of scanning tasks that have completed, and those that are in progress, by running a RESTful GET on the openidm/taskscanner context path. The following example displays all scanning tasks:

$ curl \
 --cacert self-signed.crt \
 --header "X-OpenIDM-Username: openidm-admin" \
 --header "X-OpenIDM-Password: openidm-admin" \
 --request GET \
 "https://localhost:8443/openidm/taskscanner"
{
 "tasks": [
    {
      "ended": 1352455546182
      "started": 1352455546149,
      "progress": {
        "failures": 0
        "successes": 2400,
        "total": 2400,
        "processed": 2400,
        "state": "COMPLETED",
      },
      "_id": "edfaf59c-aad1-442a-adf6-3620b24f8385",
    }
  ]
}

Each scanning task has the following properties:

ended

The time at which the scanning task ended.

started

The time at which the scanning task started.

progress

The progress of the scanning task, summarised in the following fields:

failures - the number of records not able to be processed
successes - the number of records processed successfully
total - the total number of records
processed - the number of processed records
state - the overall state of the task, INITIALIZED, ACTIVE, COMPLETED, CANCELLED, or ERROR

_id

The ID of the scanning task.

The number of processed tasks whose details are retained is governed by the openidm.taskscanner.maxcompletedruns property in the conf/system.properties file. By default, the last one hundred completed tasks are retained.