The article provides a general overview of the Workflow Engine architecture and explains the common principles of how the designed Workflows are executed.
Once a Workflow is designed in the Workflow Studio, it is stored in the Workflow Repository, in the Production database, and is ready for execution. The Workflow Engine is a special module of the Workspace Management application which is responsible for managing the execution of workflows and handling such tasks as starting, resuming, terminating Workflows, monitoring and persisting the Workflow instance state.
For analysis of how the Workflow instance is being executed (or has been executed), the System provides the Visual Tracking action available in the Workflow Studio. To feed this module with data, enough for understanding of how each workflow activity has been executed, which input arguments it has received, and what the Activity result is, as a value of output arguments, the Workflow Engine catches the Workflow Instance events and persists them in the Monitoring database (by default, the name of the monitoring database is "M42Monitoring"). The System keeps all Workflow Instance events in the database until the workflow instance is completed, and for some time afterward.
For more details see Configure Workflow Instance Monitoring
A workflow is a long-running process which in many cases requires human interactions, which means the time between the start and finish of the Workflow Instance could be years. Obviously, in such a timeframe the Application is restarted many times. For that reason, the Workflow Engine handles the saving of the Workflow Instance state to storage any time it changed. The System uses the Instance Store database for storing the serialized Workflow Instance (by default, the database name is "M42InstanceStore"). The instances are stored in the database only until the moment they are completed.
The Workspace Management application supports two alternative implementations of the Workflow Engine, the basic one which is based on Microsoft AppFabric, and a new one, based on using Workflow Workers.
The first version of the Workflow Engine is fully based on the Microsoft AppFabric module which out-of-the-box provides an implementation of all the basic tasks of the Workflow Engine, like hosting Workflows, monitoring, and persistence. The specifics of the AppFabric Engine is a way how the prepared Workflows are deployed and hosted. Using the "Publish" or "Publish Repository" action available either in Matrix42 Software Asset and Service Management or in the Workflow Studio, the prepared Workflows are deployed to the Application Server, to folder "/svc/WF/", and the System dynamically creates Web Services endpoints for each deployed Workflow version. In the end, each version of the Workflow represents the Workflow Service.
Workflow Workers [Technical Preview]
Due to a couple of downsides of the previous Workflow Engine implementation based on AppFabric, such as a problem with performance, using enormous system resources and a problem with the horizontal scaling of the Workflow execution, the new concept of Workflow execution called Workflow Workers has been introduced.
Workflow Workers engine is designed to fully replace AppFabric engine in upcoming releases. The Worker roll-out strategy includes the gradual replacement of AppFabric components from version to version, with always available option to fallback to AppFabric when something goes wrong.
The Workflow Worker is a Windows Service which could be installed as on Application Server as on any other computer in an Organization. The Workflow Worker Engine is based on the Message Queue Architecture, which means all the commands for doing something with Workflows, like starts a new one, or resume the Workflow, first are added to the Message Queue. Workflow Workers continuously watch the Queue and poll the messages from it one by one. Such an approach solves the following challenges:
If the current number of active Workers are not coping with the Workflow commands processing, and the Queue starts growing, just additional Worker instance needs to be added to the Cluster. Using such solutions as Kubernetes allows automatic scaling, when additional instances of Workers created or disposed automatically, depending on the workload
The Queue has high availability, which guarantees the Workflow command is store in Queue. The messages stay in Queue until they are successfully processed.
In contrast to AppFabric execution, the Workflow Worker is an external Windows process and all Business Components and other resources hosted on Application Server are not accessible on Worker. That limitation puts additional requirements to Workflow Activities running on Worker. For example, if the Activity requires executing Business Logic, it needs to be triggered over the Web Service. The standard Workflow Activities delivered with Product which considers Worker specific and could be executed on Worker, marked as Worker Compatible. If the Workflow uses only Workflow activities compatible with Worker, then this workflow can be executed on the Workflow Worker.
Workflow Workers uses the Token Authentication with API Token to set up connectivity with the Application Server. The API Token is provided on Worker installation or automatically generated for the Default Worker (Worker installed on Application Server). The provided Token is encrypted with the Machine Key and stored in the Matrix42.Worker.host.exe.config file. On Workflow Worker process start the API Token is sent to Application Server for checking the validity. If the Token is valid and not-expired the Server issues the short-time Access Token which is used for any other operation with the Server.
Default Workflow Worker
The Product Setup automatically installs the Workflow Worker on the Application Server. It guarantees the overall System is operable and able to handle all commands assigned for Workers right after the Setup is over, and there is no other extra activities are required. If the Workflow Worker is already installed, then the Setup validates the API Token assigned to Worker, and in case it is not valid anymore, or even expiring within next month, the Setup reissue a new API Token to avoid the cases the connectivity of Worker with Server is lost due to expired token.
The System allows installing an unlimited number of Workers. Adding additional Worker to a Farm allows to scale of the System and making it more reliable and resilient for various infrastructure challenges like unexpected growth of the System usage, or In Administration, application open the management area "Services & Processes > Workflows > Workers".
- Run "Download Worker" action.
The System automatically generates a new archive which already includes the basic configuration settings like Application Server URL and name of the Service Account.
- Unpack the received archive file on the computer where you want to install the Workflow Worker.
- Run "InstallWorker.cmd".
The batch file runs the signed Powershell script. Depending on the Domain Policies, the execution of the Powershell file can be rejected due to unknown publisher. In this case, make sure the certificate used by Configure.ps1 is installed into the Trusted Publishers Certificate Store on the Local Machine.
- Proceed with the installation steps:
1) Prove the Server URL is correct, otherwise, provide the URL of the Application Server.
2) The Worker requires the valid API token to authenticate with the Server. Please provide the valid token for the user with the Administrative rights. If you do not have a prepared token, it can be generated. See "Generate API Token" for more details.
The provided token is encoded with the machine-key and stored in the config file.
3) The System checks the validity of the provided credentials and sets up the connection with the Server.
If you have a connectivity problem and got a message the server cannot be contacted and the trusted SSL/TLS secure channel cannot be established, then please check that the Application Server certificate is installed on the local machine to Trusted Root Authorities.
4) Set the Service Account.
The Workflow Worker uses Integrated Security to work with the Product databases. Therefore the Service account should have the "db_owner" permissions to databases. By default, the installation suggests the Service account which is used on Application Server, but in case of need, it can be changed.
- The installation registers and starts Windows service "Matrix42 Workflow Worker".
Workflow Worker, in general, does not require any manual operation for updates, as it is automatically updated any time the relevant resource on Application Server changed. Means, as soon as the Application Server is updated with a new version or the patch, the next moment all related Workflow Workers will be automatically updated.
Nevertheless, there are a few cases which require manual update operations
Updating API Token.
On installing the Workflow Worker the API Token is provided, which is used for communication with the Application Server. The token is encrypted with the machine key and stored in the Config file. When the API token becomes invalid (either removed from Server or just expired) the Workflow Worker is not able to communicate with the Server. To troubleshoot the issue, the new API Token needs to be provided for the Workflow Worker.
- Generate a new Token. See "Generate API Token" for more details.
- In the Workflow Worker Application folder start "UpdateToken.cmd" and follow instructions.
- Windows service "Matrix42 Workflow Worker" will be restarted to apply configuration changes.
Manage Workflow Workers
Once the Workflow Worker is installed it automatically registers itself in the Application. All registered Workflow Workers, as well as their current statuses, can be found in Administration application in the management area "Services & Processes > Workflows > Workers".
Migration from AppFabric to Workflow Worker
The Product Update migration of all Workflows delivered out of the box. For the custom Workflows the action "Set Execution Engine" needs to be executed. When the Workflow Worker engine proves its reliability in all possible scenarios and AppFabric will be finally removed from the Product (approx. in version 11) , then all Workflows present in the Product will be marked automatically as executed on Worker.
Workflow Activities Migration
All standard Workflow Activities delivered with the Product are already migrated and can be executed on the Workflow Worker.
Due to the fact, the Workflows executed on another process (or even another computer) some standard Workflow Activities changed a little the default behavior, what is some cases need to be taken into consideration, when moving execution of the Workflow to a Worker
Invoke Powershell Activity
By default, the Activity executes the specified PS Script on the Worker. For cases, when the Workflow Worker installed on the remote computer (or in Cloud) the Activity execution could fail as the environment does not have required PS Libraries installed, or trying to reach resources not accessible on the Worker computer. Each such cases need to be analyzed and fixed to let the script keep executed on Worker. If the issue cannot be solved and the script anyway has to be executed on Application Server, then you need rework Workflow and set the property "Execute on Server" for the appropriate Invoke Powershell activity.
Execute SQL Activities
The System delivers two Workflow Activity "ExecuteSqlNonQuery" and "ExecuteSqlQuery" which runs native-SQL queries on Databases specified with the ConnectionString. For cases, when the Production database is used, and the connection string is referenced by the name "m42store". For security reasons, the Connection String is not distributed to the remote Wokflow Worker is the SQL Authorization is used (username and password defined in connection string). Connection Strings with Integrated security can be used on Workflow Workers installed on the same intranet with the Application Server.
Custom Workflow Activities
If you created before your own Workflow Activity all of them need to be reviewed for compatibility with the Workflow Worker. For more details please follow the Workflow Activities Migration Guide
Setup System Workflow Engine
The System settings define which Workflow Engine is used for the execution of a Workflow Instance. The setting can be found in the Global System Setting dialog in the Administration area, in the Workflows tabulator:
Use legacy Workflow Engine (AppFabric) - (1)
The same as in the previous product versions, the System keeps using the AppFabric for processing all kind of Workflow commands.
[TECHNICAL PREVIEW] Use Workflow Worker together with Legacy Workflow Engine (AppFabric) - (2)
The System uses Workflow Workers for the starting and processing of all Workflows marked as “Worker Compatible” (see Set Execution Engine for Workflow ). Workflows that are either not compatible with the Worker or have already been started on legacy Workflow Engine will keep using AppFabric for execution.
If this option is not selected the Workflows, even marked as "Execute on Workflow Worker", keep using AppFabric for execution.
Use Workflow Worker (3)
The option is disabled for the latest Product version and will be enabled when the Workflow Worker engine will be in a productive state, and most of the Workflow Activities will be compatible with the Workflow Worker
Set Execution Engine for the Workflow
The Workflow Workers is an application which runs on an independent process, either on the Application Server or on any other computer. This, compared to the classic AppFabric implementation, brings extra tasks for migration, as previously all Workflow Activities have been executed in-process of the Application and had direct access to all Business Components.
Workflow Activity Compatibility
To be able to be executed on the Workflow Worker, the Activity has to be compatible with it, which means the Activity either has all the required modules for the execution deployed on the Workflow Worker process, or the Activity is able to delegate the execution back to the Application Server via a Web Service method call. To signal the System uses the Workflow Activity property PLSLBinaryComponentClassBase.WorkerCompatible to signal that the Workflow Activity can be executed on the Workflow Worker.
In Workflow Studio, the Activities compatible with the Workflow Worker are highlighted to provide better transparency to the Author, which blocks the Workflow from being executed in the Workflow Worker.
Action "Set Execution Engine"
By default, all the Workflows present in the Workflow Repository run on the AppFabric engine. But an Administrator is able to define the execution for the specific Workflow(s). In the workflow management area, select one or several workflow definitions and start the "Set Execution Engine" action.
In case the "Workflow Worker" engine is set, the System checks the selected Workflows and proves that all the workflow activities used in Workflows are compatible with the Workflow Worker. In the case at least one Activity is incompatible, the set engine operations for such workflows are rejected. In this case, the mentioned activities need to be reworked, and the compatible flag for the activity set.
Configure Workflow Instance Monitoring
The new Workflow Engine based on Workflow Workers uses individual implementation of the Workflow Instance Monitoring, which not only replicates the functionality of the AppFabric Monitoring module for the Workflow Instances running on Workflow Worker, but also adds additional features which allow tailoring the monitoring for each concert environment.
Workflow Instance Monitoring Level
The System provides varying levels of monitoring capabilities for Workflow Instances running on Workflow Worker Engine. All the Workflow Instance regardless of durability can be configured to utilize Workflow event collection capabilities, allowing data at varying verbosity to be collected for monitoring and troubleshooting purposes. The engine differentiates two different levels for collecting events (see mark 4 on Workflow Engine Settings on-screen above ):
- Error Only
The Workflow Engine records to Monitoring database minimal set of Workflow Instance events, which includes events on staring, finishing workflow, suspend and resume points, and also, in case of error, all the events accumulated from the last resume point to failed Workflow activity.
The same way it is done in AppFabric, the Workflow Engine records ALL the events thrown by the Instance.
The environment in which running a massive amount of the Workflows generates huge amounts of events, which could cause serious performance and lack of resource problem on processing and recording them. Usually, most of these events just recorded and then in few days automatically cleaned up, even never be touched and reviewed in Visual Tracking. To optimize the System resources usage the "Error Only" level is introduced, which provides enough level of information to figure out the issue for most of the cases. For special cases, then the "Error Only" level does not provide enough data the Troubleshooting level can be used.
In case the "Error Only" option is set for the overall System, it is possible to elevate the level for each concrete Workflow. It can be used for troubleshooting concert workflow(-s), without putting extra pressure on the overall System. For that case, the Monitoring Level can be set in Workflow dialog general page (see Manage Workflows for more details)
The Workflows running one AppFabric Workflow Engine uses AppFabric Monitoring module. See "Monitoring Applications using Windows Server AppFabric" for more details
Cleaning obsolete Workflow Instances
The System automatically runs the background engines which periodically clean ups Workflow Instance from the Production database, and all related data from Persistence and Monitoring databases. To change the timeframe the mentioned Workflow Instance stays in the System:
- In Administration open Engine Activations management area
- Find and edit engine activation "Clean Up Obsolete Objects"
- Open dialog view "Active Engines" and edit related engine "Clean Up Obsolete Objects"
- Set the number a days for
- Completed successfully workflow instances
- Failed workflow instances
Workflow Infinite Loops Protection
If the Workflow is badly designed it could lead to infinite loops on Workflow Instance execution and overall blocking of the Workflow Engine, as some instances are always running and there is no capacity to execute new Workflow commands. To disable such negative impacts of the Infinite loops the System uses the protection mechanism which automatically terminates the Workflow Instances in case the infinite loop detected, and the amount of iterations exceeds the configured number in the Production database table SPSGlobalConfigurationClassWorkflowEngine attribute ActivityLoopLimit. By default, the System supports 10000 iterations in Workflow Instance before it will be classified as an infinite loop.
Regulate the maximum size of the Monitoring Database
For cases when the System runs monitoring in Troubleshooting mode and some Workflow Instances enters to infinite loops, it could easily lead to the very fast growth of the Monitoring Database size, and missing free hard-disk space on Database Server. To prevent this scenario the Workflow Engine supports the automatic monitoring database purge mechanism, which automatically removes the oldest Monitoring records when the database exceeds the maximum allowed size. The default allowed size is 1 Gb, but you can change in the Production database table SPSGlobalConfigurationClassWorkflowEngine in attribute MonitoringMaxTableSize, which defines the database size in Megabytes.
The System uses engine activation "Workflow Monitoring Autopurge" to automatically start the purge function, which out-of-the-box is configured to start once a day at night.