With the release of 2018.3, server users can now take advantage of job prioritization and worker node assignment. This article introduces these features and outlines how to use them.
Getting Started - User Permissions
The implementation of both of these features starts at the User permissions. Server Admins can now enable and disable scheduling, job prioritization, and worker node assignment all on a per-user basis.
Note: The server admin needs to set the Global Schedule setting on the Jobs page to ‘Yes’ first.
Server Job Prioritization
This feature allows users to assign a priority to a manually run or scheduled job to control the order jobs are picked within the queue. This means that jobs in priority order with the highest priority jobs will always execute first. If no priorities are set or jobs have the same priority level, jobs will run based on their order in the queue. Users can select from Low, Medium, High or Critical priority levels to ensure certain jobs always take priority over others.
Priorto the 2018.3 release, jobs were executed on a First In First Out (FIFO) basis. This allows admins and users to have more control over the queue execution on Server, improves efficiency and avoid bottlenecks.
This feature works in line with the already existing Quality of Service(QoS) feature, which is used to manage resource allocation in a multi-node deployment. For normal operation, leave this setting at 0.
0 = Normal/Low
1 = Medium
2 = High
3 = Critical
4 = Chained Apps
6 = Validation
Server Worker Assignment
In a multi-node server environment, the Server Admin can add job tags to workers to control which jobs are run by each worker. Users can also opt-in to allow the designated worker to run any unassigned jobs. This is done in the Worker section inside of System Settings.
The corresponding setting in the Gallery:
If a user does not select a job tag for a workflow or job, the workflow or job will run on the default worker designated in the studio, or any worker that has been designated to run unassigned jobs.
1) Always make sure you have one Worker Node with ‘Run Unassigned Jobs’ checked, or the un-labeled job will sit in the queue forever.
2)If a worker goes down, the jobs will be stuck in queue till its online again.
A default worker can be set and assigned at the Studio level by the Gallery Admin. This will ensure that all jobs in the studio will run on a specific worker. The user can override the default worker assigned by the Gallery Admin when scheduling or selecting a workflow to run.
This setting can be found under Admin > Subscription > Username > Assigned Worker
Common Scenario Best Practices:
1) Its best practice to reserve specific workers for higher priority or critical request. Depending on the workload, this could start as small as a 2 core node.
2) Always make sure you have one Worker Node with ‘Run Unassigned Jobs’ checked, or the un-labeled job will sit in the queue forever.
3) Workers can be set in different geographical regions, NA and EU for example, if there are certain rules and regulations (GDPR etc.) that you need to abide by.
a. Customers can set a different worker with different ‘Run-As’ permissions for security or data silos purposes.
4) I’ve seen instances where different departments have their own workers - one for Finance, one for Professional services etc. This will help remove the performance bottleneck and the workload is not dependent on different department verticals.
5) For cloud environments, workers can be set on cloud environments to autoscale* resources as needed.
a. AWS Auto-Scaling, Azure Autoscale,and Google Cloud Platform Autoscale
Auto Scaling is the process of dynamically allocating resources to match performance requirements from the infrastructure. As the demand and volume of usage grows, an application may need additional resources to maintain the desired performance levels from the nodes to meet service-level agreements (SLAs). As the usage goes down, the additional resources are no longer needed and can be lowered. This will help minimize costs while keeping the performance at scale.
The Server Dashboard (Introduced with 2018.4)
You may be asking yourself, but how can I see what are the available workers and the performance for each worker? In 2018.4, the Diagnostics page provides visibility into user and asset information on the Gallery and worker machines in an Alteryx Server environment. The server admin can use the Diagnostics page to monitor the number of users, assets, collections, studios, credentials, schedules, and jobs on the Server. You can also view the number of worker machines that are connected and identify constraints in resources impacting performance at a high level.