Introduction
Alteryx Server provides a fully scalable architecture that allows an organization to scale Alteryx to automate data analytics, tackle bigger projects, process larger datasets and put self-service data analytics into the hands of more decision makers. From scaling Worker nodes to Gallery nodes to the MongoDB persistence layer, Alteryx Server allows organizations to efficiently manage their automated and self-service data analytics needs.
As the number of jobs needing to be executed increases, an organization can scale their Workers. As the number of self-service Gallery users increases, organizations can scale the Gallery. To ensure the availability of your Alteryx Server environment and to protect your workflows and data from disaster, organizations can deploy a user-managed instance of MongoDB and enable advanced features such as Replica Sets and Sharding.
When talking to organizations about scaling, the most common questions we hear are “how can we scale the Controller within our Alteryx Server environment?” or “how can we setup a redundant Controller to prevent an outage within our Alteryx Server environment?”, and that is the focus of this article.
Before we get started, it is important to note that only a single Controller can be active at a given time, meaning it requires an active-passive setup. Thankfully, using Failover Clustering, which is built-in to Microsoft Windows Server, organizations can automate the failover of the Controller. Failover Clustering is responsible for monitoring the AlteryxService and the availability of the server on the network. If the AlteryxService fails or the server goes down, Failover Clustering automates the failover of the Controller functionality to a secondary server.
The following instructions detail how to setup an Alteryx Server environment with redundant Controllers and how to automate the failover of the Controller in the event of an outage. These instructions are intended for scaled-out Alteryx Server environments. When enabling an Alteryx Server environment for automated failover of the Controller, it requires configuration changes to all Gallery and Worker nodes. The configuration changes are detailed in the steps below.
Pre-Requisites
- Administrative permissions on each of the servers that will become cluster nodes.
- Permissions to Create Computer objects and Read All Properties in the container used for computer accounts in the Active Directory domain.
- User-managed MongoDB
- For instructions on setting up user-managed MongoDB on Windows, refer to: https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Creating-a-User-Managed-MongoDB-Instance/ta-...
- A minimum of three (1 primary and 2 failover) servers (virtual or physical), to act as Alteryx Server Controllers.
- It is recommended to have an odd number of systems to maintain Cluster Quorum. If you do not have an odd number of servers, to maintain a Quorum, you should use a disk, file share, or cloud witness. (https://technet.microsoft.com/en-us/library/cc731739(v=ws.11).aspx)
- For High-Availability, it is recommended that the servers be geographically dispersed. The failover controllers should never reside on the same Virtual Host, or within the same Data Center.
- At minimum, each node within the failover cluster will require an “Alteryx Server for High Availability” license (SKU: AX‐147956)
- For additional information on Microsoft Failover Clustering, refer to: https://docs.microsoft.com/en-us/windows-server/failover-clustering/failover-clustering-overview
Install the Failover Clustering Feature
The first step in configuring a high-availability Controller is to install Microsoft Failover Clustering. The following steps will need to be completed on each of the Controller failover nodes in the cluster. These instructions apply to Windows Server 2016.
- Open Server Manager
- From the Manage menu, select Add Roles and Features

- On the “Select Installation Type” screen, select Role-based or feature-based installation” and click next

- On the “Select destination server” screen, select the server you are currently logged in to from the Server Pool section and click next

- On the “Server Roles” screen click next to proceed to the “Features” screen
- On the “Features” screen, select “Failover Clustering”
- Upon selecting Failover Clustering, click Add Features when the following window appears

- Click Next to proceed with adding of the Failover Clustering Feature

- On the “Confirm installation selections” screen, ensure the Failover Clustering Feature is listed and click Install

- Once the Failover Clustering Feature has been installed, close the “Add Roles and Features Wizard” and add the feature on each of the cluster nodes
Primary Controller Configuration
The first step is to configure your primary controller, if it has not already been configured.
- Run the Alteryx System Settings console as Admin
- For the Environment >> Setup Type, select Custom >> Enable Controller
- From the Controller >> General section, copy the Controller Token for later use
- If this is a new server configuration, you will need to complete the setup to generate a Controller Token

- In the Controller >> Persistence section, set the Database Type to User-managed MongoDB
- Enter the Host, Database Name, Username and Password in the Controller >> Persistence >> Database section

- Complete the remaining Controller configuration steps and click finish to apply the settings and start the AlteryxService with the latest configuration
Failover Controller(s) Configuration
The next step is to configure your failover controller(s). To ensure a seamless failover, each controller in the cluster will need to use the same controller token and Storage Keys for encryption.
- For each failover controller, complete the Controller setup per the primary controller setup
- After configuring each failover node, you will need to set the Controller Token on each of the Controller failover nodes
- Open an Administrative Command Prompt
- In the command prompt, navigate to the Alteryx Server installation Directory (default: C:\Program Files\Alteryx\bin
- Run the following command, replacing {Controller Token} with the value copied from step 3 of the Primary Controller Configuration: AlteryxService.exe setserversecret={Controller Token}

- Confirm that the Controller Token was set by running the following command: AlteryxService.exe getserversecret
- If running, stop the AlteryxService by running the following command: net stop AlteryxService
- Finally, you will need to copy the StorageKeysEncrypted value from the primary Controller to each of the nodes in the failover cluster
- Open Notepad on the primary Controller
- From the File menu, select Open
- Open the RuntimeSettings.xml file located at C:\ProgramData\Alteryx
- From the Controller section, copy the StorageKeysEncrypted key
- Ex. BwIAAACkAACunN7PkZcdMRM2N5pW+NRyqCdBiLuVqWRJELqix6Dg3ZAitUq9BbdlSLS8Ez+me45oiNGd8m81spqMvkNz3f/cyZX8oJVo2itY4JN/RXp4iJJ+obK96UtL8h2k2nq5XZ9GEDANIurhnm5Ww/nKxUw7O0LXtqftXpXLkbD5n/+YAs58iZlKz22dEklMzXQmc5+LBX+5D4O0FAMcD0M+u06vC1zHMmTHSU9G+D6isaVgxQtHMOLP0zTzA+97UDkE0pQOK2IQPnSh58UpHEmQn6K284pLFaKNd89dZuQ43kwo3Gmp+qz3Qp//BkzMMa2Li8eXOmmxTSLpjS+syBiglS5Zu1QFgnxKnQRknex+IGRbCTbva1CIQPqAr/kCK/GNuFnPV4ESJqrs0abbV42vmXdc9Utwy0iQ5ZLO6z1AEAioGj58fgi/rTTr+qqqf4tDk2zyJqyH/fAlxgfMO4z1cZjDHt3vmLNr/U6xyr8WLlH1TiGTBg3c3s9zMlXvd9ZifFfoI62QVEFtH6TCrhTLxsIphbj/VOtLtKaYT2SMtFz/XkxA8Ns5s4Ex5gv6jJJXihVXFxXaeQZJdQBAbVM607LTAMWN8r3Vdr5GYUBCL7i8wwYVx/4GpwU7qEMWgG0sFuFSpw9+54b2NJk7avBxIU5EVaFsbBfWRULzazwjVaA5e93NZ6Q1qm/FiCfAMSV+DUubWManxJbcttn9vEz7upQCO7DnZoxdLr4oYLm+w5MOf5QUX3l/zqIiUcbDQHa5q/gHOQwwCYvnOUMkEEHZ5kba
- On each of the Failover nodes in the cluster:
- Open Notepad as an Administrator
- From the File menu, select Open
- Open the RuntimeSettings.xml file located at C:\ProgramData\Alteryx
- Replace the StorageKeysEncrypted key with the value copied from the primary Controller
- Save the RuntimeSettings.xml file
- Close Notepad
Create a Failover Cluster
Once you have added the Failover Clustering Feature to each node in the cluster, the next step is to create a cluster. These steps can be completed from any of the servers that the Failover Clustering Feature has been enabled on.
- Open Server Manager
- From the Tools menu, select “Failover Cluster Manager”

- Within the Actions pane of the “Failover Cluster Manager console”, select “Create Cluster”

- Within the “Create Cluster Wizard” click next on the “Before You Begin” screen
- On the “Select Servers” screen, enter the server name of each node that will be added to the cluster and click the Add button. Once you have added each of the servers that will be included in the cluster, verify the servers are listed in the Selected servers window and click next to proceed

- On the “Validation Warning” screen, select “Yes. When I click Next, run configuration validation tests, and then return to the process of creating the cluster” and click Next

- After clicking next on the “Validation Warning” screen, the the “Validate a Configuration Wizard” will be launched.

- On the “Testing Options” screen, select “Run all tests (recommended)" and click Next

- On the “Confirmation” screen, validate that all nodes being added to the Cluster are listed in the “Servers to Test” section and click Next to run the validation tests

- Once the validation process is complete, review the Summary to ensure there are no errors that need to be addressed and click Finish to return to the cluster creation process
- Optionally, you can click “View Report” to review the detailed validation report
- On the “Access Point for Administering the Cluster” screen of the “Create Cluster Wizard”, enter a Cluster Name. The Cluster Name will be added to DNS within the Active Directory domain and will be used for Administering the cluster and any roles owned by the Cluster. Once you have entered a Cluster Name, click next to proceed to the confirmation screen.

- On the “Confirmation” screen, verify the cluster name and that each node being added to the cluster is listed in the Node section and then click next to proceed

- Upon clicking next on step 12, the new Cluster will be configured and added to DNS

- Once the cluster has been configured, you should receive a Summary screen stating “You have succesfully completed the Create Cluster Wizard.” Click Finish to close the “Create Cluster Wizard”

Add a Cluster Role
Now that we have created a cluster, we need to add a Cluster Role. These steps can be completed from any of the servers that the Failover Clustering Feature has been enabled on.
- Open Server Manager
- From the Tools menu, select “Failover Cluster Manager”
- Within the “Failover Cluster Manager” console, expand the newly created Cluster, highlight Roles on the left and from within the Actions menu on the right, click Configure Role

- In the “High Availability Wizard”, click Next on the “Before you Begin” screen
- On the “Select Role” screen, highlight the “Generic Service” Role and click Next

- On the “Select Service” screen, select Alteryx Service and click next to proceed

- On the “Client Access Point” screen, enter a DNS name that will be used for accessing the cluster role. This is the DNS name that will be used when configuring Gallery and Worker nodes to access the High Availability Controller cluster.

- Click Next on the “Select Storage” and “Replicate Registry Settings” screens
- On the “Confirmation” screen, verify the settings and click Next

- Upon clicking next on step 9, the the Cluster Role will be created and added to DNS. Once the High Availability role has been created, you should receive a Summary screen stating “High availability was successfully configured for the role.” Click Finish to close the “High Availability Wizard”

Microsoft Failover Clustering will now manage the state of the AlteryxService.exe on each of the nodes in the cluster. The AlteryxService.exe will be started on the “Owner” (active) node and the failover nodes will be in a stopped state. In the event of a failure on the “Owner” node, Microsoft Failover Clustering will start the AlteryxService.exe on one of the failover nodes and automatically direct traffic to the active Alteryx Controller.
Gallery Node Configuration
Now that we have our High Availability Controller cluster running, the next step is to complete the setup of Alteryx Gallery and Worker nodes. To complete the setup, configure the Gallery and Worker nodes as you normally would in distributed Alteryx Server environment. When you reach the Controller configuration, proceed as follows:
- On the “Remote Controller” screen, enter the DNS host name that was created in step 7 of the Add a Cluster Role section and theController Token obtained in step 3 of thePrimary Controller Configurationsection of these instructions

- Click the Test button to confirm compatibility

- If you do not receive a Success notification:
- Confirm that all nodes in the Alteryx Server environment (Gallery, Controller, and Worker nodes) are running the same version of Alteryx Server.
- Confirm there are no firewalls blocking TCP port 80 on the Controller nodes
- Complete the remainder of the Alteryx System Settings, as required for each node in the Alteryx Server environment and click Finish on the “Finalize Your Configuration” screen to apply the settings and start the AlteryxService using the newly applied settings.
Worker Node Configuration
Now that we have our High Availability Controller cluster running, the next step is to complete the setup of Alteryx Gallery and Worker nodes. To complete the setup, configure the Gallery and Worker nodes as you normally would in distributed Alteryx Server environment. When you reach the Controller configuration, proceed as follows:
- On the “Remote Controller” screen, enter the DNS host name that was created in step 7 of the Add a Cluster Role section and the Controller Token obtained in step 3 of the Primary Controller Configuration section of these instructions

- Click the Test button to confirm compatibility

- If you do not receive a Success notification:
- Confirm that all nodes in the Alteryx Server environment (Gallery, Controller, and Worker nodes) are running the same version of Alteryx Server.
- Confirm there are no firewalls blocking TCP port 80 on the Controller nodes
- Complete the remainder of the Alteryx System Settings, as required for each node in the Alteryx Server environment and click Finish on the “Finalize Your Configuration” screen to apply the settings and start the AlteryxService using the newly applied settings.
Automated Failover Testing
Upon completion of configuring the Controller for High-Availability and automated failover, it is highly recommended that you perform testing to validate the configuration and ensure the automated failover succeeds. There are several methods you can deploy to test the automated failover.
- Manual Failover – These steps can be performed from any server within the Failover Cluster
- Open Server Manager
- From the Tools menu, select “Failover Cluster Manager”
- Within the “Failover Cluster Manager” console, expand the newly created Cluster.
- 1) If the cluster is not displayed, from within the Actions pane, click “Connect to Cluster…” and follow the on-screen prompts to connect to the newly created cluster
- Within the Roles section of the newly created cluster, right click the role and select Move >> Select Node…

- Select one of the available Cluster Nodes. The “Move Clustered Role” window will only display the available destination nodes, the current “Owner” node will not be displayed.

- Click the OK button to initiate the failover to the selected Cluster Node
- Once the “Owner” node has changed and the status is “Running”, proceed with verifying the failover of the Controller. For details on what to verify and testing, refer to the Failover Verification and Testing section

- Power outage simulation
- Virtual Machines
- 1) Option 1 – System shutdown
- Login to Remote Desktop of the current “Owner” node
- Open the Windows start menu
- Select “Shut down” to power off the server
- 2) Option 2 - Power off the Guest (Requires access to the Virtual Machine Hypervisor)
- Open the Virtual Machine Hypervisor
- Within the Hypervisor, locate and select the owner node
- With the Owner node selected, within the Hypervisor, power off the Virtual Machine
- Physical Server (Requires access to the physical server)
- 1) Power off or unplug the server to force a failover
- Once the “Owner” node has changed and the status is “Running”, proceed with verifying the failover of the Controller. For details on what to verify and testing, refer to the Failover Verification and Testing section
Failover Verification and Testing
When verifying and testing the automated failover, it is important to verify and test all functionality utilized within your Alteryx Server environment. This would include, but not be limited to:
- Verify the AlteryxService.exe has started on the new Owner node (manual failover)
- Open Task Manager
- Switch to the Services tab
- Verify the AlteryxService.exe status is Running
- Verify the AlteryxService.exe has stopped on the old Owner node (manual failover)
- Open Task Manager
- Switch to the Services tab
- Verify the AlteryxService.exe status is Stopped
- Test the execution of Alteryx Workflows/Analytic Apps
- On-demand Workflow/Analytic App executions (Gallery)
- Scheduled Workflow/Analytic App executions
- Workflows/Analytics App utilizing Run As credentials and/or requiring users to specify their credentials
- Workflows utilizing Gallery Database Connections (Gallery >> Admin >> Database Connections)
- Test the ability to publish Alteryx Workflows/Analytic Apps
Known Issues/Concerns
- In-flight jobs – In the event you have jobs running when the controller failover occurs,