Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
DavidHa
Alteryx
Alteryx

Alteryx Architectures_Banner_999x240-02.png

 

Welcome to a new blog series on Alteryx Architectures.  In this series, the Alteryx Solutions Architecture team will present a number of approaches towards designing enterprise class, resilient architectures with Alteryx Server.  Alteryx Server provides a scalable server-based analytics solution that lets you create, publish, and share analytic applications, schedule and automate workflow jobs, create, manage, and share data connections, and control data access.  Alteryx Server is the foundation for successful usage of Alteryx within an organization. The architectures and best practices covered in this blog series have been proven through customer deployments around the world. 

 

In this first entry, we will review some basic concepts of Alteryx Server which sets the stage for the rest of the blogs in this series.  It is important to note, these blogs are not a substitute for a sizing exercise to determine the appropriately sized Alteryx Server deployment to best fit your organization's business requirements, workload patterns, and data sizes.  Please work with your Alteryx sales representative and we will gladly help you with a sizing to determine an Alteryx Server deployment to best fit your needs.

 

 

Introduction


Alteryx Server's architecture is built on four main components, the Controller, Worker, Gallery, and persistence layer (MongoDB).  These are each described below.

 

  • Controller
    • Manages the environment and delegates workflows to be executed to the Workers.
    • Manages the schedules of which jobs are scheduled to run and at what times.
    • If there are jobs waiting to run (in the job queue), the Controller manages the queue and orchestrates which jobs run in appropriate order based on numerous settings such as QoS Priority and Worker Tagging.
  • Worker
    • Executes Alteryx workflows in a Server environment, using the Alteryx Engine.
    • The same Alteryx Engine which executes workflows running in Designer is used in a Server environment on the Worker.
    • Workers can manage multiple Alteryx Engines which allows multiple workflows to run concurrently.  (Performance considerations apply.)
  • Gallery
    • A web-based application for users and administrators to interact with Alteryx Server.
    • Users can publish, schedule, share, and execute workflows with other users in the Gallery.
    • Includes APIs for end users and administrators to automate Alteryx Server functions or integrate with other applications.
  • Persistence (MongoDB)
    • Stores application data such as workflows, schedules, collections, job results, and the job queue.
    • Can leverage an "embedded" MongoDB that is co-located with the Controller and fully managed by Alteryx Server, or a "user-managed" MongoDB in which an existing MongoDB deployment is used.

 

Figure 1 - an example Alteryx Server with all four components enabled on a single machine.Figure 1 - an example Alteryx Server with all four components enabled on a single machine.

 

 

Deployment

 

Each Alteryx Server deployment consists of all four of the above components.  These components can all be enabled on a single machine or spread across multiple machines.  The Alteryx Server architecture is very flexible in that there can be multiple instances of the Gallery and Worker components across multiple machines for performance and resiliency.   The Controller can have a single active instance and multiple passive instances for failover purposes.  With a user-managed MongoDB deployment, the persistence layer can be scaled across multiple MongoDB machines via a replica set.  These deployment options will be discussed in much more depth in subsequent blog entries in this series.

 

 

Engine - Worker Relationship


An important foundational concept of Alteryx Server is understanding the relationship between the Engine and Worker.   As mentioned above, the Alteryx Engine is responsible for executing workflows, whether the execution is happening from Alteryx Designer or Alteryx Server.  The Alteryx Engine process can execute a single Alteryx workflow.  With Alteryx Server, the Worker can manage multiple Alteryx Engine processes.  Therefore, a Worker can allow multiple Alteryx workflows to run simultaneously, which is a key benefit of Alteryx Server.

 

Figure 2 - An Alteryx Server Worker can manage multiple Engines, which allows multiple workflows to run concurrently.Figure 2 - An Alteryx Server Worker can manage multiple Engines, which allows multiple workflows to run concurrently.

 

 

There are many performance considerations that must be taken into account to determine the optimum number of workflows to run simultaneously.  A number of resources are available for this, including Simultaneous Workflows Guidance, Worker System Settings Deep Dive, and Worker System Settings Help.   In general, for the "E1" engine, the recommended starting point is to set the number of simultaneous workflows to half the number of physical cores on the machine.  (So for an 8-core machine, we recommend simultaneous workflows set to 4).  Please note this is a limit on the maximum number of workflows that would be allowed to run concurrently, but certainly less than that could be executing if workload demand is low.  For the “AMP” engine, a recommended best practice is to set simultaneous workflows to 1, as the AMP engine can take advantage of multiple cores and processing threads.  These are concepts that can be discussed with your Alteryx representative in greater detail about what makes the most sense for your environment.

 

 

Summary

 

In this blog we have introduced some basic concepts around the architecture and deployment of Alteryx Server.  In subsequent blog entries in this series, we will look at a number of more detailed topics ranging from scalability, high availability, cloud deployments, and more.  If you have any topics you would like to see discussed, please leave a comment below.  Thanks!

 

Helpful Reading

David Hare
Manager, Solutions Architecture

David is the manager of the Alteryx Solutions Architecture team helping customers understand the Alteryx platform, how it integrates with their existing IT infrastructure, and how Alteryx can provide high performance and advanced analytics. He's passionate about learning new technologies and recognizing how they can be leveraged to solve organizations' business problems.

David is the manager of the Alteryx Solutions Architecture team helping customers understand the Alteryx platform, how it integrates with their existing IT infrastructure, and how Alteryx can provide high performance and advanced analytics. He's passionate about learning new technologies and recognizing how they can be leveraged to solve organizations' business problems.