ALTERYX INSPIRE | Join us this May for for a multi-day virtual analytics + data science experience like no other! Register Now
The Alteryx Community will be temporarily unavailable for a few hours due to scheduled maintenance starting on Thursday, April 22nd at 5pm MST. Please plan accordingly.

Alteryx Analytics Hub Knowledge Base

Definitive answers from Alteryx Analytics Hub experts.

Alteryx Analytics Hub Deployment Overview

DavidHa
Alteryx
Alteryx
Created

This article provides an overview of what administrators will need to know to plan out Alteryx Analytics Hub deployments of any size.  An understanding of the Alteryx Analytics Hub architecture may help further strengthen the concepts discussed below.

 

System Requirements

Analytics Hub can be deployed on any Windows Server 2012+ environment, either on-premises or on cloud services such as AWS, Azure, or Google. Analytics Hub requires Windows Server 2012 or newer with a minimum of 4 Physical Cores and 16 GB of RAM, although 32 GB will provide a much-improved user experience. At least 100 GB of disk should be available to account for the product installation, PostgreSQL database, and virtual file system binary storage. The Windows Server should be a clean environment with no previous versions of Alteryx Designer, Alteryx Server, or PostgreSQL.

Analytics Hub utilizes Microsoft .NET Framework version 4.7.2. If this is not already installed on the Windows Server, then Analytics Hub will install it during product installation.

The following ports need to be open to ingress traffic:
 

  •         443: HTTPS – used by the Analytics Hub user interface
  •         5000:  HTTPS – used by Remote Workers to communicate with the Primary Hub machine
  •         8080:  HTTPS – used by the Primary Hub machine to communicate with Remote Workers


In addition, HTTP/HTTPS egress traffic is required to access whitelist.alteryx.com for license activation.

The most up-to-date requirements can be found on the Analytics Hub System Requirements page.

 

 

Deployment

A base Analytics Hub installation will include all four Analytics Hub architectural components on a single machine. This is sufficient for small departments. For use cases where a large number of concurrently running jobs need to be processed, adding remote Worker nodes can help meet those workload demands.

 

idea Skyscrapers

Figure 1: Sample deployment patterns.

 

 

 

Scalability

Scaling Workers allows Alteryx Analytics Hub to tackle big data problems, many concurrent running jobs, and even provide redundancy in the case of failure. Workers can be scaled in two ways.
 

  •      Scaling Up or Vertical Scaling.  This means adding resources such as CPU and RAM to the Worker machine so that jobs have more resources available for processing, or so that the number of simultaneous workflows on that Worker can be increased. This may improve performance by allowing jobs to complete faster or by allowing more jobs to run at a time.
     
  •      Scaling Out or Horizontal Scaling.  This approach adds Worker Nodes, which should improve performance and availability. By adding more Workers, it allows more jobs to run concurrently without impacting the performance of jobs running on the existing Worker(s). Multiple Workers also provides fault tolerance as jobs can continue to run even if one of the Workers is inaccessible.

idea Skyscrapers

Figure 2: Worker Scalability options.

 

To increase the number of jobs that may execute simultaneously on a Worker, modify the value of the “number_of_engines” setting in the CutlassSettings.yml file. The default is two, meaning two jobs can run concurrently. A recommended starting point for this setting is ½ the number of physical cores. For example, an 8-core machine should start with number_of_engines = 4.

Increasing this value will allow more jobs to run simultaneously which may result in fewer jobs waiting in the job queue to run. However, more jobs running concurrently means higher competition for system resources, notably Disk I/O, RAM, and CPU cycles, which could lead to jobs taking longer to execute.

There are many factors to consider, such as data set sizes, the Workflow characteristics, and the underlying hardware. It is highly recommended that each organization perform internal benchmarking and analysis to understand what performs best based on their unique workload characteristics and hardware.

 

Conclusion

This article has provided an overview of the deployment requirements and options for Analytics Hub.  It is recommended to have an architectural discussion with your Alteryx representative to understand the business requirements and workload characteristics. For more information on Analytics Hub, be sure to check out the Alteryx Analytics Hub Knowledge Base.

Comments
Balders
10 - Fireball

Hi David, is there a high availability set up for AAH that can keep it running if the primary node fails? 

AndrewDataKim
12 - Quasar
12 - Quasar

Hi @Balders ,

 

Currently there is not an HA setup for AAH as the Front End and Backend cannot be separated. I would expect to see it coming in the near future as it will be a requirement for larger Enterprise clients.