This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Alteryx Analytics Hub delivers an enterprise class data and analytics platform. It does this through four main components: The Frontend, Backend, Worker, and Persistence layer.
Figure 2: The Alteryx Analytics Hub architectural components.
The Frontend acts as the entry point into the platform. It is built on an Express.js web server, which supports a browser-based User Interface and an extensive list of REST API endpoints.
Figure 3: The Frontend supports the browser-based user interface and the Public APIs.
The User Interface provides a visual methodology to interact with business processes and assets. It allows for running jobs, sharing and collaborating on assets, and administering the environment. Multi-tenancy support is offered through Sites, which provides a logical separation of content and users.
Public API endpoints are available to automate the same functions available in the User Interface. These endpoints can be used by other Alteryx products, such as Alteryx Designer, or custom applications.
The Frontend is secured by username and password authentication, as well as HTTPS encryption.
The Backend is where all the core logic of Alteryx Analytics Hub resides. While the Frontend is responsible for the user interface and public APIs that the end user or application interacts with, the Backend is what processes those actions. This includes authenticating users, scheduling jobs, assigning jobs to Workers, and performing administrative functions such as modifying permissions or groups.
The Backend also manages the Virtual File System, which is where all assets are stored. Assets include Alteryx workflows, data files, and report-based output files.
As shown in Figure 4, all communication flows through the Backend. The Backend retrieves and stores information in the persistence layer based on requests from the Frontend. The Backend also assigns jobs to Workers to process Alteryx workflows.
Figure 4: All communication flows through the Backend.
The Worker is responsible for running Alteryx workflows using the Alteryx Engine. The Engine can process workflows using the standard “E1” engine, or the new “AMP” engine. The AMP engine (short for Alteryx Multi-Threaded Processing) enables lightning fast analytic execution by using multi-threaded processing to tackle complex large data problems. A workflow setting determines which Engine to utilize, and Analytics Hub supports both.
Analytics Hub offers scalability and performance by allowing multiple jobs to run simultaneously. A job is a running instance of an Alteryx Workflow. Each Alteryx Engine process can execute a single Alteryx Workflow at a time.
Figure 5: One Engine process can execute one Alteryx workflow.
Each Worker can manage multiple instances of the Alteryx Engine, meaning a Worker can manage multiple running jobs simultaneously.
Figure 6: One Worker can manage multiple Engines simultaneously.
A configuration file setting (number_of_engines) controls the number of simultaneous workflows a Worker can run. Additional Worker nodes can be added to provide redundancy and scaling. The number of Workers needed is determined by the average Workflow execution time and the number of jobs expected to run on a daily or hourly basis across the Analytics Hub environment. A sizing discussion with an Alteryx representative should take place to determine a recommended value of Workers and Engines. It should be noted that a single Workflow job cannot be scaled across multiple Worker nodes operating in a cluster.
The persistence layer of Analytics Hub is provided by PostgreSQL which is an industry standard SQL database providing extensibility and compliance. The PostgreSQL database is where all Analytics Hub application data is stored, including Users, Roles, Schedules, and Sites. Note that user assets and data files are stored in a Virtual File System, to be described later. The database contains three different schemas used by Analytics Hub:
·platform - the main schema which stores the information for all application data, such as data connections, schedules, and jobs
·pgboss - used for internal system activities such as checking for new jobs
·rdbms - used as a temporary staging area for data source metadata that has not yet been written to the platform schema
Figure 7: The three schemas used by Analytics Hub in the PostgreSQL database.
The platform schema is responsible for storing all the information needed to support the usage of Analytics Hub. This includes roughly 50 tables that can be summarized by the categories in the diagram below.
Figure 8: High-level categories of the tables supporting the Analytics Hub platform.
Any reporting or monitoring should be performed off the platform schema. However, excessive connections and queries could impact the performance of Analytics Hub and impact the end user experience.
And that concludes the overview of the Alteryx Analytics Hub architecture. If you have any specific questions, please reach out to your Alteryx representative and we’ll be happy to answer them.