This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Welcome to part 2 of the Supporting Promote series. In this series, we will tackle some common issues and questions, and provide best practices for troubleshooting. In this article, we will be investigating one common "Promote Service Down" scenario - when the promote_logspout and promote_logstash services are down. You can follow these same steps to start troubleshooting other downed services.
Welcome to part 4 of the Supporting Promote series. In this series, we will tackle some common issues and questions, and provide best practices for troubleshooting. This article will demonstrate backing up and restoring your Promote PostgreSQL database.
Welcome to part 3 of the Supporting Promote series. In this series, we will tackle some common issues and questions, and provide best practices for troubleshooting. This article will step through the process of restoring the Promote web app.
Some Promote customers have run into an issue where the status of the predictive model will flicker between online and offline continuously on the Promote UI page. This article discusses the cause of the issue, as well as how to resolve it.
One of the most important features of Promote is its ability to return near-real-time predictions from deployed models. Here is a list of frequently asked questions relating to Promote prediction requests.
Although the Admin page indicates that there are models deployed to the Promote instance, the models are not appearing in the Promote UI home page.
Alteryx Promote ≥ 2018.2.1
There was an issue with your web browser cache, causing an incomplete view of the Promote UI home page to load.
Clear your browser's cache for the Promote UI site address.
Follow the steps provided for your browser here.
If following the steps to clear the cache for your browser does not resolve the issue, please open a support ticket using the Case Portal.
This article outlines the step-by-step procedure for enabling a set of backup nodes as a production cluster. This article does not describe how to capture the backup clones. Typically, these backup clones should come from a snapshot. The specific process for capturing these clones will depend on your deployment's infastructure.
Platform Product: Promote Issues – Working with Alteryx Customer Support Engineers (CSEs) (for use by CSEs and Alteryx Customers)
To EXPEDITE the resolution of your case, please include the below information.
Promote - Requested Information
*** Suggestion: copy/paste the questions below and email the supporting documentation to firstname.lastname@example.org
1. Detailed description of the Issue
2. Alteryx Version
Promote – Requested Information (Detailed Instructions):
1. Detailed Description of the Issue – What issues are you having? Has it worked in the past? When did the issue start? Are all users affected or just some? What are the steps to reproduce your issue? What have you tried to resolve the issue? Have you searched the Alteryx Community ?
2. Screenshot of Alteryx Version– Our CSEs need to know the precise version of Alteryx so we can replicate any issues.
If problem involves Alteryx Designer (Scoring or Deploy tools), please provide version too. In Designer, on your desktop or Server, click Help >> About and provide a screenshot.
The screenshot will include whether it is Server or Designer. In addition, whether it is “Running Elevated” Admin vs. Non-Admin.
Promote Part 1
Promote Part 2
Promote Part 3
Often, when deploying a model up to Promote, the model requires certain dependencies to run. These dependencies can be certain functions, files, etc. If your model requires them, you’ll need to create a promote.sh, which contains commands to import these dependencies. This will be one of the factors needed to ensure your model will be set up for success on Promote, because sometimes a model needs a little help.
If we go to https://github.com/alteryx/promote-python we can go into the article-summarizer example, which contains one of these promote.sh files. You’ll notice that if you open the file, you’ll see this command:
python -c "import nltk; nltk.download('punkt')"
This is required because the newspaper package in the model (main.py) requires an NLP dataset. Now, when we deploy the model, the promote.sh file will run at the same time, which will ensure the dependencies live inside the model environment (docker model image). We can now properly test the model in Promote!
If we're looking at an R example (there is one on the Promote GitHub), you will have the same folder structure, except the promote.sh file will look something like this:
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
ACCEPT_EULA=Y apt-get -y install msodbcsql17
apt-get -y install unixodbc-dev
apt-get -y install r-cran-rodbc
apt-get -y install libiodbc2-dev
In this case, our model requires an ODBC driver, therefore our model container will also need it in order to run on Promote. Just as in the above Python example, when we deploy this model, the promote.sh file will run and the proper driver will be installed, enabling us to work and test this model on Promote!
Once you get these all set, you'll be good to venture on and make your model the best it can be!
When making a call to a Promote model, the input data used to make a prediction is sent in a JSON format. When working with an R model, prior to reaching the model.predict() function, the JSON string that was sent to your model is converted to an R format (either an R dataframe or an R list). By default, this conversion is performed with the fromJSON() function in the jsonlite R package.
Promote is data science model hosting and management software that allows its users to seamlessly deploy their data science models as highly available microservices that return near-real-time predictions by leveraging REST APIs. In this article, we provide an overview of Promote’s technical requirements and architecture.
Promote uses the application NGINX as a load balancer. In Promote, NGINX is configured to require TLS (Transport Layer Security) or SSL (Secure Sockets Layer) certificates. This article goes through the step by step process of using your own TLS/SSL certificate during installation or updating your TLS/SSL certificates after installation.
If you have a model that takes longer than 10 seconds to return results, by default Promote will time out your model API query. If you would like Promote to wait longer than 10 seconds before timing out, you can adjust this timeout setting with an environmental variable called PREDICTION_TIMEOUT.
To ensure that prediction requests are responsive and that an Alteryx Promote instance can scale with user and model demands, the size of the supporting infrastructure should be carefully considered. This article will review the main factors that influence system performance and provide guidance on sizing for an initial deployment. Details should always be reviewed with your Alteryx representative to understand how these recommendations might be customized to best fit your use case and requirements.
For helpful background information, we recommend reading the article An Overview of Promote's Architecture.
Promote has a minimum system requirement of 3 machines, each with 4 cores and 16GB of RAM:
This configuration is probably suitable for most development environments. However, for a production environment, there are several factors that should be understood to ensure the environment is sized properly. The following sections will introduce these factors, as well as how they impact resource consumption.
Number of Models
Each model deployed to Promote creates a new Docker service with two containers by default. These containers must always be ready to receive prediction requests, and thus are consuming system resources (CPU, RAM, Disk). The model size and complexity also contribute to the amount of system resources that are consumed.
The replication factor setting determines the number of Docker tasks & containers to deploy to support a given model. The default value is two, which means each model will have two containers running to service incoming prediction requests. This provides redundancy in the case of failure and allows for two prediction requests to be handled concurrently. For a development environment, this number could be reduced to 1. For high demand production environments, the number of replicas can be increased to handle the workload with fast response times.
The default replication factor is 2.
Promote models are deployed as Docker containers which are running instances of Docker images. These containers utilize memory, so the size of the model (the amount of memory it consumes) directly affects the number of models that can be deployed onto a machine. The goal is to maximize the number of models that can be deployed on a machine while still performing well and not exhausting the memory.
A useful command for understanding the memory size of models deployed in a Promote environment is below.
# docker ps -q | xargs docker stats --no-stream CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 0ca3a80f0f43 joeuser-irene-2.1.wb3brhtaz6zxe1l40a5gtmcvp 0.00% 102.9MiB / 31.37GiB 0.32% 123MB / 115MB 9.1MB / 120kB 29 71813bc3a453 sallyuser-CampaignRStudioNew-1.1.zkpsdf6cu9as4n19zhmxlzqe8 6.88% 228.3MiB / 31.37GiB 0.71% 45MB / 40.8MB 0B / 76.8kB 33
In this example, Sally's model is using 228MB while Joe's is only using 102MB. This amount will fluctuate a bit as prediction requests come and go, but overall this gives a good idea of the model's memory requirements.
Frequency of Prediction Requests
Prediction requests to models require CPU time to execute. The more frequently prediction requests come in, the higher the resulting CPU utilization will be.
Complexity of Models
Not all models are created equal. A simple logistic regression doesn't require a large amount of resources to derive a prediction and typically responds in a few milliseconds. However, a much more complicated model such as Gradient Boosting could crunch away for several seconds consuming CPU cycles before responding.
Prediction Logging stores information for every p rediction request which can be viewed in the Model Overview -> History tab. This includes the date and time of the request, the inputs to the model prediction, and the predicted output from the model. This data is stored for up to 14 days, with a maximum storage size of 50GB.
Prediction logging can be set for Dev & Staging models, or for Production models.
Prediction Logging can be useful for testing and validation of models in a Development / Staging environment, or for auditing information in a Production environment. There is however a performance cost for enable prediction logging, as all the prediction information is logged and sent to Elasticsearch to be indexed for quick retrieval later. The overhead of this setting will certainly vary based on many of the factors mentioned above, but as an example we've seen CPU utilization double when enabling Prediction Logging in high volume environments.
Every organization's use cases, models, workloads, and data sizes are different. The below recommendations are suggested starting points. These recommendations are based on:
Default Replication Factor = 2
Small to Medium model sizes. (100MB - 200MB memory requirements)
Simple model complexity. (predictions only take a few ms)
Infrequent Prediction Requests.
Prediction Logging ON
We've categorized the environments as Small, Medium, or Large based on the total number of models (which includes Dev, Staging, and Production).
Total Number of Models
Minimum Recommended Cores
Minimum Recommended RAM
0 - 19
4 cores / 16 GB
20 - 39
8 cores / 32 GB
6 cores / 24 GB
4 cores / 16 GB
12 cores / 48 GB
9 cores / 36 GB
6 cores / 24 cores
You'll notice a 4GB per core ratio. In most cases, this works well. If model sizes are much larger than what is shown here then we recommend increasing that ratio, perhaps to 8GB per core. You'll also notice for the Medium and Large environments, there are options for 3 large machines or multiple smaller machines. Let's dive into the reasons to consider each.
Scaling Vertically (adding cores) vs Horizontally (adding machines)
Scaling Promote vertically by increasing the number of processing cores on each machine will provide your predictive models with additional CPU power when processing incoming requests without additional machines to manage.
Scaling Promote horizontally by adding additional machines allows you more redundancy and protection in the event a machine does fail, and also allows you to configure a higher replication factor for the models. This will provide greater concurrency to support models with high volume prediction request rates.
This article has shown some of the key factors to consider when designing a new Alteryx Promote environment. Understanding these factors will help design an environment that can support fast response times to user prediction requests and scale as workload demands increase. Details should always be reviewed with your Alteryx representative to understand how these recommendations might be customized to best fit your use case and requirements.
An Overview of Promote's Architecture
Promote System Requirements