community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Promote Knowledge Base

Definitive answers from Promote experts.
Community v19.6

Looks aren't everything... But the latest Community refresh looks darn good!

Learn More
Introduction To ensure that prediction requests are responsive and that an Alteryx Promote instance can scale with user and model demands, the size of the supporting infrastructure should be carefully considered. This article will review the main factors that influence system performance and provide guidance on sizing for an initial deployment. Details should always be reviewed with your Alteryx representative to understand how these recommendations might be customized to best fit your use case and requirements.   For helpful background information, we recommend reading the article  An Overview of Promote's Architecture.   Minimum Requirements Promote has a minimum system requirement of 3 machines, each with 4 cores and 16GB of RAM:     This configuration is probably suitable for most development environments. However, for a production environment, there are several factors that should be understood to ensure the environment is sized properly. The following sections will introduce these factors, as well as how they impact resource consumption.   Number of Models Each model deployed to Promote creates a new Docker service with two containers by default. These containers must always be ready to receive prediction requests, and thus are consuming system resources (CPU, RAM, Disk). The model size and complexity also contribute to the amount of system resources that are consumed.   Replication Factor The replication factor setting determines the number of Docker tasks & containers to deploy to support a given model. The default value is two, which means each model will have two containers running to service incoming prediction requests. This provides redundancy in the case of failure and allows for two prediction requests to be handled concurrently. For a development environment, this number could be reduced to 1. For high demand production environments, the number of replicas can be increased to handle the workload with fast response times.     The default replication factor is 2.   Model Size Promote models are deployed as Docker containers which are running instances of Docker images. These containers utilize memory, so the size of the model (the amount of memory it consumes) directly affects the number of models that can be deployed onto a machine. The goal is to maximize the number of models that can be deployed on a machine while still performing well and not exhausting the memory.   A useful command for understanding the memory size of models deployed in a Promote environment is below.     # docker ps -q | xargs docker stats --no-stream CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 0ca3a80f0f43 joeuser-irene-2.1.wb3brhtaz6zxe1l40a5gtmcvp 0.00% 102.9MiB / 31.37GiB 0.32% 123MB / 115MB 9.1MB / 120kB 29 71813bc3a453 sallyuser-CampaignRStudioNew-1.1.zkpsdf6cu9as4n19zhmxlzqe8 6.88% 228.3MiB / 31.37GiB 0.71% 45MB / 40.8MB 0B / 76.8kB 33   In this example, Sally's model is using 228MB while Joe's is only using 102MB. This amount will fluctuate a bit as prediction requests come and go, but overall this gives a good idea of the model's memory requirements.     Frequency of Prediction Requests Prediction requests to models require CPU time to execute. The more frequently prediction requests come in, the higher the resulting CPU utilization will be.     Complexity of Models Not all models are created equal. A simple logistic regression doesn't require a large amount of resources to derive a prediction and typically responds in a few milliseconds. However, a much more complicated model such as Gradient Boosting could crunch away for several seconds consuming CPU cycles before responding.     Prediction Logging Prediction Logging stores information for every p rediction request which can be viewed in the Model Overview -> History tab. This includes the date and time of the request, the inputs to the model prediction, and the predicted output from the model. This data is stored  for up to  14 days, with a maximum storage size of 50GB.    Prediction logging can be set for Dev & Staging models, or for Production models.   Prediction Logging can be useful for testing and validation of models in a Development / Staging environment, or for auditing information in a Production environment. There is however a performance cost for enable prediction logging, as all the prediction information is logged and sent to Elasticsearch to be indexed for quick retrieval later. The overhead of this setting will certainly vary based on many of the factors mentioned above, but as an example we've seen CPU utilization double when enabling Prediction Logging in high volume environments.      Recommendations Every organization's use cases, models, workloads, and data sizes are different. The below recommendations are suggested starting points.  These recommendations are based on:   Default Replication Factor = 2 Small to Medium model sizes.   (100MB - 200MB memory requirements) Simple model complexity.   (predictions only take a few ms) Infrequent Prediction Requests. Prediction Logging ON   We've categorized the environments as Small, Medium, or Large based on the total number of models (which includes Dev, Staging, and Production).     Total Number of Models Minimum Recommended Cores Minimum Recommended RAM 3-Machine Config 4-Machine Config 6-Machine Config Small 0 - 19 12 48 GB 4 cores / 16 GB N/A N/A Medium 20 - 39 24 96 GB 8 cores / 32 GB 6 cores / 24 GB 4 cores / 16 GB Large 40 + 36 144 GB 12 cores / 48 GB 9 cores / 36 GB 6 cores / 24 cores     You'll notice a 4GB per core ratio. In most cases, this works well. If model sizes are much larger than what is shown here then we recommend increasing that ratio, perhaps to 8GB per core. You'll also notice for the Medium and Large environments, there are options for 3 large machines or multiple smaller machines. Let's dive into the reasons to consider each.   Scaling Vertically (adding cores) vs Horizontally (adding machines) Scaling Promote vertically by increasing the number of processing cores on each machine will provide your predictive models with additional CPU power when processing incoming requests without additional machines to manage.     Scaling Promote horizontally by adding additional machines allows you more redundancy and protection in the event a machine does fail, and also allows you to configure a higher replication factor for the models. This will provide greater concurrency to support models with high volume prediction request rates.    Conclusion This article has shown some of the key factors to consider when designing a new Alteryx Promote environment. Understanding these factors will help design an environment that can support fast response times to user prediction requests and scale as workload demands increase. Details should always be reviewed with your Alteryx representative to understand how these recommendations might be customized to best fit your use case and requirements.   Other Resources An Overview of Promote's Architecture Promote System Requirements
View full article
Platform Product: Promote Issues – Working with Alteryx Customer Support Engineers (CSEs) (for use by CSEs and Alteryx Customers)   To EXPEDITE the resolution of your case, please include the below information.   Promote - Requested Information *** Suggestion: copy/paste the questions below and email the supporting documentation to support@alteryx.com   1. Detailed description of the Issue 2. Alteryx Version   Promote – Requested Information (Detailed Instructions):   1.  Detailed Description of the Issue – What issues are you having?  Has it worked in the past?  When did the issue start?  Are all users affected or just some?  What are the steps to reproduce your issue?  What have you tried to resolve the issue?  Have you searched the Alteryx Community ?     2.  Screenshot of Alteryx Version– Our CSEs need to know the precise version of Alteryx so we can replicate any issues.   In Designer, on your desktop or Server, click Help >> About and provide a screenshot.  The screenshot will include whether it is Server or Designer.  In addition, whether it is “Running Elevated” Admin vs. Non-Admin.         Suggested links:   Promote Part 1 Promote Part 2 Promote Part 3
View full article
  Often, when deploying a model up to Promote, the model requires certain dependencies to run. These dependencies can be certain functions, files, etc. If your model requires them, you’ll need to create a promote.sh, which contains commands to import these dependencies. This will be one of the factors needed to ensure your model will be set up for success on Promote, because sometimes a model needs a little help.     If we go to https://github.com/alteryx/promote-python we can go into the article-summarizer example, which contains one of these promote.sh files. You’ll notice that if you open the file, you’ll see this command:   python -c "import nltk; nltk.download('punkt')" This is required because the newspaper package in the model (main.py) requires an NLP dataset. Now, when we deploy the model, the promote.sh file will run at the same time, which will ensure the dependencies live inside the model environment (docker model image). We can now properly test the model in Promote!   If we're looking at an R example (there is one on the Promote GitHub), you will have the same folder structure, except the promote.sh file will look something like this: curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list apt-get update # ACCEPT_EULA=Y ACCEPT_EULA=Y apt-get -y install msodbcsql17 apt-get -y install unixodbc-dev apt-get -y install r-cran-rodbc apt-get -y install libiodbc2-dev In this case, our model requires an ODBC driver, therefore our model container will also need it in order to run on Promote. Just as in the above Python example, when we deploy this model, the promote.sh file will run and the proper driver will be installed, enabling us to work and test this model on Promote!   Once you get these all set, you'll be good to venture on and make your model the best it can be!
View full article
How To: Backup the Promote PostgreSQL database   Promote uses a PostgreSQL database. This article explains how to create a backup of a Promote instance's database. Backing up your Promote database can save you time and headaches in the event of a system failure.   Prerequisites   Alteryx Promote  ≥ 2018.1   Procedure   The PostgreSQL database must be backed up from the Master node, and not just the Leader node (note: these can be the same node). To return a list of node IDs, run the following command: docker node ls --format='{{.ID}}' To check if a node is the Master node, run the following command on each node ID from the list returned by the above command, replacing {node_id} with the node ID. docker node inspect {node_id} --format='{{.Spec.Labels.master}}' Once you have determined the Master node (where "yes" is the returned response value from the above command), run the following commands from that node to start the PostgreSQL backup process. On the host machine  run  the following bash command : docker exec -it $(docker ps | grep promote-db | awk '{print$1}') bash Run the backup script: sh /scripts/backups.sh Change to the directory of the database backup: cd /var/backups/postgres Copy the backup file to the host machine (the node that is hosting the promote-db docker container) from the database container (promote_app). This command needs to be run outside of the database container, changing /location/path to the database path on the host machine: docker cp $(docker ps | grep promote-db | awk '{print$1}'):/var/backups/postgres/{backup db name} /location/path A backup of your database should now be saved in your specified directory. You deserve a coffee!
View full article
How To: Restore a Promote PostgreSQL Database   This article outlines the process of restoring the Promote PostgreSQL database from a backup. For instructions on creating a backup, please see How To: Create a Promote PostgreSQL Database Backup.   Use these steps only if there is no data in your database.    Only run these commands if: Your PostgreSQL database is corrupted. Your database is in a new state.   Prerequisites   Alteryx Promote  ≥  2018.2.1   Procedure   The PostgreSQL database must be restored from the Master node, and not just the Leader node (note: these can be the same node). To return a list of node IDs, run the following command: docker node ls --format='{{.ID}}' To check if a node is the Master node, run the following command on each node ID from the list returned by the above command, replacing {node_id} with the node ID. docker node inspect {node_id} --format='{{.Spec.Labels.master}}' Once you have determined the Master node (where "yes" is the returned response value from the above command), run the following commands from that node to start the PostgreSQL restoration process. Ensure the backed up PostgreSQL database is on your host machine.  Provide the Database Password with the following command: cat /var/promote/credentials/db.txt Copy the PostgreSQL database from the master node to the promote-db  container on the same node, where /location/path is the location of the backed-up database: docker cp /location/path $(docker ps | grep promote-db | awk '{print$1}'):/var/backups/postgres  On the host machine run the following bash command : docker exec -it $(docker ps | grep promote-db | awk '{print$1}') bash Restore the database within the promote-db container: pg_restore -c -U ${POSTGRES_USER} -d ${POSTGRES_DB} -v "/var/promote/postgres/{database name}" You should now be able to log into the UI and see your predictive models rebuild and go online. If not, please open a support ticket in the Case Portal.
View full article
  This article outlines the step-by-step procedure for enabling a set of backup nodes as a production cluster. This article does not describe how to capture the backup clones.  Typically, these backup clones should come from a snapshot. The specific process for capturing these clones will depend on your deployment's infastructure. 
View full article
Issue    Although the Admin page indicates that there are models deployed to the Promote instance, the models are not appearing in the Promote UI home page.     Environment   Alteryx Promote  ≥  2018.2.1   Cause   There was an issue with your web browser cache, causing an incomplete view of the Promote UI home page to load.   Solution   Clear your browser's cache for the Promote UI site address.  Follow the steps provided for your browser here. If following the steps to clear the cache for your browser does not resolve the issue, please open a support ticket using the Case Portal.
View full article
Troubleshooting steps for missing model logs.
View full article
Keras  is an  open-source  neural network API library, written in Python  designed to run on top of  TensorFlow ,  CNTK , or  Theano . In this article, we demonstrate how to deploy a Keras model to Promote.
View full article
Models deployed to Promote can be queried through a couple of different ways, one of them being a standard REST API post request. Querying a model consists of sending in the predictor variables to the model, allowing the model to process the data and make predictions. After the prediction is made from the model, the return is the score based on the predictor variables entered.
View full article
Did you know that you can use Promote to query a database (or include a database query in your Promote model)? 
View full article
An overview of user management in Promote. 
View full article
This article provides an overview of the administrative options in Promote (excluding user management, which can be found here).   Basic   Models   Within the Admin Dashboard, an Admin user can view a list of all the models deployed to any environment by any user by clicking on the Models tab.      System Overview   Within the Admin dashboard, an Admin user can monitor system health metrics for each node in the Promote cluster by clicking on the System Overview tab.     Advanced   Within the Admin dashboard, an admin user can adjust several settings that affect the performance and behavior of the system by clicking on the Advanced tab.    Base Image   An Admin user can change the base image used to deploy both R and Python models. An admin user may do this if they create a new image that has custom R or Python libraries available on it, or if they'd like to use a different version of R or Python.       Disk Bundle Limit   An admin user can change the disk bundle limit to protect the system against running out of disk space. The disk bundle limit limits the number of versions of a model that are stored on disk.            Prediction Logging   Promote can store logs for every prediction request made for up to 14 days, with a maximum of 50GB. You can toggle this logging on and off for Development/Staging and Production in this section.         We hope this gives you a good foundation for administering your Promote instance. Good luck, we're all counting on you.   
View full article
There are two tools in Alteryx Designer that connect to Promote; the   Deploy  tool and  the Score tool. The Deploy tool allows you to send trained models from Alteryx Designer to Promote. The Score tool allows you to connect to a model that has already been deployed to Promote to create predictions from a provided data set.    The first step is to have a model object from a trained model. You can use any of the standard Predictive Tools to train a model (including the R tool), as long it is not from the revoscaler package, which is not currently supported by Promote.   In this example, let's say we are interested in training a random forest model to predict forest type (classification) based on remotely sensed spectral data. The study area covers Japan, and the predictor variables include values for  visible to near-infrared wavelengths, derived from ASTER satellite imagery.    After performing some data investigation and pre-processing (this dataset is already very clean) we can create, refine, and ultimately select our model.     Once we have a model we are happy with, we can send it to Promote using the Deploy tool. You can start by adding a Deploy tool to the canvas and adding it to the O anchor of your selected model.      If you haven't already, connect your Alteryx Designer instance to Promote.   To being the process of adding a Promote connection, click the Add Connection button in the Configuration window of the Deploy tool.     After clicking the Add Connection button, a modal window will pop up on your screen. Type your Promote instance's URL in the first screen and click Next.       Now add your Username and API key.         For your API key, you may need to log in to your Promote instance and navigate to the Account page.         Once you have your username and API key correctly added to the modal window, click Connect. If all your information checks out, you will see this success message.     After clicking Finish, there will be an option in your Alteryx Promote Connection drop-down menu. You will also see a new option to Remove Connection.     To deploy a model, give it a name in the Model Name setting and run your workflow. If this is a new or updated version of a model that already exists on Promote, give it the same name as the currently deployed version, and check the Overwrite existing model option.   After running the workflow, if the model deploys successfully, you will see a message from the Deploy tool that says "your model is building, check the UI for build logs" in your results window.      To check the build logs, navigate back to the Promote UI in your web browser, click on your model, and then click on the Logs tab. You still see the messages from the model building process. If all is well, the log will end with a "model built successfully" message.        Your model now lives on Promote! 
View full article
One of the most important features of Promote is its ability to return near-real-time predictions from deployed models. Here is a list of frequently asked questions relating to Promote prediction requests.
View full article
Promote is data science model hosting and management software that allows its users to seamlessly deploy their data science models as highly available microservices that return near-real-time predictions by leveraging REST APIs. In this article, we provide an overview of Promote’s technical requirements and architecture.
View full article
If you’ve never heard of Docker, or aren’t particularly familiar with it, you are probably wondering “what’s the deal with Docker?”
View full article
If you have a model that takes longer than 10 seconds to return results, by default Promote will time out your model API query. If you would  like  Promote to wait longer than 10 seconds before timing out, you can adjust this timeout setting with an environmental variable called PREDICTION_TIMEOUT. 
View full article
When making a call to a Promote model, the input data used to make a prediction is sent in a JSON format. When working with an R model, prior to reaching the model.predict() function, the JSON string that was sent to your model is converted to an R format (either an R dataframe or an R list). By default, this conversion is performed with the fromJSON() function in the jsonlite R package.
View full article
Some Promote customers have run into an  issue  where the status of the predictive model will flicker between online and offline continuously on the Promote UI page. This article discusses the cause of the issue, as well as how to resolve it.
View full article