This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
My name is Austin from the server support team here at Alteryx and I want to thank you for your input on our Server product. I just wanted to clarify a few points you have brought up and our recommendations for this issue. We actually consider this to be more of an issue with configuration rather than a resource exhaustion issue. As we don't expect to not have CPU availability or at least not for an extended period of time and have a few performance guidelines to follow to put you in the best spot for resource allocation.
Specifically, we have a 5 minute timeout period for Gallery. If we are unable to receive ping responses from the gallery node for a 5 minute period our service asks for a restart. We send a shutdown request to gallery after this time, when a HTTP 200 response is received affirming that the gallery will shut down we wait for the gallery process to exit then we request a new gallery process to be spawned. If we see a response code other than 200 to the shutdown request, which we will send 3 times, we terminate the gallery process and do not spawn another one as we do not believe the gallery will properly spawn again and require user intervention.
We specifically do not anticipate that the full CPU will be used, leaving at least some overhead for all of the processes running on a single-node Server environment. When all the resources are used not only are our services fighting for resources, the OS is also fighting for resources when trying to hand out resources to all of the processes that are requesting them. If you are configuring your server to use all resources for the workflows when running and not leaving any overhead for other processes this is what is causing your issue. We can handle a high load for the CPU as long as we are able to retrieve those resources within a few minutes. If the server is capped at 100% CPU usage it is highly unlikely that the Controller/Gallery will receive the necessary resources to continue running effectively.
To put it another way, it is like when you are running processes in your operating system and you max out all of your resources, all of the less important processes will be put at a lower priority to those needed to continue running the operating system. Then followed by higher priority processes, then medium priority and low priority. This is true for our process as well. The Alteryx Gallery is more of a front end GUI to operate Alteryx. However, the controller and the Engine commands will take precedence over Gallery as Gallery is not critical to the functioning of Alteryx. The information that Alteryx Gallery accesses however is persisting and stored in the MongoDB database and updated properly even though the Alteryx Gallery is down.
We can almost completely avoid this issue by changing the configuration of your server slightly with minimal performance drop offs in workflow execution times. If you would can you open a new case with Alteryx Support and we can look through your configuration and make recommendations to be sure that our processes have adequate resources available and give you the most uptime in your server environment?