Alteryx Server Discussions

Find answers, ask questions, and share expertise about Alteryx Server.
SOLVED

Gallery suddenly stops working and user can't access it until Alteryx service is restarted

varundcs
7 - Meteor

Hello All,

 

I'm facing an issue related to Alteryx Gallery. My users complained that they can't access the gallery. The gallery portal is getting down frequently every day and i don't have a clue what  might be going wrong. I first noticed this error in early week of June 2020  but since last 3 weeks it is happening consistently almost every day. Few days the portal would be down only two times while other days it's more than two times going down.

 

We ensured we have  no issues with the CPU, Memory and space and Alteryx recommendations are met.

AlteryxServer Version 2019.2.10.64688

CPU: 3.00 GHZ (2 processor) 8 cores ,  

Memory: 128 GB 

705 GB Disk space available

 

 

joining.jpg

 

Diagnose done so far: Whenever the gallery portal goes down; one of below message is what I can see on the browser

 

  • {"data":null,"exceptionName":"AlteryxServiceException","innerExceptionMessage":"","message":"Error receiving data: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.\u000d\u000a (10060)"}
  •   An unknown error occurred during authentication with your Windows credentials
 

I tried looking at the Service Log & the Gallery logs and the below is what i find but I'm not able to decode it 😞   The places where i have masked in the error log below is the IP of the user invoking the app request and the Alteryx server.  There is also socket time out error in the logs. 

 

ErrorLog.jpg

 

 Whenever the user complaints of the portal being down, I goto the services.msc to find the AlteryxService.exe Status. My initial assumption was that if the portal is going down then one of below services could be down but this is not the case. All Alteryx related services seems to be running OK. 

  • AlteryxService.exe:
  • AlteryxCloudCmd.exe 
  • mongo.exe
  • alteryxService_WebInterface.exe
  • MongoController.exe

So I simply restart the AlteryxService.exe and the Gallery portal becomes up and accessible. Since it's  happening every day multiple times, so, I have created a batch script and scheduled it to starts the AlteryxService .exe to avoid the above situation happening. As of now the iam simply restarting it few times in a day irrespective of AlteryxService.exe status. However, iam looking for a permanent fix.

 

Also did a telnet to my Alteryxserver and the port it returned blank

When the gallery portal goes down, i also find it down on the Alteryx Server machine as well. 

 

 

We have also raised a support ticket with AlteryxSupport a week back and still awaiting their responses. I am hopeful  someone in our Alteryx community would have encountered this and would be able to assist me how we can avoid it. 

 

 

Thanks ahead for your time and support.

 

Varun

6 REPLIES 6
joshuaburkhow
ACE Emeritus
ACE Emeritus

Looks to me like something going on with your MongoDB connection. How do you have it setup? Is it a single node installation? or do you have Mongo on a different server(s)? 

Joshua Burkhow - Alteryx Ace | Global Alteryx Architect @PwC | Blogger @ AlterTricks
varundcs
7 - Meteor

Hi There,

It's a single node installation. Both Alteryx Server and MongoDb on the same machine. As you can see mongo db is not user managed.

varundcs_0-1594404234738.png

Thanks,

Varun

 

raychase
11 - Bolide

Sounds like a problem authenticating users with AD. Are all users on the same domain as the VM? Are your domain controllers stable?

varundcs
7 - Meteor

Yes the users and machine are on the same domain. Users use the AD authentication for logging. We never faced the issue in last few years and been using the same authentication mechanism.So I think domain controller should be stable. Did you meant anything else from "domain controller"?

varundcs
7 - Meteor

Hi All,

 

Iam still unsuccessful in finding the root cause of the gallery being down frequently.

I have few questions related to the AlteryxService.  Please help with these.

 

  1. Iam thinking to install the AlteryxGalleryCmd.exe as a service on the single node where AlteryxService.exe is already installed. Has anyone done it before. Any thoughts. Is this a good idea.
  2. What's the default web server used to host AlteryxGallery and where can I find it's settings/configuration.
  3. What should be the size of mongod.lock look like when the AlteryxService is stopped and when the service is running.
  4. When i ran my AlteryxService in test mode i noted  "“lock obtain timed out: MongoDb.Lucene.Mongo Document Lock Retry in 1 second ” "  Is this a concern or normal to receive this message. What can we infer from it.varundcs_0-1594839934820.png

     

  5. Does any one think that the Gallery services can go into not reachable state due to too many job request submitted simultaneously. May be can this happen due to increase in either the job placed via gallery or the concurrent users. any thoughts?

 

Please advise. Thanks In Advance.

 

--

Thanks,

Varun

 

varundcs
7 - Meteor

Hi All,

 

Wanted to share the resolution for the above thread.  The root cause actually came to be an installation of network scanner software on Alteryx server. The scanner software (Nmap ,Npcap) was interfering with network connectivity to the server and this caused Alteryx service and Gallery to fail at random. So after uninstalling the network scanner software the Gallery connection issue was fully resolved. Server has been stable since then.  We also didn't  noticed the  mongo , n/w connectivity related  error messages in the server event log since the uninstall.

 

Thanks all  for providing their valuable inputs.

 

Regards,

Varun