Gallery suddenly stops working and user can't access it until Alteryx service is restarted
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello All,
I'm facing an issue related to Alteryx Gallery. My users complained that they can't access the gallery. The gallery portal is getting down frequently every day and i don't have a clue what might be going wrong. I first noticed this error in early week of June 2020 but since last 3 weeks it is happening consistently almost every day. Few days the portal would be down only two times while other days it's more than two times going down.
We ensured we have no issues with the CPU, Memory and space and Alteryx recommendations are met.
AlteryxServer Version 2019.2.10.64688
CPU: 3.00 GHZ (2 processor) 8 cores ,
Memory: 128 GB
705 GB Disk space available
Diagnose done so far: Whenever the gallery portal goes down; one of below message is what I can see on the browser
- {"data":null,"exceptionName":"AlteryxServiceException","innerExceptionMessage":"","message":"Error receiving data: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.\u000d\u000a (10060)"}
- An unknown error occurred during authentication with your Windows credentials
I tried looking at the Service Log & the Gallery logs and the below is what i find but I'm not able to decode it 😞 The places where i have masked in the error log below is the IP of the user invoking the app request and the Alteryx server. There is also socket time out error in the logs.
Whenever the user complaints of the portal being down, I goto the services.msc to find the AlteryxService.exe Status. My initial assumption was that if the portal is going down then one of below services could be down but this is not the case. All Alteryx related services seems to be running OK.
- AlteryxService.exe:
- AlteryxCloudCmd.exe
- mongo.exe
- alteryxService_WebInterface.exe
- MongoController.exe
So I simply restart the AlteryxService.exe and the Gallery portal becomes up and accessible. Since it's happening every day multiple times, so, I have created a batch script and scheduled it to starts the AlteryxService .exe to avoid the above situation happening. As of now the iam simply restarting it few times in a day irrespective of AlteryxService.exe status. However, iam looking for a permanent fix.
Also did a telnet to my Alteryxserver and the port it returned blank
When the gallery portal goes down, i also find it down on the Alteryx Server machine as well.
We have also raised a support ticket with AlteryxSupport a week back and still awaiting their responses. I am hopeful someone in our Alteryx community would have encountered this and would be able to assist me how we can avoid it.
Thanks ahead for your time and support.
Varun
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Looks to me like something going on with your MongoDB connection. How do you have it setup? Is it a single node installation? or do you have Mongo on a different server(s)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi There,
It's a single node installation. Both Alteryx Server and MongoDb on the same machine. As you can see mongo db is not user managed.
Thanks,
Varun
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Sounds like a problem authenticating users with AD. Are all users on the same domain as the VM? Are your domain controllers stable?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Yes the users and machine are on the same domain. Users use the AD authentication for logging. We never faced the issue in last few years and been using the same authentication mechanism.So I think domain controller should be stable. Did you meant anything else from "domain controller"?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
Iam still unsuccessful in finding the root cause of the gallery being down frequently.
I have few questions related to the AlteryxService. Please help with these.
- Iam thinking to install the AlteryxGalleryCmd.exe as a service on the single node where AlteryxService.exe is already installed. Has anyone done it before. Any thoughts. Is this a good idea.
- What's the default web server used to host AlteryxGallery and where can I find it's settings/configuration.
- What should be the size of mongod.lock look like when the AlteryxService is stopped and when the service is running.
- When i ran my AlteryxService in test mode i noted "“lock obtain timed out: MongoDb.Lucene.Mongo Document Lock Retry in 1 second ” " Is this a concern or normal to receive this message. What can we infer from it.
- Does any one think that the Gallery services can go into not reachable state due to too many job request submitted simultaneously. May be can this happen due to increase in either the job placed via gallery or the concurrent users. any thoughts?
Please advise. Thanks In Advance.
--
Thanks,
Varun
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
Wanted to share the resolution for the above thread. The root cause actually came to be an installation of network scanner software on Alteryx server. The scanner software (Nmap ,Npcap) was interfering with network connectivity to the server and this caused Alteryx service and Gallery to fail at random. So after uninstalling the network scanner software the Gallery connection issue was fully resolved. Server has been stable since then. We also didn't noticed the mongo , n/w connectivity related error messages in the server event log since the uninstall.
Thanks all for providing their valuable inputs.
Regards,
Varun
