community
cancel
Showing results for 
Search instead for 
Did you mean: 

alteryx server Knowledge Base

Definitive answers from Server experts.

Alteryx Server Backup & Recovery Part 2: Procedures

Alteryx
Alteryx
Created on

This is the second article in a series on Alteryx Server backup and recovery. You can find Part 1 at:

 

Alteryx Server Backup & Recovery Part 1: Best Practices

 

As long as a backup of the Mongo database is available, you can get Alteryx Server back up and running. Luckily, backing up the embedded MongoDB is pretty simple, and can be done with a few console commands. I would recommend creating a batch file or script to perform the process. Doing so will allow you to schedule the backup using Windows Task Scheduler. The actual steps to perform a MongoDB backup are covered in detail in the online help under the server configuration section or at this direct link. I will also outline the steps below for completeness.

 

To create a backup of the MongoDB:

 

  1. Stop AlteryxService.
  2. Execute the following command to save a backup of the database in the specified folder:

 

alteryxservice emongodump=<path to backup location>
  1. Restart AlteryxService

 

You can easily script this to a batch file with a few simple console commands. Keep in mind that paths may vary on your server, but it should look something like this.

 

Example:

 

 

"C:\Program Files\Alteryx\bin\AlteryxService.exe" stop
"C:\Program Files\Alteryx\bin\AlteryxService.exe" emongodump=Z:\Path\MongoBackup
"C:\Program Files\Alteryx\bin\AlteryxService.exe" start

 

 

You can add additional features, such as logging and date/time stamps, to the backups.  As an example of additional useful features to include with your backups, I have included the code for a batch script I created that adds the following information: logging with date/time stamping, a backup that is also date/time stamped, automated archival of the backup, copying the archive to a network location, and cleanup of the temp files.

 

Once you have a batch file or other script to perform your backups, you need to test the script to ensure it works properly. Once testing is done, the next step is to schedule the backup. The easiest way to do this is to use Windows Task Scheduler. To create a scheduled task on Windows 2012 Server, follow these steps:

 

Create a scheduled task:

 

  1. Open Task Scheduler and click on “Create Task”

2016-05-11_8-50-15.png

 

  1. On the General tab, enter “Name”, “Description”, select “Run whether user is logged in or not", and select "Run with highest privileges"

2018-07-27_8-54-52.png

 

  1. On the Triggers tab, click “New”

2016-05-11_9-01-50.png

 

  1. A dialogue box will appear. Define the schedule (daily, weekly, etc...) on which you want the backup to run and click “OK”

2016-05-11_9-05-10.png

 

  1. On the Actions tab click “New”

2016-05-11_8-55-03.png

 

  1. On the dialogue window, make sure “Start a Program” is selected and click “Browse”. Select the batch file you created and click “Open”. Then click “OK”.

2016-05-11_8-57-49.png

 

  1. Click “OK” on the Create Task window to finalize the creation of the backup task.

 

Now that you have successfully implemented backup procedures and scheduled a task to automate the backups, it is time to discuss database restoration from a backup. The good news is that restoring the database is just as simple as backing it up. Assuming that 1) the server is functioning, 2) Alteryx Server is installed, and 3) you have a valid backup available, you can follow these simple steps outlined below.

 

To restore a backup of the MongoDB:

 

  1. Stop AlteryxService
  2. Execute the following command to restore the backup:

 

alteryxservice emongorestore=<path to backup location>,<path to restore to>

 

  1. Restart AlteryxService

 

This simplicity and same focus on command line statements means that we can also script recovery. However, since recovery actions are much less frequent, it probably isn't necessary. Instead, you would just connect to the server, open a command prompt and, following our backup example above, execute the following commands:

 

Example:

 

 

"C:\Program Files\Alteryx\bin\AlteryxService.exe" stop
"C:\Program Files\Alteryx\bin\AlteryxService.exe" emongorestore=Z:\Path\MongoBackup,C:\ProgramData\Alteryx\Service\Persistence\MongoDB
"C:\Program Files\Alteryx\bin\AlteryxService.exe" start

 

 

For Alteryx Server we also recommend backing up the controller token and some settings files. While the server can be recovered without these files. Having a backup of them can expedite the recovery process, and they will also ensure you will be able to decrypt any sensitive data in the database. The settings files we recommend backing up are:

 

C:\ProgramData\Alteryx\RuntimeSettings.xml

C:\ProgramData\Alteryx\Engine\SystemAlias.xml

C:\ProgramData\Alteryx\Engine\SystemConnections.xml

 

Again, please keep in mind the exact paths may vary depending on the server configuration and where the backup is located. This example also assumes the backup isn't compressed/archived. If you are using a backup script that archives the backup and copies it to network storage, you will need to copy the backup file to the server and decompress the archive before running the recovery commands above.

 

 

Below is the code for my sample batch script:

 

::-----------------------------------------------------------------------------
::
:: AlteryxServer Backup Script v.2.0.2 - 01/04/19
:: Created By: Kevin Powney
::
:: Service start and stop checks adapted from example code by Eric Falsken
::
::-----------------------------------------------------------------------------

@echo off

::-----------------------------------------------------------------------------
:: Set variables for Log, Temp, Network, and Application Paths
::
:: Please update these values as appropriate for your environment. Note
:: that spaces should be avoided in the LogDir, TempDir, and NetworkDir paths.
:: The trailing slash is also required for these paths.
::-----------------------------------------------------------------------------

SET LogDir=C:\ProgramData\Alteryx\BackupLog\
SET TempDir=C:\Temp\
SET NetworkDir=\\ServerName\SharePath\
SET AlteryxService="C:\Program Files\Alteryx\bin\AlteryxService.exe"
SET ZipUtil="C:\Program Files\7-Zip\7z.exe"

:: Set the maximium time to wait for the service to start or stop in whole seconds. Default value is 2 hours.
SET MaxServiceWait=7200

::-----------------------------------------------------------------------------
:: Set Date/Time to a usable format and create log
::-----------------------------------------------------------------------------

FOR /f %%a IN ('WMIC OS GET LocalDateTime ^| FIND "."') DO SET DTS=%%a
SET DateTime=%DTS:~0,4%%DTS:~4,2%%DTS:~6,2%_%DTS:~8,2%%DTS:~10,2%%DTS:~12,2%
SET /a tztemp=%DTS:~21%/60
SET tzone=UTC%tztemp%

echo %date% %time% %tzone%: Starting backup process... > %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Stop Alteryx Service
::-----------------------------------------------------------------------------

echo %date% %time% %tzone%: Stopping Alteryx Service... >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

SET COUNT=0

:StopInitState
SC query AlteryxService | FIND "STATE" | FIND "RUNNING" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 0 IF NOT errorlevel 1 GOTO StopService
SC query AlteryxService | FIND "STATE" | FIND "STOPPED" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 0 IF NOT errorlevel 1 GOTO StopedService
SC query AlteryxService | FIND "STATE" | FIND "PAUSED" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 0 IF NOT errorlevel 1 GOTO SystemError
echo %date% %time% %tzone%: Service State is changing, waiting for service to resolve its state before making changes >> %LogDir%BackupLog%datetime%.log
SC query AlteryxService | Find "STATE"
timeout /t 1 /nobreak >NUL
SET /A COUNT=%COUNT%+1
IF "%COUNT%" == "%MaxServiceWait%" GOTO SystemError
GOTO StopInitState

:StopService
SET COUNT=0
SC stop AlteryxService >> %LogDir%BackupLog%datetime%.log
GOTO StoppingService

:StopServiceDelay
echo %date% %time% %tzone%: Waiting for AlteryService to stop >> %LogDir%BackupLog%datetime%.log
timeout /t 1 /nobreak >NUL
SET /A COUNT=%COUNT%+1
IF "%COUNT%" == "%MaxServiceWait%" GOTO SystemError

:StoppingService
SC query AlteryxService | FIND "STATE" | FIND "STOPPED" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 1 GOTO StopServiceDelay

:StopedService
echo %date% %time% %tzone%: AlteryService is stopped >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Backup MongoDB to local temp directory.
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Starting MongoDB Backup... >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

%AlteryxService% emongodump=%TempDir%ServerBackup_%datetime%\Mongo >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Backup Config files to local temp directory.
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Backing up settings, connections, and aliases... >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

copy %ProgramData%\Alteryx\RuntimeSettings.xml %TempDir%ServerBackup_%datetime%\RuntimeSettings.xml >> %LogDir%BackupLog%datetime%.log
copy %ProgramData%\Alteryx\Engine\SystemAlias.xml %TempDir%ServerBackup_%datetime%\SystemAlias.xml
copy %ProgramData%\Alteryx\Engine\SystemConnections.xml %TempDir%ServerBackup_%datetime%\SystemConnections.xml
%AlteryxService% getserversecret > %TempDir%ServerBackup_%datetime%\ControllerToken.txt

::-----------------------------------------------------------------------------
:: Restart Alteryx Service
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Restarting Alteryx Service... >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

SET COUNT=0

:StartInitState
SC query AlteryxService | FIND "STATE" | FIND "STOPPED" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 0 IF NOT errorlevel 1 GOTO StartService
SC query AlteryxService | FIND "STATE" | FIND "RUNNING" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 0 IF NOT errorlevel 1 GOTO StartedService
SC query AlteryxService | FIND "STATE" | FIND "PAUSED" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 0 IF NOT errorlevel 1 GOTO SystemError
echo %date% %time% %tzone%: Service State is changing, waiting for service to resolve its state before making changes >> %LogDir%BackupLog%datetime%.log
SC query AlteryxService | Find "STATE"
timeout /t 1 /nobreak >NUL
SET /A COUNT=%COUNT%+1
IF "%COUNT%" == "%MaxServiceWait%" GOTO SystemError
GOTO StartInitState

:StartService
SET COUNT=0
SC start AlteryxService >> %LogDir%BackupLog%datetime%.log
GOTO StartingService

:StartServiceDelay
echo %date% %time% %tzone%: Waiting for AlteryxService to start >> %LogDir%BackupLog%datetime%.log
timeout /t 1 /nobreak >NUL
SET /A COUNT=%COUNT%+1
IF "%COUNT%" == "%MaxServiceWait%" GOTO SystemError

:StartingService
SC query AlteryxService | FIND "STATE" | FIND "RUNNING" >> %LogDir%BackupLog%datetime%.log
IF errorlevel 1 GOTO StartServiceDelay

:StartedService
echo %date% %time% %tzone%: AlteryxService is started >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: This section compresses the backup to a single zip archive
::
:: Please note the command below requires 7-Zip to be installed on the server.
:: You can download 7-Zip from http://www.7-zip.org/ or change the command to
:: use the zip utility of your choice as defined in the variable above.
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Archiving backup... >> %LogDir%BackupLog%datetime%.log

%ZipUtil% a %TempDir%ServerBackup_%datetime%.7z %TempDir%ServerBackup_%datetime% >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Move zip archive to network storage location and cleanup local files
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Moving archive to network storage >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

copy %TempDir%ServerBackup_%datetime%.7z %NetworkDir%ServerBackup_%datetime%.7z >> %LogDir%BackupLog%datetime%.log

del %TempDir%ServerBackup_%datetime%.7z >> %LogDir%BackupLog%datetime%.log
rmdir /S /Q %TempDir%ServerBackup_%datetime% >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Done
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Backup process completed >> %LogDir%BackupLog%datetime%.log
GOTO :EOF

:SystemError
echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time% %tzone%: Error starting or stopping service. Service is not accessible, is offline, or did not respond to the start or stop request within the designated time frame. >> %LogDir%BackupLog%datetime%.log
Comments
Bolide

@KevinP,

 

First and foremost, thanks so much for the comprehensive write-up!

 

I've able to run an edited version of your batch script and schedule it successfully.  It's a big relief to have some backups running and I'd encourage all server admins to implement this process - we've learned the hard way that it's imperative to have the ability to restore promptly.

 

Two Questions:  

 

  1. Any idea why I can't find the SystemAlias.xml and/or SystemConnections.xml files on our server?  I was able to locate and copy the RuntimeSettings.xml file (location shown in batch script below)
  2. I noticed that the log reads - during the mongodump - that  DBName=AlteryxService;  Does the mongodump also make a copy of the AlteryxGallery database and it's collections?

I thought I'd share the batch script that worked for us in case others may find it helpful.  I had to make some very slight adjustments, mainly just adding quotes around my copy statements due to spaces in the file paths causing syntax errors.  I also created a %NewDir% variable for the 7-zip copy destination.

 

Batch Script:

 

::-----------------------------------------------------------------------------
::
:: AlteryxServer Backup Script v1.0 - 5/25/2016
:: Created By: Kevin Powney
:: Edited: 2/15/2017 by Taylor Cox
::
::-----------------------------------------------------------------------------

@echo off

::-----------------------------------------------------------------------------
:: Set variables for Log and Temp directories
::-----------------------------------------------------------------------------

SET LogDir=E:\ProgramData\Alteryx\BackupLogs\
SET TempDir=C:\AlteryxBackupResources\Temp\
SET NewDir=I:\MarketingAnalytics\Alteryx\ServerBackups\

::-----------------------------------------------------------------------------
:: Set Date/Time to a usable format and create log
::-----------------------------------------------------------------------------

FOR /f %%a IN ('WMIC OS GET LocalDateTime ^| FIND "."') DO SET DTS=%%a
SET DateTime=%DTS:~0,4%%DTS:~4,2%%DTS:~6,2%_%DTS:~8,2%%DTS:~10,2%%DTS:~12,2%


echo %date% %time%: Starting backup process > %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Stop Alteryx Service
::-----------------------------------------------------------------------------

echo %date% %time%: Stopping Alteryx Service >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

NET STOP AlteryxService >> %LogDir%BackupLog%datetime%.log

::-----------------------------------------------------------------------------
:: Backup MongoDB to local temp directory.
::-----------------------------------------------------------------------------

echo %date% %time%: Starting MongoDB Backup >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

"E:\Program Files\Alteryx\bin\AlteryxService.exe" emongodump=%TempDir%ServerBackup_%datetime%\Mongo >> %LogDir%BackupLog%datetime%.log

:: pause

::-----------------------------------------------------------------------------
:: Backup MongoDB to local temp directory.
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time%: Backing up settings, connections, and aliases >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

copy "E:\Program Files\Alteryx\bin\RuntimeData\RuntimeSettings.xml" "%TempDir%ServerBackup_%datetime%\RuntimeSettings.xml" >> %LogDir%BackupLog%datetime%.log

:: pause

::-----------------------------------------------------------------------------
:: Restart Alteryx Service
::-----------------------------------------------------------------------------

echo %date% %time%: Restarting Alteryx Service >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

NET START AlteryxService >> %LogDir%BackupLog%datetime%.log

:: pause

::-----------------------------------------------------------------------------
:: This section compresses the backup to a single zip archive
::
:: Please note the command below requires 7-Zip to be installed on the server.
:: You can download 7-Zip from http://www.7-zip.org/ or change the command to
:: use the zip utility of your choice.
::-----------------------------------------------------------------------------

echo %date% %time%: Archiving backup >> %LogDir%BackupLog%datetime%.log

"c:\Program Files\7-Zip\7z.exe" a %TempDir%ServerBackup_%datetime%.7z %TempDir%ServerBackup_%datetime% >> %LogDir%BackupLog%datetime%.log

:: pause

::-----------------------------------------------------------------------------
:: Move zip archive to network storage location and cleanup local files
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time%: Moving archive to network storage >> %LogDir%BackupLog%datetime%.log
echo. >> %LogDir%BackupLog%datetime%.log

:: Be sure to update the UNC path for the network location to copy the file to.
copy "%TempDir%ServerBackup_%datetime%.7z" "%NewDir%" >> %LogDir%BackupLog%datetime%.log

:: pause

del %TempDir%ServerBackup_%datetime%.7z >> %LogDir%BackupLog%datetime%.log
rmdir /S /Q %TempDir%ServerBackup_%datetime% >> %LogDir%BackupLog%datetime%.log

:: pause

::-----------------------------------------------------------------------------
:: Done
::-----------------------------------------------------------------------------

echo. >> %LogDir%BackupLog%datetime%.log
echo %date% %time%: Backup process completed >> %LogDir%BackupLog%datetime%.log

 

Thanks again,

 

Taylor

 

Alteryx
Alteryx

@Coxta45

 

To answer your questions:

 

The SystemAlias.xml and SystemConnections.xml may not be present in all environments. These files are used to store Database connection aliases/information for standard and In-DB connections. If your server doesn't have any saved database aliases or In-DB connection then these files will not be present. As an additional note the RuntimeSettings.xml file you want to backup should always be in C:\ProgramData\Alteryx (%ALLUSERSPROFILE%\Alteryx), The version you found on your E drive is likely the default file included with the application installation. You don't need to backup that copy and you shouldn't make edits to it. All changes from the default configuration should be stored in the copy found in ProgramData.

 

If you are using the emongodump option of the AlteryxService to perform the backup the backup data will including the following items:

 

  • admin database
  • AlteryxGallery database
  • AlteryxGallery_Lucene database
  • AlteryxService database
  • ASCredentials.bin file
  • ASMongoDBVersion.bin file
  • mongocontroller.log file
  • mongoDump.log file

These list includes all items needed to restore the database as needed. This is why we recommend performing the backup via our service when possible. 

 

In regard to the batch script I am glad you found it useful, and thank your for sharing your edits with the community. 

Bolide

@KevinP - Once again, thanks!  

 

While making the change to copy the correct RuntimeSettings.xml file, I was also able to locate our SystemAlias.xml and SystemConnections.xml files (and begin copying them in the batch script).

 

Location references:

copy "C:\ProgramData\Alteryx\Engine\SystemAlias.xml" "%TempDir%ServerBackup_%datetime%\SystemAlias.xml" >> %LogDir%BackupLog%datetime%.log
copy "C:\ProgramData\Alteryx\Engine\SystemConnections.xml" "%TempDir%ServerBackup_%datetime%\SystemConnections.xml" >> %LogDir%BackupLog%datetime%.log
Alteryx Certified Partner
Alteryx Certified Partner

Would you happen to have a powershell script? 

 

I see the Pause commands in here and wonder if that then requires operator involvement in this operation or what happens?  Ideally, this is auto-magic.

 

I also didn't see the copying of runtimesettings and systemalias nad systemconnections xml in this procedure.  Should it be included as a best practice?

 

Are there any other updates to include?

 

Cheers,

Mark

Alteryx
Alteryx

@MarqueeCrew Sorry, I don't have a powershell version of this script. However, my original script also doesn't have any 'pause' functionality and should work without any user intervention as long as any needed tweaks are made so the script fits your environment. The modified version of my script @Coxta45 posted does have lines with 'pause' listed but these are comments and wouldn't have any impact on the code. I assume he was using them to debug the script for his particular use case and just commented them out once he finished.

 

In regard to backing up the RuntimeSettings, SystemAlias, and SystemConnections xml my script does backup all of these on lines 53,54, & 55. The code does use a hard path though and it would probably be better to use the environmental variable instead just in case the ProgramData location has been changed. Something like the snip below would be technically more accurate.

 

copy %ProgramData%\Alteryx\RuntimeSettings.xml %TempDir%ServerBackup_%datetime%\RuntimeSettings.xml
copy %ProgramData%\Alteryx\Engine\SystemAlias.xml %TempDir%ServerBackup_%datetime%\SystemAlias.xml
copy %ProgramData%\Alteryx\Engine\SystemConnections.xml %TempDir%ServerBackup_%datetime%\SystemConnections.xml

 

 You could also take this same concept and set variable for all of the hard coded paths in the script. Such as the paths for the service, networks locations, and 7-zip. The you could just update the variables with the correct paths for your environment. Maybe I will do this if I have some spare time in the near future and post an updated version.

Alteryx Partner
I wanted to create a manual backup of mongodb, but somehow it didn't work can you suggest me on this thread: https://community.alteryx.com/t5/Alteryx-Server-Discussions/MongoDb-backup/m-p/198728#M2106 thanks, rahul

Has anyone had any issues with this? We were running it for several months - however the last month or so we kept having an issue with the Alteryx Service not restarting after the back up.  The error in the log files was "no suitable servers found", which sounds like it's having an issue with the MongoDB? I've had to disable the back up for now- but would love to find out what we did wrong(or if anyone else has come across the same issue) (have to thank Alteryx Support (thanks Peter!) for helping me figure out why our service was stopping in the first place!)

 

Meteor

Working with a colleague - we have converted the batch file listed within this thread into a power shell script.  Its completely possible to perform the conversion.

We are running tests against each of our environments and of  course with each run we come up with new ideas to add to it.

@DerangedVisions -  what changes  have you made to your environment that presented your error?  Is this error in conjunction with your backup script?  Manual backups for us work as well.   In regards to your service  not restarting -  is your mongodb.lock file in your Persistence path 0k when the service is stopped?  if its 1k or higher, you need to move it out of the way and recreate the empty lock file so its 0k in size - then restart your service.

 

@sgabriel62 i'm not aware of any changes to the environment. The Batch file would run 98% of the time, and randomly it wasn't able to restart the service ("no suitable servers found" ) prior to the batch file issue- we did have some locks that were removed with the help of the Alteryx support team right before the issues with the back up, but I haven't checked since.

 

 

Atom

Hi all

We are having the same issue.

The service does not start again after the backup.

The backup completes and looks fine but the service start fails.

Even if we run it from the command line.

Have to kill mangod.lock then it starts.

But is not 100%. Have to stop it.

Then kill all alteryx process in Task manager.

Then start again - all good.

Alteryx
Alteryx

I am currently aware of two scenarios surround this process and my sample script that can cause issues with the backup and the service stop/start processes.

 

The first scenario effects scripts based on versions of the sample prior to 1.5 or if your Alteryx Server is version 2018.3. In these scenarios any delay in the service shutdown (typically due to jobs running at the time the backup is started) will cause the backup script to proceed with the backup while the service is still running. This causes the service start commands to fail because the service is still technically running even though it is in the process of stopping. Eventually the service will stop leaving the server in an offline state. If you are not running Alteryx Server 2018.3 version 1.5 or higher of my sample script should prevent this with the changes to the service start and stop commands. If you are on Alteryx Server 2018.3 you will need to schedule your backups during a period where no jobs are running, or monitor the process manually to ensure the service stops and restarts properly. I am also working on an updated script that should address this through better handling of the services state changes, and I will update the article with the new script as soon as it has been completed and tested.

 

We have also seen some issues with 2018.x running on 2008 R2 server where the mongo process doesn't properly indicate to the service that it has exited. The issue occurs inconsistently, and we haven't been able to identify the exact scenario or cause yet. If you find that the backup process hangs during the backup (the script will continue updating the screen with .... forever) you are likely encountering this behavior. This behavior can also affect service state changes after the service has been running for a few days. Unfortunately, I can't correct this behavior with the sample script as it seems to be an interaction between our service and the mongod process in certain environments. If you encounter this issue I would recommend upgrading to the latest version of Alteryx Server and ensuring you have all available windows patches installed. If the issue still persists you may want to consider upgrading to Windows Server 2012 or 2016.

Meteor

We have converted your batch script into Powershell.   We've added a condition statement to forcefully stop the service after a sleep period has been met.  Meaning it will wait for the lock file status to change from 1k to 0k on its own.  Once executed we force the creation of an empty lock file prior to restart of service so that once the dumps are completed the service will start without fail.  But there will be the occasion where this cant be helped because of a hung session or Windows is mis-behaving.   Advantage version 2018.x - Workflow caching - where the workflow rerun continues where it left off.

 

Alteryx
Alteryx

Version 2.0 of the example backup script is available now, and the article has been updated to reflect the new version. This version handles service start and stop request in a more dynamic and graceful manner. Please note that if the timeout value is exceed you may encounter a scenario where the service eventually shuts down and isn't restarted. This would be expected depending on where the timeout occurred and the scripts log would reflect that service stop or start failed. I have defaulted the timeout to 2 hours to try and give the service ample time to respond, but this may not be sufficient for some environments. If you find you are frequently encountering the service timeout try increasing the timeout value.

Alteryx
Alteryx

Updated example script to version 2.0.2. This version only includes some logging improvements. Specifically more messages are timestamped, spacing has been improved, and time stamps now include timezone information.

Atom

@, why not post your Powershell script to help others?

Meteor

OK Folks -  Here is a generic breakdown of the Powershell script.   What the script does is:  Stops Alteryx service, performs the dump, restarts the service so your users can get back to work.  Then as a background process, we copy the dump location from the local server to the NAS for DR replication.  Do a bit of cleanup and its done.

This will get you started.  Obviously my Production version works so I couldnt share with you but - as an example -  a 13GB DB will take approx 3 min to dump.   Vs.  a heavy content 63GB takes nearly and hr.   On average its 30 minutes but again its all down to the content of the DB.  

This same script can be used to perform restores if you are on a regular maintenance schedule - but I manually perform my restores from the command line for more control.

Any questions feel free to ask:   Hope this helps and good luck

--------------------------------------------------------------------------------------------------------------------

# Mongod backup Script
# stop mongod.exe, backup the db, make local copy of files, start mongod.exe, create network archive of files.

#Error color set to yellow, to cause less panic.
$host.PrivateData.ErrorForegroundColor = 'Yellow'

[cmdletbinding()]

$totalScriptTime = Measure-Command {
# Set log location
#$log = "\\SAN/NAS name\alteryx_prod0001\MongodbBackups\dbbackup.csv"
$log = "E:\Temp\dbbackup.csv"
# set local backup location
$dest = "E:\temp\Mongo_backup"
[System.Collections.ArrayList]$source
#Set network backup location
$archive = "\\SAN/NAS name\alteryx_prod0001\MongodbBackups"
# List of files to backup
#$persist = "E:\ProgramData\Alteryx\Service\Persistence\"
$filelist = @( # Add or remove files
"E:\temp"
"C:\ProgramData\Alteryx\RuntimeSettings.xml"
)

$lockfile = "E:\programdata\alteryx\service\persistence\MongoDB\Mongod.lock"

#$filelistpath = ""
#$filelist = import-csv "$filelistpath"

# $Source is the variable for compression and archiving.
[System.Collections.ArrayList]$Source = @( # Add or remove files
"C:\ProgramData\Alteryx\RuntimeSettings.xml"
)
#Format for adding additional directories.
# $Source.Add("C:\users\cstone42\desktop\gifs")

# Test if the local backup location exists, if not, create it
if (!(test-path $dest))
{
Write-Output "$dest does not exist, creating now"
New-Item -ItemType directory -Path $dest
}

if (!(test-path $log))
{
Write-Output "$log does not exist, creating now"
New-Item -ItemType directory -Path $log
}

if (!(test-path $archive))
{
Write-Output "$archive does not exist, creating now"
New-Item -ItemType directory -Path $archive
}

# Function for logging
function SendOutput()
{
Param(
[string[]]$output
)

$now = get-date -format "MM/dd/yyyy HH:mm:ss"
# Build a hashtable of the collected data + computername (because thats important)
$Properties = @{ComputerName = $env:COMPUTERNAME
now = $now | Out-String
output = $output | Out-String
}

# importing hashtable to properties of new windows object
$obj = New-Object -TypeName PSObject -Property $Properties
Write-Output $obj | Select-Object -Property now,ComputerName,output | Export-Csv -Path $log -NoTypeInformation -Append
Write-Output $obj | Select-Object -Property output | convertTo-csv | write-host -ForegroundColor Magenta
}

# write to log
$output ='Beginning backup script';SendOutput -output $output
#
# Stop mongod.exe
Write-Warning "Stopping AlteryxService.exe"
Set-Service -ServiceName AlteryxService -StartupType Disabled
Stop-Service AlteryxService | Out-Null
#start-sleep 780

$processname = "AlteryxService"
# establish a loop counter
$i = 0
# Get info about the process
$p = Get-process -Name "$processname" -ErrorAction SilentlyContinue # Pick a process name
# If there is no running process, say so.
if (!($p)){Write-Host "Process does not exist, These are not the droids your looking for. move along. Move along!" -ForegroundColor Green}
# Show me the PID so I know your telling the truth and the script is still moving along.
$p.Id
# In the name of honest transparency, let people know.
If ($p){write-host "$processname is alive, ITS ALIVE!!" -ForegroundColor Yellow}
# We can only kill something if we are in emminent danger:
While ($p) # As long as the above named process exists,
# ... ITS COMING RIGHT FOR US!! KILL IT!!!
{
# Check to see if the process is in ".HasExcited status from Get-Process, if so, break
if ($p.HasExited)
{Write-host "Process $p has exited" -ForegroundColor Green ;Break}
# Give it some time
start-sleep 30
# Update the Process info
$p = Get-process -Name "$processname"
# Set to a number above the forced shutdown, to catch any runaway loops.
if($i -ge 22)
{break}
# Set to the max number of attempts before using a forced stop.
if ($i -ge 20)
{Stop-process -force $p}
# Output the number of loops we are at, so people know the script is still running
write-host "$i"; $i++
# Politely ask the process to stop and continue to check on it via the loop.
Stop-process $p
} # End of Loop

#$output = "Result from stopping Alteryx: $lastexitcode";SendOutput -output $output
start-sleep 2

Remove-Item "$lockfile" -Force -ErrorAction SilentlyContinue
if (!(test-path $lockfile))
{
New-Item $lockfile
}
else {
$output ='Its not you, its me ... No, its you. I give up, you are on your own. Im leaving you. Good-Bye';SendOutput -output $output
Write-Warning "There is a problem with the lock file, Help me Obi-Wan Kenobi, you're my only hope"
Start-Sleep 2
Write-Warning "The Vogon destructor fleet is in position. So long and thanks for all the fish"
Break
}
<#
$isASRunning = get-service -Name AlteryxService
if ($isASRunning.Status -ne 'Stopped')
{
$output ='AlteryxService did not stop normally, using force and removing lock file.';SendOutput -output $output
Write-Warning "$output"
Stop-Service -Name AlteryxService -force
Remove-Item "$lockfile" -Force
if (!(test-path $lockfile))
{
New-Item $lockfile
}
else {
$output ='Its not you, its me ... No, its you. I give up, you are on your own. Im leaving you. Good-Bye';SendOutput -output $output
Write-Warning "There is a problem with the lock file, Help me Obi-Wan Kenobi, you're my only hope"
Start-Sleep 2
Write-Warning "The Vogon destructor fleet is in position. So long and thanks for all the fish"
Break
}
}#>

# more logging
$output = "Starting Mongo db backup";SendOutput -output $output

# Creating the dump
$dumpname = "MongodbDump_$(Get-Date -f yyyy-MM-dd-HH-mm)"
#$dump = & "E:\Program Files\Alteryx\bin\AlteryxService.exe" emongodump=E:\temp\mongod_dmp -Wait
$dump = & "E:\Program Files\Alteryx\bin\AlteryxService.exe" emongodump=E:\temp\$dumpname -Wait
$result = $dump.ExitCode
if ($result -eq "False")
{
$output = "Mongo backup failed during dump"
SendOutput -output $output
break
}
else
{
$output = "Mongo dump completed"
SendOutput -output $output
$Source.Add("E:\temp\$dumpname")
}

# Restarting the alteryx service
Set-Service -ServiceName AlteryxService -StartupType Automatic
$output = "Starting Alteryx Service";SendOutput -output $output
Start-Service AlteryxService | Out-Null
$output = "Starting Alteryx Service exit code: $lastexitcode";SendOutput -output $output
#>

###
#
#write-warning "waiting for files to copy"
#get-childitem $dumpname -Recurse | Copy-Item -Destination $dest\$dumpname | out-null
#start-sleep 1
#
##
####
#########
#####

start-sleep 5


# copy and hash check to check data integrety
foreach ($file in $filelist)
{
$SourceFile = $file
#$SimpleName = [System.IO.Path]::GetFileName("$SourceFile")
Get-ChildItem $SourceFile | Copy-item -Recurse -Destination $archive

if((Get-FileHash $SourceFile).hash -ne (Get-FileHash $archive).hash)

{
#sloppy messages ... need to clean it up
$output = "Copy to $archive Failed - $file is different"; SendOutput -output $output
}
Else
{
$output = "Copy to $archive sucessful - both copies of $file are the same"; SendOutput -output $output
}
}
#$Source.Add("$dest")

 

# compress from local temp to network location for long term storage and log.
#write-host "Beginning Archive process" -ForegroundColor Cyan
<#
# File Compression.

$TotalCompressTime = measure-command {
<# Source Information should be located above.
[System.Collections.ArrayList]$Source = @( # Add or remove files
"C:\users\cstone42\desktop\desktop"
)
$Source.Add("C:\users\cstone42\desktop\gifs")
#

$bun = 0
foreach ($thing in $Source)
{
$bun++
$destination = "$archive\Backup$bun.zip"
$arctime = Measure-Command {
If(Test-path $destination){Remove-item $destination}
$compressionLevel = [System.IO.Compression.CompressionLevel]::NoCompression
Add-Type -assembly "system.io.compression.filesystem"
[System.IO.Compression.ZipFile]::CreateFromDirectory($Thing, $destination, $compressionLevel, $true)
} #End Measure-command
$ThisMany = $arctime.Seconds
$output = "$thing took $ThisMany seconds to compress."; SendOutput -output $output
}
} # End of Measure-Command
$TCTS = $TotalCompressTime.Seconds
$output = "Total compression time: $TCTS"; SendOutput -output $output

#get-childitem $dest -Recurse | Copy-Item -Destination $archive -force | Out-Null
$output = $?
$output = 'network archive complete '; SendOutput -output $output
#>

#housekeeping
$Now = Get-Date
#define amount of days
$Days = "1"
#folder where files are located
$TargetFolder = $archive
#define extension
$Extension = "*"
#LastWriteTime parameter based on $Days
$LastWrite = $Now.AddDays(-$Days)

#get files based on lastwrite filter and specified folder
$Files = Get-Childitem "e:\temp\*.*" -Recurse | Where {$_.LastWriteTime -lt "$LastWrite"}


if ($Files -ne $NULL)
{
foreach ($file in $Files)
{
$output = "Removing file $file"; SendOutput -output $output
Remove-Item $file.FullName -force #-WhatIf
}
} # End if

} #End Of Measure-Command

$output = 'Ending Backup Script'; SendOutput -output $output
$output = "Total mongo db backup time: $totalScriptTime"; SendOutput -output $output

# EOF

Atom

@,

AWESOME! Thanks for posting this :)

Meteoroid

for those of you forcing the AlteryxService to stop ...

(because you can't always wait four hours for a workflow to complete)

 

if the AtleryxService is not stopping and you're running a worker on the same machine as the controller (and thus embedded mongodb), it's most likely because there are jobs running.  the service will not run down until those jobs complete.  so if you're going to kill anything, you should kill all the instances of AlteryxEngineCmd.exe.  The value of "Workflows allowed to run simultaneously" in System Settings > Worker > General determines the number of instances of AlteryxEngineCmd.exe.  however, be forewarned.  This is not without consequences as well.  you will be terminating running jobs and, occasionally, i've seen the schedules for those jobs be disabled as a result.

Meteoroid

what now also concerns me is executing a backup of the embedded mongo instance across our alteryx server environ which consists of three machines:  an 8-core runs the controller, the gallery and a worker; two 4-cores run workers only.  i've not worried about the remote workers when backing up mongo.  i've only stopped AlteryxService on the 8-core.  but recently, i've received conflicting information from Alteryx support regarding a clean shutdown.  i've heard both that shutting down the workers doesn't matter.  and i've heard that shutting them down *first* is critical to having a "clean" shutdown.  i assume this means that doing otherwise risks corrupting the content of the embedded mongodb instance.  (which is it?)

 

if shutting down the service on ALL the node in your alteryx server environ is a requirement, then an automated backup just got more difficult.  and, again - if this is true, AND there's no way to backup mongo while the service is running, then mongo is not a good choice.  b/c in addition to the complexity of having to stop the service on EVERY node, if you are multi-node, it's probably b/c your load is significant and you probably have no daily windows for a backup (and a weekly one is probably a challenge - is for me).

 

i know alteryx server is going to postgres soon.  that's a good thing.  but in the meantime, what's a alteryx server admin with multiple nodes and a workflow schedule that's booked solid all day to do?  what's alteryx's suggestion @KevinP ?

Asteroid

FYI,

We added this to get rid of logs/zip files older than 14 days.

The PushD/PopD temporarily maps a network drive.

 

PushD "%LogDir%" &&(forfiles -s -m *.log -d -14 -c "cmd /c del /q @path") & PopD
PushD "%NetworkDir%" &&(forfiles -s -m *.7z -d -14 -c "cmd /c del /q @path") & PopD

Asteroid

Another addition.

We had some jobs running overnight when the backup was triggered. This caused the gallery to go down. Adding this prevented the issue.

Basically it checks to see if there is an active job running, and if so skips the backup. We are okay missing a backup once in a while.

 

 

:: if a job is running (i.e. AlteryxEngineCmd.exe task exists) then abort backup
tasklist | FIND "AlteryxEngineCmd.exe"
IF errorlevel 0 IF NOT errorlevel 1 GOTO SystemError

 

 

... as suggested by @DanC on this post: https://community.alteryx.com/t5/Alteryx-Server-Knowledge-Base/Alteryx-Service-Stuck-in-Stopping-Sta...