Hello community,
I am trying to run a Python file from the "Events" tab and I setup the Python Path in environment variable for an Anaconda installation.
I am following this link to run my Python code and there might be an error in my Path because when I run the workflow, the Python CMD doesn't appear like the CMD does when you execute a batch file.
The objective of this Python code is to download a set of PDF files from GCS to a local folder and the workflow uploads these files and other files in that folder to Sharepoint.
Can anyone help me with the best way to configure this Events tab to run this Python code?
When we are running the code using "Events", do we use the Virtual Environment that is setup by Alteryx (C:\Program Files\Alteryx\bin\Miniconda3\envs\DesignerBaseTools_vEnv) or the Base Virtual Environment that comes with our Python installation as I want to make sure where I should install my packages before I can run the code.
I already tested this code in a separate venv I have created separate from the base and ran the code and it downloads all the PDF files that are available in the folder on GCS, so the code is working as expected.
import os
import datetime
from google.cloud import storage
from google.oauth2 import service_account
# Set the path to the GCP key file
keyPath = r"path\to\gcp\key\file\here"
# Authenticate the credentials using the key file
credentialsForGCP = service_account.Credentials.from_service_account_file(
keyPath, scopes=["https://www.googleapis.com/auth/cloud-platform"],
)
# Set the name of the bucket and folder containing PDF files to be copied to the local folder
bucketName = 'bucket-name-here'
folderName = 'folder/name/here'
# Set the local path where the files will be saved
# localPath = r'C:\Users\ssripat3\Documents\Python Scripts\PDF_Files'
localPath = r'path\to\local\folder\here'
# Establish a connection to the GCP account using the key file and authenticate the credentials
storageClient = storage.Client(credentials=credentialsForGCP)
# Fetch the bucket details from the GCP account
bucket = storageClient.bucket(bucketName)
# Get the list of blobs in the specified folder with the PDF file extension
blobs = bucket.list_blobs(prefix=folderName)
pdfBlobs = [blob for blob in blobs if blob.name.endswith('.pdf')]
# Copy the PDF files to the local folder with the current date and time appended to the filename
for blob in pdfBlobs:
# Get the base filename without the extension
baseFilename = os.path.splitext(os.path.basename(blob.name))[0]
# Get the current date and time in the desired format
currentTime = datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
# Construct the new filename with the current date and time appended to the base filename
pdfFileNameWithDateTime = f'{baseFilename}_{currentTime}.pdf'
# Copy the file to the local folder with the new filename
blob.download_to_filename(os.path.join(localPath, pdfFileNameWithDateTime))
Alteryx Version 2021.4
Solved! Go to Solution.
You could try something like this,
Where I'm using the DesignerBaseTools_vEnv python.
My file is called "myfile.py"
And that file lives on my desktop.
Thank you for the help @PhilipMannering. I was able to make the code run, I wasn't setting the Working Directory and instead just used the complete path in the "Command Arguments" textbox.
After messing around with the Python path, I am now able to mention Python.exe and it is working as expected.
Hello again @PhilipMannering. I need one more help, please.
I am able to run the workflow using the Events tab and also the Python code using "Before Run". The server team said I will not be able to deploy the workflow using Events and instead use the Python tool. Now, the Output that comes out of the Python tool, I don't use it anywhere in any part of the workflow, I just download some files from GCS and store them in a folder before uploading them to Sharepoint which is handled by a Macro.
Now, I need to run that Python code using the tool before everything else or it can at least run anytime until/before the Macro starts executing.
Let us say I have an Input tool which reads an Excel file and a Python tool which runs the code as the first two tools, how can I make sure that my Python code runs either first or second before everything else.
I am on an older version of Alteryx (2021.4.2).
As the Input Data Tool runs first I would the Dynamic Input Data Tool. So you would output something (anything) from the Python Tool and then connect it to the Dynamic Input, which would ensure the Python runs first.
My explanation probably doesn't make any sense, so I've mocked up a simple example....