I have some reports being sent to my S3 bucket in a specific folder. The files have names that also includes the date, are zipped files that have a single csv file each.
I would like to use Alteryx to download the most recent file daily and unzip the file, read the csv, clean up the data and append to a table on my Snowflake server.
I have tried the use the Amazon S3 Download tool, as well as the Input data tool to do this. I tried setting up a macro using text input. However, everything I have tried has failed to get the data into Alteryx.
I have an existing Amazon Upload tool macro that works, with no issues.
Has anyone else had issues and found a clear way to do what I need? Does anyone have an existing workflow that can be “cleaned up” so that I can recycle and use to get my downloads in motion?
I would appreciate any assistance you can provide! Thanks!!
I tried both ways...let me ask this...if in my workflow, I get my file name from the S3, then push that file name into the macro as the object, then want the Action Tool to update the Object field with that file name value, so that it may update the Object name field in the S3 download tool, what would I do? It appears that I am doing something wrong, but not sure exactly what.
Can you post some screenshots of your config or like your .yxzp ---> this is pretty hard to do via your descriptions. if you are using an action tool to edit the alteryx standard s3 output tool you'd modify the object name in the action tool... you'd create a dummy name like MYFILE.XLSX ---> run it. test it. confirm it works in the macro with your dummy file. then run as a macro.. Note ---> this assumes static file types/bucket etc.
Below are the screenshots of the workflow (with notes in the comment lines), then the macro, and the details in each tool for the macro.
Hey--- reiterating ---> if you are using an action tool to edit the alteryx standard s3 output tool you'd modify the object name in the action tool... you'd create a dummy name like MYFILE.XLSX ---> run it. test it.
you did not follow this instruction. you have to have a value in the object field in the S3 tool (or modify the xml which is a universe you are not exploring right now) in order to edit it in the action tool. you need to do this. Object -> dummy object. action tool -> replace dummy object-> save. rerun.
I guess that's where I am lost. I am trying to bring in a daily file that changes names daily based on the date. I will still have hundreds of prior files in the folder and need to create 4 different workflows from that folder, based on name and file date. If you are saying create an xlsx file, I am not sure what that will do, when the files are in CSV, and I need to access them in their ZIP files in the S3 folder. I would hope to NOT download the data and just access the data from the S3. It is a very considerable size for each file (between 5 and 10 GB) per file.
If using the xlsx file is just to somehow recall or utilize the filename, then please explain that. If you are expecting me to use that process to download the data into a local file, that I cannot do at this time.
Can you offer any suggestions or more assistance for my next steps?
step 1:
create a value in the object field... if your final results are .zip --> put something called "test.zip" -preferably put something you can set up.
step 2:
configure action tool to replace this value. action tool is going to replace the value in object -> but object needs to have a value first. otherwise it will do nothing-> except give you that error.
step 3 and on ---> figure out your external (.zip) and internal (.csv) filename logic. I mentioned from the get go there are lot of parts here which are not supported out of the box. This is going to be a slog. I would use aws cli/python/lambda functions. First things first -> get your batch macro working and to do that you need a value in object so your action tool can replace it.
I am not sure what the issue is, but I still cant get this to work. The batch macro doesnt pull in anything. Ugh....I really dont want to push this off to IT, and I am limited to what I can do with Python due to system security constraints.
Screen shot of the config of your action tool?
the common issue here is the logic you are passing from your action tool is wrong.
Hey ---> so without seeing what your editing/what your replacing and without knowing what the file name looks like this is pretty much impossible to troubleshoot. TBH -> I don't even know where your issue is. It could be that you download your file -> but you can't access the underlying .csv. It could be that what's in the .zip isn't a csv. The most likely case is that you are replacing part of the S3 tool incorrectly and not specifying the correct object to download via your dynamic formula -> or not replacing a second key part of the S3 tool which needs to be replaced. I mentioned from the beginning that you are looking at a few things which may be complicated.