Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
Watermark
12 - Quasar
12 - Quasar

Introduction

 

The business case for posting a blob file to Azure was that a company wanted to post hundreds of invoices online in a secure Azure Environment for their clients to sign into and view those invoices in a format not readily modified (PNG, JPG, PDF). We settled on PDF for our case because the invoices themselves are also being produced with Alteryx. The reporting tools could readily create PDF with pagination versus trying PNG or JPG, which weren’t as friendly to invoice page requirements (all the invoices were more than 1 page) and would have required additional acrobatics (see what I did there 😉) to create nice pretty pagination. 

 

I found this to be a really interesting project because the only valid documentation to date was around posting Excel/CSV/JSON to Azure via the downloadable connector, which doesn’t accommodate blob files. There were a couple of sources pretending to post blob that I found on the net after considerable scouring, but they were not only very incomplete, but in some cases plain wrong, and sent me on goose chases trying things that simply didn’t work.

 

With the rise of the cloud and the prevalence of Azure, it’s certain there will be others who’d like to post these types of files to their environments. The following is an in-depth tool-by-tool approach and some additional automation considerations to quickly work in the different environments in IT, namely Dev/Test/Stage & Prod.  See below for my best efforts to capture this with a modest understanding there are likely different approaches that could be used for the environment automation portion.

 

Process

 

I have saved these to the Gallery for you to use and work along with this document.

 

1. Start with the Directory tool: capture all the information you need for a folder migration. Identify the folder, then specific file or *wildcards for multiple files.

 

Screen Shot 2022-07-21 at 11.56.32 AM.png

 

2. Take the above concept (pictured in #1) and expand it—leverage selecting the environment to also select the data set. For my approach, I use 2 data sets: 1 dataset for Dev/Test/Stage & 1 live dataset for live prod. This prevents prod data from accidentally being published in lower environments (Dev/Test/Stage) and keeps the test dataset from being published to prod. If you actually have 4 different sets, this concept can easily expand to accommodate.

 

MeganDibble_0-1658427401300.png

 

a) Here is what the wizard view will look like:

 

Watermark_2-1658424526835.png

 

b) Here is the Action tool configuration:

 

Watermark_3-1658424526900.png

 

c) The second Action tool configuration controls data from the source.

 

Watermark_4-1658424526973.png

 

3. After the tools correctly select the data source, then use the Blob Input tool to pull in files to be turned into blobs. Here is the Blob Input tool configuration:

 

Watermark_5-1658424526990.png

 

4. Then use the Create Blob tool (configuration pictured).

 

Watermark_6-1658424527012.png

 

5. Use the Select tool to eliminate unnecessary fields and rename a couple of fields needed for later use (configuration pictured).

 

Watermark_7-1658424527039.png

 

6. Next in the process would be the Append tool, but we’ll jump to the following input, which is the URL & Token, and return to the Append when we bring the two together.

 

7. We add a macro with the URL & Token, and we add a filter to capture the correct URL & Token from the macro. The correct one is driven by the App selection previously noted.

 

Watermark_8-1658424527060.png

 

a) The filter will allow us to have multiple Azure URL / TOKEN keys in 1 macro, and we pick which environment we want to post the PDF files.

 

b) The macro itself is exceedingly simple. Only two tools: a Text Input and a Macro Output tool. The Text Input tool only has four columns: EnivID, Environ, URL, Token.

 

Watermark_9-1658424527099.png

 

c) There are five values, 1 for each different environment. You create a record for each one with a unique URL & Token. Work with your Cloud team (or the appropriate group in your company) to secure these.

(1) Sandbox – process concept testing, total junk environment for experimenting

(2) Dev

(3) Test

(4) Stage

(5) Production

 

8. The third Action tool configuration controls the Azure URL / Token destination.

 

a) This will allow you to pick your environment and use the correct URL & Token for the correct environment from the macro. So, you can choose where you load the files.

 

b) The Action tool will update the filter value:

 

Watermark_10-1658424527180.png

 

9. Next, we return to the Append tool, combining the earlier data with the URL Token data.  For slight sake of organization, we also reorder these in the Append tool for logical use in the next tool in our flow (the Formula tool).

 

Watermark_11-1658424527234.png

 

10. Then we configure the Formula tool: there are two formulas. 

 

a) Formula #1:  adjust the file name if there happen to be any spaces in it and replace it with an underscore (Azure doesn’t like spaces)

 

b) Formula #2: combine URL File & Token to a single string field.

 

Watermark_12-1658424527269.png

 

11. Your final major step is the Download tool. The name is a slight misnomer in that it also functions as an upload.  It can post, put, get, and delete with API calls. There are four tabs to configure: Basic, Headers, Payload, and Connection.

 

a) Basic tab configuration:

 

Watermark_13-1658424527298.png

 

b) The Headers tab is key—it will capture the necessary info by record for the two fields checked. This was the purpose of renaming them in the Select tool in step #5. It’s to match the defined header names above, which need to be included for proper loading. Content-Length has to be exact for each record.

 

The top portion of the Headers configuration will start out blank, looking like this:

 

Watermark_14-1658424527319.png

You will need to add each of the fields exactly as I have them below.

 

Screen Shot 2022-07-21 at 12.13.00 PM.png

 

c) Payload tab: “Put” must be selected in the action field. The blob radio button at the bottom of the configuration is also selected.

 

Watermark_16-1658424527402.png

 

12. Here are some final notes:

 

a) Azure doesn’t actually store things in a file folder hierarchy. They are all technically in the same folder. But, it can give a visual presentation as if there was a folder structure.

 

b) To do this, a “  %2F  ” is inserted in between each element you want to appear as a separate folder (i.e., PROJECT%2FYEAR%2FFILENAME.PDF).

 

Conclusion

 

This project was a huge win. It was assumed posting to Azure would be easy to do without anyone vetting if it had ever been before. This is now on an automated basis for monthly invoice creation as well as an Alteryx flow to post the invoices to Azure. The idea behind putting the URL/Token in a macro is that if you need more than one process flow publishing to Azure, you don’t have to remember every flow and directory when you need new Tokens (Azure best practice is a token for one year). Then only the Token needs to be updated in 1 spot (the macro), and all the Alteryx flows that use the macro will automatically leverage the new Token information.

 

Here is the link to the Alteryx flow & the macro on the Alteryx Gallery:

https://community.alteryx.com/t5/Public-Community-Gallery/BLOB-jpg-png-pdf-UPLOAD-to-AZURE-SAS-Token...

 

Mark Thompson
Consultant

Mark Thompson - Executive, Entrepreneur, Consultant with 30+ years of industry experience in functional areas of Operation (supply chain) , Finance & IT.

Mark Thompson - Executive, Entrepreneur, Consultant with 30+ years of industry experience in functional areas of Operation (supply chain) , Finance & IT.

Comments