This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Every so often we get questions about a .tde (Tableau Data Extract) file that is being output from Alteryx that has a file size of 30k when the original data is much larger. When the file is opened in Tableau this error sometimes comes up: An error occurred while communicating with data Source ‘yourfilename.tde’
This article will walk you through how to install a macro, so that it can be accessed from the tool palette in the Designer.
Download or save a macro to a directory
Open Alteryx Designer
Click Options > User Settings > Edit User Settings
Click the Macros tab within the User Settings window
Click the + icon and select the directory used to save your macro(s)
Enter a Category Name for the directory to appear as in the Designer
Macro(s) will now show up in the category specified
Things to consider:
- all macros within the folder specified will be installed
- the category specified within the macro Meta Info will take priority over the Category Name specified in step 6:
A macro can be added to a workflow without installing it. Do this by right-clicking on a blank-portion of your wokrflow and selecting Insert > Macro.., then navigate to the location of the macro you'd like to insert.
This article is part of the CS Macro Development Series. The goal of this series is to communicate tips, tricks, and the thought process that goes into developing good, dynamic macros. In this part, we demonstrate how to read in multiple files with different schemas using a Batch Macro.
As currently designed, the Amazon S3 Download tool only allows one file, or object, to be read in at a time. This article explains how to create a workflow and batch macro that will read in the list of objects in a bucket and allow you to filter for the file(s) you want using wildcards!
Fact: workflows are the best. Look it up. They’re all about getting things done and, with hundreds of tools and the ability to integrate external processes , there’s no shortage of things you can get done. We know that there are some areas of analytics that require a little extra firepower, however, and that’s why you can leverage your workflows in apps and macros for added functionality.
Amazon offers many different services that can be accessed from the AWS Command Line Interface (aka the AWS CLI). Alteryx workflows can take advantage of these services using the AWS CLI from the Run Command tool.
Some of the common uses of the Amazon CLI within Alteryx include:
Using S3 with IAM roles
Uploading/Downloading file types not currently supported by the S3 Upload / Download tools
Interacting with Athena
As an example of using the AWS CLI, let’s build a process that can take a local file and copy to S3, using an AWS IAM profile. Here are the steps involved:
Review the documentation for the service. In this example, we're using the "S3" service, but Amazon has many different AWS services.
Install the AWS CLI on any machines that will need to execute the AWS functions
From Windows Command Line, create a working command
(Optional step to make calling the CLI easier in the future) – Add reference to the AWS CLI executable to Window Environment variables via My Computer > Right Click > Properties > Advanced System Settings > Environment Variables > System Variables > Select “Path” > Edit > (Add the path to the directory of your installed “aws.exe” file, separated by a semicolon. Do not delete/modify any pre-existing values in the Path variable.) > OK > OK
Per the AWS documentation, create a local profile using your IAM role. The local profile references your credentials so that you don't have to pass the credentials in every command.
Test the following command in your Windows command prompt, making sure to use a valid local file to copy, and a valid destination location on your S3, and a valid profile name:
s3 cp "C:\temp\mylocalfile.csv" "s3://mylocation/mydestinationfilename.csv" --profile MyIAMUser
Once you've validated your command in the command prompt, transfer it to an Alteryx workflow
In a new workflow, pull in a “Run Command” tool from the “Developer” category
Configure the Run Command tool using the working command:
Test the above
As an optional step, we can wrap this functionality into a macro to make it easy to update, and to support batch uploads
Add a Control Parameter for “Local File Path” and configure its action to update the specific string value “C:\temp\myfile.csv”
Add a Control Parameter for “Destination File Path” and configure its action to update the specific string value “s3://mybucket/myfiletest1.csv”
Add a Text Box and configure its action to update the command value “aws” in case the user hasn’t placed the CLI exe in their PATH variables
The example above is attached in the 11.3 workflow, SaveToS3_usage.yxzp.
The key component of any batch macro , the Control Parameter Tool is the gear that keeps things moving. Using the input , the control parameter accepts a field of values that will be used within the batch macro to reconfigure and rerun the macro for each of the standard input records - unless using the GroupBy feature that matches certain control parameters to buckets of records to be batched through the macro together. Adding this interface tool to any macro will upgrade it to a batch macro and will give you the ability to loop through macro configurations for added customizability. While one of the more sophisticated solutions you can build into your workflows, there are few problems you can’t solve with a batch macro:
There’s a lot going on in the world of analytics. Endless data stores and insight are at the other end of an internet connection and, as analysts, we’re always in on the action. Being in the thick of the fray with data whizzing by at lightning speeds, being equipped with the right tools is a must. Like you, Alteryx also likes to live dangerously, and we’re always ready for action.
Question Can you wait X seconds between processing each row in Alteryx?
Yes! Thanks to Invisio for creating an Inviso Macro Pack and posting on their blog here.
The "Wait a Second" macro lets you wait X number of seconds before processing each row in the dataset.
One application is if you are contacting an API with multiple requests. The WaitAsecond macro may help to pause the API long enough to process multiple rows without issue.
It can also be used to scrape sites without putting heavy loads on their server. An Invisio sample of scraping the Alteryx community (See Insights to the Alteryx Community)
As you can see, the part of the flow that runs through the WaitASecond tool gets a NOW timestamps which are 5 seconds a part, whereas the bottom stream, not running through the WaitASecond tool, all gets the same timestamp.
There are essentially two macros:
The first one assigns a unique id to each record and then uses that ID for the batch macro.
The batch macro has a “Command tool” that runs a ping that waits x seconds before timeout (18.104.22.168 if that exist in your network it won’t work).
The macro can be downloaded here (InvisoMacros.zip).
Question Have you ever wanted your own help page for your custom macros or applications?
Answer If you create your own macros or applications and send them to other who aren’t as familiar with your project, or if you just need a refresher from time to time, you may try and access the help menu only to be greeted by the general Alteryx macros/apps pages:
Macro Workflows Page
Analytics App Workflows Page
You can actually create your own help pages/files that can be accessed how you would normally access the Alteryx Help Menu for any "out of the box" tool that comes with the Designer.
Using your favorite text editor (Microsoft Word, for example), you can create your help file with any instructions or graphics that you feel would be helpful to the end users who may need to access a help file. Once you are done, you can save this in any file format that your (or your users') machine is able to open, as well as any location those users would be able to access (a network drive for example).
In your application or macro’s Interface Designer Properties, there is an option to add the path of a file or hyperlink to your newly created help file.
For an example I created the following help file as a .docx, .pdf, and .htm file type. Each other these files open in their respective default programs.
Objective: A dataset has 31 fields, seven of which represent categories of data and 24 that represent each hour of a day (Figure 1). The objective is to multiply the values in each of the Category fields by the values in each of the hour fields, effectively creating 24 fields for each of the 7 categories. The data output should contain 199 fields*: the original seven categories of data, the 24 original hourly data and the 168 fields generated in the workflow. The 168 fields should be named using both the Category Name and the Hour: Category_1_1, Category_1_2….Category_7_24.
Method: Conceptually, the workflow will take each of the seven categories of data and multiply each categorical field by the values stored in each of the 24 hourly fields. The repetitive nature of this process lends itself nicely to a batch macro configuration to automate the calculations. One major obstacle, however, throws a bit of a wrench in our plans. Batch macro outputs “Union” the data, stacking data vertically. That means that the data associated with Categories 1-7 would be included in the output data as a new record, not as a new field. Because the goal is to see these data “Joined”, or appended laterally, we essentially need to defy the laws of gravity (or, batch macros, as it were) by leveraging the powers of Crosstab and Transpose tools. To successfully configure our data laterally, some key steps need to happen at three steps in the workflow: (I)before the data enters the batch macro, (II) inside the macro, and (III) after leaving the macro. Follow along with the attached workflow to visualize the process at each step.**
I. Before the data enters the Macro….
The very first step after bringing in data is to add a Record ID to assign each of your records a unique identifier. Not only do we recommend this step when the input data will be undergoing some significant transformations but, ultimately, this Record ID is the glue that holds this whole process together. Without it, it’d be a pretty tough, if not impossible, task to get this data pieced back together.
The next step is to split the data into two streams, one for the categorical data fields and another for the hourly data fields, using Select tools. Then, the data is Transposed, flipping the data vertically, using the Record ID as the Key Field and the Category or Hour fields as the data fields. This step prepares the data for the mathematical processes that will be applied to the data in the batch macro. The transposition of the Category data fields also creates an opportunity to use a Summarize tool to list the values (each of the Categories) that will be fed to the macro’s control parameter.
II. Inside the Batch Macro...
Once the data is brought into the batch macro, the data is Joined together (by Record ID) to assemble the data so that the values stored in the Category fields are alongside its corresponding data from the Hour fields. These values are then multiplied together in the Formula tool. Additionally, a new field, called “Placeholder”, is created that contains the name of the Category that is used for that batch of the macro. Note that this field is connected to the Control Parameter, which indicates that the value in that field will be updated with each batch to reflect the changing Category field names. You may notice that, at first glance, the “Placeholder” field seems unnecessary as it simply duplicates the Category field. However, this field’s importance isn’t fully realized until we start to re-compile the data using Crosstab tools a bit further downstream.
At this point, the data is split again into the data associated with Categories and Hourly data to begin the process of horizontally arranging the data and updating the naming schema of the fields. Using Crosstab Tools, the Category Data is grouped by Record ID and Placeholder, which ensures that this newly created field carries through the macro. The Hourly data fields are split into two separate streams, one for the newly created data in the field “Multiplied” and the original data. Both data streams are updated using a Dynamic Rename tool to add prefixes to field names. The new Hourly data fields, however, are connected to the Control Parameter. This means that, like the value of the “Placeholder” field, the prefix of the field will be updated with every batch of the macro. Before the data leaves the batch macro, it is joined together with a Join Multiple tool. This is the first instance in which you can see that Record ID come in handy with re-assembling our data!
III. After the Batch Macro...
This third and final step of the workflow was developed to overcome the obstacle of the Unioned output of the batch macro process. Clicking on the “Browse” following the Batch Macro will show that all the records for each of the batches are now stacked vertically and our field names only reflect the first batch of data. All the data we want is technically there; we want to visualize it horizontally for a side-by-side comparison. To re-configure the data so that each batch of data is appended laterally, we can use another round of Transpose and Crosstab tools.
First, we’ll transpose the data so that, once again, is arranged vertically. You might be thinking that this is a step backwards from our end goal. Well, in this case, it’s one step back, then two steps forward! Not only do we get those field names that we need to fix we’re arranging our data vertically, we’re preparing to create our new 168 field names according to our naming schema. The fields we’re primarily concerned about re-naming are those with data that was calculated in the batch macro, which can be filtered out of the data. Then, using RegEx and a Formula tool, we begin the process of creating our 168 new field names. Once again, that seemingly unnecessary “Placeholder” field comes in handy! Since all of our field names contained “Category 1” in them, we can compare our RegEx output (what we don’t want) with the Category that is contained in the “Placeholder” field (what we do want) with a Formula tool to re-construct every possible combination of Category Value and Hour that we need. A final Crosstab aligns the data horizontally so that the data can be Joined and visualized in the way we intended. After all that data manipulation, how do we know we got it all back together correctly? Luckily for us, that Record ID we created at the beginning of our workflow ensures that both our Crosstab and Join tools successfully connect everything correctly.
*Normally, we would not recomemnd creating 199 fields but sometimes visualization or side-by-side comparison is the goal. Inspired by a true story.
**Workflow created in Alteryx Designer v10.5
This article is part of the CS Macro Development Series. The goal of this series is to communicate tips, tricks, and the thought process that goes into developing good, dynamic macros.
Implementing APIs into macros isn’t a difficult process, you only need to first understand macros and how they’ll interact with your API requests. In short, a macro gives you the ability to encapsulate just about any repeatable process in an Alteryx workflow as a tool. Understanding that, you only need to identify what in your API request will need to change with each row the process/request is being repeated for, and how to update this request dynamically.
With each row of your input data stream, expect to be able to use fields to reference what individual values will be – doing so in a formula tool will build out parts of the request that change with each record. If instead you need to update an argument of the request just once for all your records, try using an interface tool and a place-holding value. Need to update parts of a request for only certain records? You can use formula logic or the batch macro’s control parameter approach.
Consider the Google Maps Geocoding API request format below:
If we were to send a request to their API to geocode a single address (specifying an xml output), this would look like:
To update this dynamically, within a macro, we need only to map our input fields to their appropriate places in the request, emulating the rest of the request syntax with formula logic:
(the replace function exchanges spaces for + characters, the remainder of the + characters are added as literal strings to mirror the format above)
Then only updating our key remains before passing this to a Download Tool, and this will be the same for all our input rows:
The v10.5 example above is attached for reference. It is an adaptation of a more robust Google Maps Geocoder hosted on our gallery.
Please note that in order to use this macro, you must first generate a server key from the Google Developers Console. Each key has a limit of 2,500 free requests per day. Click here for more information on usage limits for the Google Maps API.
This macro demonstrates requests to Google Maps' web service API and is meant as a proof of concept only. It is not intended for production use.
Sometimes, especially when interacting with tools/files outside Alteryx, you need to explicitly control the order of events. The classic example is a process whereby you 1) INSERT to a SQL table where an auto-incrementing ID is applied 2) read the records back in order to get the auto-incremented id 3) insert the auto-incremented ID into a different table.
In Alteryx worflow streams that terminate (as in an Output or Render), you can't reliably use Block Until Done to control the stream. One way to reliably control the order of events is with batch macros.
Batch macros guarantee that 1) every iteration is complete before the next iteration begins, and 2) that all iterations are complete before the macro reports itself complete. So if you place a process inside a batch macro and create an output stream from that macro, all processing will be done and the macro still can exist in the middle a workflow stream. You don't even need a Block Until Done as part of the process.
Please see the attached example (authored in 10.0).
You probably already know that you can output results to multiple sheets of an Excel file. If not, you should check out our resource that explains how to do that very thing. But what if you run that workflow every day, and you want to keep the outputs from days past?