This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
If you are developing with Alteryx's In-Database tools you may want to share the workflows with users who do not have as much database experience. Below is an example workflow to create a macro that will allow your users to select which fields are brought out of the in database tools without interacting with the in database tools directly.
The premise of the process is that you can stream out a single record of your table, then leverage a List Box tool in select mode to de-select the fields your user does not want to pull. From there, you can use Field Summary and Summarize to create a select statement that will be used to update a Dynamic Input In-DB tool.
To run in your environment, update the Connect In-DB tool to a connection that exists on your machine and start with a select all query.
Upon creating a BINGO game, I came across a technique that I thought could be useful in "real world" scenarios for users who are attempting to iterate a process and then replenishing the data after a certain amount of time.
How do I know if I need to use a Join or a Union tool?
This is a fairly common question. You have two data sets and you need to combine them into one larger data set, but how do you go about doing that? Alteryx has two tools that make this very easy depending on what you're trying to accomplish; the Join Tool and the Union Tool.
So what is the difference anyway?
The Join tool will make your file wider, meaning it adds fields to your data, or combines your inputs horizontally. You want to use this tool if you're looking to add information for existing records. For instance maybe you have a list of customers with the store ID that they shop at, and then have a separate list of the Store locations. If you want to add the Store information to the Customer information, you'd need to do a Join based on the Store ID field. This will match record by record the Stores with their Customers so your Customer file now has the Store information on it. Note that this can result in duplicate records depending on how your data sets are set up so be prepared to check your data.
The Union tool makes your file longer, meaning it adds records to your data, or combines your inputs vertically. You want to use this tool if you're looking to stack two files on top of each other. For instance, maybe you have a list of customers from each region of your sales organization and want a single master customer file. All your region files have the exact same fields. The Union tool will align your data sets based on the Field names (either automatically or you can do it manually if there are slight variations) and you'll end up with one master file with all the same fields.
Take a look at the attached sample built in v11.3 that demonstrates the difference!
Amazon offers many different services that can be accessed from the AWS Command Line Interface (aka the AWS CLI). Alteryx workflows can take advantage of these services using the AWS CLI from the Run Command tool.
Some of the common uses of the Amazon CLI within Alteryx include:
Using S3 with IAM roles
Uploading/Downloading file types not currently supported by the S3 Upload / Download tools
Interacting with Athena
As an example of using the AWS CLI, let’s build a process that can take a local file and copy to S3, using an AWS IAM profile. Here are the steps involved:
Review the documentation for the service. In this example, we're using the "S3" service, but Amazon has many different AWS services.
Install the AWS CLI on any machines that will need to execute the AWS functions
From Windows Command Line, create a working command
(Optional step to make calling the CLI easier in the future) – Add reference to the AWS CLI executable to Window Environment variables via My Computer > Right Click > Properties > Advanced System Settings > Environment Variables > System Variables > Select “Path” > Edit > (Add the path to the directory of your installed “aws.exe” file, separated by a semicolon. Do not delete/modify any pre-existing values in the Path variable.) > OK > OK
Per the AWS documentation, create a local profile using your IAM role. The local profile references your credentials so that you don't have to pass the credentials in every command.
Test the following command in your Windows command prompt, making sure to use a valid local file to copy, and a valid destination location on your S3, and a valid profile name:
s3 cp "C:\temp\mylocalfile.csv" "s3://mylocation/mydestinationfilename.csv" --profile MyIAMUser
Once you've validated your command in the command prompt, transfer it to an Alteryx workflow
In a new workflow, pull in a “Run Command” tool from the “Developer” category
Configure the Run Command tool using the working command:
Test the above
As an optional step, we can wrap this functionality into a macro to make it easy to update, and to support batch uploads
Add a Control Parameter for “Local File Path” and configure its action to update the specific string value “C:\temp\myfile.csv”
Add a Control Parameter for “Destination File Path” and configure its action to update the specific string value “s3://mybucket/myfiletest1.csv”
Add a Text Box and configure its action to update the command value “aws” in case the user hasn’t placed the CLI exe in their PATH variables
The example above is attached in the 11.3 workflow, SaveToS3_usage.yxzp.
The Transpose tool pivots data for all selected fields. The column headers are listed in the name field and the corresponding data items are listed in the value field. You can also select key fields which will remain unchanged through the transformation. The transpose tool is often used in conjunction with the Cross Tab tool, which essentially works in the opposite direction.
A must-have for any app or macro, the Error Message Tool displays a prompt to the user based on input from Interface Tools . Using any expression that evaluates to true, along with any number of user inputs from question anchor connected Interface Tools, the Error Message Tool can layer even the most involved applications with failsafes that ease a user’s experience through more robust interactions.
The ConsumerView Matching macro enables users to match their customer file to the Experian ConsumerView data. Starting with customer information such as name and address you can leverage the ConsumerView macro in Alteryx to append a variety of information about your customers such as household segmentation, home purchase price, presence of children in a home, estimated education and income levels, length of residence, and many more!
Binary Large OBject (BLOB) data types are often used to store images, audio, and other multimedia files/objects in a single, standardized, format for simplified database management - making them a frequent filetype in the Alteryx Designer. Fortunately, with the Blob Convert Tool , along with the Blob Input and Blob Output Developer Tools , working with BLOB objects is no more difficult than the file types they represent!
When working with SPSS, values can have both a Text label and a numeric representation of the categories (equivalent of string factors in R). Columns can also have an encoded name ex. Q_1 and a longer descriptive name that maps Q_1 to the original question that was asked (closest thing in R is the data frame attribute on the column).
Alteryx reads .sav files and loads either the numeric representation or the textual representation of values based on the user’s selection. It also reads the variable labels into the Alteryx Field Description. When writing .sav output, Alteryx will write either the text or the numeric values (depending on what was used in the workflow) as well as the SPSS variable labels which were displayed in the description field. However sometimes to maintain the integrity of the whole SPSS file, clients will want the value labels, value levels, and variable labels to appear in the output file. For these cases, using the Alteryx tools and a few lines of R code (while leveraging the thousands of R packages on CRAN) wrapped in a macro gives us the needed functionality. Attached is a macro which will write the data, variable & value labels back into SPSS.
In this section, we will explain the R code logic that is specific to this macro. You can get an introduction to writing custom R code in Alteryx here.
Before we can do anything, we will need to pass the data to the tools inside the macro (more details on macros here). The raw numeric data should be connected to the D input of the macro. In addition, and because the data frames created in R don’t contain the Field Description data, we need to pass Field Description values to the M input (M for Metadata) of the macro. We’re using the Field Info Tool to extract the description into the rows and send it to the macro. With that done we can now look inside the macro.
Inside the Macro
Inside the macro, we are using the R Tool to contain the main functionality. We connect it to the Interface tools (Macro Inputs, File Browse, Action Tool) to get the data, metadata, and output file path from the user. Finally, we’re using the API tool to pass a message back to the user in the Alteryx Execution Messages.
The R Tool contains the code to properly format the input data and write it out to the .sav file. The majority of the work is already done for us in the ‘sjmisc' package from CRAN (R users know that most of the time they can find a package that does what they want). This package, among other features, implements reading and writing .sav files with both variable and value labels. We will first check if the package is not already installed and we’ll attempt to install it.
This workflow installs the additional sjmisc package. To avoid package version and dependency issues, it is possible to use Microsoft R Client as the base R. More details here.
If you would like to install the package separately you can use the R install packages App from the Alteryx Gallery.
filePath <- "c:\\temp\\outputRAlteryx.sav"
inputData <- read.Alteryx("#1", mode="data.frame")
ColumnLabels <- as.vector(read.Alteryx("#2", mode="data.frame")$Description)
Within the R code we also define a static ‘filepath ‘ pointing to where the output data should be written. The Action Tool that we previously mentioned will update this filepath to the one chosen by the user while at the same time correctly escaping the backslashes to work in R.
inputData <- read.Alteryx("#1", mode="data.frame")
ColumnLabels <- as.vector(read.Alteryx("#2", mode="data.frame")$Description)
We then read the data that we pass to the macro from input ‘#1’ into an R data frame. In this case we are depending on R’s default behavior of transforming text to factors to get the Value encodings for all columns ex Male(1), Female(2). In addition, we read the metadata from input ‘#2’ into R. The sjmisc function, set_label, that applies the variable names to the data frame expects the variable names to be passed as an object of type vector. To make it work, we have to convert the Description column of the data frame we’re reading in into a vector with the as.vector() base R function. For more details about ‘sjmisc’, you can find the documentation here.
labeledData <- sjmisc::set_label(inputData,ColumnLabels)
Finally we label inputData with the labels we just created and we store the result in the labeledData dataframe and then write it to the user’s filepath using the sjmisc’s write_spss function.
MessageOut <- paste("file written to: ",filePath)
names(MessageOut) <- "Output File Path"
We also pass the filepath as a message to the R Tool output to be displayed to the user.
Edit: It was brought to our attention that the macro has an issue writing out text columns that are longer than 120 characters. Unfortunately this is a defect in the underlying R package. As a workaround for now, the macro was modified to trim all text fields to 120 characters. Please keep this in mind when writing out data.
Mandatory Note: This macro and sample code were developed by the authors as a proof of concept to show what's possible. This is not a production-ready macro and is not supported by Alteryx. Do ask questions on this thread - BUT use at your own risk!
WriteSPSSWithLabels_sjlabelled.yxzp has been updated from using the R package sjmisc because the set_label command has been deprecated from sjmisc and is now in sjlabelled.
Jordan Barker & Fadi Bassil
This tool provides a number of different univariate time series plots that are useful in both better understanding the time series data and determining how to proceed in developing a forecasting model.
Have you ever wanted to do a Cross Tab, but needed the results in a particular order? You've probably discovered that Cross Tab will dutifully change your data to column headings and keep the rest of the data correlated, but that the output columns will be arranged in alphabetical order. It's as if there were a Select tool with the "Sort on Field Name" Option thrown into your workflow without you putting it there.
How Do I Replicate the WHERE EXISTS Functionality of SQL within Alteryx?
Example 1: UNION - Add rows from Table #1 to Table #2 if the key value of Table #1 does not exist in Table #2. If desired, the combined data set, could be joined with a third data set, only if the key value in #3 does not exist with only one key column.
Example 2: SELECT/UPDATE records from Table #1 based on the contents of Table #2. The statement below generates the names of customers who had orders during 2016.
select c1.customer_number ,c1.customer_name from customers c1 where 1 = 1 and exists (select * from customer_orders c2 where 1 = 1 and c1.customer_number = c2.customer_number and c2.order_year = 2016 ) ;
The SQL EXISTS condition is used in combination with a subquery and is considered to be met, if the subquery returns at least one row. It can be used in a SELECT, INSERT, UPDATE, or DELETE statement.
The syntax for the EXISTS condition in SQL is: WHERE EXISTS (subquery).
The subquery is a SELECT statement. If the subquery returns at least one record in its result set, the EXISTS clause will evaluate to true and the EXISTS condition will be met. If the subquery does not return any records, the EXISTS clause will evaluate to false and the EXISTS condition will not be met.
To do this in Alteryx (see attached workflow):
Scenario 1: Take the unmatched records from the source you are appending data from out of a join between the two data sets and union it back to the data set you are appending to.
Scenario 2: Create your "look up list" out of a filter for whatever you set as the condition. In this example, a year that is in the data. Join the filtered data back with the other data source. Deselect the data that comes from the filtered source.
Example attached in the v11.3 workflow Where Exists Question.yxmd.