This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Data blending, transformation and cleansing..oh my! Whether you're looking to apply a mathematical formula to your numeric data, perform string operations on your text fields (like removing unwanted characters), or aggregate your spatial data (among many other things!), the Formula Tool is the place to start. With the examples provided below, you should be on your way to harnessing the many functions of the Formula Tool:
The Select Tool within the Alteryx Designer is the equivalent of your High School Sweetheart. Always there when you needed them and helped you find out more about yourself. The Select Tool can do exactly this by showing you the data type and structure of your data, but it also gives you the flexbility to change aspects of your dataset.
The Neural Network Tool in Alteryx implements functions from the nnet package in R to generate a type of neural networks called multilayer perceptrons. By definition, neural network models generated by this tool are feed-forward (meaning data only flows in one direction through the network) and include a single Hidden Layer. In this Tool Mastery, we will review the configuration of the tool, as well as what is included in the Object and Report outputs.
Sampling weights, also known as survey weights, are positive values associated with the observations (rows) in your dataset (sample), used to ensure that metrics derived from a data set are representative of the population (the set of observations).
Time series forecasting is using a model to predict future values based on previously observed values. In a time series forecast, the prediction is based on history and we are assuming the future will resemble the past. We project current trends using existing data.
Unlike a snowflake, it is actually possible for duplicates exist when it comes to data. To distinguish whether or not a record in your data is unique or a duplicate we have an awesome tool called the Unique Tool that will actually turn your data into a unique snowflake.
The Association Analysis Tool allows you to choose any numerical fields and assesses the level of correlation between those fields. You can either use the Pearson product-moment correlation, Spearmen rank-order correlation, or Hoeffding's D statistics to perform your analysis. You can also have the option of doing an in-depth analysis of your target variable in relation to the other numerical fields. After you’ve run through the tool, you will have two outputs:
The Multi-Row Formula Tool functions much like the normal Formula Tool but adds the ability to reference multiple rows of data within one expression. Say, for example, someone was on the ground floor of a house and had a Formula Tool. They would only be able to talk to the people also on the ground floor. If they had a Multi-Row Formula Tool, though, they would also be able to talk to the people upstairs, in the attic, and in the basement as well.
Far more than just a window to your data, the Browse Tool has a catalog of features to best view, investigate, and copy/save data at any checkpoint you place it. That introspection to your data anywhere in your blending gives valuable feedback that often speeds workflow development and makes it easier to learn tools by readily visualizing their transforms. Be equipped, and browse through the catalog of useful applications below!
The Auto Field tool examines your data, and automatically optimizes the field type and length. Take a look at your data with a Select tool, follow this up with an Auto Field, and follow that up with another Select tool to see what kind of changes you’ve made. After you run the module, you can examine each Select tool to garner a before-and-after view of the adjustments made to the fields. You can even take this a step further, and add a few Browse tools to see how your database actually decreases in size, you may be surprised by how much! In the below view, the file size was reduced by about 40% with the Auto Field tool, on just 50,000 records and one field. Now consider running a file of millions of records and the amount of the size decrease becomes really substantial! Before Auto Field View: After Auto Field View:
Alteryx provides GUI tools that offer similar functionality to many SQL commands. Although minimal SQL scripting may be necessary in order to properly configure tools, the amount required to complete analysis is significantly reduced.
There are a handful of ways to search for a particular string within a data field. If you want to perform a query, identifying records with a particular string field within a data field:
Use the Filter tool: the result will be two streams - those records that meet your filter criteria and those that do not.
On the Functions Tab, expand the string tree and select FINDSTRING(String, Target)
Replace the `String` parameter with the field name
Replace the `Target` parameter with the value you are looking to identify
Finish the expression with !=-1 which will separate the true values from the false ones.
Example - If you are trying to identify all of the customers with Joe in a data field [Name] :
The records that meet this criteria will be output from the True anchor ([Name] contains the value "Joe"). All other records will be output from the False anchor.
This function can also be used in the Formula tool; if for example you want to populate a different data column based on the [Name] field, you can use the FINDSTRING in an IF statement. Example: You would like to classify your data in a new field based on the instance of customers with Joe in a data field [Name] :
IF (FINDSTRING([Name], "Joe") != -1) THEN "JOE Customer" ELSE "Other" ENDIF
This will populate a new data field with "JOE Customer" if the field "Name" contains "Joe" otherwise it will populate that field with the value "Other"
In the Formula tool, add a new field by selecting + Add Column, or choose an existing field to update.
Make sure the appropriate Field Type and Size is also specified
On the Functions Tab, expand the Conditional Tree and select IF c THEN t ELSE f ENDIF
On the Functions Tab, expand the String tree and select FINDSTRING(String, Target) to replace c
Replace `String` with the field name
Replace `Target` with the string you are looking to identify
Finish this part of the expression with !=-1 which will separate the true values from the false ones
Replace "t" with the desired value to populate the new field if the condition is met: "JOE Customer"
Replace "f" with the desired value to populate the new field if the condition is not met: "Other"
If you’ve gotten accustomed to having headers and sub-headers (fields/sub-fields) in your tables, much like those you can build with the “Merge & Center” Excel feature, you’ll be happy to hear with just a couple formatting steps we can build the format in this article