- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Notify Moderator
Did you know there are even more data science tools available for Alteryx, beyond what you see in your Designer palettes, for statistics, data prep, modeling and more?
Image via GIPHY
In this week’s episode of our Alter Everything podcast, Chris @mceleavey talks about building his own Alteryx tools. One of Chris’s projects is especially helpful for many modeling tasks: a one-hot encoding tool that handles that pesky but important step. (He also wrote a blog post about one-hot encoding!)
Alongside Chris’s work, there are many more useful tools for data science in the Alteryx Analytics Gallery, our public repository of workflows, macros and analytic apps. Many of these tools reside in the Predictive District, but we scrounged up more from other corners of the Gallery.
Image via GIPHY
Here’s a list of freely available macros, tools and sample workflows for data science tasks that might save you time and effort, plus a bonus package of tools developed by Alteryx enthusiasts.
Data Preparation and Statistics
- BoxCox
- Creator: @TimothyL
- Use for Box-Cox transformation, which transforms a variable with a non-normal distribution so that it has a normal distribution
- Pearson Correlation with Group By
- Creator: @k_koebisawa
- Allows you to group records by one field prior to calculating Pearson correlations
- One-Hot Encoder
- Creator: @mceleavey
- One-hot encodes data using Alteryx tools
- One Hot Encode vPython
- Creator: @TimothyL
- One-hot encodes data with pandas’ get_dummies
Image via GIPHY
More Modeling Options
- Predictive Analytics Starter Kit
- Creator: Alteryx Solutions
- Features sample workflows for essential prediction tasks, including linear and logistic regression and A/B testing
- K-Medoids
- Creator: Alteryx Innovation
- Offers another clustering method using the Partitioning Around Medoids (PAM) algorithm, which may handle noisy data better; check out the K-Medoids Sample once you have the macro for a demonstration
- Logistic Regression Worked Example
- Creator: @jamielaird
- Shows a full workflow for building a logistic regression model
- Multidimensional Scaling
- Creator: Alteryx Innovation
- Carries out multidimensional scaling, a dimensional analysis technique with similarities to principal components analysis (PCA); see it in the sample workflow and read the documentation
- MB Affinity Sample
- Creator: Alteryx Innovation
- Provides macro and example for another method of market basket analysis using alternative measures of affinity, such as cosine similarity (documentation)
- Survival Analysis
- Creator: Alteryx Innovation
- Carries out survival analysis (documentation) and generates relative risk and survival time when used with the Survival Score macro; see them at work in the sample workflow
- XGBoost
- Creator: @TimothyL
- Includes sample workflow with Python and R macros for modeling with XGBoost
- Output Model
- Creator: Alteryx Innovation
- Saves a model as an R binary file or PMML file for use outside Alteryx
Image via GIPHY
Evaluating and Understanding Models
- Cross Validation Tool
- Creator: Alteryx Innovation
- Performs cross-validation to compare and evaluate models. Note: Download the .yxi file linked in the description; you’ll need admin privileges to install this new tool. Once the tool is installed, try out the sample workflow to see how to use the tool in a workflow with various models.
- FeatureScaler
- Creator: @Dominik2806
- Scales features so their values are between 0 and 1, a necessary step for some algorithms; can also back-transform values
- ImportanceWeights
- Creator: Alteryx Innovation
- Provides an approach to feature selection based on feature importance
- Variance Inflation Factors
- Creator: Alteryx Innovation
- Generates a report providing the variance inflation factors for variables in a model to help assess multicollinearity; see it in a sample workflow
- Model Coefficients
- Creator: Alteryx Innovation
- Outputs the model coefficient names and values from count, gamma, linear, or logistic regression models
- Model Comparison
- Creator: Alteryx Innovation
- Compares how models perform on a test set, and provides error measures and prediction results for each model, as shown in the sample workflow
Image via GIPHY
For Time Series Analysis Fans
- TS Model Factory
- Creator: Alteryx Products
- Creates ARIMA or ETS time series models for multiple groups simultaneously (documentation)
- TS Forecast Factory
- Creator: Alteryx Products
- Generates forecasts from groups of time series models (ARIMA or ETS) for your specified number of future periods (documentation)
- TS Split Periods
- Creator: @RWvanLeeuwen
- Divides a time series dataset sorted chronologically to create training and test sets and visualizations
- TS Stationary Test
- Creator: @TimothyL
- Checks for stationarity in a time series using the augmented Dickey-Fuller test
And a bonus item: The ayx-builders pack on Github, created by @nick612haylund and @tlarsen7572, offers multiple handy tools for data science tasks, including a data generator and a Twitter scraping tool.
With these free tools, you can extend the data science capabilities of Designer and finish projects more easily and efficiently.
And, of course, be sure to listen to the Alter Everything conversation with Chris for inspiration, then share your own custom creations.
Blog teaser image by ThisisEngineering RAEng on Unsplash
Senior Data Science Journalist
Susan Currie Sivek, Ph.D., is the data science journalist for the Alteryx Community. She explores data science concepts with a global audience through blog posts and the Data Science Mixer podcast. Her background in academia and social science informs her approach to investigating data and communicating complex ideas — with a dash of creativity from her training in journalism. Susan also loves getting outdoors with her dog and relaxing with some good science fiction. Twitter: @susansivek
Susan Currie Sivek, Ph.D., is the data science journalist for the Alteryx Community. She explores data science concepts with a global audience through blog posts and the Data Science Mixer podcast. Her background in academia and social science informs her approach to investigating data and communicating complex ideas — with a dash of creativity from her training in journalism. Susan also loves getting outdoors with her dog and relaxing with some good science fiction. Twitter: @susansivek
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.