cancel
Showing results for 
Search instead for 
Did you mean: 

Data Science Blog

Machine learning & data science for beginners and experts alike.
Announcement | Looking to expand your Alteryx skillset?! Check out the latest set of interactive lessons in Alteryx Academy: Creating Analytic Apps
Alteryx
Alteryx

Alteryx has always been focused on making it simpler for users to blend, explore and model data using an intuitive drag-and-drop interface. Over the years, we have introduced several new interactive visualizations for predictive analytics (Field Summary, Time Series, Network Analysis etc.), to facilitate better data-discovery and model-exploration.

 

While many of you might be aware of these tools and made use of them, what you may not know is that with a little bit of effort, you can roll your own custom interactive visualizations. In this two part series, my goal is to precisely show you how you can achieve this and work your way to glory!!! This first part is geared towards the R aficionados in the audience. In the second part, I will show you how to achieve the same without R.

 

I have always been interested in novel ways to visualize customer journey. A useful technique to visualize such flows is the Sankey Diagram. As a user, I want to be able to take customer touchpoint data and visualize it as an interactive sankey chart.

 

sankey1.png

 

Fortunately for us, there is a handy R package, networkD3 with a function named sankeyNetwork that makes this as easy as a few lines of R code. The function takes two tables, one containing the nodes and the other containing the links between the nodes, and produces a nice interactive sankey chart! The links table captures the percentage of customers flowing between the source and target nodes, and these are numbered in the same order as they occur in the nodes table.

 

# Load Library
library(networkD3)

# Read Data
nodes <- data.frame(
  name = c('Home', 'FAQ', 'Registration', 'Account', 'Cart', 'Checkout', 'Complete')
)
links <- data.frame(
  source = c(0, 0, 0, 1, 2, 3, 4, 5),
  target = c(1, 2, 3, 4, 4, 4, 5, 6),
  value = runif(8, 0, 1)
)

# Create Plot
p <- sankeyNetwork(Links = links, Nodes = nodes, Value = 'value', 
  NodeID = 'name', fontSize = 12, fontFamily = 'Arial'
)
p

 

How hard would it be to throw this code into Alteryx and create a Sankey Chart macro? I think you already know the answer, as you would expect with most things Alteryx :-). See it to believe it!

 

 

r_alteryx_diff2.png

 

This diff illustrates the key changes you need in order to make this code work in the Alteryx R Tool! You might notice that the only portions of the code I had to change was I/O (Input/Output), and this is true of any R code that you integrate into Alteryx. The output from the R Tool is passed on to the Report Text as PCXML.

 

Sankey Network Macro 

 

The sankeyNetwork function allows the user to customize different aspects of the chart. We can easily rope these in using the Interface Tools and pass them on to the R Tool using question constants ('%Question.fontSize%').

 

diff2.png

 

We can now use this macro in a workflow to visualize energy production and consumption. The data for this visualization comes from this post by Mike Bostock, who is the author of D3.js, the library that powers several interactive visualizations, including this one.

 

 

sankey2.gif

 

You might now be thinking: "This is great. But hey, I want an interactive scatterplot matrix. Can you do that?" The answer is yes! I deliberately skipped mentioning something early on. These interactive visualizations are powered by an R package htmlwidgets that I coauthored, that brings javascript visualizations to the R console. The renderInComposer function brings these plots from R into Alteryx. The neat thing about this is that any R package that is built on top of htmlwidgets can be used in Alteryx, and there are 76 such widgets to date, that you can browse through in this widget gallery.

 

Coming back to the question of creating interactive scatterplot matrices, there is an R package pairsD3 that does that. You can create a nice interactive scatterplot matrix using a few lines of R code.

 

library(pairsD3)
pairsD3(iris[,1:4], group = iris[,5])

splom.gif

 

My challenge for you is to wrap this in an Alteryx macro and provide a nice user interface to customize various features. Who is up for it?

 

I hope this post has stoked some interest in creating custom interactive visualizations in Alteryx using R. In my next post, I will discuss how you can bring in arbitrary javascript visualizations into Alteryx and create macros that rock!

 

Comments
Atom

That is awesome. I am curious if some of the other D3JS visual components could be used!! Keep sharing. 

Alteryx Partner

Cool