Another day, another awesome SparkED session in the books!
On March 29th, 2022 Alteryx and CodePath joined forces for a second time to deliver an Alteryx 201 session that took a deeper look into not only predictive modeling, but the data preparation & thought process that goes into the process of predictive analytics. (If you haven't read up on the 101 session, you can find a quick article on it here.)
At a high level, the workflow used open source data from Kaggle to predict whether any given NBA team would win a home match in the 2021-22 season. Although there are many ways to approach this problem, a combination of Association Analysis, Logistic Regression, and a Stepwise Function were used to output the final prediction.
Though the workflow is attached below, we also went through the exercise of integrating external version control with GitHub by following some of the best practices laid out in this article by my fellow Alteryx colleague @DavidM. You can find the (albeit poorly documented) GitHub repo here.
There were a few questions that came up during the session, one of which included Alteryx's ability to handle PCA (Principal Component Analysis). Although it wasn't a topic that was covered in depth, Alteryx does in fact have the ability to do PCA through the Principal Components Tool – more on that in the Help Documentation!
All in all, the aim of the session was to bridge the gap between computer science and data analytics by providing some perspective into how leveraging a strong technical background with the right toolset has all the makings for a data superstar. For those interested in learning more, make sure to check out our Data Science Learning Path!
Until next time!