Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Data Science

Machine learning & data science for beginners and experts alike.
SusanCS
Alteryx Alumni (Retired)

Oh, 2020. We’re ready to leave this year behind for so many reasons. But there are some bright spots for data professionals as we head into 2021. 

 

In this week’s Alter Everything podcast episode, guest Steve Mann from Alteryx partner Propel32 Analytics discusses the increasing importance of analytics in the mergers and acquisitions field in recent years. Data analysts and data scientists must constantly adapt to that kind of change, and there’s always something new to learn!



SusanCS_0-1607360659740.gif

Image via GIPHY



Let’s take a look at some of the trends and developments that emerged in 2020 that might shape and inspire our work in the year to come. And, if news of these flew right past you (totally understandable in 2020!), we’ve got some links and resources to get you up to speed.

 

  • The impact of the pandemic on forecasting and our (future) historical data. One of the biggest challenges for data analysts and data scientists this year has been dealing with the radical changes in consumer behavior, supply chain availability, public opinion, political realities and everything else during a pandemic. Data from the past suddenly became less relevant for many (although the fashion industry has looked to the trends during and after the 1918 influenza pandemic for ideas!). Moreover, data collected in 2020 will contain aberrations that will confuse and break models. Data experts will have to find creative ways to deal with these divergences in 2021 and beyond.

 

  • Automating data processes and machine learning. The automated building of machine learning models was a hot topic in data science in 2020. Alteryx released its own Assisted Modeling capabilities as part of the Intelligence Suite, including a one-click automatic modeling option. Featuretools, an open-source Python package from the Alteryx Innovation Labs, continued to develop automated feature engineering options, and the Labs’ Compose package (also open-source) presented automated prediction engineering. More tools to automate data science will no doubt emerge in 2021.

 

  • Unstructured data analysis and generation. Analyzing unstructured data became more accessible this year with our own topic modeling and sentiment analysis text analysis tools in the Alteryx Intelligence Suite. We also saw lots of news stories about — and, in one case, controversially “written by” — GPT-3. GPT-3 is a language model that was trained on trillions of words of text from the internet. It has been used for remarkably effective text generation and can even write code. And, of course, many news stories addressed generative adversarial networks, or GANs, that most famously can generate convincing, but wholly fabricated, images of human faces. (Try this site and see if, and how, you can find the ‘tells’ of algorithmic fabrication.) Video generation is coming along as well; tennis fans especially should check out this example of synthesized video from the Vid2Player project.

 

 

 

  • More attention to issues of bias and fairness in machine learning and AI applications. Along with the Black Lives Matter movement and greater attention to issues of systemic racism, this year saw increased concern and public discussion of the potential for bias and injustice in the use of data and algorithms. Notably, the documentary “Coded Bias” explored many of these problems and was screened at virtual film festivals. The MIT researcher featured in the film, Joy Buolamwini, testified in the U.S. House of Representatives about her work.  

 

  • Resources for starting the data literacy journey early. A recent article about Amazon noted that the company is working with preschools to help start technology education early. Amazon isn’t alone in thinking about getting kids into tech. Alteryx released its own data science resources for K-12 students this year, including a special podcast miniseries, “Data [in the] Sandbox,” that teaches kids essential data concepts. Microsoft and Netflix also released a collaborative project to teach kids data science. Clearly, getting young people into technology and data at an early age is on everyone’s minds as data skills continue to gain value.

 

 

 

 

There will certainly be more excitement (hopefully of a positive variety) to come in 2021. We have some cool things in the works here at Alteryx, and we’ll continue to share about them here and at the Data Science Portal

 

And in the meantime, be sure to listen to this week’s Alter Everything episode to hear about Steve’s work. You’ll also learn how he’s applied his data expertise to assist volunteer organization New York Cares with their COVID-19 response — helping everyone make 2021 a better year.

 

 



Blog teaser photo by Jan Tinneberg on Unsplash

Susan Currie Sivek
Senior Data Science Journalist

Susan Currie Sivek, Ph.D., is the data science journalist for the Alteryx Community. She explores data science concepts with a global audience through blog posts and the Data Science Mixer podcast. Her background in academia and social science informs her approach to investigating data and communicating complex ideas — with a dash of creativity from her training in journalism. Susan also loves getting outdoors with her dog and relaxing with some good science fiction. Twitter: @susansivek

Susan Currie Sivek, Ph.D., is the data science journalist for the Alteryx Community. She explores data science concepts with a global audience through blog posts and the Data Science Mixer podcast. Her background in academia and social science informs her approach to investigating data and communicating complex ideas — with a dash of creativity from her training in journalism. Susan also loves getting outdoors with her dog and relaxing with some good science fiction. Twitter: @susansivek