This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Read Alteryx customer stories to learn how they transform their organizations into becoming a data-driven business.
Announcing Alteryx + Snowflake | Alteryx and Snowflake make analytics and data science fundamentally easier. With the new integrated starter kit, you can push down data prep transformations and more into Snowflake for faster data quality and analytics output. Learn More
The U.S. Census is publishing real-time data on the impact of Covid-19, unlike their core products that lag a year or more behind their collection date. Food banks, United Ways, governments and foundations can tap this unusually rapid data flow to plan and execute aid programs to counter the pandemic. The following analysis focused on people who had sufficient food prior to pandemic, but were tipped into hunger after it struck, as well as hunger impacts by race and age.
Describe the business challenge or problem you needed to solve
On May 20, 2020, the Census Bureau began publishing more than 20 Excel-format workbooks on Covid-19’s impact on employment, hunger, health, education, and housing. As of this writing, 11 weeks are available, enabling charting of data with trend lines over time. This is hugely useful to understanding the changing levels of pandemic-induced disruptions.
While regular Census tables usually have column labels for population counts in each field and rows labeled with geographic units (states, census tracts, or census blocks), in the Covid-19 case each geographic unit is a separate sheet within the workbook. There are 51 sheets for states and the District of Columbia, another 15 sheets for the largest Metropolitan Statistical Areas (MSAs), and one sheet for the U.S. as a whole. The sheets have columns with the population counts, and the rows describe various characteristics being counted. As one reads down the rows, different situations are addressed, and some row labels such as “Did Not Respond” will show up multiple times. The sheets have to be parsed very precisely to avoid this issue.
To demonstrate the power of Alteryx to quickly ingest each week’s new batch of data and incorporate it into the analytical framework, Precision Analytics Group created two types of workflows and tested them on the 15 MSAs. One calculated the absolute number of people impacted by hunger to quantify the need, and the other showed the percentage of hungry people by race and by age to help guide where aid should be targeted.
Describe your working solution
The first workflow “Newly Hungry.yxmd” reads only the 15 sheets in each weekly workbook related to the 15 MSAs, using a Dynamic Input working off a list of MSA sheets created by an Input tool reading only sheet names. Each workbook file name includes the week number, so a wildcard enabled reading in all available workbooks about hunger regardless of week number.
Chained Filters looking for specific wording of a row label plucked out just the rows desired, as shown below:
Then a Formula tool used Substring functions to break apart the included file name to create columns for the MSA’s name and the week number. An earlier version of this workflow used a Formula tool with the Switch function to turn entries like “Week3” into “May 19”. However, this was a hard-coded approach covering only the 90 days that the Census Bureau initially said would be included in the project. It would fail if the Census Bureau were to extend the collection period. The current workflow sets “Week1” to May 5, 2020, and uses a Multi-Row tool grouped by MSA to create as many subsequent actual dates as are needed to match the workbooks in the directory.
People who reported that pre-pandemic they had “Enough Food” or “Enough but not the types desired” were combined into a group for “not hungry before the pandemic, but hungry now.” People who were hungry prior to the pandemic – “Sometimes” or “Often” – were combined into a second group:
The Interactive charts for the top row (formerly food secure) had two layers, one for people “sometimes hungry” and another for “often hungry.” The bars revealed the total number of people hungry in each MSA, as they rose and fell over time:
The data for each city appears pretty noisy, with very large changes from week to week:
The second workflow “Hunger by age and race.yxmd” also uses a dynamic input, but based on a file of sheet names extracted earlier.
Unlike the previous workflow that broke apart the file name using substring functions in a Formula tool, this one extracts the week number and the MSA name primarily through a series of text-to-column tools (Alteryx often has multiple ways to achieve the same result). Each text-to-column exploits a different delimiter, such as a backslash or the three pipes that precede an Excel worksheet name:
After creating the week numbers and MSAs, the workflow then adds a field that matches week numbers with actual calendar dates for easier understanding by end users.
This workflow also creates a four-week rolling average for the percent hungry among age groups and racial categories. Each report’s data is so chaotic from week-to-week that discerning trends or even basic relationships was nearly impossible without the rolling average:
Then the joined age data was input to the Interactive Chart tool, in Batch mode so that each MSA had its own chart. Each race or age group had its own Layer for the line charts. A Summarize tool created an average for all 15 MSAs. A Formula tool put “Composite of 15 MSAs” into the column for MSA name; with all the other charts, the MSA names were extracted from the tables’ file names.
The final steps joined the age and race line charts for side-by-side views for each MSA city, as well as side-by-side charts for the composite of all MSAs:
In almost all cities, the 55-64 and 65+ age groups actually were the least hungry; younger people had greater needs. Black and Hispanic/Latino populations were the most hungry among racial categories. Here’s what the composite shows:
Describe the benefits you have achieved
The two workflows can ingest new weekly data without changing any portions. New workbooks that are dropped into the directory are automatically included in the output. Leaders in charge of raising funds for food assistance, income security and other needs will have the ability to rapidly show funders the charted needs as soon as new workbooks are issued every Wednesday.