This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have the attached flow that was modified to pull data from the JHU website and combine the three data sets. I have been trying to add a Summarize feature to sum the totals by country and province for Confirmed, Deaths and Recovered cases. I have not been able to get it to work and maybe one of you wonderful people could provide a solution.
The data is changing daily and it would be helpful for my purposes to add the new data daily as well.
Hi @bcampbell0621, does adding a summarize tool to sum numerical value by country and province at the end of the workflow, and joining in back to the base data work? Something like the attached solution? Please let me know if I misunderstood the ask.
I think we are closer to the right solution, but we need to figure out how to sum the totals for reach case type for each country. I added a mock up of what the flow should produce. I hope this helps and makes sense.
Hi @bcampbell0621 - please see my workflow attached. This uses Python pandas to generate a data frame from the daily report csv file on JHU's GitHub site. To get the most up to date data - you'll need to change the date in the url string in the Python tool i.e (change 04-04-2020 to 04-05-2020) to get the data that is posted tonight and so on. Hope this helps, please let me know if you have any questions, thanks.
Sorry I'm just seeing this now; have been on a Tableau dashboard for weeks and am coming up for air.
If I wanted to pull all of the daily reports from the Github repository folder, i.e. January 21 - May 6, could I used the same logic as you have provided from Pandas to create a data frame and then update it daily through Alteryx? The idea would be to union all of the daily reports and and have them in one file by date, with cases by type (confirmed, active, deaths and recovered)
No worries, Bruce. Yes, you could update the file daily through Alteryx and put an Output Data tool at the end of the flow as an Excel or TDE file. Changing the 'Output Options' to 'Append to Existing Sheet' or 'Append to an Extract file' will let you add the new records as you change the date. It would be a bit of a manual process but a good start for the specific columns you're looking for. You may also want to look at the timeseries data on JHU's GitHub which has daily data:
The time series data doesn't include active cases, but I guess that would be easy enough to calculate. Does the workflow you sent earlier only work for individual files or multiple? Trying to find a quick hit that will allow me to combine them and visualize increases and decreases.
The flow I posted earlier is for individual csv files - you'd have to run that flow multiple times to get the combined dataset from January to May 2020. Lots of files but not sure of a easier way to do it right now.