Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

General Discussions

Discuss any topics that are not product-specific here.

Johns Hopkins University COVID-19 daily data workflow

DavidW
Alteryx Alumni (Retired)

Johns Hopkins University CSSE released the data set that powers their dashboard on GitHub (https://github.com/CSSEGISandData/COVID-19). If you want to work with that data easily, I created the attached workflow and macro to import the daily data found within that repository. Install the package to the root of the folder that Git creates. This will import the daily data, parse the date fields that change formats halfway through the timeseries, update null latitude and longitude fields, and other general cleansing. With that done, you can experiment with daily worldwide COVID-19 confirmed/deaths/recovered case numbers at the country/region and province/state level, with geocoding available for about 99% of records.

 

If this workflow is useful, please let us know. If you need help or have improvements to the workflow, please share.

 

We'd also love for you to share what you create or discover by replying to this thread!

 

EDIT: The workflow has been updated to better clean and regularize the data. A lot of clean up is being now to country and state names, with merging of duplicates being done, and a locality field being parsed out of values such as "Chicago, IL". Review the new workflow for details. This should improve the quality of the output data significantly, although JHU is still working on upstream issues on their end.

David Wilcox
Senior Software Engineer
Alteryx
52 REPLIES 52
neilgallen
12 - Quasar

In the theme of additional sources, google has made the johns Hopkins CSSE dataset available via their bigquery platform.

 

I haven't thoroughly checked the accuracy, given how many changes the CSSE dataset has gone through regarding schema, but it's google so I trust them?

 

Obviously accessing via alteryx is as simple as using the big query input tool, or python!

 

kevinraj127
7 - Meteor

There's been a lot of great ideas and discussion on here, thanks a lot for everyone's help! I've attached a workflow that uses the GeoJSON API that powers the Johns Hopkins COVID-19 dashboard. This should help in getting data more periodically throughout the day rather than at the end of the day when the csv files are posted to JHU's GitHub page. Here is the link on ESRI's site for more info on the data:

 

https://coronavirus-resources.esri.com/datasets/628578697fb24d8ea4c32fa0c5ae1843_0?geometry=13.259%2...

 

Thank you and be safe!

 

-Kevin Raj

imattclark
6 - Meteoroid

@LukeM 

 

I am getting the following error:

 

Error: Designer x64: The Designer x64 reported: Error running Event #1: The external program "gitpull.bat" returned an error code: 1: The system could not find the environment option that was entered. (203)

 

Is this because I've already run the git pull today?  The .bat itself will execute outside of Alteryx.

chriscuk
5 - Atom

David,

 

I downloaded your original workflow and I think it processed all the files in the daily reports directory, but I have just revisited it and it seems to only process up to the end of March.  Do you have a later version?

shevshenko
7 - Meteor

Hello Chriscuk - I had noticed some small tune up that might be required, I have seen for instance some column headers that may be affecting when you run but at the end I included a version based on David's that I am currently using to load my own dahsboard. Please take a look to the attached one, I just used it to load till April 12 Data that is the latest available Data file.

shevshenko
7 - Meteor

I missed the workflow, sorry.

chriscuk
5 - Atom

Thank you. I will give it a go. 

bcampbell0621
8 - Asteroid

Hi David, 

 

This workflow is wonderful!  I used it yesterday for the first time and worked well.  I used the latest daily reports from JHU which included all reports up to 5/7/2020, but when the flow ran, I only had up to 3/22/2020.  I went through the flow, but was unable to determine why.  

 

Any ideas?

 

Many thanks for this!

 

Bruce. 

shevshenko
7 - Meteor

Thanks for sharing, while analyzing the flow, does this only include US country? I could not handles to see a global overview but US only.

shevshenko
7 - Meteor

Hi - have you tried to use the Workflow I shared in this same previous post:  Daily_Import2.yxmd

I have been using it, I remember I had some issues while getting all of the files Data but I have made some adjustments to it and should be working okay now.

 Thanks

 
 
 
 
 
 
 
 
 
 
 
Sergio
 
Labels