Hello All:
I have three separate CSV files containing data about COVID-19 infections. 1. Confirmed cases 2. Deaths and 3. Recovered cases. Each one contains country information, long and lat and a series of dates, each with a count for that day. I have to be able to combine them with totals across the time series data. This data is from JHU and it is clean, but it is not summed and combined. I have been trying to create a flow, but it is not creating the correct results. I have included the data here.
I tried creating a separate column for the totals, but that did not work. I also have the advanced join tool macro, but I need to summarize the data before I can join the files.
I hope this makes sense. The updates will be on-going so I will need to add new daily information as it becomes available.
Thank you so much,
Bruce.
Solved! Go to Solution.
I needed to do this for my own purposes, so I had built a workflow a while back. Rather than download/input the files manually from the Johns Hopkins GitHub, I utilized the download tool to get the raw file from the git and parse the files from there. A bit more legwork upfront, but now requires no effort to update.
Attached workflow, only for confirmed cases and deaths.
This is great! Thank you so much for sharing this!
If you wanted to add the recovered cases data, you would just a separate flow for that data set, correct?
Yes the process should be similar, you'd just have to point to the raw recovered file URL.
How would you join all three data sets? Sorry for all the questions. This really an amazing flow you created.
Adding one more additional join after joining the confirmed and deaths files, you could bring in the recovered data. Optionally you could use a join multiple tool on all three, but it's usually simpler to join them in stages to diagnose any potential issues between the datasets.
This is such a huge help. How did you add the URL as the data source? It looks like you created some kind of field to add the URL for the Github source.
Hi Jay, I went through your flow and ran it with the data, but I am not getting the totals as I originally asked. I need a total of each case type summed by data for each country. The output generated by your flow appears to generated differences, not totals. Am I missing something?