Joining two datasets by county and date

Question

Hey experts!

I have two datasets, one containing data for weather types (Rain,Fog, Storm etc.) from 2016-2019 in California, and the second dataset containing wildfires in California from 2016-2019. In the second dataset (wildfires) there are 835 unique records, meaning that there were 835 wildfires in California from 2016-2019.

The weather dataset is much larger than this, since this shows the different weather activities during a day (it could be both foggy and rainy for one day, hence there are multiple duplicated dates)

df1

TypeDateCountiesRain2016-01-02Santa CruzRain2016-01-02Santa CruzRain2016-01-02Santa CruzRain2016-01-04Santa Cruz.....  Fog2019-02-26Yuba

Contains 200,000+ records

df2

Date*CountyLatLonUniqueId2016-07-22Monterey36.45994-121.89938977ac7ce2016-06-23Kern35.6115-118.4562891816ff6....    2019-06-28Nevada39.40972222-121.00055556b38c0563

Contains 835 unique records

*Date here is when the first fire was observed (i.e., Started)

My question is, how would you guys join these two datasets to visualize what kind of weather there was during a wildfire?

What I have done until now:

- Joined Left and Right by Date, and Counties on Left and Right

The issue with this: This gives me duplicated values (because of the duplicated dates in weather dataset), so it now looks like there are 1208 records, or 1208 wildfires (which is not correct) -> if I just take the unique values of the UniqueId column, then it only shows about 200 records

- Joined only on counties (Left and Right)

The issue with this: This obviously doesn't take the different dates into account, why it just

I hope my question isn't too broadly asked, and that you guys can come up with a "better" solution that could showcase the weather during a wildfire more precisely

rxak · Answer

Have tried, and that dataset was the closest I could find that have been filtered via counties

Thanks once again for your time!

gautiergodard · Answer

@rxak I get the same 293 result. I suggest trying to obtain a more complete weather dataset.

rxak · Answer

@gautiergodard

Correct, there are a few counties that are missing from the Weather dataset

If you would like to investigate further, I have attached both .csv files :)

AlteryxWildfires_v2.csv

AlteryxWeather.csv