Want to get involved? We're always looking for ideas and content for Weekly Challenges.
SUBMIT YOUR IDEAI was able to get to the expected answer for the second question but don't feel it reflects what was actually asked, so I provided the answer to both the "asked" question and the expected response.
Great challenge — H-1B data always brings interesting patterns when you start breaking it down by industry.
We worked with similar employment datasets at Phonexa when mapping out geo-industry lead distribution. Cleaning and joining large volumes with NAICS codes can be a pain, especially with missing values, but it’s always rewarding once visualized.
Curious to see how tech vs. healthcare shake out in the 2024 results. Looking forward to digging in!
Here is my approach,
I used filter tool initially to filter out only records for fiscal year 2024,
then I used Formula tool to add a new column, Total Approvals = [initial approval]+[continuing approvals]
I then used Select tool to select only necessary columns,
I used filter tool to filter out records with missing Industry NAICS codes,
then I use summarization tool to group by employers and get sum of total approved petitions,
I used another summarization tool to group by Industry NAICS codes and to get count of Industry codes,
lastly I used Sort By, Sample and Browse tool..