community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcement | Get certified today - take the Alteryx Designer Core and Advanced exams on-demand now!
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Weekly Challenge
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Unable to display your progress at this time. Please try again a little later, or contact an administrator if you continue to see this error.

Challenge #89: Analyzing Social Data

Asteroid

Challenge Completed

Asteroid

Definitely enjoyed this one, cleaned the data, removed '?', and did some sorting based on favorite original tweets and counting hastags.  Definitely a lot more that could be done within tweets themselves to do analysis, and lots of chaos to figure out.  Sustainable Development goals played a huge part in all the tweets, and certain countries showed up more than others.  

Asteroid

Here's my solution for week #89. I found a few interesting issues while exploring the data (in the spoiler tag).  

For next week (#90): I'm going to challenge myself to use Alteryx's reporting tools, so I'm going to keep the analysis pretty basic. I'm going to look at:

  1. The frequency of each hashtag
  2. The count of distinct users using each hashtag
  3. The timing (did the hashtags peak at different times?)
Spoiler
Some findings from exploration in week 89:

1 - if a tweet had multiple hashtags, the tweet may be duplicated across the files. The ID field is unique.
2 - all of the csv files have the same schema.
3 - hashtags may not always be capitalized in the same way; you may want to convert to all upper or all lowercase if using case-sensitive formulas/tools
4 - the Tweet field is sometimes truncated, and in some cases, the hashtags were cut off.  If the hashtag does not appear in the Tweet field, then it also does not appear in the Hashtag field. As a result, sometimes the Hashtag field is null.
5 -since we know the tweets were harvested based on hashtag, then we know that every tweet in the file should contain that file's hashtag. For example, every tweet in the 'globalgoals' file should contain the #globalgoals hashtag. We can rebuild the Hashtag field to include the 10 hashtags of interest, but if any other hashtags were truncated, we don't know about them.

Here is an example.  This tweet (ID# 914489241266278000) does have the #act4sdgs hashtag, but it is truncated from the Tweet field and thus not present in the Hashtag field in the act4sdgs csv file.

Capture.PNG

Capture1.PNG
Magnetar

Simple data clean up

Spoiler
Solution 89.png
Quasar
Spoiler
Challenge #89.PNG
Asteroid

Kept it simple. Just data cleansing ans summarizing. Can also show tweets by region, unique users, etc.

Pulsar

I kept mine pretty simple - will decide how to summarize, sort and sample the data in the next challenge.

 

1. Parsed the hashtags to rows

2. Combined date and time

3. Replaced all null hashtags with filename

4. Changed all hashtags to the same case

5. Filtered out where hashtags contain ?

6. Removed dups based on ID, Tweet and Hashtag

 

Alteryx Partner

I wanted to use the cognitive service analytics tool, but it seems Azure services are no longer free, so, a chance of brushing my rusty python and use the new tool

Spoiler

First bringing all the tweets into a yxdb file:

input.png

Then a bit of processing, removing duplicates, and getting their "polarity"

sa.png

Thanks to Zoe Wilkinson Saldaña for the detailed how-to on Python and Vader

Alteryx Certified Partner

Kept it quite simple and did number of tweets by hashtag.. also used Alteryx Interactive Chart for the first time!

 

Number of Tweets.png

Bolide

Cheers!

Spoiler
I chose to stick with the Alteryx reporting suite. Here are the visuals that I came up with.

Challenge_89.png