Weekly Challenge

Solve the challenge, share your solution and summit the ranks of our Community!
IDEAS WANTED

We're actively looking for ideas on how to improve Weekly Challenges and would love to hear what you think!

Submit Feedback
We've recently made an accessibility improvement to the community and therefore posts without any content are no longer allowed. Please use the spoiler feature or add a short message in the message body in order to submit your weekly challenge.

Challenge #89: Analyzing Social Data

Highlighted
12 - Quasar
12 - Quasar

I chose to focus on other unique hashtags per record - exported to .tde for Tableau dashboard.

Highlighted
10 - Fireball

Went pretty simple on this one

Highlighted
Alteryx Certified Partner

I wanted to do some text analysis so I thought it would be interesting to focus in on the originally-authored tweets, to deal with duplication of tweets caused by RTs.

 

I built a data prep workflow that excludes RTs, and outputs two TDE files for use in Tableau: the first to use for a wordcloud, with tweet text parsed and prepared, and the second to use for applying filters to that wordcloud (i.e. the date, dataset, and other tweet-level metadata).

 

Spoiler
Screen Shot 2017-12-28 at 15.52.12.png

 

UPDATE: Better handling of non-words.

Highlighted
12 - Quasar

Solution attached.

Highlighted
Alteryx Certified Partner

Very minimal data prep, as I wanted to preserve as much as possible for exploring data in Tableau

 

Spoiler
Screen Shot 2018-01-07 at 12.01.13.png
Highlighted
8 - Asteroid

I'm stuck in a loop and have a very specific question.  Hope someone can help.  Of course, if my question is not OK, please feel free to remove!

Spoiler
I'm familiar with Twitter message contents (the "Tweet" field) containing non-ASCII characters and usually get around this in Excel by using UTF-8 encoding.  However, in Alteryx nothing I've tried is working.  I've tried both:
(1): (csv input option 11) Code Page = UTF-8  and (csv input option 7) Field Length = 254
and
(2) (formula tool): ConvertToCodePage([Tweet], 65001)

In both cases I STILL see lots of question marks. 

My question is just this: have these non-standard characters irreversibly lost their meaning in the csv files provided?  OR am I doing something wrong?  Has anyone here managed to convert the question marks to kanji or emoji or whatever it is they were originally meant to be?

Thanks in advance

 

Alteryx
Alteryx

Hello @TeePee

 

Spoiler
I can confirm that the CSV files do not have the proper encoding to be able to read these wide characters back in, I'm afraid. These are typically emojis or other wide characters as you suspect.
Mike Spoula
Solutions Architect - Services
Alteryx
Highlighted
8 - Asteroid

Thanks so much for taking the time to reply.  Much appreciated.  

Highlighted
8 - Asteroid

 

I noticed many tweets had a "We" phrase such as "We must...." and "We need...."   I pulled out the we phrases and did some counts a few different ways.  I could spend hours playing with this but for Challenge #90, I'm going to just make a word cloud to show some of the more popular "We" phrases in the data..

 

Spoiler

 

 

89_v1.JPG89_v2.JPG

 

Highlighted
Alteryx Certified Partner

That was fun, gave the Charting Tool a go. May bring over to Tableau later