Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #89: Analyzing Social Data

ggruccio
ACE Emeritus
ACE Emeritus

I chose to focus on other unique hashtags per record - exported to .tde for Tableau dashboard.

samN
ファイアボール

Went pretty simple on this one

jamielaird
マグネター

I wanted to do some text analysis so I thought it would be interesting to focus in on the originally-authored tweets, to deal with duplication of tweets caused by RTs.

 

I built a data prep workflow that excludes RTs, and outputs two TDE files for use in Tableau: the first to use for a wordcloud, with tweet text parsed and prepared, and the second to use for applying filters to that wordcloud (i.e. the date, dataset, and other tweet-level metadata).

 

スポイラ
Screen Shot 2017-12-28 at 15.52.12.png

 

UPDATE: Better handling of non-words.

jasperlch
クエーサー

Solution attached.

Natasha
コメット

Very minimal data prep, as I wanted to preserve as much as possible for exploring data in Tableau

 

スポイラ
Screen Shot 2018-01-07 at 12.01.13.png
TeePee
アステロイド

I'm stuck in a loop and have a very specific question.  Hope someone can help.  Of course, if my question is not OK, please feel free to remove!

スポイラ
I'm familiar with Twitter message contents (the "Tweet" field) containing non-ASCII characters and usually get around this in Excel by using UTF-8 encoding.  However, in Alteryx nothing I've tried is working.  I've tried both:
(1): (csv input option 11) Code Page = UTF-8  and (csv input option 7) Field Length = 254
and
(2) (formula tool): ConvertToCodePage([Tweet], 65001)

In both cases I STILL see lots of question marks. 

My question is just this: have these non-standard characters irreversibly lost their meaning in the csv files provided?  OR am I doing something wrong?  Has anyone here managed to convert the question marks to kanji or emoji or whatever it is they were originally meant to be?

Thanks in advance

 

MikeSp
Alteryx
Alteryx

Hello @TeePee

 

スポイラ
I can confirm that the CSV files do not have the proper encoding to be able to read these wide characters back in, I'm afraid. These are typically emojis or other wide characters as you suspect.
Mike Spoula
Senior Solutions Architect
Alteryx
TeePee
アステロイド

Thanks so much for taking the time to reply.  Much appreciated.  

kcgreen
アステロイド

 

I noticed many tweets had a "We" phrase such as "We must...." and "We need...."   I pulled out the we phrases and did some counts a few different ways.  I could spend hours playing with this but for Challenge #90, I'm going to just make a word cloud to show some of the more popular "We" phrases in the data..

 

スポイラ

 

 

89_v1.JPG89_v2.JPG

 

Elena_Caric
アステロイド

That was fun, gave the Charting Tool a go. May bring over to Tableau later