Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #89: Analyzing Social Data

ggruccio
ACE Emeritus
ACE Emeritus

I chose to focus on other unique hashtags per record - exported to .tde for Tableau dashboard.

samN
10 - Fireball

Went pretty simple on this one

jamielaird
14 - Magnetar

I wanted to do some text analysis so I thought it would be interesting to focus in on the originally-authored tweets, to deal with duplication of tweets caused by RTs.

 

I built a data prep workflow that excludes RTs, and outputs two TDE files for use in Tableau: the first to use for a wordcloud, with tweet text parsed and prepared, and the second to use for applying filters to that wordcloud (i.e. the date, dataset, and other tweet-level metadata).

 

Spoiler
Screen Shot 2017-12-28 at 15.52.12.png

 

UPDATE: Better handling of non-words.

jasperlch
12 - Quasar

Solution attached.

Natasha
9 - Comet

Very minimal data prep, as I wanted to preserve as much as possible for exploring data in Tableau

 

Spoiler
Screen Shot 2018-01-07 at 12.01.13.png
TeePee
8 - Asteroid

I'm stuck in a loop and have a very specific question.  Hope someone can help.  Of course, if my question is not OK, please feel free to remove!

Spoiler
I'm familiar with Twitter message contents (the "Tweet" field) containing non-ASCII characters and usually get around this in Excel by using UTF-8 encoding.  However, in Alteryx nothing I've tried is working.  I've tried both:
(1): (csv input option 11) Code Page = UTF-8  and (csv input option 7) Field Length = 254
and
(2) (formula tool): ConvertToCodePage([Tweet], 65001)

In both cases I STILL see lots of question marks. 

My question is just this: have these non-standard characters irreversibly lost their meaning in the csv files provided?  OR am I doing something wrong?  Has anyone here managed to convert the question marks to kanji or emoji or whatever it is they were originally meant to be?

Thanks in advance

 

MikeSp
Alteryx
Alteryx

Hello @TeePee

 

Spoiler
I can confirm that the CSV files do not have the proper encoding to be able to read these wide characters back in, I'm afraid. These are typically emojis or other wide characters as you suspect.
Mike Spoula
Senior Solutions Architect
Alteryx
TeePee
8 - Asteroid

Thanks so much for taking the time to reply.  Much appreciated.  

kcgreen
8 - Asteroid

 

I noticed many tweets had a "We" phrase such as "We must...." and "We need...."   I pulled out the we phrases and did some counts a few different ways.  I could spend hours playing with this but for Challenge #90, I'm going to just make a word cloud to show some of the more popular "We" phrases in the data..

 

Spoiler

 

 

89_v1.JPG89_v2.JPG

 

Elena_Caric
8 - Asteroid

That was fun, gave the Charting Tool a go. May bring over to Tableau later