community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcement | Get certified today - take the Alteryx Designer Core and Advanced exams on-demand now!
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Weekly Challenge
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Unable to display your progress at this time. Please try again a little later, or contact an administrator if you continue to see this error.

Challenge #89: Analyzing Social Data

Bolide
Bolide

I chose to focus on other unique hashtags per record - exported to .tde for Tableau dashboard.

Fireball

Went pretty simple on this one

Alteryx Certified Partner

I wanted to do some text analysis so I thought it would be interesting to focus in on the originally-authored tweets, to deal with duplication of tweets caused by RTs.

 

I built a data prep workflow that excludes RTs, and outputs two TDE files for use in Tableau: the first to use for a wordcloud, with tweet text parsed and prepared, and the second to use for applying filters to that wordcloud (i.e. the date, dataset, and other tweet-level metadata).

 

Spoiler
Screen Shot 2017-12-28 at 15.52.12.png

 

UPDATE: Better handling of non-words.

Quasar

Solution attached.

Alteryx Certified Partner

Very minimal data prep, as I wanted to preserve as much as possible for exploring data in Tableau

 

Spoiler
Screen Shot 2018-01-07 at 12.01.13.png
Highlighted
Asteroid

I'm stuck in a loop and have a very specific question.  Hope someone can help.  Of course, if my question is not OK, please feel free to remove!

Spoiler
I'm familiar with Twitter message contents (the "Tweet" field) containing non-ASCII characters and usually get around this in Excel by using UTF-8 encoding.  However, in Alteryx nothing I've tried is working.  I've tried both:
(1): (csv input option 11) Code Page = UTF-8  and (csv input option 7) Field Length = 254
and
(2) (formula tool): ConvertToCodePage([Tweet], 65001)

In both cases I STILL see lots of question marks. 

My question is just this: have these non-standard characters irreversibly lost their meaning in the csv files provided?  OR am I doing something wrong?  Has anyone here managed to convert the question marks to kanji or emoji or whatever it is they were originally meant to be?

Thanks in advance

 

Moderator
Moderator

Hello @TeePee

 

Spoiler
I can confirm that the CSV files do not have the proper encoding to be able to read these wide characters back in, I'm afraid. These are typically emojis or other wide characters as you suspect.
Mike Spoula
Solutions Architect - Services
Alteryx
Asteroid

Thanks so much for taking the time to reply.  Much appreciated.  

Asteroid

 

I noticed many tweets had a "We" phrase such as "We must...." and "We need...."   I pulled out the we phrases and did some counts a few different ways.  I could spend hours playing with this but for Challenge #90, I'm going to just make a word cloud to show some of the more popular "We" phrases in the data..

 

Spoiler

 

 

89_v1.JPG89_v2.JPG

 

Alteryx Certified Partner

That was fun, gave the Charting Tool a go. May bring over to Tableau later