Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Analyzing irregular data from Facebook Messenger

birdie124
5 - Atom

Hey there,

 

I would like to analyze data I downloaded from my Facebook Messenger, however the format the data arrived in is puzzling for me. 

 

You can see the raw data in the first column. I described it in the second column.

You'll notice that most of the data follows the pattern of Sender > Message> Reaction (if any) > Date.

When a photo or GIF is sent, the cell is empty.

When there is no reaction to a message or a photo/gif, the data follows the pattern of Sender > Message> Date.

Raw dataMy classification 
Person ASender 
HelloMessage
❤Person BReaction
Jun 09, 2023 9:21:51pmDate
Person ASender 
 Photo (when a photo is sent, the row is empty)
❤Person BReaction
Jun 09, 2023 9:21:38pmDate
Person BSender 
Hey thereMessage
Jun 09, 2023 9:10:51pmDate
Person BSender 
Good morningMessage
Jun 09, 2023 9:10:45pmDate
Person ASender 
Top of the morningMessage
❤Person BReaction
Jun 09, 2023 9:09:37pmDate

 

As I have hundreds of thousands rows of data, doing the manual classification for all rows is not an option :)

I scoured the community for an answer but I could only find solutions that handled such data when the data followed a more regular pattern than here. 

Could you please advise me on the steps that would get me to the following data structure?

Thank you in advance :)

SenderMessageReactionDate
Person AHello❤Person BJun 09, 2023 9:21:51pm
Person APicture/gif❤Person BJun 09, 2023 9:21:38pm
Person BHey there Jun 09, 2023 9:10:51pm
Person BGood morning Jun 09, 2023 9:10:45pm
Person ATop of the morning❤Person BJun 09, 2023 9:09:37pm

 

2 REPLIES 2
Yoshiro_Fujimori
15 - Aurora
15 - Aurora

Hi @birdie124 ,

 

Here is one way of doing this.

The approach is to simply fill the category one by one using IF clauses.

As the sample data is small, there might be still some exceptional cases with larger data set.

So please test with larger data and modify the formula if necessary.

 

Workflow

Yoshiro_Fujimori_0-1686441490543.png

 

Formula tool

Category = IF REGEX_Match([data], "[^[:ascii:]].*") THEN "Reaction" ELSE Null() ENDIF

Category = IF IsNull([Category]) AND REGEX_Match([data], "\w{3}\s\d{2},\s\d{4}.*") THEN "Date" ELSE [Category] ENDIF

Category = IF IsNull([data]) THEN "Photo" ELSE [Category] ENDIF

 

Multi-Row Formula

Category = 

IF IsEmpty([Row-1:data]) AND IsEmpty([Row-1:Category]) OR [Row-1:Category] = "Date" THEN "Sender" ELSE [Category] ENDIF

 

Formula

Category = IF IsNull([Category]) THEN "Message" ELSE [Category] ENDIF

 

Multi-Row Formula

IF [Category] = "Sender" THEN [Row-1:RecordID] + 1 ELSE [Row-1:RecordID] ENDIF

 

Output

Yoshiro_Fujimori_1-1686441658074.png

 

birdie124
5 - Atom

Works like a charm! Thank you very much for your solution, Yoshiro! :)

Labels
Top Solution Authors