Want to get involved? We're always looking for ideas and content for Weekly Challenges.
SUBMIT YOUR IDEA
Hi Maveryx,
A solution to last week’s challenge can be found here.
This week's challenge, contributed by our Community member AntoBennetsha Jabamalai (@AntobennetshaJ), will sharpen your data preparation skills. A big thanks to Anto!
Who among us has not encountered the task of handling dates in various formats in our daily work? But let’s be honest: who regularly checks the date format before streamlining their processes?
In this challenge, we are faced with the task of cleaning a set of dates in a survey dataset. The date was collected in a string field, which meant that survey respondents entered the dates in different formats. This resulted in dates having different delimiters and additional, unnecessary information. For practical analysis of the survey data, it is crucial to standardize all dates into a consistent format (YYYY-MM-DD) and organize them from the oldest to the most recent.
Need a bit of guidance? The interactive lesson Separating Data into Columns and Rows in Academy offers insights on handling values like those in our dataset, which include multiple separators such as slashes, hyphens, and dots, as well as other superfluous characters.
Good luck!
c 414
Surprised I couldn't get DateTimeFormat to take a column for the formatting string.
oh my - and lol - as to the nerve of the data source owner who has the guts to use a date format where mm points to a month instead of MM, but we can work with that, no problem!
here's my take
Good to go! 😎
Complete! (And yay, page 1! First time)
I was disappointed that I couldn't use a a column to specify the date format. Maybe a future Alteryx feature?
I ended up using RegEx Replace since I'm trying to practice with it. If I were going to use this in real life, I'd add handling for all the different combinations of day-month-year that people could put. I only did the relevant ones for this challenge.
Note for @AYXAcademy - For the solution within the Start File, 1) the first row of data is the header column instead of an actual header, and 2) it's sorted newest to oldest, not oldest to newest.