
Hi Community members,
A solution to last week’s challenge can be found here.
This challenge was submitted by John Primeaux (@jeprime) . Thank you, John, for your submission!
This week, you’re stepping into the role of a data quality detective. A data table has been populated with critical tracking codes—but many of them have been entered incorrectly.
Each code is supposed to follow a very specific structure. This format is vital because these codes are used to locate corresponding documentation within a larger database. Your mission: clean them up and restore order.
The required format is 1 to 4 alpha characters, a hyphen, 3 digits, a hyphen, 2 digits, then one optional alpha character.
The numeric portions (3-digit and 2-digit groups) are always generated correctly by the system. However, delimiters (- vs. _) may be missing or incorrect.
Hint: Assume that the three digits and two digits inside the code are created by a computer; therefore, the hyphen is never missing.
Task 1: Create the code in the correct format (using only hyphens "-"), including the optional letter suffix, and compare it to the old code.
Task 2: Calculate the percentage of codes with a letter suffix, as well as the percentage of codes that are missing or contain incorrect delimiters.
Once you have completed your challenge, include your solution file and a screenshot of your workflow as attachments to your comment.
Good luck!
The Academy Team
Source: Dataset generated by challenge creator.