Free Trial

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #37: Parsing a Raw XML File

bradp
7 - Meteor

A very interesting problem which required a bit of a different solution. Managed to validate the final answer as well. 

Spoiler
bradp_1-1609692019871.png

 

 

clmc9601
13 - Pulsar
13 - Pulsar

My solution

aiahwieder
8 - Asteroid

Lots of ways to crack this nut; I went a little "brute force" but I know there are more elegant solutions out there!

Hub119
11 - Bolide
11 - Bolide

Solution attached.

BeginnerMindset
8 - Asteroid

Fun way to learn about parsing raw XML files (Not as painful as it sounds!)

 

Spoiler
BeginnerMindset_0-1614784625566.png

 

mot
11 - Bolide

Solution attached.

JP_SDAK
8 - Asteroid

In looking at other solutions, it looks like I could have streamlined this some.

Spoiler
JP_SDAK_0-1615173893235.png

 

LHolmes
9 - Comet

Maybe using xml parse was cheating but it was fun!

 

Spoiler
037 - Solution.PNG
apathetichell
19 - Altair

Included the billing reference and a record id column so there are two additional columns here... Kind of feel like REGEX may have been easier the easier route than the XML parse option that I used.

Jon_Taylor
8 - Asteroid

Can you explain (?<=\w) . I have seen "?" in a formula before but do not understand it. I see it is taking a word and i have seen the same formula take several words.

 

in challenge #37 I am lost with this :

 

The formula of the solution for challenge #37 is as follows (I had to look as I was very stuck and re-watching training over and over was not helping)

(\d{7})(.*?)\s(\w+)\s\((.*)\).*CAT:(.*) PUB:(.*?)\s\$(.*)

 

 

(\d{7})= first 7 digits     this on is easyJ

 

(.*?)\s(\w+)\s\((.*)\) = the game name ? and year, this part is confusing me. I see \((.*)\ this is identifying the date. Why not \(\d{4})\. Before the date (xxxx) I do not see how the formula is parsing the name and separating out the platform. The last \s I assume is the space before the date (xxxx) but I do not know what (.*?) is. I see ?: in the menu but not a ? by itself. To me this reads “any character and zero more” space \s “word and more letters” \w+. how is that saying WII Sports as a name and then NES as a publisher?

 

Later in the formula they use (.*?) after “PUB:”, here like at the beginning I do not see how it is pulling one word with no punctuation where at the beginning it seems to be pulling multiple words with punctuation.

 

Then at the end $(.*) is pulling multiply sets of numbers.

 

I went back through the regex training several times but cannot seem to locate these specific formula types.

 

Sorry for posting in the challenge, but it popped out at me. a moderator can move this if need be.