Once I went from trying to generate all of the rows and just generated all the range/ingredient combinations. Once again we are reminded to consider the size of the problem and maybe solve the small one rather than the big one.
I had the basic idea that you sort the ranges and then prune each one to begin at max(Current Begin, Previous End +1). Once I looked at the sorted ones I saw the second bit, which is that sometimes Previous End +1 >End, in which case you throw the whole range out. What caught me was the last thing, which is that the above logic only works if there are duplicates. If there are triplicates or more you need to apply the logic an additional time. Fortunately, Triplicates was as high as it went, so no Iteration required.
Part 1 was pretty straight forward... I might have taken an inappropriate shortcut and just brought the data in as two inputs. But i'm confident I could have split them up anyway, so whatever, I moved on.
Appending the data allowed me to just compare the ingredients list as "between" the low and high values of the range. Filter to yes, distinct, and count.
Part 2 was a little more dubious, simply because of the volume of values. But the key is in the overlaps. By creating groups of distinct ranges, you can reduce the overall number of ranges considerably. Then just count the number of values in the range (high minus low plus 1 to include) and you have the number of unique values. Sum those up and there you go!
In Part 2, I merged each range. First, I sort by the smaller values. If a larger value is smaller than the next smaller value, these two ranges cannot be merged.
Part 1 was a straight-forward brute force approach. There are absolutely ways to optimize, but there is not enough data for that to really matter here. Append, Compare, and count the unique Ingredients!
Part 2 is where the fun is had! We have a list of potentially overlapping ranges and we need to count how many unique ID's are contained in at least one range. If the numbers were smaller, we could easily Generate Rows for each range and then Count Distinct as in part 1, but the numbers are too large for this to be practical. Unfortunately for us, if the ranges were all disjointed (no overlap), then this would still be trivial because we can "count" the ID's in each range with the formula [Max ID] - [Min ID] + 1, and then get the answer by summing over all of those counts.
For example: [1,3] & [6,7] has 1,2,3 & 6,7 - 5 ID's total which can be calculated as (3) - (1) + 1 = 3 (7) - (6) + 1 = 2 -> 3+ 2 = 5
The key insight is that we need a way to identify which ranges overlap and then merge them, which will allow us to solve the much simpler problem described above. This can be solved in several ways. My first solution involved building a macro to repeatedly check for merges and perform them, but I optimized that and wanted to showcase a dynamic, non-macro-based solution to this problem, which involves String Coding, a way of problem solving I have tried to popularize and discourage.
If you sort the ranges on their minimum values, you can step through the data with the Multi-Row Formula Tool to check if the maximum ID of the previous merged range is greater than the minimum ID of the current range, then there is overlap and the two ranges should be merged. In order to do this without a macro, you must track both the current group in addition to the maximum ID you've seen up to that point. This is because a large range, may contain several smaller ranges, and if you don't track the max, you'll lose sight of the bounds truly being considered:
It's obvious above that 1000 >> 3, so ranges 1 and 2 should be merged, but if the next Multi-Row Formula Tool step only considers ranges 2 and 3 they look disjoint, because the interval component that connects the two ranges is the 1000 from range 1, which is why we need to track both pieces of information. Finally for each group of ranges we can merge by finding the minimum minimum ID and the maximum maximum ID of the group at which point the ranges are guaranteed to be disjoint and ready for the simple counting process!
Started out by way overcomplicating this, thinking back to AoC 2021 where you need to account for overlapping cubes in 3 dimensional space and building in some set math (I blame the Whiskey Advent Calendar I am pairing with this event this year 🥃)... only to eventually realize I could simplify this big time by just sorting my ID ranges in part 2 and just build groups adjusting the range bounds as I go...
In retrospect, probably could have just done this with some multi-row formula tools instead of an iterative macro, but once I had started I couldn't bring myself to reverse course. Macro
Another challenge I was able to complete in 1 sitting, taking just over 1 hour in total for both parts! I did not find this puzzle particularly challenging, but part 2 had it's *specialties*. Anyways, it's back to waiting for another puzzle!
Part 1 was insanely easy. I literally just appended each candidate to each range, checked for each candidate if it was within any range, and counted up the yes's Boom, done!
Part 2 didn't SEEM hard, but it was in the test cases that it got tricky. For anyone who is wondering, yes I did try to use generate rows initially, and yes... my Alteryx crashed😂 Can you hate me for trying though? Anyways, I realized that if I sorted the ranges by their starting value then I could use a multi-row tool to go through the ranges one by one. I thought the process I created was airtight: Go through the list and assign them one of 3 'situations': Usurped by previous, overlapping previous, or independent of previous. Then, remove the usurped ranges and go through the list again, setting the overlapping range's start values to the previous range's end value + 1. Should be done, right? After adding erroneous +1s in random places and testing outputs over and over, I decided to dig deeper into the actual ranges in my input and that's where I found some wacky ranges that were causing problems for my algorithm. I thought I was banished to the iterative macro realm, but decided to do one more manual iteration of the process just to see if I could get only independent ranges and voila, the answer was found.
Overall, this was another really strong day for me, as I was able to get both parts done in a decent time window and once again with 0 assistance. I thought surely I wouldn't be able to keep that going but here we are!
Is it just me who first applied the Generate Rows tool for part2 and as soon as run WF, endless number of rows were being generated? Thanks to Algorithm.
For part2, the core concept is to merge overlapped ranges each other, and create distinct and independent ranges. Mine is more or less similar to others.