I have a table that breaks out data based on classes. In some cases, a class does not exist. In this case, I want to create a class for it by duplicating the next, closest class.
City | Class | Rent |
New York | A | 30 |
New York | B | 20 |
New York | C | 10 |
Los Angeles | A | 20 |
Los Angeles | B | 15 |
Los Angeles | C | 10 |
Seattle | B | 10 |
Seattle | C | 5 |
Austin | A | 15 |
Austin | C | 5 |
In the above example, Seattle does not have a Class A row. I want to duplicate the next highest class row for that city. For Seattle, I want to duplicate the Class B row and name it Class A. For Austin, I want to duplicate the Class A row and name it Class B. My end result would be like the table below.
City | Class | Rent |
New York | A | 30 |
New York | B | 20 |
New York | C | 10 |
Los Angeles | A | 20 |
Los Angeles | B | 15 |
Los Angeles | C | 10 |
Seattle | A | 10 |
Seattle | B | 10 |
Seattle | C | 5 |
Austin | A | 15 |
Austin | B | 15 |
Austin | C | 5 |
My end result would be each row having a Class A, B and C row, with the row being duplicated from the highest available class.
Solved! Go to Solution.
@taran42
A bit different approach.
Hi @taran42,
while I think that both @gabrielvilella and @TonyA posts would solve the problem, I'd like to offer an additional approach because I think both have the same two potential flaws:
Append Fields will create all possible combinations and could have a huge impact on large datasets when we built out all combinations. The second Text Input in my opinion is a duplicated information, therefore we'd always need to maintain two sources, which could get nasty at some point. If this works fine with your real dataset, go ahead and use it - both should satisfy you.
If they don't here is what you might try:
I would neither call it clean nor perfect and it has a flaw on his own (more on that later).
What happens?
The upper stream uses a typical design-pattern and transposes and cross-tabs the data back and forth. Why you might ask? Because it already gets the data into the right shape. Take a look at it:
All missing rows have been created. They are not filled, but they are already there. At that point we could go for a Multi-Row and simply look 1-2 up/below and it's done. But what when we have hundreds of classes? That's why we have the second stream.
Lower Part
The lower part calculates the maximum rent PER city and adds a column called 'Trickery' that simply has a null value inside it.
Bringing it together
Because of the null value, we can do a very smart join. In the original dataset we always have City, Class and Rent. If the rent is empty, we'd like to fill it up. If we join on City and the empty rent field, we can use that to join the city and trickery field to only connect the missing data to each other.
We can also uncheck the 'Value' from the left (that is empty) and rename the 'Max_Rent' from the right to 'Value' to than union the correct data with the ones, that was fine by default.
Afterwards the data looks like this:
And that's where we can see the flaw that the solution brings with it. We lost the original column names for Class / Rent during the process and weren't able to get them back. We could. But I don't want to add more complexity into a solution that is already a little bit of trickery itself. Therefor I added a simple Select-Tool to correct the issue and fix it. If you care about the order, you should also sort it - I didn't do that.
I'd probably go with one of the prior approaches if possible because mine might be harder to understand, but I thought it might be worth sharing.
I have attached the workflow for you to play around and seeing the exact configuration.
Best
Alex
Thanks for the feedback, @grossal. Much cleaner and faster. I wouldn't consider your solution trickery at all. Joining on a null isn't usual practice but it does work. If someone really has an issue with it, it would be simple enough to assign a value to the "trickery" column and use a formula tool to rename the nulls after the transpose. As for the column renames, if that's really an issue, it's easy enough to add them back from the original data set with a Dynamic Rename.
@grossal @Qiu @TonyA @gabrielvilella Thank you for the plethora of solutions!
All of these solutions got me where I needed to go. In the end I went with Grossal's answer since, like he mentioned, it did not have any extra data inputs. Also thanks a bunch for the detailed explanation of the process.
@taran42
Thank you your feedback and comment.
Would you also mark the solution from me and @gabrielvilella @TonyA as accepted so others may refer it in the future. 😁
@Qiu I forgot you can select multiple solutions as correct. Done!
@taran42
Thank you very much for your kindness! 😁
User | Count |
---|---|
18 | |
18 | |
14 | |
8 | |
7 |