Hey guys,
I am challenging a certain problem at the moment. I have a report converted from a .pdf to .xlsx. However the PDF contained headers at every single page. What I would like to do is deleting these rows and adding the information from it to every single row related to it. Here is an Example:
Account 100 | Cash |
Date | Text |
01.01.2000 | Bank YX |
20.03.2000 | Bank XY |
Account 200 | Inventory |
Date | Text |
20.03.2000 | Table |
30.08.2000 | Pencils |
31.08.2000 | Laptops |
01.09.2000 | PC's |
Account 300 | Buildings |
Date | Text |
01.09.2000 | Building A |
01.09.2000 | Building B |
I would like to look it more like that:
Account | Acc. Description | Date | Text |
100 | Cash | 01.01.2000 | Bank YX |
100 | Cash | 20.03.2000 | Bank XY |
200 | Inventory | 20.03.2000 | Table |
200 | Inventory | 30.08.2000 | Pencils |
200 | Inventory | 31.08.2000 | Laptops |
200 | Inventory | 01.09.2000 | PC's |
300 | Buildings | 01.09.2000 | Building A |
300 | Buildings | 01.09.2000 | Building B |
Is there any possible way to do that in alteryx? Thank you a lot.
Gelöst! Gehe zu Lösung.
Hi @Torben_Wiesbach,
welcome to the Alteryx Community and especially to the German Forum! Feel free to speak up in German next time. As the post was asked in English, I'll continue in English.
As often, it's definitely possible in Alteryx and doesn't need much tools to do so. I'll show you how it could be done.
The workflow looks like this:
And the result matches yours if I see it correctly:
Let's take a look how we achieved this:
0. I expect that your data comes in with no header, this makes it easier to handle it. For your real file, just deselect that the first Row contains field names in the option.
1. Separate the Account Number from the Word "Account". This can be done with a Text-To-Columns tool that splits by whitespace (\s):
The result afterwards looks like this:
2. In the next step we want to fill up the account-rows with a Multi-Row-Formula. We can simply say "If the current row is null, than take the previous row, if it's already filled, leave it as it is"
The result looks a lot better!
3. We move on with the Acc. Description column, therefore we could use another Multi-Row-Formula:
This time it looks a bit more tricky, but it's actually easier. We use the Group By option with our newly created field Account. Therefore we can say, if the previous description is null - this happens only at the first entry of every group, we take the value from the Field2 column. If the previous value isn't null, we can just take the previous one.
The result looks like this and we are almost done:
4. Time for some basic renaming and re-ordering. (Select Tool)
5. In the last step we remove all Rows with a Filter Tool that contain the Word "Account" or "Date" in the 'Date' Column.
An easy way to achieve this is using the IN-Funtion that basically checks if a value is in a list. In our case we add 'NOT' in front to remove all lines that contain one of these words.
That's it 🙂
I'll attach the workflow as a reference for you. Let me know if this helped and yet again, welcome to the forum!
Best
Alex
Hallo @Torben_Wiesbach,
War diese Antwort von @grossal für Sie hilfreich? Wenn Ihnen die Antwort geholfen hat, können Sie diese als „Lösung akzeptieren“. So können auch andere in der Community nützliche Antworten finden.
Danke!
Viel Spaß mit Alteryx, bei Fragen sind wir hier im Forum immer gerne für dich da.
Hi @grossal ,
danke dir vielmals 🙂 Das hat mein Problem gelöst. In Zukunft werde ich meine Posts auf Deutsch verfassen.
LG
Torben
Hallo @Torben_Wiesbach ,
vielen Dank, dass Sie die Antwort von als Lösung akzeptiert haben.
Es freut mich zu sehen, dass die deutsche Community (Danke @grossal 😎) Ihnen helfen konnte.
Viel Spaß mit Alteryx, bei Fragen sind wir hier im Forum immer gerne für dich da.
Einen schönen Tag,