Generate a new row based on values in multiple other rows
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
What I want to do is kind of like imputation, but it's enough different that it needs a post of its own. I don't actually have any missing data, and rather than modifying an existing field (like imputation would) I'd like to create an entirely new row based on calculations on some (or all) of the fields in at least two rows in the original dataset. I have a dataset of about 300 rows, so doing this manually is somewhat impractical
Here's the idea -
An example comma-delimited dataset might look like this -
row one: 1,2,4,purple,5,grape
row two: 3,6,7,green,9,strawberry
I want to create and append to the bottom of the table (or dataframe if you prefer) a new row, number three, that looks something like this-
row three: 2,8,4,Null,9,Null
The "Generate Rows" tool is too simple by itself. It only generates a sequence of numbers to create rows, and it has no provision for creating values in more than one column.
The Multi-Row Formula tool won't work by itself because it creates a new field in an existing row (based on the values in that and/or other rows), and an existing row is not where I need my new values.
If I were coding this in Python or R then it would be relatively straightforward to grab the rows, do the calculation(s), create the new row, store it away, and then append all "n" of them at once when I'm done processing. However, I'm trying to do this in Alteryx, and there don't appear to be any built-in tools that do what I want to do.
I'm wondering if I might have to hack a solution with a batch or iterative macro, but I'd rather avoid the added complexity if possible.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @TH
I would suggest converting your comma separated field with text to rows to look like this
Now you can use whatever logic to create a new dataset for your 3rd row. One thing to note here - convert your nulls to zeros, otherwise they get excluded when you crosstab or summarize to get back to the original format.
You should look to create something like this
Now you can use the Summarize tool to get back to the original format:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
That's a helpful start.
Everything I'm doing is within an Alteryx workflow.
Getting the data into your format I can do.
Even doing the calculations is relatively straightforward.
I must be missing something that you are assuming, though, because at this point I'm still left with the issue of trying to create the entirely new row *within* the Alteryx workflow.
How do I make *that* happen?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Sorry @TH ,
Here's a breakdown of what every tool in my sample workflow does and the workflow attached.
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I appreciate you working on helping me. I must have been unclear when I originally stated my issue.
The first two rows are already in the dataset that's in the alteryx workflow.
Third row is generated dynamically based on the values in the cells of those first two rows.
So cell 3,1 might be the average of cells 1,1 and 2,1.
Cell 3,2 might be the maximum of the two numbers in cells 1,2 and 2,2.
Call 3,3 might sum the two values in 1,3 and 2,3.
Cell 3,4 might be a random choice of the categorical (nonnumeric) values in cells 1,4 and 2,4.
and so on.
This is the point of the question. An essential part of the problem is that I don't know what the values in row 3 will be without already having rows 1 and 2.
I've managed to put together a workable method that involves a lot of multirow formulas and some transforms, but it looks like a convoluted hack. I'm hoping there's a simple and straightforward way to do what I want to do.
