Hi,
Today I have 2 problems.
1st one. I would like to populate missing values with the first value which isn't null. As in the table below
Original Data Set | Desired Output |
Mark | Mark |
Mark | |
Mark | |
Tom | Tom |
Tom | |
Tom |
Usually, I would do that with the Multi-Row formula but it isn't available in the Alteryx Cloud.
2nd. By working on the workaround to this specific problem I discovered that I don't know how to make a join by a specific position.
In the official documentation, I can read this: Use Join to combine 2 inputs based on common columns between the 2 tables. You can also join 2 data streams based on row position.
The truth is I don't have a clue if I am missing something obvious or if this feature is not currently supported.
I would really appreciate any guidance.
Trifacta - Handling Missing Values:
To achieve the desired output of populating missing values with the first non-null value, you can use the FILL transform in Trifacta. Here are the steps:
Open the Transformer:
Open the Trifacta Transformer.
Select Column:
Select the column for which you want to fill missing values.
Apply FILL Transformation:
Click on the column menu.
Choose "Fill" and then select "Down" to fill missing values with the first non-null value.
Review & Apply:
Review the changes in the preview pane.
Click "Add Recipe" to apply the transformation.
Trifacta - Joining by Row Position:
To perform a join based on row position, you can use the MERGE transform in Trifacta. Here's how you can do it:
Open the Transformer:
Open the Trifacta Transformer.
Select First Dataset:
Select the first dataset in the flow.
Apply MERGE Transformation:
Click on the "+" button between the two datasets.
Choose "Merge" to bring up the Merge dialog.
Configure MERGE:
In the Merge dialog, you can specify the type of join (Inner, Left, Right, Outer).
Choose "By order" as the join condition.
Review & Apply:
Review the changes in the preview pane.
Click "Add Recipe" to apply the transformation.
By using the By order option in the MERGE transform, you can join two datasets based on their row positions. This feature allows you to combine data streams based on the order in which they appear rather than matching on specific column values.
I hope these instructions help you resolve the issues you're facing in Trifacta!
as for problem #1:
do you have a RecordID? I know that Designer Cloud cannot create one without sorting the dataset so perhaps you can define a sort order that would match your data structure?
with that record ID I would consider two self-joins on recordID-1=recordID; but then again if you know the group bys you need you might as well aggregate and then cross join to a scaffold of 3 rows to replicate the name or just use the rownumber function
Subject | Likes |
---|---|
1 | |
1 | |
1 | |
1 | |
1 |