Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

fill missing blanks in column on the attached dataset using fuzzy match

vinayvnn
6 - Meteoroid

Hi All,

I have data with three  columns, a,b, and c, based on the columns a, and b  I wanted populate data into the missing values in the column c, the method I am basically looking for is Fuzzy match, attached a sample of data however I have to replicate the same to the original data which I am working, It will be helpful If I get any leads to progress further.

 

abc
Pension ER A Sec Pension ER A Sec 20%605
Period Base Sal BTPeriod Base Sal BT704
myShare PlanmyShare Plan508
Transportation AllowanceTransportation Allowance227
Hypo Tax (Salary)Hypo Tax (Salary)1001
Amount paidAmount paid1025
Car assistance - cash allowanceCar assistance - cash allowance202
Hypo NI (Salary)Hypo NI (Salary)908
Housing AllowanceHousing Allowance404
Pens Supp AllwncePens Supp Allwnce203
Home AllowanceHome Allowance 
Assignment Allowance GRSAssignment Allowance GRS 
Sal Supp Allowance GRSSal Supp Allowance GRS 
Accident InsAccident Ins 
Relocation Allowance GRSRelocation Allowance GRS 
BonusBonus 
TotalTotal 
SueldoSueldo 
RET. TAX INCOMERET. TAX INCOME 
Plus vacacionalPlus vacacional 

 

4 REPLIES 4
gawa
15 - Aurora
15 - Aurora

hi @vinayvnn Can I ask something?

1) I can imagine Fuzzy Match between column 'a' and 'b', but how column 'd' will be used?

2) How do you determine the value of column 'c'? Elaborate the logic, if any.

vinayvnn
6 - Meteoroid

Thanks for responding,

 

Basically I have a data of 5,00,000 rows in which there are 6,800 rows of blank.

 

Column C already have data, where as it is left with 6800 blanks this needs to be filled based on the combination of .column a and b not with d (Apologies) for the confusion.

 

If any of the combinations of column a and b had populated data in column c

and same combination is having blank on the 6800 rows that should be filled based on the above criteria  

gawa
15 - Aurora
15 - Aurora

@vinayvnn *Please try to mention so we can notice your response

If you want to find the same combination of column 'a' and 'b', and append to blank value of 'c' for that combination,

1) Split data by Filter tool; One is data having the value of 'c', another not having the value of 'c'.

2) JOIN by column 'a' and 'b' for the above two data (refer to configuration in the snapshot)

3) Combine data by Union tool

image.png

vinayvnn
6 - Meteoroid

@gawa Thanks for the quick response, however the mentioned technique have not populated all the blank values still there are few of rows with blank values, approx (4000 rows) any alternative approach would you suggest here please ? it need not to be the same as column a and b and then populate blanks on the column c.

80% of match would be acceptable 

Labels