column_list = pd.DataFrame(['Col1','Col2','Col3','Col4', 'Col5',
'Col6','Col7','Col8','Col9','Col10',
'Col11','Col12'])
final_hr = pd.DataFrame()
for column in range(len(column_list)):
hr_new=hr.copy()
#Drop rows containing NAN for column 'col4' for new merge
hr_new.dropna(subset=[column_list.iloc[column,0]], inplace = True)
#Creating a new column for merge
hr_new['ID_final']=hr_new[column_list.iloc[column,0]]
#case folding
hr_new['ID_final']=hr_new['ID_final'].str.strip().str.upper()
#Merge data
merged_data = pd.merge(hr_new, data, how='left', left_on='ID_final', right_on ='OtherID')
#Concatinating all data together
final_hr = final_hr.append(merged_data)
Hi all,
I am trying to replicate the following functionality with alteryx.
my thinking on the logic: For each iteration of the loop, it's taking each of the user ID fields(Col1 to Col 12) and writing that to the ID_final field. And then it tries to use that to join to "data". Then appends everything to final_hr dataframe.
What I have trouble replicating is the loop and writing that to an "ID_final" field , aswell as the final_hr.append part, not sure what this is exactly doing.
Any help would be appreciated.
Hi, @wonka1234 - it sounds like you could connect your hr dataset to a Transpose tool to take the ID columns and put them into individual rows.
Once the values are in rows, then you can use a Filter tool to remove the nulls, and then a Data Cleansing (or Formula) tool to remove whitespace and change the case.
Finally, you can join that result to your data table using a Join tool.