Hi All, I'm very new to Alteryx and looking for some suggestions on how to convert the below SAS to Alteryx. I'm thinking it would be a combination of a Formula along with Regex but since just learning Alteryx not sure
This is my data and you'll notice that sometime there's an extra semicolon and sometimes there's a space missing or semicolon missing.
NM | VB |
VB00026 | VB = STD ; HIRE ; ; YEARS ; 1 ; 902NZ ; 2 ; 902P0 ; 3 ;902P1 ; 4 ; 902P2 ; 5 ; 902P3 ; AFTER ; 9029R |
VB05067 | VB = STD ; EMPL ; 20091101 ; MONTHS ; 12 ; 90ZHCAFTER ; 914DV |
VB05089 | VB = STD ; EMPL ; 20040101 ; MONTHS ; 6 ; 90ZP2 ;12 ; 90ZP5 ; 24 ; 90ZP6 ; AFTER ; 914HH |
VB10090 | VB = STD ; PCAR ; ; MONTHS ; 6 ; 91ZHH ; 12 ; 91ZHJ ;AFTER ; 91ZHK |
Here's my SAS datastep I'm trying to convert to Alteryx:
data vbs1;
set vbs;
if index(vb, 'MONTHS ; 6')> 0 then first_6 = substr(vb, index(vb, 'MONTHS ; 6')+13, 5);
if index(vb, ' ; 12 ;') >0 then month_12 = substr(vb, index(vb, ' ; 12 ;')+8, 5);
if index(vb, '; YEARS ; 1 ;')>0 then year_1 = substr(vb, index(vb, '; YEARS ; 1 ;')+14, 5);
if index(vb, '; 2 ;')>0 then year_2 = strip(trim(substr(vb, index(vb, '; 2 ;')+5, 6)));
if index(vb, '; 3 ;')>0 then year_3 = strip(trim(substr(vb, index(vb, '; 3 ;')+5, 6)));
if index(vb, '; 4 ;')>0 then year_4 = strip(trim(substr(vb, index(vb, '; 4 ;')+5, 6)));
if index(vb, '; 5 ;')>0 then year_5 = strip(trim(substr(vb, index(vb, '; 5 ;')+5, 6)));
if index(vb, 'AFTER ;')>0 then after = substr(vb, index(vb, 'AFTER ;')+8, 5);
run;
And this is how the data looks after SAS datastep and how I need it to look in Alteryx:
NM | VB | first_6 | month_12 | year_1 | year_2 | year_3 | year_4 | year_5 | after |
VB00026 | VB = STD ; HIRE ; ; YEARS ; 1 ; 902NZ ; 2 ; 902P0 ; 3 ;902P1 ; 4 ; 902P2 ; 5 ; 902P3 ; AFTER ; 9029R |
|
| 902NZ | 902P0 | 902P1 | 902P2 | 902P3 | 9029R |
VB05067 | VB = STD ; EMPL ; 20091101 ; MONTHS ; 12 ; 90ZHCAFTER ; 914DV |
| 90ZHC |
|
|
|
|
| 914DV |
VB05089 | VB = STD ; EMPL ; 20040101 ; MONTHS ; 6 ; 90ZP2 ;12 ; 90ZP5 ; 24 ; 90ZP6 ; AFTER ; 914HH | 90ZP2 |
|
|
|
|
|
| 914HH |
VB10090 | VB = STD ; PCAR ; ; MONTHS ; 6 ; 91ZHH ; 12 ; 91ZHJ ;AFTER ; 91ZHK | 91ZHH | 91ZHJ |
|
|
|
|
| 91ZHK |
Thanks for any and all help!
Solved! Go to Solution.
Hi @jokeffer
Personally I would recommend first converting your data into a tabular format, and secondly using Formula tools to create the new columns. It will be much easier to write out the logic once you have distinct fields for each of the different parts of your data.
hi @jokeffer
Since you've got varying number of columns in your VB, using @jamielaird's solution may be tricky. A more straightforward approach would be to build a Formula tool with translation of the SAS to Alteryx
The translation is very direct. IndexOf()<>0 maps to Contains and substring(field,Indexof() maps to substring([field],findstring()). You just have to watch the indexes. Sometimes, but not always, they're off by one
I've created the 1st 3 columns for you resulting in
If you've only got these 8 fields to build out, using this approach will work well. If you have hundreds of datasteps to convert, you'll need to come up with a different strategy.
Dan
Yes, this will work. It has flushed out a couple additional string types to deal with but 10 isn't too bad. Thank you for your suggestions and the code to get it started!
Thank you for your suggestion, however my data has varying lengths so opted for the SAS to Alteryx conversion. Really appreciate the help though.