Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Split Database column in half

JR1
7 - Meteor

I was wondering how to split a column in half in order to speed up a fuzzy match join.  Figure A contains all the items.  Figure B contains items 1-5 while Figure C contains items 6-10.  After the fuzzy matching result has been completed on the separate columns, I want to rejoin them so that they resemble Figure A again with the addition of the Matched result score.  Is this possible?

 

IDItem aItem b
1Aa
2Bb
3Cc
4Dd
5Ee
6Ff
7Gg
8Hh
9Ii
10Jj

Figure A

 

IDItem aItem b
1Aa
2Bb
3Cc
4Dd
5Ee

Figure B

 

IDItem aItem b
6Ff
7Gg
8Hh
9Ii
10Jj

Figure C

9 REPLIES 9
NicoleJohnson
ACE Emeritus
ACE Emeritus
Do you know how many characters you want in each half ahead of time, or does the split need to be dynamic?

If you know ahead of time (i.e. there are always 20 characters, and you want 10 in each), then you could use the RegEx tool:

Regular Expression: (\w{10})
Output Method: Tokenize
Split to Columns: 2

This will split your characters into two columns of 10. You could also be a little more flexible by using (.{3}) if there are non-alphanumeric characters involved...

If you have a varying number of characters, however, might need to get more creative... let us know! :)

Cheers,
NJ
NicoleJohnson
ACE Emeritus
ACE Emeritus
Do you know how many characters you want in each half ahead of time, or does the split need to be dynamic?

If you know ahead of time (i.e. there are always 20 characters, and you want 10 in each), then you could use the RegEx tool:

Regular Expression: (\w{10})
Output Method: Tokenize
Split to Columns: 2

This will split your characters into two columns of 10. You could also be a little more flexible by using (.{3}) if there are non-alphanumeric characters involved...

If you have a varying number of characters, however, might need to get more creative... let us know! :)

Cheers,
NJ
JR1
7 - Meteor

The split needs to be dynamic.  Some values in certain fields will remain the same e.g. State and ZIP will remain 2 and 5 respectively.

NicoleJohnson
ACE Emeritus
ACE Emeritus
How about a few formulas?


1. Find the Length of the field:

Length: Length([Field1])


1. If the Length is greater than a certain amount, say 10 characters, split the field in half. For the second portion, if the first Split field matches the original, then second half should be Null, otherwise capture the remaining characters not contained in the first split.

Split1: if [Length]>10 then left([Field1],ceil([Length]/2)) else [Field1] endif
Split2: if [Field1]=[Split1] then null() else Replace([Field1],[Split1],"") endif

Alternatively, if you had another method for identifying fields that needed to be split, you could filter for only those needing to be split, then use the two split formulas shown above, then union them back to your fields that did not need to be split...

NJ
JR1
7 - Meteor

I'm not sure that this is splitting the column in half.  It seems that it is just creating a new column entirely with all of the entries repeated in the new column.

NicoleJohnson
ACE Emeritus
ACE Emeritus

Sorry about that!! I didn't see the image you posted in your original query when the post came through in my email, and it looked like it was set up like this:

ABCDEFGHIJabcdefghij

 

So you are totally correct in that my response would not be helpful in the way you had it set up!!

 

However, I think to do what you're trying to do, you could find the median record number (use a Summarize tool to find the Median of RecordID), then append the median back onto your original data, then filter to split entries that are greater than the median. Would that do it?

 

Again, sorry about that. The way tables come across in the email format is WAY different than how they look here on the Community!!

JR1
7 - Meteor

No need to apologize for the wrong formatting.  I understand completely.  Happens to the best of us.

 

I think your solution will work, however, I am running into a problem when appending the median back onto the original.  I am using a union to rejoin the two, and It gives the column with null as all the values.  Should I use a different join?

NicoleJohnson
ACE Emeritus
ACE Emeritus

Use the Append tool rather than the Union tool - the median will be your Source (bottom input) and your original info will be the Target. This will add the median field on to every record in your original list, whereas Union will try to join your columns together by field/position. Example attached. Does that help?

JR1
7 - Meteor

Yes that helped.  Thank you very much.

Labels