We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Specialized ways to compare text files - 'Beyond Compare' or similar method?

AndyJay
8 - Asteroid

When I had to compare text files at a previous company, we'd use Beyond Compare to show only the lines that differed. It was smart enough to also skip empty lines so that the line number did not matter. It was really useful when trying to compare code. I haven't come across an clean way to do this in Alteryx.

 

My first thought was to join, but then I had an issue where one file had a replicated line in it and the join did not catch it since it would join with the first occurrence of the other file. 

 

Any suggestions on this?

8 REPLIES 8
caltang
17 - Castor
17 - Castor

We'll need some data to see what you mean. 

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
alexnajm
18 - Pollux
18 - Pollux

I agree with @caltang that we'll need to see more - also the Join will capture all matches, not just the first match, so maybe you tired a find Replace accidentally?

AndyJay
8 - Asteroid

A simple example would be:

 

File 1 (four lines):

If this then that else something
The quick brown fox
jumps over the lazy dog
The quick brown fox

 

File 2 (two lines of code plus an empty line between):

The quick brown fox


jumps over the lazy dog

 

Desired Result:

If this then that else something

The quick brown fox

 

The result would show the differences, but the process knows that there's an extra line of 'The quick brown fox' in File 1 as well as knowing to skip the empty line in File 2.

alexnajm
18 - Pollux
18 - Pollux

The Join will capture both as you originally indicated in your question - if you want just one match, use a Find Replace

 

Please use your example in Alteryx and see what's wrong so we can pinpoint an answer better for you. You can also just remove your empty lines with a filter or a Data Cleansing tool if you are trying to Join by record position

 

Or if you just need unique matches, use a unique tool after your join - lots of solutions!

Screenshot 2025-06-10 095643.png

abacon
12 - Quasar

@AndyJay Alteryx would catch this, you technically might have to remove null or blank lines but using the join tool would catch it. My previous company also used Beyond Compare, however, I wouldn't even download it onto my machine in protest haha. Alteryx can definitely do the comparisons you need and more.

 

Agreed with @alexnajm, was a Find and Replace accidentally used?

 

Bacon

 

image.png

AndyJay
8 - Asteroid

Thank you for the tip on removing empty lines. Find/replace was not accidentally used. What happened was both instances of 'The quick brown fox' from File 1 were matched with File 2. What I would like is to be able to see the left side, File 1, to show the difference on the left join anchor. It should have 'If this then that else something' and 'The quick brown fox' appearing once because the other occurrence of 'The quick brown fox' was matched with File 2:

 

If this then that else something
The quick brown fox

 

Right side (File 2) would then show nothing, since both lines were matched.

 

I'm thinking a record and sort tool may be the best way.

alexnajm
18 - Pollux
18 - Pollux

This is different than your original post, so that's where the confusion lies. "My first thought was to join, but then I had an issue where one file had a replicated line in it and the join did not catch it since it would join with the first occurrence of the other file." --> it does catch it, which is why we figured the find Replace was being used

 

the Join is your best bet in this case - you'll see what matched (and you can use a Unique after as myself and @abacon indicated to get one line of each) and you'll see what didn't match out of the L and R anchors

AndyJay
8 - Asteroid

I knew join would be involved but I'm not sure unique would have helped. For example if 'The quick brown fox' appears 15 times on the left file and 20 on the right, I'd want 'The quick brown fox' to appear five times on the right and zero times on the left.

 

I should have posted true sample code that has lines being used several times. The way I ended up going about this was to

  • assign record IDs to prevent losing line order
  • remove empty lines, I used a filter tool
  • sort ascending by the original field having the code (Field_1)
  • apply a multi-row formula tool to track the instance of each
    Instance = if [Field_1] = [Row-1:Field_1] then [Row-1:Instance] + 1 else 1 endif
  • Join on Field_1 and Instance

Sort the left/right outputs by record ID ascending to see true changes.

 

Thank you for all who replied and I apologize for confusion. I didn't know an easy way to ask this but not accidentally disclose code without company permission.

 

Screenshot 2025-06-11 012636.png

Labels
Top Solution Authors