I found strange behaviour of the Data Cleanse Pro tool, which I don't understand and it looks to me as a bug.
I have a large set of data with poorly written names (like: NameSurname, Name Surname, Name Surname, Name Surname, ....)
I wanted to clean that so I used Data Cleansing Pro and set to remove Leading and Trailing Whitespaces and Tabs, Line Breaks, and Duplicate Whitespaces.
To my surprise, after such cleaning I ended up with more name variations than before the cleaning. But when I used the older Data Cleansing tool (with the same settings) the result was as expected - after the cleaning, the number of variations for the same names significantly decreased. When I compared the output of Data Cleanse Pro and Data Cleansing tools I found that the Data Cleanse Pro is not removing all whitespaces as expected, and in certain cases it is even adding (!) spaces to the values.
It seems that it happens when we have more than two spaces in a row and for the cases when we have more than one place where duplication happen
(i.e.: First Name Second Name, Surname - in such case in one place additional space may be removed, in other it can be added).
Did anyone experience such behaviour? Is it a bug or am I missing something?
EDIT: as I just checked, using Data Cleanse Pro with "Tabs, Line Breaks, and Duplicate whitespace" on the value which have 7 words (only one space between each word) crated over 2500 versions with various number of spaces out of only one 😱
By the way, the "remove all Whitespace" option works just fine.