Hi,
I have an issue with using Input Data tool for reading xlsx files that contain mixed data types. In such cases Alteryx sets V_String data type for such columns and that is fine, however the result are a bit unexpected and they do not correspond to values contaned in the spreadsheet, e.g.
A | B |
a | 1 |
0.001 | 0.001 |
18.4 | 18.4 |
is read by Input Data as:
A | B |
a | 1 |
1E-3 | 0.001 |
18.399999999999999 | 18.4 |
What might be the reason for such behaviour? We use that data to compare it later with data extracted from another set of files to list differences, so it creates false positives, because from one file we get 18.399999999999999 and from another one 18.4 (as it should be) if it happens to contan numbers only.
I tried it in Alteryx Designer 10.6.8.17850 both x64 and x86 versions and I get the same results.
In the attachment you can find small example that demonstrates this issue.
Kind regards,
Daniel
Solved! Go to Solution.
The issue is the "a" in variable 'A', Alteryx reads this in as a V_String not a Double like variable 'B'.
You can use a select tool to amend this to a Double, it fixes the issue with your numeric entries but does change your "a" to a NULL. It's not elegant but the only way I could think of to fix this is to drop another Select tool changing it to a VW_String then use a Formula tool to change NULL's to "a" - of course I have no idea if you only have "a" in your full dataset as if not this will require a different solution.
I've attached my workflow - hope this helps.
Hi Phil,
Thanks for your comment. I was already considering using Select tool to enforce specific data type and it seems to convert the values back to their original form (although it throws conversion errors due to lost precision), however as you wrote the downside is that I will loose all non-numeric entries this way. The example that I posted is just a small extract to present the problem. Full dataset has hundreds of columns and thousands of rows, so if possible I wanted to try some other solutions before I go with adjusting the data types for all of them.
The strange thing is that if I try to input exactly the same values from csv or using Text Input tool then the results are correct and the column 'A' is still set by Alteryx to be V_String exactly as when I read the data from xlsx. Unfortunately I am stuck with xlsx files, because that is what I receive from business users. It seems to be a bug in Input Data tool when used with Excel.
Kind regards,
Daniel
Hi Daniel,
Sorry it's taken a while to get back to you, I had to leave the office yesterday.
Thats very odd about the csv files, I'm sure someone can explain it but its beyond me.
How about this, I've created a unique identifier variable using the RecordID tool and then split out the NULL values created by the coersion and joined them back to the original dataset via the unique identifier variable. I've then unioned all the data back together. It works for the test dataset you've provided so will hopefully do the trick.
Hi Phil,
No problem, your solution seems to do the trick, thanks :)
In the meantime my colleague also suggested another way to solve this which is to use formula with regular expression for checking if field contains number, if yes then convert it to number otherwise leave it as it is. With multi-field formula tool it can be easilly applied to all fields that require such conversion. I attached a workflow with this solution.
Kind regards,
Daniel
Thats a much more elegant solution
I really must explore RegEx more.
Hi ,
Below data is the out put of my workflow which is saved on my desktop with xlsx format:-
Name | Designation | Salary |
John | Manager | 100 |
Peter | AM | 79 |
Sam | VP | NA |
Lisa | AVP | NA |
Now when I open my excel output sheet , in salary column , I see the value in cell have green triangle which gives me option to convert to number and value are aligned to left which states they are text.
My Motive is to remove those triangle from my output excel , so that numeric salary remain as numeric and salary value where it is 'NA' should remain as text.