I tried loading a .sas7bdat file using 'Input Data' tool.
The workflow was successful, as on below screen - I can see 5864 records were imported, no errors or warnings in the log:
However, when I check the table properties in SAS it actually has 341,435 records:
Did anyone else ran into similar issues with importing SAS datasets? Why does Alteryx drop a huge portion of the table with no warning?
Note - I'm on Alteryx Designer 11.0 x64
Michal
Solved! Go to Solution.
Have you experienced this with other sas files, or just this one? I input sas files occasionally and have not experienced this.
Hi,
I was able to replicate this behavior with 3 other files - the underlying data is loaded from the same application, so it is likely something in the way it's structured that throws Alteryx off.
I'm now very curious and will keep digging :)
Ha - you have me very curious too!!! The sas.7bdat files I load are created by SAS itself, so that's probably why I've never had an issue.
When you nail it down, if you remember, respond back - my curiosity is peaked as well!
OK, I was able to get pretty close to the source of this.
A bit of background on how the table is created: Data are pulled from a web application each month and saved as tab-delimited text file. Next, a SAS program is ran to append all monthly loads to create a historized table.
What seems to be throwing Alteryx off are special characters in the oldest files - most commonly 'Zurich' in a column containing a city name.
In monthly cuts that import to Alteryx with no issues, it appears with an umlaut character: "Zürich".
In the ones that cause issues, it appears as "ZErich". I noticed that similarly, in names and surnames that contain special characters the character is replaced by an 'E'.
Examining the files in Notepad++, the values initially look the same - 'ZErich.
After changing encoding to UTF-8, it looks like this:
After changing encoding back to ANSI, it looks like this:
I don't know much about encoding, but maybe someone who does can get some insight from this.
Curiously, the case is not that Alteryx does not load rows including these values. Some are loaded, although with an underscore added: 'Z_Erich'.
I wasn't able to track down exactly what setup causes Alteryx to stop the data load.
Lastly, I tried importing the source data (txt files) directly into Alteryx - that worked fine, and Zurich is loaded without an underscore as 'ZErich'.
So at some stage of SAS working with the data these values are saved in a form that causes problems for Alteryx. I looked at SAS programs and there is no explicit formatting logic coded for these values.
I hope you found this interesting :)
Ah yes. Very interesting. I had a similar situation just this week. A certain character from a Netezza database was getting read in as an upside down question mark and it was throwing everything off. For the heck of it, I tried copying and pasting the upside question mark into a replace function and it worked LOL.
Hi MichalK,
I have the same problem as you.
I tried reading a .sas7bdat file using 'Input Data' tool.
The workflow could run successfully and there was no error or warning message, but after checking the results, it was found that there were 140,000 original data, and only more than 3,000 were read in.
Therefore, I would like to ask if you have found a solution to this problem.