Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Trying to load .sas7bdat file - data are incomplete but no error or warning received

MichalK
8 - Asteroid

I tried loading a .sas7bdat file using 'Input Data' tool.

 

The workflow was successful, as on below screen - I can see 5864 records were imported, no errors or warnings in the log:

SAS load - Alteryx.png

 

However, when I check the table properties in SAS it actually has 341,435 records:

 

SAS load - SAS.png

 

Did anyone else ran into similar issues with importing SAS datasets? Why does Alteryx drop a huge portion of the table with no warning?

 

Note - I'm on Alteryx Designer 11.0 x64

 

Michal

5 REPLIES 5
mbarone
16 - Nebula
16 - Nebula

Have you experienced this with other sas files, or just this one?  I input sas files occasionally and have not experienced this.

MichalK
8 - Asteroid

Hi,

 

I was able to replicate this behavior with 3 other files - the underlying data is loaded from the same application, so it is likely something in the way it's structured that throws Alteryx off.

 

I'm now very curious and will keep digging :)

mbarone
16 - Nebula
16 - Nebula

Ha - you have me very curious too!!!  The sas.7bdat files I load are created by SAS itself, so that's probably why I've never had an issue.

 

When you nail it down, if you remember, respond back - my curiosity is peaked as well!  

MichalK
8 - Asteroid

OK, I was able to get pretty close to the source of this.

 

A bit of background on how the table is created: Data are pulled from a web application each month and saved as tab-delimited text file. Next, a SAS program is ran to append all monthly loads to create a historized table.

 

What seems to be throwing Alteryx off are special characters in the oldest files - most commonly 'Zurich' in a column containing a city name.

In monthly cuts that import to Alteryx with no issues, it appears with an umlaut character: "Zürich".

In the ones that cause issues, it appears as "ZErich". I noticed that similarly, in names and surnames that contain special characters the character is replaced by an 'E'.

 

Examining the files in Notepad++, the values initially look the same - 'ZErich.

After changing encoding to UTF-8, it looks like this:

zurich UTF-8.png

After changing encoding back to ANSI, it looks like this:

zurich ANSI.png

I don't know much about encoding, but maybe someone who does can get some insight from this.

 

Curiously, the case is not that Alteryx does not load rows including these values. Some are loaded, although with an underscore added: 'Z_Erich'.

I wasn't able to track down exactly what setup causes Alteryx to stop the data load.

 

Lastly, I tried importing the source data (txt files) directly into Alteryx - that worked fine, and Zurich is loaded without an underscore as 'ZErich'.

 

So at some stage of SAS working with the data these values are saved in a form that causes problems for Alteryx. I looked at SAS programs and there is no explicit formatting logic coded for these values.

 

I hope you found this interesting :)

mbarone
16 - Nebula
16 - Nebula

Ah yes.  Very interesting.  I had a similar situation just this week.  A certain character from a Netezza database was getting read in as an upside down question mark and it was throwing everything off.  For the heck of it, I tried copying and pasting the upside question mark into a replace function and it worked LOL.

Labels