nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Data Loss using data-stream Out in-db tool

EnaK

I have an issue with data stream out in-db tool and was wondering if anyone has a solution:

it is randomly losing data, example below shows same filters applied but different output as a result before and after data stream out used:

Data stream-out data loss.jpg

Bugs & Issues

Accepted answers

CodyConner

@Alder @apathetichell My issue was related to special characters. By adding the string below to my ODBC connection string I was able to get all of my data. Perhaps it can help you as well.

;char_as_utf8=TRUE

All comments

simonaubert_bd

hello @EnaK you're supposed to have the same. Usually when it's not, it's a syntax issue in SQL or alteryx expression.

Can you do a screenshot of each filter and summarize configuration please ? In in-memory and in-db.

Best regards,

Simon

EnaK

adding screenshots and results now changed again... now it is ZERO vs 144

Data stream-out data loss.jpg

apathetichell

filter in-db requires the native sql syntax. most likely reason for this is that your sql syntax in your filters is not the same as your on-canvas filter syntax. the other thing I'd look at is if something went off on the type of numbers you are using (ie fixed witdth of 19.6 being converted to 25.0 (figures below .50 would be rounded to zero).

Also - turn off amp.

EnaK

I am sorry, I am only the beginner in this so, not sure I understand what you mean... I am not doing any conversion with the numbers anywhere in the process, why would it be different before and after?

also, how can I see the difference between SQL syntax and on-canvas filters?

thank you very much for your such quick responses!

simonaubert_bd

@EnaK Well, you must read the documentation of your database, each database having its own flavor of SQL (yeah, it pretty sucks).

Here, i guess you have issue on your filter because of the like operator that is not exactly similar to contains function.

Please have a look at https://www.w3schools.com/sql/sql_like.asp

Best regards,

Simon

apathetichell

One filter is 36444 - one is 364444. It has an extra 4. Fix. retry.

EnaK

i am sorry, that was just a typo to make up fake data, the filters in workflow are identical

EnaK

I might of stated my issue wrong.

the filter is not the issue, I only used it to find where the issue was occuring since I was randomly seeing different results of output.

once I'd get 120,000 records and running a minute later, I would get 10,000 then 130,000 and in reality, could actually be 160,000 records.

after testing and rerunning the data and checking results at different steps of the workflow I found the data was flowing all consistent until it hits the data stream out in-db tool... and that is when i decided to filter it down to smaller set of data to confirm the issue

so, PLEASE do not focus on filters thank you all again for all the help!

apathetichell

o.k. - so two things:

1) turn off amp.

2) what is your underlying database.

I assume it's not possible that the table you are querying is changing that rapidly?

EnaK

1) amp is off, 1st thing I did

2) HANA

table is not changing for sure, i am running simple query in parallel and getting expected results

EnaK

no filters... format is the same.. nothing different

HANA is the database behind

Data stream-out data loss.jpg

EnaK

no, absolutely not

EnaK

I keep responding, but my response won't show up for some reason

1) 1st thing i did was turn off amp

2) HANA

data is not changing, i run query in parallel and see correct results

EnaK

no filters applied, please see the difference in results below:

Data stream-out data loss.jpg

apathetichell

add a select tool before summarize tool on canvas - as mentioned sometimes things misconvert. My hope is you see a fixed decimal ending in 0 meaning all your values are being rounded.

also - can you add a count to both summarize tools?

EnaK

I do need to see decimals though as in original data set, I want to keep the numbers as they are

apathetichell

Here's what I've seen - let's say you have 10mm records in your dataset. you have lots of amounts like .25, .23 etc. these become 0 when fixed decimal creates a truncated value. If you can add a select before the summarize tool and post the results - that would be helpful.

EnaK

Data stream-out data loss.jpg

apathetichell

Ok! So that's not the issue - I guess a count? Not really sure if there's something HANA specific re: data truncation via the ODBC. Maybe someone else knows about it.

My next suggestion would be to look for line item differences and seeing what have in common.

EnaK

thank you very much for all your help and ideas, i will definitely share what I find

EnaK

It looks like it has to do with the amount of data

CodyConner

Hi, I am currently experiencing the same issue. What was your resolution?

apathetichell

@CodyConner - are you on S4/Hana - @EnaK was. Most common explanation is user error. @CodyConner can you share screen shots like @EnaK did?

CodyConner

@apathetichell I can show the basic issue. As @EnaK said in the last post, I am pretty certain it is a Hana issue.

Alteryx In-DB Totals1.png

Alteryx In-DB Totals2.png

apathetichell

Yeah - not really sure here. I'd recommend enabling logging and then looking at the odbc logs for what's pulled out. I'd then try to identify specific line items which are not coming through to the canvas and try to see if they have anything in common.

Alder

Same exact issue with and without AMP.

This is ridiculous, tested also on other softwares that we have at work and they actually extract all the rows correctly without any data loss.

IMG_20240918_165714.jpg

Alder

Preview here:

apathetichell

@Alder which version of Alteryx are you running? can you add a record id tool to both sides -> add a count distinct on the reocrdid tool? It's deifnitely odd behavior - and not something I've seen with the DBs I use regularly (Databricks/Snowflake/Redshift/Postgres/BigQuery).

Alder

I will answer tomorrow as I have left the office just now.

CodyConner

Alder

Hello,

can you tell me exactly where to put this parameter?

Quick Links

Popular Tags

This months top contributors

atcodedog05 19598

Qiu 15908

binu_acs 15783

MarqueeCrew 13710

apathetichell 13703