Alteryx Designer Desktop Discussions

jeldridge · ‎11-27-2024

I am using a macro created by another user here in the group and in the past it was working perfectly. I am having the strangest issue today. In a script that I run weekly, I have a calculation to remove outliers based on the output of a Macro that calculates quartiles in the same manner that Excel calculates them (which I need in order to maintain consistency in the data processing over time). If a ID has 10 data points after outliers are removed, it shows up in my report.

If I run the report with all of this week's vehicle data (the same way I always run it), one of the IDs that should be ok is flagged as having outliers. If I run the report with a filter at the beginning to use just that one ID, the calculations all run correctly. The script didn't change, this ID didn't change (it has no new data this week) but in one case 2 rows are flagged as outliers incorrectly and in the other it runs fine. Anyone have any ideas?

cpet13 · ‎11-27-2024

@jeldridge is the an ID being flagged as an outlier dependent on the other data moving through the workflow? I had a similar process a while back that I was doing - if I ran it for one record, it was not flagged, but when I included other data it was. That was because the flag was dependent on there being other records - if there were no other records for that particular ID within the last 30 days, it would not be flagged. So if I removed all records except for that one, that means I was getting rid of the previous records for that ID and so the process would, naturally, not find any records for that ID within the last 30 days.

jeldridge · ‎11-27-2024

I don't think so - the way it is supposed to work is that the macro creates the percentiles, InterQuartile Range, etc. based on the rows for that specific ID. When I run the script with all of the vehicles, it is generating a different set of values for these with this week's file, which then differentially flags outliers. But, I have no idea why that would be the case, when the script was running perfectly last week. I am guessing it might be something corrupt in my input file, but have had no luck trying to trace that either. My process involves an Excel sheet that gets new data each week. Last week's file is fine, this week's is not and I can't figure out why (and can't post the actual files here because of confidentiality.)

I have tried copying and pasting the values as text into a new file, converting the input to an Alteryx file, running the file with different rows removed from the bottom to try and isolate the problem, and am not having very much luck with any of it. I only noticed because this ID had 11 rows, 2 rows now are flagged as outliers so the vehicle dropped off the report (along with 9 other IDs with 10-12 rows). I am sure in the thousands of IDs other vehicles have this error, but idk what to do to fix this.

rfoster7 · ‎11-27-2024

Without being able to see the data, all we can really do is give you some ideas on how to troubleshoot.

You describe the input as being an excel file that gets rows added each week. First thing I might try is take last week's file and this week's file and put them in a fresh workflow and do a rowcount and see if the rowcount is wonky. Sometimes with excel inputs, it will give you a bunch of null rows on the bottom of the sheet that can throw things off.

The other thing you can do is take the tools out of the macro and put them into the worklow and run it so you can see what it is doing. It's a pretty simple macro, just open it up, copy out the tools and past them into the workflow where the macro would be and then you can run it and actually see what is happening to the data as it flows through.

Lastly I notice in your macro that you are grouping by the "number" field which is type double. Sometimes, that can cause issues grouping by a double due to floating point issues. I don't know why it has been working and now isn't, but that's something I would consider as a problem point.

cpet13 · ‎11-27-2024

@rfoster7 gives some good ideas. Another idea I have is that your functions are off somehow. You have some Row+1, Row-1 types of functions in your workflow. I would following @rfoster7 's advice and take out the tools and follow the data through; you could inadvertently be comparing a record from one ID to another ID.

Another thought I have is that many times with excel files I have used as inputs they will have a tab or line break in a cell, which causes issues. If the workflow gets to a line break in a cell of the row, it will suddenly make everything null for the rest of that record and put all the rest of the data in a new row, starting with column 1. It then starts putting dates into numeric fields, strings into date fields, etc. Double check the excel files to ensure there are no line breaks or tabs. Often, when using excel as input files for a workflow, I use a Data Cleansing tool at the beginning of the flow, which has an option to remove tabs, line breaks, and duplicate whitespace.

jeldridge · ‎11-27-2024

Thank you @rfoster7 and @cpet13 for your suggestions. I started to step through each of them and then had an epiphany and tried an older version of the script. It runs correctly with the new file, no issues. Apparently there is definitely something corrupted in the formulas, etc. in the newer version. Instead of spending more time trying to figure out what happened, I am going back to the old version and getting some work done.

Again Thanks! You are both rock stars!

Alteryx Designer Desktop Discussions

Macro returning unexpected results

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: How to separate IDs from the text

Passed the Advanced Certification Exam!