Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Bringing in a whole slew of sheets, html scrape

GoldenDesign04
8 - Asteroid

Hello everyone! I hope you've had a happy holiday season! 

 

I have a workflow I am quite proud of, my first complete end to end build. Now, credit is due to this awesome community in helping me stick the landing too. But I'd like to take it to the next level.

 

Lemme walk you through what I have and where I'd like to go:

 

I have a website that is a file system dump. Contains 270 links to xls files (3 reports generated daily). Looks like this:

Source.jpg

 

My work flow brings this in and sorts, as you see the source sorting is kinda wonky, and then pulls the most recent one. The documents look kinda line this:

Page view.jpg

 

I then strip out all the html, do some de-duping, and add in a few columns and get a flat file:

Example pull.jpg

 

This is all exported to a shared drive to be worked by the team I am doing this handy little automation for. 

But the Stakeholders would like more. The want all that lovely data for a dashboard/reporting tool. An I'd love to give it to 'em!

 

I filter out duplicates currently between the new/old as you can see in my example workflow (attached). For daily work this is adequate. For analytics purposes we'd only need to filter dupes by day but not across the whole 90-day time line as the same "ID" may pop up with the same type of error.

 

Now; the problem/question:

 

I figured I could just adjust the select records in the first bring in to 270 instead of the first record and all would work. Yes and No. I get the records but the flat file structure gets all wonky. I've tried adjusting in all kinds of places but to no avail. So I was thinking instead I would take the flat file before the filtering out of any columns (select), or any filter uniques, and loop it somehow. Though having a single workflow with 270 branches sounded silly to my novice mind.

 

Any suggestions on approach?? How would you pull in all 270 records, and have them nice and clean for a union for one big historical dump?

 

 

 

1 REPLY 1
DiganP
Alteryx Alumni (Retired)

@GoldenDesign04 If I am understanding correctly, you want to iterate the logic (dedupe/cleanup) 270 times, for each file. I believe a macro would help in this case. It doesn't make sense to have 270 different paths for each file!

Digan
Alteryx
Labels