We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Looking For Feedback and Pointers

Dave5454
5 - Atom

Looking for some feedback on this my first attempt at web scraping with Alteryx.  I know at least half of it could be done more efficiently in Alteryx but I am still a newbie and that is why I am looking for feedback.  

 

This is basically my second time using Alteryx.  The first time was a pretty simple MySQL DB to MS Sql DB ETL operation no issues what so ever there as the data was pretty clean.  This time however I wanted to learn a bit more and dive deeper in the functionality.  So what I did was created a workflow to scrape the numerous days of celebration and causes from a website.  Through this community and google I was able to complete the project.  Please have a look attached work flow and tell me what I did wrong or could could have done better.  

 

I am looking to improve my knowledge and use learn/use best practices.  All feedback is appreciated.  Thanks!

1 REPLY 1
AndrewSu
Alteryx Alumni (Retired)

@Dave5454 , first off, great work on the workflow!  As you'll find with Alteryx, there are many paths to the correct answer.

 

A few points of feedback. 

 

  1. I see that you have multiple Formula and Filter tools and I believe that means there's an opportunity for consolidation down into a smaller number of tools. 
    • There will be cases that make sense to keep the tools separate, but more often than not, consolidation/simplification is the right move. 

  2. Another point of feedback is to use the Annotations function in each of the tools.  This way, you can document exactly what changing at each step instead of needing to interpret the formulas and settings of each tool. 
    • AndrewSu_0-1679084175690.png

       

  3. the text to columns tool that separates the dataset into 300 columns seems inefficient.  I imagine that there could have been a simple regex expression that could have identified all the matches within one column instead of the multiple columns/formulas that you have now.  

 

All in all, great work!

 

I hope this helps.  Please mark this reply as the solution so that we can clean up the threads and that other people in Community can benefit from our collaboration. 

 

Thanks. 

 

 

Labels
Top Solution Authors