Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Pre-check if website is valid before using the download tool

estherb47
15 - Aurora
15 - Aurora

Hi,

I have a list of about 60,000 websites that I need to verify. The Download tool fails (couldn't resolve host name) when it comes upon a bad URL, so anything after a "bad" website isn't processed.

 

Is there a tool/macro I can use to check if the URL is even valid ahead of the Download tool? I saw a tip to write a batch file/run command combo to ping the websites, but I have no idea how to do this.

 

Attaching a small workflow with a handful of records.

@SeanAdams@MattH, @NicoleJohnson, @MarqueeCrew any insights here?

5 REPLIES 5
jdunkerley79
ACE Emeritus
ACE Emeritus

One quick solution is to use an iterative macro.

 

Adjusted sample attached

 

MattH
Alteryx
Alteryx

Hi @jdunkerley79 and @estherb47

 

I had the same idea.  Ping will only tell you if the IP address of a server is functional, not if there is a website on the server.  Checking the headers of the site will give you more information. Here is my rough attempt.  Because one of the URLs failed completely, I added a join to identify this site (coming out the right join).  You could further parsing the response in the macro for the 304 returns to get the actual URL to use.

 

Thanks,

Matt

estherb47
15 - Aurora
15 - Aurora

You are both ROCKSTARS. Thank you. Love this community so much.

Best,

Esther

stapuff
9 - Comet

@MattH

 

Almost 3 years after your post and this suggestion works perfect.

 

Thanks,

 

Puff

vijaysuryav93
7 - Meteor

Though it's too late, But just posting it for future references. 

 

I have written a python based workflow/tool, where it would validate a URL to be Valid or Not Valid, It is based on the python socket library using a method 'gethostbyname'. It would be one of the fastest solution I have ever come across. This could validate any number of records. Rough estimate to validate 1000 records would be 7 Minutes. 

Labels
Top Solution Authors