Hi,
I have a list of about 60,000 websites that I need to verify. The Download tool fails (couldn't resolve host name) when it comes upon a bad URL, so anything after a "bad" website isn't processed.
Is there a tool/macro I can use to check if the URL is even valid ahead of the Download tool? I saw a tip to write a batch file/run command combo to ping the websites, but I have no idea how to do this.
Attaching a small workflow with a handful of records.
@SeanAdams, @MattH, @NicoleJohnson, @MarqueeCrew any insights here?
Solved! Go to Solution.
Hi @jdunkerley79 and @estherb47
I had the same idea. Ping will only tell you if the IP address of a server is functional, not if there is a website on the server. Checking the headers of the site will give you more information. Here is my rough attempt. Because one of the URLs failed completely, I added a join to identify this site (coming out the right join). You could further parsing the response in the macro for the 304 returns to get the actual URL to use.
Thanks,
Matt
You are both ROCKSTARS. Thank you. Love this community so much.
Best,
Esther
Though it's too late, But just posting it for future references.
I have written a python based workflow/tool, where it would validate a URL to be Valid or Not Valid, It is based on the python socket library using a method 'gethostbyname'. It would be one of the fastest solution I have ever come across. This could validate any number of records. Rough estimate to validate 1000 records would be 7 Minutes.
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |