Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Alteryx Download Tool - 400 Error

NexBK
7 - Meteor

I want to scrap a website.
(https://www.songpa.go.kr/www/index.do)

However, when I import a web page for that site from the Download Tool, a 400 Error appears.

(DownloadData)
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>400 Bad Request</TITLE>
</HEAD><BODY>
<H1>
400 Bad Request</H1>
</BODY></HTML>

(DownloadHeaders)
HTTP/1.1 400 Bad Request
Date: Tue, 21 Jun 2022 04:18:38 GMT
Content-Type: text/html; charset=EUC-KR
Connection: close
Content-Length: 157

Using the same URL in Python Tool with the "beautifulsoup4" and "requests" packages, page information is imported well.

How can I get the information using the Download Tool?

2 REPLIES 2
PhilipMannering
16 - Nebula
16 - Nebula

I think you have to add a User-Agent to the Header. Try the attached. Not that a lot of the data is loaded with JavaScript after the html has loaded so it might be tricky to scrape. Try [url] as well as [url2] in the Download Tool in my workflow.

 

Hope this helps,

Philip

NexBK
7 - Meteor

You're right. When I add User-Agent, the web page source is output well.
Thank you for your help.

Labels
Top Solution Authors