Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Read html file from desktop

Andrzej
8 - Asteroid

Hi,

 

I am creating app, which user is using to upload data. Client was using data in csv and excel format. Now he will also be loading html files. She will be loading data from her desktop, so I cannot use download html tool. The only solution for me was to change html to xml and load it. Unfortunately it doesn't load correctly, so I had to cleanse this data a bit (please find attached examples, I couldn't attach html or xml, but xml is embedded in workflow).

 

My question is whether I can upload html file to Alteryx without converting it to xml first (I don't want customer to do it). I am also concerned if my solution is robust. If not could you advice my something more robust? 

8 REPLIES 8
seinchyiwoo
Alteryx Alumni (Retired)

Hey,

 

You can read it directly without converting it.

Input Data > All Files > xxx.html > Read it as delimited text file > Other: \0:

seinchyiwoo_0-1606293943343.png

Next you just have to go through the usual motion of data parsing in Designer.

Some examples of parsing HTML file here: https://community.alteryx.com/t5/Weekly-Challenge/Challenge-40-Parsing-a-HTML-File/td-p/36581

 

Cheers,

Seinchyi

Andrzej
8 - Asteroid

Hi Seinchyi,

 

Thank you for this. I didn't know that input tool can actually read this file, because it didn't see it. I have to specify file name which I want to load by hand. I am still wondering if there is a way for input tool to see this file, because it is used by customer. 

 

Capture.PNG

Greeting 

Andrzej Gabryel

 

 
 

 

seinchyiwoo
Alteryx Alumni (Retired)

Yep, instead of "All Data Files", you change to "All Files" as Files of type.

Then you will be able to see All Files.

Andrzej
8 - Asteroid

Thank you again 🙂 

Andrzej
8 - Asteroid

Hi,

 

I have one additional question. I am trying to use regex tool to tokenize columns from row like this one

<tr class="b"><td>24/11/2020 13:44:10</td><td>user1</td><td>221232732</td><td>756756</td><td>fdsfs</td><td></td></tr> 

using this expression

(?<=(<td>))(\w|\d|\n|[().,\-:;@#$%^&*\[\]"'+–/\/®°⁰!?{}|`~]| )+?(?=(</td>))

and when I use replace it finds all necccesary informations correctly. I have also check it with https://regexr.com/

Andrzej_0-1606309204929.png

 

But when I try to use tokenize I get error. Do you know what should be changed in regex expression?

 

Greetings

Andrzej

 

 

seinchyiwoo
Alteryx Alumni (Retired)

Hey,

 

Try this instead:

<td>(.*?)</td>

 

seinchyiwoo_0-1606355206880.png

 

Cheers,

Seinchyi

Andrzej
8 - Asteroid

Thank you again 🙂 

Andrzej
8 - Asteroid

Hi @seinchyiwoo,

 

I have another question, everything was working when I was using normal workflow. But now when I try to use app, I get this error. Do you know how can I resolve this issue? 

Andrzej_1-1606912290388.png

 

when I use file browse tool I don't see this screen 

Andrzej_0-1606912651710.png

 

 

Greetings

Andrzej

Labels