community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Community v19.6

Looks aren't everything... But the latest Community refresh looks darn good!

Learn More

How to use R and Python to Parse Word Documents

Alteryx Partner

Hi @ShaanM 

 

Please find the sample file.

Thanks in advance.

Alteryx Partner

Hi Shaan

 

Please find the file.

Thanks in advance.

Alteryx
Alteryx

@gururajb i tested with your file. Looks like some file properties have not been filled in.

 

i opened the doc and copied contents and pasted into a new word doc and then the file reads in ok.

 

it might be down to how the original file was created

Alteryx Partner

Thanks for the insights @ShaanM.

I will understand from the client how the files were created.

Asteroid

If I wanted to add the input filepath to the python macro so I can link phrases back to source documents, what might that look like? Something like this?

 

from ayx import Alteryx
import pandas

import docx2txt

text = docx2txt.process('XXXX')
filepath = 'XXXX'

print(text)

#Turn the variabe with html page into Pandas' DF
df = pandas.DataFrame({"text","filepath":[text],[filepath]})

#Write the data frame to Alteryx workflow for downstream processing
Alteryx.write(df,1)

Alteryx
Alteryx

@coderockride 

 

Yes think you are on the right path.

 

The main thing is to define the file path in the data frame that way it can be part of the data as it passes through the stream

Labels