Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Read a field from connected tool in python tool

knozawa
11 - Bolide

Hello,

 

Is there any way to use a field from input connection for the python tool?  I'm using a python tool for web scraping.  Currently, I'm directly embedding a URL in the python tool to scrape.  However, I would like to dynamically scrape multiple pages on the website. 

 

I tried to read the connected tool:

 

data = Alteryx.read("#1")
Alteryx.read(Alteryx.getIncomingConnectionNames()[0])
data_url = data['file']

wd = webdriver.Chrome("C:/Program Files/Alteryx/bin/Plugins/chromedriver.exe")

wd.get(data_url)

 

However, data_url is an "object" not a "string".  wd.get(data_url) command gives an error message: # For dynamically generated websites wait for a specific ID tag.  

8 REPLIES 8
Jean-Balteryx
16 - Nebula
16 - Nebula

Hi @knozawa ,

 

Does data_url[0] returns something ?

knozawa
11 - Bolide

@Jean-Balteryx 

 

Thank you!  Yes, it read the first record.  Do you know if we can read multiple records from the incoming connection for scraping multiple pages in the python tool?  In this case, I wonder if I need to use an iterative macro.

Jean-Balteryx
16 - Nebula
16 - Nebula

I'm not a python expert but maybe using slicing such as data_url[0:2] could work !

clmc9601
13 - Pulsar
13 - Pulsar

Hi @knozawa,

 

@Jean-Balteryx is on the right track!

 

You can definitely reference specific fields from the input. If 'data' is the variable name from input anchor #1, you can do

data[rowNumber]['columnName' or columnNumber] 

to reference a specific column from a specific row.

 

If you want to iterate through all the rows in the input, Python already lets you write for that.

I'd use a "for loop". More detailed information here: https://automatetheboringstuff.com/chapter2/. Search "in range" to skip straight to for loops.

 

It'll be something like: 

for x in range(0, len(data)):
   variableYouChoose = data[x]['columnName']
   otherCodeHere...

 

I hope this helps! If it does, please consider marking it as a solution so others may find it.

knozawa
11 - Bolide

@clmc9601 

 

Thank you!  I added following code, but webpage = data[x]['file'] is having a key error: 

 

 

 

 

KeyError                                  
     69 
     70 for x in range(0, len(data)):
---> 71     webpage = data[x]['file']
     72     if __name__ == "__main__":
     73         main()

c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2897             if self.columns.nlevels > 1:
   2898                 return self._getitem_multilevel(key)
-> 2899             indexer = self.columns.get_loc(key)
   2900             if is_integer(indexer):
   2901                 indexer = [indexer]​

 

 

clmc9601
13 - Pulsar
13 - Pulsar

Hi @knozawa,

 

Sorry, I wrote rows and columns in the wrong order. Try this instead:

data['file'][x] 

knozawa
11 - Bolide

@clmc9601 

Thank you!  It successfully iterated and read rows one at a time.  I will close this case now.

 

However, output has only the last iteration result.  I created another case since it's a bit different question: https://community.alteryx.com/t5/Alteryx-Designer-Discussions/how-to-append-data-frame-output-in-pyt...

 

If you can give me some suggestions, that would be helpful.  Thanks!

clmc9601
13 - Pulsar
13 - Pulsar

Sure, will do.

Labels