Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to avoid the Python tool joining all records together

cmcclellan
13 - Pulsar

Today I wrote a macro, that uses the Python tool (first time I've ever used the tool, first time I've ever done something with Jupyter notebooks), that is supposed to hash (SHA256) the field value that is entered.

 

It worked great until I added more records and realised that the input of the Python tool is a multi-join, so it doesn't matter how many records go into the tool, only ONE record comes out.

 

Is there any way to fix this ?  I understand why it's written that way, but in my case I want X records in and X records out, not X records in and 1 record out.

 

Also, ignore my actual Python coding, I'm still learning the best way to use pandas.

6 REPLIES 6
BenMoss
ACE Emeritus
ACE Emeritus

The input to the python tool is not a 'multi join', it allows you to input many inputs yes, but it's not joining those inputs together, and you are still only inputting one data stream not many; the problem will lie with how you have built your script @cmcclellan.

 

The fact that the input to the python tool looks how it is allows you to have many streams of data which can reference; for example in yours, you have 1 input stream, given the connection #1.

 

Alteryx.read("#1")

So this process allows me to see this input in python (all of it, row by row).

 

IF I had a second stream and plugged this into the python tool, I would have to use 'Alteryx.read("#2")' too see this dataset.

 

Unfortunately when I open the macro itself the script disapeers, so I cannot visually show you screenshots of this, but if you can attach the script then I can potentially help.

 

What I would advise is you develop on a more representive set from within your macro (i.e. two lines instead of one), and this will allow you too see from within your script, where the problem lies.

 

Ben

 

 

 

cmcclellan
13 - Pulsar

Sorry, I didn't realise the code disappeared !!

 

Here's a screenshot :

 

2019-02-13 21_22_44-Alteryx Designer x64 - sha256_hash.yxmc_.png

 

and here's the code :

 

from ayx import Alteryx

import hashlib, binascii, sys , pandas


def hashthekey(input):
	dk = hashlib.pbkdf2_hmac('sha256', str.encode(input), b'salt password', 1)
	result = binascii.hexlify(dk)
	result = result.decode()
	return result
	
def main():
	input = Alteryx.read("#1")
	input = pandas.DataFrame.to_string(input, index=False, index_names=False, header=False)
	data = hashthekey(input)
	index = ['row1']
	cols = ['col1']
	df = pandas.DataFrame(data,index=index, columns=cols)
	Alteryx.write(df,1)        

main()

 

 

BenMoss
ACE Emeritus
ACE Emeritus

I just took a look at the following code and it seems like one of your line statements concats your rows of data into a single cell, as a result, you then end up with a single cell output; I think you need to look at rewriting your code.

 

input = Alteryx.read("#1")
pandas.DataFrame.to_string(input, index=False, index_names=False, header=False)

2019-02-13_13-26-03.png

BenMoss
ACE Emeritus
ACE Emeritus

There may be some sort of loop built in; but I can't tell for sure as I am no python expert, but if you check out the dataframe that is pulled into the Python tool, just by running 'Alteryx.read("#1")' then you see the result is not a multijoined string.

 

2019-02-13_13-31-38.png

AndrewKramer
Alteryx Alumni (Retired)

As Ben mentioned, the to_string() function flattens your data frame, thus meaning your hashthekey() function is only called once on this string.

 

data = pandas.DataFrame.to_string(input, index=False, index_names=False, header=False)
data

'password\n      password\n             1\n             2\n             3\nthis is a test'

Your input column, testing, is already a string, called an object in Pandas. You can simply use the .map() function to apply your hashthekey() function to each row in your pandas data frame and return the result as a new column.

 

python_code.PNG

 

cmcclellan
13 - Pulsar

I had to add some code to write to the output stream of the macro, but THANK YOU for the code :) :) it really did help and shows that I need to learn more Python before just chucking in code ;) 

Labels