We are having issues using our custom Python tool in an Alteryx server environment because it seems the XML config string that is passed in pi_init() is formatted differently depending on Designer or Server environments. Specifically, the issue only occurs if you have a config option left unset from the HTML SDK. This results in XML that looks as follows in the .yxmd file
<Properties> <Configuration> <enableDedPred>False</enableDedPred> <username> </username> <predURL> </predURL> <predKey> </predKey> </Configuration> <Annotation DisplayMode="0"> <Name /> <DefaultAnnotationText /> <Left value="False" /> </Annotation> </Properties>
When running from Designer, the XML string that get passed into pi_init() appears to have been stripped of all the this whitespace. However, when saving the workflow to Alteryx Server, the workflow fails to save because it seems that the XML string passed to pi_init() in this case is not minified and it is breaking our parsing because XML tags with whitespace in the text section is not the same as an empty XML tag. I feel that empty XML tags should be saved as
<username></username>
for example.
Are you saying that the workflow does not save to the Alteryx Server? Or that when it is run from the Alteryx Server, the pi_init XML is in a different format than when it is run from Alteryx Designer?
The workflow saves, but we see errors on server because the configuration xml string the pi_init sees is different between Server and Designer. On server, we appear to see the XML formatted the same as how the yxmd file looks on disk. In Designer, it is doing some sort of whitespace stripping so the XML appears to be more correct (I haven't confirmed that the behavior is 100% correct because it appears to be stripping all/most whitespace but some whitespace may be intentional).
However, I don't know if the real bug is in the HTML SDK because I don't think the v1 HTML tools output the config XML in this weird indented manner.
You should be able to reproduce this issue by creating an HTML and Python tool that has some text-box inputs with no default values. Then when you save a workflow with that tool in it, you will see in the .yxmd file the structure I have in my original post. From there you will see that Designer is stripping that whitespace before the Python code sees it; however, this doesn't happen in Server.
Hmm, this could be an issue in the HTML SDK, but it seems like it could be a defect in the Engine as well, I can't tell without looking into it further. Perhaps @TashaA, @RyanSw, or @LindaT can shed some light here.
Meanwhile, can you use an XML parsing library or some kind of config flag that ignores whitespace?
Yes, we are currently working around the issue by examining the text of the element tag and if it only contains whitespace characters then we are considering it to be null. However, I feel this is brittle because in the future, we may want a config option where only inputing whitespace could be a valid input, so it would be best of the XML saved by Alteryx was correct, i.e. use
<Configuration>
<foobar></foobar>
</Configuration>
or
<Configuration>
<foobar />
</Configuration>
for empty elements and not
<Configuration> <foobar> </foobar> </Configuration>
For reference, this is our workaround
def _isblank(element): """ Returns if a given XML element is _blank_. We define blank to be any of: * not found in the document tree * an empty tag (i.e. <foo />) * only containing whitespace (e.g. <foo> </foo>) """ return (element is None or element.text is None # We treat whitespace-only tags as blank due to a bug in Alteryx Server: # https://community.alteryx.com/t5/Dev-Space/Issue-using-Python-tools-in-Alteryx-Server/m-p/149291#M315 or element.text.strip() == '')
What does the XML look like when you run the workflow using AlteryxEngineCmd.exe?
Our license doesn't allow us to run via the command line.
You could try using the lxml library for this, and when you want to maintain significant whitespace, the front end UISDK ought to wrap your SimpleString values in CDATA so your whitespace will be preserved like that.
I found this here regarding such:
https://stackoverflow.com/questions/3310614/remove-whitespaces-in-xml-string
Let me know if that helps!
I feel this reply misses the point. The stack-overflow link is not the same as the issue I am describing. However, the lxml documentation describes the exact situation I am seeing:
Note that the whitespace content inside the <b> tag was not removed, as content at leaf elements tends to be data content (even if blank). You can easily remove it in an additional step by traversing the tree
http://lxml.de/tutorial.html#the-parse-function
Python's built-in ElementTree is parsing the XML document fine as it isn't an issue that the document has whitespace in it in general. The issue is that the whitespace (specifically a newline and indentation) occurs between the opening and closing element tags which any valid XML parser must not strip. But, this whitespace seems to have been inserted by HTML SDK when the TextBox widget is left empty. I originally filed this bug because it seems that Designer is doing some crazy XML parsing that is stripping out this extraneous whitespace but Server is not. However, the more we discuss, it really just seems like the real issue is in the way that the HTML SDK is serializing all the DataItems into XML.