I have a dilemma that I am finding to be in a catch 22.
we have a large SFTP site that we need to download via Alteryx Server (versus someone doing it manually on their own machine) it is over 60GB and 800k files.
I was able to traverse the whole inventory and get the folder structure and the files and I am ready to run this job end to end.
However, I noticed that the sfTP tool that I used exported the address with Hex characters instead of the local language characters (i.e Japanese, Chinese, Korean, Greek)
Japanese example
This is how the tool exports the URL
sftp://ftp1.mysite.com/532243/%28%E5%92%8C%E8%A8%B3%29016358-140605_ReusePositionFAQs_SalesRep_JA.pdf
This is how it should look when I save the file on our side.
(和訳)016358-140605_ReusePositionFAQs_SalesRep_JA.pdf
Another one for Greek
URL from sFTP tool
sftp:/ftp1.mysite.com/3190708/content/assets/ZTwX9ZElQruYdAoQ_5N01JrDRWABWUNS8-%D0%9F%D1%80%D0%B8%D0%BC%D0%B5%D1%80%20%D0%BB%D0%B8%D1%81%D1%82%D0%B0%20%D0%BE%D0%B7%D0%BD%D0%B0%D0%BA%D0%BE%D0%BC%D0%BB%D0%B5%D0%BD%D0%B8%D1%8F%20%D1%81%20%D1%82%D1%80%D0%B5%D0%BD%D0%B8%D0%BD%D0%B3%D0%BE%D0%BC.docx
How it needs to be saved
ZTwX9ZElQruYdAoQ_5N01JrDRWABWUNS8-Пример листа ознакомления с тренингом.docx
Is there a way to "convert" this address back to the local language using some kind of find and replace or REGEX? I need it to work universally for any of the language types.
Thanks for any suggestions. I have not found another approach for getting the entire inventory exported as a CSV or TEXT with the local language intact.
Here is what I would try. If you can find a list of the hex/local language pairings, you can try and use the "Find Replace" tool to lookup against the URL. Check the example I have attached. You can see it start to work, but I don't have Greek or Japanese values in my list.
Quick things to troubleshoot...
1) is the download tool set to download in UTF-8 (OR UTF-16)
2) is the data downloaded directly into the datastream/saved as a string or saved as a file.
3) if it's read back in from .csv - what's the text encoding on the input data tool.
4) is there a manual conversion from vw_string to string at any point in the workflow?