After used the new "Image Recognition Tool" a few days, I think you could improve it :
> by adding the dimensional constraints in front of each of the pre-trained models,
> by adding a true tool to divide the training data correctly (in order to have an equivalent number of images for each of the labels)
> at least, allow the tool to use black & white images (I wanted to test it on the MNIST, but the tool tells me that it necessarily needs RGB images) ?
Question : do you in the future allow the user to choose between CPU or GPU usage ?
In any case, thank you again for this new tool, it is certainly perfectible, but very simple to use, and I sincerely think that it will allow a greater number of people to understand the many use cases made possible thanks to image recognition.
Currently when a unique tool is used, and a field is removed upstream then the workflow fails to move forward. If you have one or two unique fields being used then it is no big deal, but when you have a very complex workflow then you have to click into each one of those tools in order to update. This can be very problematic and creates a lot of time following all the branches that is connected after the 1st unique tool is used. My suggestion is to make this a warning instead of a fail or have an option to select fail or warning like the union tool is setup. This way people can decide how they want this tool to react when fields are removed.
In the tools that embed the "Rename" option (Select, Append Fields, Join, Join Multiple), copying the new name will copy all the information of the field configuration : tick/untick, original field name, type, size, new name and description.
Renaming the field "Rename_Field"
In my opinion, it should copy only the new name. This would be useful, especially because when you change the name of a field, it isn't automatically changed in subsequent tools, so copying it to replace it in those tools is faster than retyping it every time.
We all know for sure that != is the Alteryx operator for inequality. However, I suggest the implementation of <> as an other operator for inequality. Why ?
<> is a very common operator in most languages/tools such as SQL, Qlik or Tableau. It's by far more intuitive than != and it will help interoperability and copy/paste of expression between tools or from/to in-database mode to/from in-memory mode.
A very useful and common function
Return the first non-null value in a list:
There is no tool that exists that outputs all records that are duplicates (those sharing the selected values with at least one other record) and also outputs the records that are not duplicates (those not sharing the selected values with at least one other record).
The Unique Tool is not sufficient. It only provides the first record of a unique duplicate group along with any non-duplicates and then provides a secondary output that only contains the additional records of a duplicate group. Sometimes you only care about the duplicates and want to quickly see what differs between the unique groups.
For example, if there are 4 records with the City of Austin and I am looking for duplicates on City I want to see all 4 records with Austin in the output so I can quickly compare additional fields to see what might differ, or if they are all indeed truly duplicates.
As of today, DCM is great to store credentials. But once we want to dive deeper in technicity, like using macros or Applications, it's really bad. One of the things I hate is that we can't retrieve any informations from the DCM connection, just the id. Not good for logs, really bad for understanding and have some conditional logic related to connection type or name.
Here an example
Nice, I managed to retrieve an id but I have no idea of what it means : what kind of connection? what's name?
Sometimes I want to set up a filter to compare the values in two fields in my data set. The basic filter option would be much more powerful and configuration would be quicker if this option allowed this.
For example, currently I must use a custom filter to check if Field1 and Field2 are equal:
I would love to have the option to either use a static value in the basic filter (as you can now) or select a field name from a dropdown:
Currently there is a function in Alteryx called FindString() that finds the first occurrence of your target in a string. However, sometimes we want to find the nth occurrence of our target in a string.
FindString("Hello World", "o") returns 4 as the 0-indexed count of characters until the first "o" in the string. But what if we want to find the location of the second "o" in the text? This gets messy with nested find statements and unworkable beyond looking for the second or third instance of something.
I would like a function added such that
FindNth("Hello World", "o", 2) Would return 7 as the 0-indexed count of characters until the second instance of "o" in my string.
I am just making a quick suggestion, specifically for the Formula tool within Alteryx.
Often when I am working on a larger workflow - I will end up optimising the workflow towards the end. I typically end up removing unnecessary tools, fields, and rethinking my logic.
Much of this optimisation, is also merging formula tools where possible. For instance, if I have 3 formulas - its much cleaner (and I would suspect faster) to have these all within one tool. For instance, a scaled down example:
to this:
This requires a lot of copy and paste - especially if the formulas/column names are long - this can be two copy and pastes, and waiting for tools to load between them, per formula (i do appreciate, this sounds an incredibly small problem to have, but on what I would consider a large workflow, a tool loading can actually take a couple of seconds - and this could burn some time. Additionally, there's always potential problems when it comes to copy/pasting or retyping with errors).
My proposed solution to this, is the ability to drag a formula onto another - very similar to dragging a tool onto a connection. This integration would look like:
Drag to the first formula:
Formula has been appended to the formula tool:
I think this will help people visually optimise their workflows!
I think that there are a lot of formulas that would be easier to write and maintain if a SQL-style BETWEEN operator was available.
Essentially, you could turn this:
ToNumber([Postal Code]) > 1000 AND ToNumber([Postal Code]) < 2500
Into this:
ToNumber([Postal Code]) BETWEEN 1000 AND 2500
That way, if you later had to modify the ToNumber([Postal Code]), you only have to maintain it once. Its both aesthetically pleasing and more maintainable!
I'm currently learning Pythin language and there is this cool feature : you can multiply a string
Pretty cool, no? I would like the same syntax to work for Tableau.
Similar to this idea, I think it would be really helpful to be able to search for fields in the dropdowns when using the Sort tool. Having to scroll through all of the possible field names can be a chore if you have 50+
The Formula Tool does a good job of autocompleting expressions (for example an open square bracket will show you variables in your dataset), as well as syntax highlighting (coloring variables, keywords, strings, etc).
I propose having this feature available in all tools that use the expression editor, particularly common ones such as the Multi-Row Formula Tool and the Multi-Field Formula Tool.
This parity across tools would provide a more consistent experience for the user and increase one's productivity using these tools. It's incredibly helpful for beginners and seasoned Alteryx users alike and should be available wherever possible.
The basic premise is this:
Phantom spacing. Basically something that looks like it has spaces on Excel but is actually formatted as an indentation.
Unfortunately, to read the indentation we will need either a VBA prep or read the XML inside. The latter of which is difficult.
As to VBA, the general steps are to create an indentation formula in order to see the numbers, then go from there. The idea is credited to @clmc9601 as we discussed privately.
As of now, I do not see anyway to do this on Alteryx as a function or even expression. It would be very helpful especially reading trial balances or even Bloomberg outputs as they are formatted with indentation.
Reading indentation from Excel or any other file within Alteryx will be much appreciated, especially in actuarial and finance spaces.
Toggle individual expressions on/off in the formula tool.
On more than a few occasions I have a number of expressions in a single formula tool and find myself wanting to turn off a few or many, but not all.
It'd be great if there was a checkbox to activate/inactivate : on/off : include/exclude : select/deselect (whatever language you like for the concept) an individual expression.
Simple as a text box. with maybe a 'select/deselect ALL box available incase you want to turn most off then only select a single one?
I surprisingly couldn't find this anywhere else as I know it's been discussed in person on many occasions.
Basically the Formula tool needs to be smarter in many ways, but this particular post focuses on the Data Type component.
The formula tool, should not always default to V_String as the data type when entering data or a formula into the formula tool, it should look at the data type and estimate the most likely option.
I know there are times where the logical type might not be consistent in all fields, but the Data Preview and the Function of the formula should be used to determine the most likely option.
E.G. If I type a number or a date directly into the formula tool, then Alteryx should be smart enough to change the data type from the standard V_String to Int, Double or date.
This is an extension to the ideas posted here:
I often need to create a record ID that automatically increments but grouped by a specific field. I currently do it using the Multi-Row Formula tool doing [Field-1:ID]+1 because there is no group by option in the Record ID tool.
Also, sometimes I need to start at 0 but the Multi-Row Formula tool doesn't allow this so I have to use a Formula tool right after to subtract 1.
So adding a group by option to the Record ID tool would allow the user not to use the multi-row formula to do this and to start at any value wanted.
Is it possible to add sort functionality to the Sample tool in Designer, similar to the 'Sample Based on Order' functionality in the Sample tool in Designer Cloud? This would cut down on the Sort + Sample tool combo in Designer!
Enhancement of 'IN' functionality (ie. in Filter tool), so using range instead of citing particular values for example:
instead [ID] IN (1,2,3,52,53,54,100,101,102) something like that [ID] IN (1-3,52-54,100-102).
This may be a very simple thing, but would it be possible to add a DateTimeQuarter() function? We have DateTime Second, Minute, Day, Month, and Year, and being able to have an easy formula for the quarter as well would be incredibly convenient.