I surprisingly couldn't find this anywhere else as I know it's been discussed in person on many occasions.
Basically the Formula tool needs to be smarter in many ways, but this particular post focuses on the Data Type component.
The formula tool, should not always default to V_String as the data type when entering data or a formula into the formula tool, it should look at the data type and estimate the most likely option.
I know there are times where the logical type might not be consistent in all fields, but the Data Preview and the Function of the formula should be used to determine the most likely option.
E.G. If I type a number or a date directly into the formula tool, then Alteryx should be smart enough to change the data type from the standard V_String to Int, Double or date.
This is an extension to the ideas posted here:
I often need to create a record ID that automatically increments but grouped by a specific field. I currently do it using the Multi-Row Formula tool doing [Field-1:ID]+1 because there is no group by option in the Record ID tool.
Also, sometimes I need to start at 0 but the Multi-Row Formula tool doesn't allow this so I have to use a Formula tool right after to subtract 1.
So adding a group by option to the Record ID tool would allow the user not to use the multi-row formula to do this and to start at any value wanted.
When writing an expression in a Formula tool, I love that you can just type an open bracket and suggestions pop up that allow you to auto-fill the rest of the variable name. What I find frustrating, however, is that once you type the open bracket, the highlighted field automatically moves to the one where your mouse is pointing, regardless of if you have moved your mouse or not. I think it makes more sense to always highlight the first field in the list and only take mouse position into account once it has actually moved.
It is hard to describe in just a picture as opposed to a video but essentially I had my mouse below where I was typing in the screenshot below then when I typed the open bracket, the 3rd field listed automatically got selected even though I never moved my mouse.
The select tool does a great job at flagging up when something has changed from its original state. However why does this not happen with the checkboxes to keep or remove a field? It would be much faster and easier to read if we could have the same color conditional formatting as the rest.
When we edit formula tool, only first expression is expanded. I prefer all expressions are expanded as a default. When I want to shrink them, I want to 'expand all' icon like attached snap shot. This icon is toggled same as each expression's expand icon('expand all' <-> 'shrink all')
There is no straightforward way to know if a string is lower, upper or title case. A workaround such as Contains function or REGEX ones has to be used.
The creation of the following functions would make it easier :
- IsTitleCase(String) : tests if a string is in TitleCase
- IsUpperCase(String) : tests if a string is in UpperCase
- IsLowerCase(String) : tests if a string is in LowerCase
They would all return a boolean and be in the Test category.
Lately I've used the 'Add Prefix to Field Names' option in the Select tool. It works great, however when you click the button to add a prefix, the new window pops up and the focus is on the checkbox. I think when this box pops up, the focus should be in the text box so the user can start typing right after they click the button. This is the same case for the Add Suffix option, too.
The Remove Null Rows feature added to the Data Cleansing tool is really nice, however it doesn't work for a common use case for us where we have key metadata field(s) added to the data stream that make rows not null so we'd like to be able to ignore or exclude one or more fields from the Remove Null Rows output.
Here's a use case starting with an Excel file with multiple tabs where each tab holds the records for a different Province:
Note that the 2nd record in Southern is entirely empty, so this is the record that we'd like to remove using the Data Cleansing tool.
Since the Province name is only in the worksheet name (and not in the data) I'm using a Dynamic Input tool with the "Output File Name as Field" to include the worksheet name so I can parse it out later. So the output of the Dynamic Input looks like this:
With the FileName field populated the entire row is not Null and therefore the Remove Null Rows feature of the Data Cleansing tool fails to remove that record:
Therefore what we'd like is when we're using the Remove null rows feature in the Data Cleansing tool to be able to choose field(s) to ignore or exclude from that evaluation. For example in the above use case we might tick the "FileName" checkbox to exclude it and then that 2nd row in Southern would be removed from the data.
There are workarounds to use a series of other tools (for example multi-field formula + filter + select) to do this, so extending the Data Cleansing tool to support this feature is a nice to have.
I've attached the sample packaged workflow used to create this example.
It would be useful to be able to select a single container (containing a data input) or multiple containers using Shift, and run those and only those.
When building a new element to a larger workflow, I often enter a new Input in a new container, the ability to run just that container without having to turn off all my other containers would be really useful in speeding up the start of joining things together.
Hope that makes sense.
My friend @jdunkerley79 posted a terse idea: https://community.alteryx.com/t5/Alteryx-Designer-Ideas/FieldName-constant-in-Generate-Rows-Tools-an... it is inactive, but I want to extend his thoughts.
Rephrasing his idea as mine: The tool defaults the expressions to use [RowCount]. If you should either "Update Existing Field" or change the "Create New Field" the default expressions MUST be updated manually. Please update the expressions to make use of the new field.
Well, that doesn't always work! Often it will. But if you change the TYPE to date, it certainly won't. In fact, I see many questions about joining from within a DATE RANGE and the technique to build date rows from the range requires the use of DateTimeAdd(). Wouldn't it be nice (like your sample workflows) to modify the default expression based on the change of data type? I think so.
If we were thinking easy. Suppose you could have a RANGE function (dates or numerics) where you simply selected the from, to fields and gave the user the option to select the units. Now the tool auto-configures itself to create all of the "days" between the from and to dates or "1.0" and it creates all unit values between the two numeric amounts.
These would be "Alteryx" worthy enhancements in my opinion.
This is a pretty quick suggestion:
I think that there are a lot of formulas that would be easier to write and maintain if a SQL-style BETWEEN operator was available.
Essentially, you could turn this:
ToNumber([Postal Code]) > 1000 AND ToNumber([Postal Code]) < 2500
ToNumber([Postal Code]) BETWEEN 1000 AND 2500
That way, if you later had to modify the ToNumber([Postal Code]), you only have to maintain it once. Its both aesthetically pleasing and more maintainable!
With the 2019.3 release the summarize tool now includes prefixes for grouped fields. While a nice addition, in application it makes using this data downstream (like joining to other tables) more involved because of needing to remove this prefix.
It would be nice to have this as an option (a checkbox to add/remove prefixes maybe) or just revert back to pre-2019.3 behavior...thanks!
During development it would be helpful to be able to do the following in both Formula and Filter tools (and perhaps any other tool that uses custom code):
1) Highlight a line or block of code
2) Right click
Easier than manually typing or deleting "//" at every line.
Thanks in advance!
Any python user will tell you that one of the reasons why python is so powerful is the ability to access values using their indexes. It would be great if alteryx had such a system in place too, where you can access values or loop over them using their index, which can then be applied in creating new columns or calculations.
P.S - I know we can use the python tool but I would rather see this ability built in the formula tool.
The Multi-Field formula tool has three really powerful features that it supports:
These are really powerful within Multi-Field formulas because they allow for a dynamic process to apply across multiple fields.
However, they would also be very helpful in regular formulas and Multi-Row formulas, for code transportability.
A basic example: I have a Longitude field that is a string. I need to set it to a value of 0 if there is a null value.
My formula today:
IF ISNULL([Longitude]) THEN 0 ELSE [Longitude] ENDIF
Now lets say I want to use the same formula somewhere else, but for Latitude instead.
That formula looks like:
IF ISNULL([Latitude]) THEN 0 ELSE [Latitude] ENDIF
If I could use [_CurrentField_] instead, that would allow me to instead write both formulas as:
IF ISNULL([_CurrentField_]) THEN 0 ELSE [_CurrentField_] ENDIF
This code can easily be copied for any field that requires replacing Nulls with 0s, and doesn't require refactoring to use a Multi-Field formula instead.
This also means that if I later change my field name, the code will remain consistent. This not only speeds up development time and flexibility, but more readily allows for validation that the existing code has not changed.
Often as I am scraping web sites, some clever developer has put an invisible character (ASCII or Unicode) in the data which causes terrible trouble.
I've identified 89 instances of zero-width or non-zero-width glyphs that are not visible and/or Alteryx does not classify as whitespace. There are probably more, but Unicode is big y'all.
Unfortunately, the Trim() string function only removes 4 of these characters (Tab, Newline, Carriage Feed, and Space).
REGEX_REPLACE with the \s option (which is what the Cleanse macro uses) is a little better but still only removes 20. And it removes all instances, not just leading and trailing.
I've attached a workflow which proves this issue.
@APolly: this is what I mentioned at GKO.
And I did see this post (https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Elegantly-remove-all-ASCII-characters-...), but it's too brute force. Especially as Alteryx is localized and more users need those Unicode characters.
When we create new workflows, we like to have them in our company template, to stnadardise documentation. This makes it easier for a supervisor to review, and for a colleague to pick up the workflow and understand what is going on. For instance, we have all data input on the left, and all error checks and workflow validation on the right, and a section at the top with the workflow name, project name, purpose etc. We have a workflow that we use as a template with containers, boxes and images all in the appropriate places
It would be great if there was an option to select a workflow as a template. When a new workflow is opened, it would load this template rather than having a blank canvas.
Sometimes formulas get pretty long. There are cases of deeply nested conditionals, concatenation of long strings, cases where multiple casts and parses are used, etc. where formulas get pretty large and unwieldy. The current system of wrapping lines and managing the size of the properties pane can be a hassle, especially if you are trying to use any sort of whitespace formatting to make the formulas more readable.
My solution is this is pretty simple, add a pop-out window for formulas. It could be a context menu option from right-clicking the formula box itself, a button on the bar at the top of each formula, or any number of other things.
A really good example of this is MS Access. You can right-click any text box that takes an expression and open it in the expression editor pop-up window. The current system is more like excel where you're stuck with whatever box size you're given.
Working in the accounting department, this has come up too many times now to ignore!
Would LOVE LOVE LOVE to see a new formula available in the DateTime formula suite that mimics the function of the EOMONTH() formula when working with dates in Excel.
The beauty of the EOMONTH() formula in Excel is that I can just give it a date, and then tell it how many months in the future or past I would like it to add/subtract... Alternatively, in Alteryx, this can require 2 or 3 nested DateTime functions to arrive at the same answer.
Example: To find the end of the month 2 months in the future from today's date, I would use the following formula...
Excel = EOMONTH(Today(),2)
Alteryx = DateTimeAdd(DateTimeAdd(DateTimeTrim(DateTimeToday(),"month"),3,"months"),-1,"days")
Seems much more complicated than it needs to be in Alteryx, and easy to get lost in the nested formulas & non-intuitive adding/subtracting of months/days! I can see a new formula (something like DateTimeEOMonth?) being structured as follows in Alteryx: DateTimeEOMonth([Field],increment)
Please consider! Our accounting department thanks you heartily in advance... 🙂
I'm sure there's a reason behind it, but can we please be allowed to run calculations on null values in a formula tool? right now, if we sum three values (1 + 3 + [null]) it produces [null], can the formula tool just ignore the null values? the only way around this is to fill the [null] cells with a value and that adds an additional step to what should be a fairly straight forward process. That value would have to be different for a multiplication formula vs an addition formula in order to not change the answer materially whereas ignoring the value is a more consistent solution.