This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Often as I am scraping web sites, some clever developer has put an invisible character (ASCII or Unicode) in the data which causes terrible trouble.
I've identified 89 instances of zero-width or non-zero-width glyphs that are not visible and/or Alteryx does not classify as whitespace. There are probably more, but Unicode is big y'all.
Unfortunately, the Trim() string function only removes 4 of these characters (Tab, Newline, Carriage Feed, and Space). REGEX_REPLACE with the \s option (which is what the Cleanse macro uses) is a little better but still only removes 20. And it removes all instances, not just leading and trailing.
Sometimes formulas get pretty long. There are cases of deeply nested conditionals, concatenation of long strings, cases where multiple casts and parses are used, etc. where formulas get pretty large and unwieldy. The current system of wrapping lines and managing the size of the properties pane can be a hassle, especially if you are trying to use any sort of whitespace formatting to make the formulas more readable.
My solution is this is pretty simple, add a pop-out window for formulas. It could be a context menu option from right-clicking the formula box itself, a button on the bar at the top of each formula, or any number of other things.
A really good example of this is MS Access. You can right-click any text box that takes an expression and open it in the expression editor pop-up window. The current system is more like excel where you're stuck with whatever box size you're given.
Working in the accounting department, this has come up too many times now to ignore!
Would LOVE LOVE LOVE to see a new formula available in the DateTime formula suite that mimics the function of the EOMONTH() formula when working with dates in Excel.
The beauty of the EOMONTH() formula in Excel is that I can just give it a date, and then tell it how many months in the future or past I would like it to add/subtract... Alternatively, in Alteryx, this can require 2 or 3 nested DateTime functions to arrive at the same answer.
Example: To find the end of the month 2 months in the future from today's date, I would use the following formula...
Seems much more complicated than it needs to be in Alteryx, and easy to get lost in the nested formulas & non-intuitive adding/subtracting of months/days! I can see a new formula (something like DateTimeEOMonth?) being structured as follows in Alteryx: DateTimeEOMonth([Field],increment)
Please consider! Our accounting department thanks you heartily in advance... 🙂
Currently zero-width spaces are not removed by the data cleanse tool. This means you can have two visually similar words like Dog and Dog but one is actually longer than the other due to zero width characters. The removal of these zero-width spaces can be added with the formula.
I saw there are number of posts from the community asking for solution to calculate the NetWorkDays (e.g. similar to the networkdays in excel which to calculate the number of days different between the two days excluding weekend and holidays.)
Although we could build a macro for it, the performance is not ideal, especially when the data set is huge and/or the date range required is far apart from each other because there is currently NO a build-in function in Alteryx. Alteryx will have to expand the date range by date and check whether each is a weekend or holiday. It will an excellent idea if a build-in function for Networkdays could be built to minimize this hassle from everyone around the world.
We are looking forward this idea could be take forward.
The join tool currently does not allow case-insensitive joins, but the find/replace tool does. Additionally- even if both sides are identical, the join tool will not join "Sean's house" to "Sean's house" because of the non-letter character in the middle. Finally - if one side is a string(2), and the other is a vString(200) - even if you have a single identical character on both sides you get uncertain outcomes unless you force the type
Please could you consider amending the join tool to include 3 new options or capabilities:
- Case insensitive join
- Allow full Unicode character set in join
- Full match across text types (irrespective of string size) - this would allow a string(2) value to match to a string(100) value as long as the string(100) value only has the same 2 characters in it as the string(2) value
That would remove a load of work from every text-join that's being done on every canvas we do.
This is similar to a prior idea now marked complete "Allow macro metadata to persist until next run". I tried the check box solution and still have the same issue, running V11.
What we NEED is for tools that derive columns like CrossTab to retain metadata from the most recent run and thus pass that metadata downstream for further tools and development.
I have several cross tabs and before V11 I could run the flow once to push metadata downstream, then add or modify tools downstream and the derived fields from the cross tab stayed available in those tools to be recognized and referenced as I add more tools and logic. Now in V11 I am finding if I click on a tool or add a tool downstream the metadata for the derived columns disappears.
I attached pics to illustrate where I have 6 CrossTabs and decided I needed to add a summary downstream. I had to run the flow to get metadata populated which is normal and I added the first summary, then inserted another summary and immediately the derived column metadata was lost in all paths after the crosstabs. so ended up having to re-run the flow 5 more times for each summary tool added. then I had to re-run it 5 more times to adjust column names in selects after downstream joins.
I end up wasting a lot of time having to re-run a sufficient test file to feed all the variety of data necessary to generate all columns between most edits or new tool adds. What used to take ~5 minutes to do now takes ~35
I recall seeing and discussing this issue previously and hoped the check box would resolve but It does not fix the issue.
We see similar issue for tools downstream from other tools where the columns are derived or uncertain until that tool runs, such as, transpose, Joins and Unions. I recall some discussion at user groups and in the community but the only reference I found this morning of seeming relevance is the one I mentioned above.
Idea: I need a function that given two dates, will return the number of business days between them. I need to know the # of business days between when a sales order is placed and when it ships to the customer. I'm in the US, so I would want to not count Saturdays, Sundays, and US Holidays, but I can foresee others wanting the option to change to other calendars or ignore holidays.
There are a couple of posts on this in the community, but everything I've found so far is too laborious to implement or not robust.
The default data type for new columns used to be Double, and now it is V_WString. Is this user-configurable? I find that, at least in my work, needing a formula to output a Double is way more common than a V_WString. I'm curious of other community members' thoughts here. If there isn't a consensus, it would be great to have this be a user-configurable default.
Virtually every client/project I work on, there is a nead to clean up data. Most of the time, that involved standardizing to some existing list of data. However, as we all know, data from differnet systems or being manually collected will not match perfectly in all cases. This is most often when I tend to use the Fuzzy Match tool.
However, I have to use a lot of weird steps to effectively create a "Fuzzy Join", which is something I've done using database functions in the past. I think it would be great if a new tool were created that would do the following:
Accept two inputs, one for the "raw" data and another for the "list" of data to match to.
Perform a fuzzy join based on similar functionality to the fuzzy match, convert data to metaphone keys and then run Jaro/Levenstein matches. By default, return only the highest matching result.
Expand the pre-process functionality to include words to exclude from the analysis (beyond just "and", "the" and "in").
Match on the whole string. No need to try and do joins based on partial words within a string.
This seems like a very common thing (I've created a macro for this anyway) that could be made to be simpler for everyday use.
I have need to Unique my data on all fields. However, I do not know all the fields that will be in my data because of a Dynamic Select tool (or in some cases a Cross Tab tool). The Unique tool defaults to having any new fields be unchecked in its configuration. Without finding some work around, I will be unable to schedule this workflow or turn it into an app for use on the Gallery without risking duplication caused by the Unique tool that fails to capture new/different fields that may come through the data.
To solve this problem and be consistent with other tools in Alteryx, the following features need to be added to the Unique tool:
The user should be able to check a "Dynamic or Unknown Fields" option that will enable Alteryx to include any new fields in the data as part of the Unique (or any fields that have their names changed).
If a field that was checked in the Unique tool before no longer exists, it should appear yellow (like the missing fields in the Select tool) and produce a warning in the Results window. The option to "Forget all missing fields" will also need to be added.
The v10 formula configuration window had two very small advantages. First, it always had an extra 'line' for another output field (no pressing '+' required). Second, it defaulted to letting you immediately begin typing the name of the next column (no need to press 'Select Column' then 'Add New Column'). I know these are minor, but every little thing counts when you're doing heavy development.
It has been brought up that the following comments were given during the beta. While I appreciate the reasoning of requiring 'obvious intention,' my personal opinion is that it is overkill in this scenario. Even for new users, the old design was quite intuitive.
"Thanks for taking the time to provide feedback! This touches a conversation topic that has been ongoing here at Alteryx. While we want workflow development to be as fast as possible, we also are trying to address the overall usability of the tool and make sure it is very clear what we intend the user to do. We decided to have the UI ask for an explicit action (pick an existing field to edit or click to add a new field) to help make those options clear, as we have found that users don't always understand from the existing tool that this is the first decision they should make when using the tool. That being said, your feedback is definitely valuable. I will be sure to bring this up as we are making improvements to the new tool and see if there's a compromise that we can make on speed vs. obvious intention. Thanks for taking the time!"