Be sure to review our Idea Submission Guidelines for more information!
Submission GuidelinesHello,
After used the new "Image Recognition Tool" a few days, I think you could improve it :
> by adding the dimensional constraints in front of each of the pre-trained models,
> by adding a true tool to divide the training data correctly (in order to have an equivalent number of images for each of the labels)
> at least, allow the tool to use black & white images (I wanted to test it on the MNIST, but the tool tells me that it necessarily needs RGB images) ?
Question : do you in the future allow the user to choose between CPU or GPU usage ?
In any case, thank you again for this new tool, it is certainly perfectible, but very simple to use, and I sincerely think that it will allow a greater number of people to understand the many use cases made possible thanks to image recognition.
Thank you again
Kévin VANCAPPEL (France ;-))
Thank you again.
Kévin VANCAPPEL
When building API calls within Alteryx there are a few common steps required
1) Build out the URI for the API call (base URL plus any query parameters)
2) Deal with authentication, such as basic authentication requires taking a key and secret, base 64 encoding and passing this into the tool
3) parsing the results out and processing these downstream
For this idea I am specifically focusing on step 3 (but it would be great to have common authentication methods in-built within the download tool (step 2)!).
There are common steps required to parse out the results, such as using Filter (to check for a 200 response), JSON parse, text to columns and then cross tab to get the results into a readable format. These will all be common steps anyone who has worked with APIs will be familiar with:
This is all fine for a regular user to quickly add in and configure these tools. However there is no validation here for the JSON result being as expected, which when embedding an API into a batch macro or analytic app means it can easily fail.
One example of a failure which I've recently come across is where the output JSON doesn't have all fields (name:value pairs) depending the json response. For example using the UK Companies House API, when looking at the ceased to act field at this endpoint - https://developer-specs.company-information.service.gov.uk/companies-house-public-data-api/resources... the ceased to act field only appears in the results if a person has actually ceased to act. This is important if you have downstream tools such as a formula to create a field [Active] where you have:
IF ISNull([ceased_to_act]) THEN "Active" ELSE "Ceased to Act" ENDIF
However without modification the macro / app will error if any results are returned where there is not this field.
A workaround is to add in the Crew Ensure Fields or union on a list of fields, to ensure that the Cease to Act field is present in the output for all API calls. But looking at some other tools it would be good if an expected Schema could be built in to the download tool to do this automatically.
For example in Power Automate this is achieved as follows:
I am a big advocate of not making things unnecessarily complicated. Therefore I would categorise this as an ease of use feature to improve the experience of working with APIs within Alteryx and make APIs (as load of integrations are API based) accessible to as many users as possible.
Let's "Elevate" Alteryx to enter the Euclidean space and add the Z-Coordinates to our spatial tools!
Cleanse Macro
Given a choice between the delivered macro and the CReW macro, I’ll choose the CReW macro for both speed and functionality. Wikipedia says, “Data cleansing or data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data.” If Alteryx were to convert the macro to a true tool, here is my feature request list:
Performance:
Feature Enhancement:
Going the extra mile:
Hi, I'm currently feeding about 4,000 URLs into the Download tool. Some of these may be old websites that are no longer hosted. As the records get passed, if the process runs into one of these (random, 1 in a 100 chance), the whole thing stops executing and errors out.
It would be fantastic if I could just tell it to skip any records that error out, instead of having the whole thing error out because of one dud website. I don't know which ones are duds until I pass them through the tool, so there's no way to filter them out upstream. Real Catch 22 here!
I have a coworker who's going to tackle this in Python instead, and he says he can put in a "try, except block" into his code. His words, not mine, and I know nothing about python code...but given what I know about Alteryx, there must be a way to build this into the tool!
Thanks!
Added in Alteryx Version 2020.3, the Browse tool no longer shows a profile of the complete dataset (it is capped when the record data size reached 300MB).
My proposed solution is an optional override of the record size limit on the browse tool (which will make the profiling take longer, but actually profile the entire dataset). I would also like a general user setting to set the default behavior of the browse tool to either be limited or unlimited.
Below is the newly included documentation of the Data Profiling Limit, which I'm proposing can be overridden.
Data Profiling Limit
Data Profiling in the Browse tool is capped at 300 MB. This allows you to process very large datasets faster. For each record in the incoming dataset, we process the record and add the record size to a counter. Once the counter reaches 300 MB, we stop processing records.
It is important to note that there is no specific number of records that we can process. This depends on the dataset since a record size can range from 1 byte to a few thousand bytes. This record size is different from the file size, displayed in the Results grid and Data Profiling Holistic View. The file size is generally different since it has been compressed to optimize spacing.
In other words, 300 MB of record size is not the same as 300 MB of file size.
This new tool can cause confusion when looking at the data profile (e.g. if you expect the sum to be $3 million, but the browse tool is only showing 2% of your total records in the profile tool, the profile sum may only show $60 thousand).
The sampled version with a cutoff of 300MB is rarely useful if you are using browse tools to get a quick sense of the variable profiles on medium sized datasets (around 1 million records) since this rarely will fit into the 300MB record size limit.
An example can be shown in the image below, where the dataset contains 855,085 records, but the browse tool is profiling only the first 20,338.
Again, being able to override this 300MB record size limit would fix the problem created in the 2020.3 change to the browse tool.
When using the text mining tools, I have found that the behaviour of using a template only applies to documents with the same page number.
So in my use case I've got a PDF file with 100+ claim statements which are all laid out the same (one page per statement). When setting up the template I used one page to set the annotations, and then input this into the T anchor of the Image to Text tool. Into the D anchor of this tool is my PDF document with 100+ pages. However when examining the output I only get results for page 1.
On examining the JSON for the template I can see that there is reference to the template page number:
And playing around with a generate rows tool and formula to replace the page number with pages 1 - 100 in the JSON doesn't work. I then discovered that if I change the page number on the image input side then I get the desired results.
However an improvement to the tool, as I suspect this is a common use case for the image to text tool, is to add an option in the configuration of the image to text tool to apply the same template to all pages.
Hello all,
Some Database, including Hive, support natively scheduled queries (yes, the scheduling configuration is inside the database, not through etl/dataprep system). I think this would be an interesting feature for in-db workflow output : you play the worflow once and then only have to run it when it changes, the database do the scheduling.
https://cwiki.apache.org/confluence/display/Hive/Scheduled+Queries
Intro
Executing statements periodically can be usefull in
Best regards,
Simon
Alteryx doesnt support querying tables within Apache Ignite via Ignite ODBC connector. Connectivity from Ignite being an in memory database with Alteryx would help in better connectivity via ODBC.
Currently, if you download and Alteryx package from an alternative version it doesn't allow import into a newer version.
Workflows allow this with a warning it would be good to allow it on packages too.
The email tool, such a great tool! And such a minefield. Both of the problems below could and maybe should be remedied on the SMTP side, but that's applying a pretty broad brush for a budding Alteryx community at a big company. Read on!
"NOOOOOOOOOOOOOOOOOOO!"
What I said the first time I ran the email tool without testing it first.
1. Can I get a thumbs up if you ever connected a datasource directly to an email tool thinking "this is how I attach my data to the email" and instead sent hundreds... or millions of emails? Oops. Alteryx, what if you put an expected limit as is done with the append tool. "Warn or Error if sending more than "n" emails." (super cool if it could detect more than "n" emails to the same address, but not holding my breath).
2. make spoofing harder, super useful but... well my company frowns on this kind of thing.
I've recently been delving into using the interface tools and there are a couple of glaring issues for me as a developer/designer, all having to do with the UI, ironically (yes, I used that correctly!) with the interface tools. The irony here is that the interface tools utilise a poor user interface.
Firstly, I finished this video to ensure I was indeed doing things correctly, and I was.
The UI for designer's interface tools is incredibly sluggish. In order to rearrange tools, each time you create a new one, you have to push the up arrow for each tool and you have to traverse the groupings.
Instead of this, I suggest two changes to the interface designer.
I know not everyone is building macros/apps and dealing with this, so I have little faith that this will jump to the top of your queue. But this is a painful part of the UI. I don't know if your UX designers could easily fix this or if it is more pain on your end than the pain you're giving me, but I just want to say: This hurts. 35 clicks every time I add a new element with no option to 'move to top' like you (wonderfully) do in the select tool is a big drag on my time (hint: maybe add that sort of functionality too; the select tool manages this stuff so well!). Which is supposedly valuable. In theory. But it certainly doesn't feel that way when I've spent 10 minutes clicking an up arrow (and yes, my UI is slow. And I may be exaggerating, but not by much!).
Thank you for your continued improvement!
-Çædric Justice
Alteryx Developer
Cambia Healthcare
Please enhance the dynamic select to allow for dynamic change data type too. The use case can be by formula or update in an action for a macro. If you've ever wanted to mass change or take precision action in a macro, you're forced to use a multi-field formula. It would be rather helpful and appreciated.
Cheers,
Mark
Two additions to the formula tool that would be great to see:
- When I select a function (like DateTimeParse) and hit F1 for help - please could you take me to the help for this specific function?
- For the parameters which are ordinal or "magic value" type parameters - please can you create a simple formula builder so that we don't have to keep on going into the help text to find out which % flag to use for a month in MM format; and which is MMM format.
- What I'm thinking here is a simple pop-up box that allows you to create the parameter you want
- Alternatively - provide a direct hint-text for the parameter in question or inline intellitext like Visual Studio or Eclipse
- Overall though - the date functions seem to have grown up at different times - and so they treat dates in different ways - dateTimeDiff uses "Hours" which is pretty common, the DateTimeParse uses magic values like "%Y", and the new date time conversion tool uses the standard form used in Windows of MM-YYYY etc. So it woudl be worth looking at a refactoring of the date functions to bring them all to a standard treatment of date parameters.
Thank you
Sean
Hi Alteryx 🙂
When you set maximum records per file, the filename gets _# appended. Great! But in reality you get:
Filename.csv
Filename_1.csv
Filename_2.csv
The first filename doesn't get a number. I think that it should.
Cheers,
Mark
The v10 formula configuration window had two very small advantages. First, it always had an extra 'line' for another output field (no pressing '+' required). Second, it defaulted to letting you immediately begin typing the name of the next column (no need to press 'Select Column' then 'Add New Column'). I know these are minor, but every little thing counts when you're doing heavy development.
It has been brought up that the following comments were given during the beta. While I appreciate the reasoning of requiring 'obvious intention,' my personal opinion is that it is overkill in this scenario. Even for new users, the old design was quite intuitive.
"Thanks for taking the time to provide feedback! This touches a conversation topic that has been ongoing here at Alteryx. While we want workflow development to be as fast as possible, we also are trying to address the overall usability of the tool and make sure it is very clear what we intend the user to do. We decided to have the UI ask for an explicit action (pick an existing field to edit or click to add a new field) to help make those options clear, as we have found that users don't always understand from the existing tool that this is the first decision they should make when using the tool. That being said, your feedback is definitely valuable. I will be sure to bring this up as we are making improvements to the new tool and see if there's a compromise that we can make on speed vs. obvious intention. Thanks for taking the time!"
I do a lot of work with SQL code in the PRE/POST SQL options and when I get an error, it usually returns the entire code and a little bit about what is wrong. These long strings are hard to read in the current tooltip format as if you hover over to see the entire error, the tooltip goes away after 5 seconds. So I am frantically reading through lines of error code 5 seconds at time. Can we make it so the tooltip just hangs out until I move my cursor off of it?
API Security requirements are constantly evolving and strengthening. As API architectures migrate from traditional authentication models (Basic, OAuth, etc.) to more secure, certificate-based models, like MTLS/MSSL, leveraging Alteryx Designer will become increasingly difficult, especially for larger organizations trying to scale the use of Alteryx across a large user base, with vastly diverse skillsets.
I realize issuing API calls with certificates is possible via the Run Command tool. We consider this a temporary workaround, and not a permanent, strategic solution. The Run Command tool can be clunky to use when passing in variables and passing the output back into the workflow for downstream processing.
Therefore, I would like to request a more scalable approach to issuing MTLS/MSSL API calls. Can an option be added to the Download Tool to allow for certificates to be passed on API calls?
Python pandas dataframes and data types (numpy arrays, lists, dictionaries, etc.) are much more robust in general than their counterparts in R, and they play together much easier as well. Moreover, there are only a handful of packages that do everything a data scientist would need, including graphing, such as SciKit Learn, Pandas, Numpy, and Seaborn. After utliizing R, Python, and Alteryx, I'm still a big proponent of integrating with the Python language much like Alteryx has integrated with R. At the very least, I propose to create the ability to create custom code such as a Python tool.
I often encounter situations where I need to apply the same formula to several columns. Doing this requires copy/pasting the formula several times and then updating the variable names in the formula for each output column. I wish there was a built in "Current Output Column" variable so that I could build one formula and use that for each column.
For example:
Please consider adding the ability in the Power BI Output Tool to create/modify multiple tables per dataset - having to work with only single table datasets in Power BI is very limiting.
Benutzer | Anzahl |
---|---|
25 | |
9 | |
6 | |
6 | |
5 |