This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
The Data Sources page currently lists all the different data sources that Alteryx supports - however for an administrator it's almost impossible to ensure that their designer users have the drivers for these, or are on the right version.
As an early step - can we add 1 more field to this list which points to the downloader for the driver where applicable
It appears that the Workflow Dependencies window does not report dependencies from all tools. In the example image, you can see that the file input from the Amazon S3 Download tool is not listed. Some tools may have dependencies that do not easily fit the current field structure of the window, but maybe the input/download tools could be listed with an asterisk or partial reference.
When a tool container is disabled, I'd like the lines that are going into it to be different from "enabled" lines.
They could be grey or dotted for example.
When working on a workflow and disabling containers, I find that the lines entering disabled containers become confusing or cluttering. It would be much simpler to focus my attention efficiently if lines that remain enabled could be distinguished quickly.
But it's still to hard to use. It requires you to have pre-knowledge of a bunch of parameters and different types of knowledge.
Can we improve the interface on this tool so that it can be used by folk who do not have a background in R - for example, take all the different inputs, and make them parameterized on drop-down boxes or input boxes on the tool?
I have a problem when transferring records between different O365 Sharepoint Sites. It seems that Alteryx cannot maintain 2 separate connections at the same time. I can transfer fine if I read from one site to a temp file and then, in another workflow, read from the file and write to the second site.
I can work around the problem using Block until Done, but there are some situations where I need to be able to compare between lists in 2 different sites and write back to one or both depending on the results. it would be much more convenient to be have multiple connections open simultaneously. I'm aware that Alteryx uses the SharePoint API to move information around. This API does allow multiple connections. I'm not familiar with the internals of how Alteryx accesses the API, perhaps the OAuth token is shared through out the workflow process, but this should be posssible
When I add a data connection to my canvas - it's only added to the Data Connections window under certain circumstances (e.g. when I use an alias, or the SQL connection wizard) rather than showing ALL data connections.
Given the importance of data connections for Alteryx flows - it would be better if ALL data connections were grouped together under a Data Connection Manager, which was as visible as the results window not buried deep in the menu system - and you could also then use this spot to change; share; alias etc.
In Microsoft SSIS there's a useful example of how this could be done - where the connections are very visibly a collection of assets that can be seen and updated centrally in one place. So if you have 5 input tools which ALL point to the same database - you only need to update the connection on your designer in one place - irrespective of whether this is a shared connection or not.
This wasn't pretty (actually, it was challenging and pretty when I was done with it)!
My client receives files that include a static and dated name portion (e.g. Data for 2018 July.xlsx) within the file there are multiple sheets. One sheet contains a keyword (e.g. Reported Data) but the sheet name also includes a variable component (e.g. July Reported Data). I needed to first read a directory to find the most recent file, then when I wanted to supply the dynamic input with the sheet name I wasn't able to use a pattern.
The solution was to use a dynamic input tool just to read sheet names and append the filtered name to the original Full Path.
[FullPath] + "|||<List of Sheet Names>"
This could then feed a dynamic input.
Given the desire to automate the read of newly received "excel" data and the fluidity of the naming of both files and sheets, more flexibility in the dynamic input is requested.
I suppose I could just bookmark this page, but that wouldn't help others. I frequently forget (I'm getting old) the format strings while creating custom datetime formulas. Is there a quick way to get to these format strings when in the context of creating a datetimeparse/datetimeformat formula?
As Alteryx becomes more focussed on the Enterprise - it is important that we build capabilities that support the needs of large-scale BI.
One of these critical needs is dealing with heterogeneous data from different systems that use different IDs for every critical entity / concept (e.g. client; product)
Here's the example:
- In any large enterprise - there are several thousand different line-of business systems
- Each of these was probably built at a different time, and uses a different key for specific concepts - like Client & Product
- Most large enterprises that I've worked at do not have a pre-built way of transforming these codes so...
- This means that any downstream analytics finds it almost impossible to give single-view-of-customer or single-view-of-product.
Solution option A:
Reengineer all upstream systems. Not feasible
Solution option B:
Expect some reference-data team to fix this by building translations. More feasible but not fast
Remaining Solution Option:
Just as Kimball talked about - the only real way is to define a set of enterprise dimensions, which are the defined master-list of critical concepts that you need to slice-and-dice by (client; product; currency; shipping method; etc) in a way which is source-system agnostic
Then you need a method in the middle to transform incoming data to use these codes. This process is called "Conforming"
What would this look like in Alteryx?
We would use the connect product to define a new dimension - say "Product".
Give this a unique ID which is source-system independant; and then add on the attributes that are important for analytics (product type; category; manufacturer; etc)
Then decide how to handle change (slowly changing dimension or SCD type 0,1,2, etc). Alteryx should take full responsibility for managing this SCD history; as do many of the competitors
We then create a list of possible synonym types (within Connect). For example - a product may have a synonym ID from your supplier; from your ERP system; from your point of sale system. that's 3 different IDs for any product.
We then load up the master data - this is painful but necessary
I read in data into alteryx via any input tool
I bring in a "Conforming" tool off the toolbox (new tool which is needed)
It asks me which column or columns I wish to conform
For each - it asks me which synonym type to use
It then adds a translated column for me to use which ties back to the enterprise dimension - and spits out the errors where the synonym is necessary.
In BI in smaller contexts, or quick rapid-fire BI - you don't have to worry about this. But as soon as you go past a few hundred line-of-business systems and are trying to do enterprise reporting, you really have to take this serious. This is a HUGE part of every BI persons's role in a large enterprise - and it is painful; slow and not very rewarding. If we could create this idea of a simple-to-use and high-velocity conforming process - this would absolutely tear the doors off enterprise BI - and no-one else is doing this yet!
It would be nice if this option would take you to the correct download page relative to the version the user has installed. Currently, this always loads the download page for the current version which is confusing for users of a company who are still required to use an older version.
When output is disabled, Alteryx's output tools are helpfully grayed out and include the message 'output has been disabled by the workflow properties.'
However, if a macro has an output, there is no visual indicator that output is disabled, even though the macro's output will also be suppressed by this workflow configuration.
Obviously, macros can be very complex, and could have both a file and a macro output, or have an optional file output, so these cannot be entirely locked out just because there is an output.
To that end, I suggest some other kind of color-coding/shading be applied visually to these tools, and that a message be added to the interface for these macros that says something like "output has been disabled, this macro may not perform all of its functions".
I just spent about 10 minutes debugging why a macro wasn't working properly in one workflow but was working in another, and it was because I had disabled output, which I wasn't thinking of because this particular macro uses the Render tool to produce a hyperlink. I wouldn't have spent more than 30 seconds on this if there was some kind of visual indicator showing me what I was doing wrong!
One of the biggest areas of time spent is in basic data cleaning for raw data - this can be dramatically simplified by taking a hint from the large ETL / Master data Management vendors and making this core Alteryx.
- Allow the users of the server & connect product to define their own Business Types (what Microsoft DQS calls "Domains")
- Example may be a currency code - there are many different synonyms, but in essence you want your data all cleaned back to one master list
- Then allow for different attributes to be added to these business types
- Currency code would have 2 or 3 additional columns: Currency name; Symbol; Country of issue
- Similar to Microsoft DQS - allow users to specify synonyms and cleanup rules. For example - Rupes should be Rupees and should be translated to INR
- You also need cross business type rules - if the country is AUS then $ translates to AUD not to USD.
- These rules are maintained by the Data Steward responsibility for this Business Type.
- This master data needs to be stored and queryable as a slowly changing dimension (preferrably split into a latest & history table with the same ID per entry; and timestamps and user audit details for changes)
- When you get a raw data set - user can then tag some fields as being one of these business types
- Example: I have a field bal_cur (Balance Currency) - I tag this as Business Type "Currency"
- Then Alteryx automatically checks the data; and applies my cleanup rules which were defined on the server
- For any invalid entries - it marks these as an error in the canvas; and also adds them to a workflow for the data steward for this Business Type on the server - value is set to an "unmapped" value. (ID=-1; all text columns set to "unmapped")
- For any valid entries - it gives you the option to add which normalised (conformed) columns you want - currency code; description; ID; symbol; country of issue
Data Steward Workflow:
- The data steward is notified that there is an invalid value to be checked
- They can either mark this as a valid value (in which case this will be added to the knowledge base for this business type) or a synonym of some other valid value; or an invalid value
Cleanup Audit & Logs:
- In order to drive upstream data cleaning over time - we would need to be able to query and report on data cleanups done by source; by canvas; by user; by business type; and by date - to report back to the source system so that upstream data errors can be fixed at source.