Is there a way to systematically determine dependent columns?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Good Morning all. I'm looking for a way to determine columns that have dependencies downstream in a workflow. For example, if a data table has 100 columns and during the workflow creation process, I end up only using 30 columns, how can I systematically deselect those columns that end up not being needed (like the SELECT tool's Forget All Missing Fields) - Without having to manually deselect them...
- Labels:
- Datasets
- Input
- Preparation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Can you just put a Select tool at the end of your process and deselect "Unknown Fields"? That way you set it once and never have to worry about it again
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm hoping to put something at the beginning of the process to be able to clean up unneeded data , after, I've created a finalized workflow. At the beginning of the creation process, I'm not 100% sure of the needed fields and as the workflow is tested, modified, and then finalized, I can go back to the beginning of the process and remove unneeded columns.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Yep, you can put a Select tool then at the beginning for the same purpose - you can always add columns later on as needed using the same Select tool!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I guess I'm not following. With the select tool at the beginning, I'm not sure what data points I need so I have all selected. Unselecting the "Unknown Fields" just prevents any new fields from being added to the downstream flow.
My dilemma is that after 100+ steps with multiple joins, unions, etc..., as I complete the flow, there are many unneeded data fields that I would like now to not move through the flow. I didn't know what I would need until I finished.
I want to find a way to determine unneeded fields without having to manually unselect them. It would be nice if we could work backwards in the flow and remove fields with no downward dependency (like Excel's Trace Precedents or Trace Dependents).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I see - No there is not a way that I am aware of for Alteryx to look at your flow and dynamically determine which columns are used throughout the flow and then deselect them.
My suggestion is to take the manual approach - look through it once complete and make those determinations about what is used versus what is not. Yes it is a pain to do it once, but it will provide an opportunity to review and document your flow plus give you more control on what the end output is. For example, what if there is a column that some logic determines is "unneeded" where you really do need it at the end? You have the ultimate process and goal in mind and therefore the ultimate say in what gets kept
Put that Select tool down at the beginning once complete and test removing columns one at a time - if it doesn't break your flow, you likely don't need it! I am sorry there is not a better solution that I am aware of - I would love to see if there is a solution but I am doubtful
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks!
