Splitting Records dynamically, based on a column values
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hey,
I have this huge dataset of more than 80GB, that has transactional data of more than 10 years.
I need to create smaller extracts - one for each year (10 extracts in total), so that it is easier for me to run workflows.
As of now, I am using a series of Filter tools - which basically means I have to put about 9 tools to get the desired result.
Is there a way I can do this using a single tool?
While we are at it, Can i use multiple columns to partition data? e.g - 1 extract for all transaction through Android App in 2010, another for all trans through IOS in 2010, and so on...
Regards,
Anujay
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Consider creating a new partion field [Part] that includes the YEAR and the OS. This field can be used as a PREFIX or a SUFFIX to your Dataset name. You would then INPUT your 80GB data and OUTPUT the data using the "Take File/Table Name From Field" option at the bottom of the Output Data tool configuration options.
You can un-check the "Keep Field in Output" if you want to remove the field from your data. This single output tool will "Filter" records to the appropriate partion with a SINGLE tool.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Perfect! Thanks a lot for the prompt response.
Regards,
Anujay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Mark,
I followed your suggestions on this problem and found that the changes that you recommended to the "Output" tool produced a single sheet with multiple tabs corresponding to the unique values in the field used to partition the data.
Is there a way to output the data based on the unique values in the filed to altogether different files instead of different tabs within the same file.
For eg., if the field being used for this operation has 3 unique values say - Jan, Feb and Mar, then creating 3 different files with Jan_FileName, Feb_FileName and Mar_FileName instead of one file with Name -"FileName" and 3 different tabs - Jan,Feb and Mar.
Best Regards,
Ashish Bhavnani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Please check out the workflow to see it in action. All paths are relative. You might want to specify where to save the files.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Mark,
Thanks for your prompt response.
I followed the workflow that you shared. I am again seeing a tab labeled as the Output FileName variable. The excel sheet that contains the tab still continues to retain the Name that was defined as part of the configuration of the Output Tool.
Could you point to anything that I might be doing incorrectly.
Best Regards,
Ashish
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Please PM me with your email and availability either shortly or tomorrow morning (Eastern Time) and I'll setup a WebEx to review your workflow with you.
Thanks,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
We just had to modify the FULL Path and everything worked.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
hi, may I know what modification on the FULL path means?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
i am trying to achieve something like this but i need to split the data for transformation rather than writing the data in excel.
Regards,
Yash
