Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

split file based on column totals

nwhite
8 - Asteroid

a large dataset (200m+ records) is reduced to the following 2 fields (read from a table, tiled and group by/count)
file looks like this
code count
1001 308
1003 2424
1005 1849
1007 777
1009 869
1013 340
1015 279
1017 141
1019 142
1021 254
... file continues

the file ultimately needs to be split sequentially into X pieces with the number of records written to each file below a specified threshold.

the number of splits required can be determined based on the total record count and the number of records allowed in each file

the split needs to be determined by a range of codes so the ranges can be used in other processing.   codes  cannot be split across files -  i.e. all records with a specific code must be in a single file 

for example using the sample above

3 files should be created if the threshold is set to 3000
the resulting files would be split as follows

code count
1001 308
1003 2424
count total 2732 -- file1

1005 1849
1007 777
count total 2626 --file2

1009 869
1013 340
1015 279
1017 141
1019 142
1021 254
count total 2025 file3

and so on

in another step (outside alteryx) i'd need to know the first file includes the range of 1001-1003, 2nd is 1005-1007, 3rd file includes 1009-1021, etc

 

any help is appreciated!  

thank you 

5 REPLIES 5
AlteryxAlexB
Alteryx
Alteryx

Hi - I have attached a solution but please let me know if I have misunderstood the ask! 

 

You can set your threshold in the formula tool and then the workflow will divide the data into different groups based on that threshold value. I also included a container that will dynamically name each file with 1001to1003 etc. so you can see which records are contained within each file. 

 

The output looks like this: 

AlteryxAlexB_0-1637316224348.pngAlteryxAlexB_1-1637316312025.png

 

 

 

nwhite
8 - Asteroid

thank you - from what you returned as output it looks exactly what i need but we haven't yet upgraded so i'm unable to see your workflow.  i do have a solution now but appreciate your efforts!  

AlteryxAlexB
Alteryx
Alteryx

No problem, glad you've got a solution! 😊

Tanai_Goncalves
8 - Asteroid

@nwhite, good you have a solution. Just a tip, if you open the sample workflow sent by @AlteryxAlexB  as a XML and just change the version named on it, you'd probably be able to open it. 

 

Regards,

 

Tanai

nwhite
8 - Asteroid

@Tanai_Goncalves   LIFECHANGING 😀  thank you so much for pointing out this very handy tip!!

Labels