Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Create Ranges for Output Splits Based on Data Values and Record Count

nwhite
8 - Asteroid

a large dataset (200m+ records) is reduced to the following 2 fields (read from a table, tiled and group by/count), file attached
 
the file needs to be split sequentially into X pieces with the number of records written to each file at or below a specified threshold.

the threshold is determined based on the total record count and the number of records allowed in each file

the split needs to be determined by a range of codes so the ranges can be used in other processing.  codes cannot be split across ranges

for example using the sample  

 

file looks like this
code count
1001 308
1003 2424
1005 1849
1007 777
1009 869
1013 340
1015 279
1017 141
1019 142

3 files should be created if the threshold is set to 3000
the resulting files would be split as follows

code count
1001 308
1003 2424
total count 2732 -- file1

1005 1849
1007 777
total count 2626 --file2

1009 869
1013 340
1015 279
1017 141
1019 142
1021 254
total count 2025 file3
   etc 

 

the ranges need to be used in another process (outside alteryx)   

 

there is no specific output format required but needs to identify the start and end points of range.  i.e. 

 

1001file1
1003file1
1005file2
1007file2
1009file3
1013file3
1015file3
1017file3
1019file3
1021file3

 

or  

1001 - 1003   1

1005 - 1007   2

1009 - 1021   3 

  

 

 Thank you! 

 

6 REPLIES 6
DanM
Alteryx Community Team
Alteryx Community Team

@nwhite  for a quicker response from the Community,  I would suggest telling everyone what you've tried to do in Alteryx so far to figure this out. 

 

I would suggest looking into an iterative macro to solve this use case.

https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Getting-Started-with-Iterative-Macr...

https://community.alteryx.com/t5/Interactive-Lessons/tkb-p/interactive-lessons/label-name/Macros

 

nwhite
8 - Asteroid

i've explored several different tools - multi row, tile, binning, running total,  but as i have limited use with all i am stuck.   

i will read up on iterative macros, another area i don't have experience with.    thanks for the suggestion on a place to start.  

mceleavey
17 - Castor
17 - Castor

Hi @nwhite ,

 

I think what you're trying to achieve here is an iteration where you create a running total of Count until 3000 is hit, then you put those records in a file and create the next running total for the next file.

To do this, I've built a simple iterative macro:

 

mceleavey_0-1636803591892.png

 

This creates a running total until the total goes over 3000. It then sets the running total to zero for those with a higher number and iterates them to the next run. The ones that fall within 3000 are given the first file number and passed out of the macro.

This repeats until they are all in a file.

 

The workflow takes the macro output and simply creates a dynamic filename, which you then use in the output tool as the output full path:

mceleavey_1-1636803714872.png

The results of the iterative macro are:

mceleavey_2-1636803743447.png

 

Hope this helps,

 

M.

 

 



Bulien

nwhite
8 - Asteroid

Thank you for this - but i am unable to import as my company has not yet updated to the most recent version of Alteryx and i can't import it.  Is there a way around this (short of installing the update?)   

PangHC
13 - Pulsar

or you can try with 2 multiple row formula tool

formula 1: if [Count]+[Row-1:Running Total]>3000 then [Count] else [Count]+[Row-1:Running Total] endif
formula 2: if [Count]=[Running Total] then [Row-1:File]+1 else [Row-1:File] endif

Pang_Hee_Choy_0-1637025862544.png

 

nwhite
8 - Asteroid

THANK YOU SO MUCH!   this is perfect! 

Labels
Top Solution Authors