Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

split a large dataset

Saraabdi955
8 - Asteroid

Hi,

I have over 10 milion records in a dataset and I want to import records to a SQL Server Table.

But I want to import each 100000 records in Database.

Can you help me how can I split the records and imports them in database?
Thanks a lot,

9 REPLIES 9
Qiu
21 - Polaris
21 - Polaris

@Saraabdi955

Maybe we use the Record limit function in Input Tool together with a batch macro.

Capture.PNG

Saraabdi955
8 - Asteroid

No, this dataset is sumation of so many files and I want split iteratively each 10000 records for example, first records 1 to 10000 import to database, then records 100001 to 20000 import to the same table of database and to end ...

Qiu
21 - Polaris
21 - Polaris

@Saraabdi955 

If it is sumation of many files, maybe we can read the files batch, like 5 per batch.

danilang
19 - Altair
19 - Altair

Hi @Saraabdi955 

 

Here's a demo of how you can split a large dataset using a batch macro

 

main.png

 

The first container just generates 1000 rows of data.  The StepSize Formula tool defines the size of the batch.  The Make Batches container finds the max row count and generates a new record from 1 to max incrementing by [StepSize].  These records are passed to the control input of the batch macro with the actual data passed to data input

 

macro.png

 

The macro itself is quite simple.  The formula in the filter 

 

 

[RowCount] >= 1 and [RowCount] < 1+[StepSize]

 

 

is modified by the Action tool to replace the "1" with the value in the control parameter.  The control parameter changes on each iteration, so on the first one the criteria is '[RowCount] >=  1 and [RowCount] < 1+[StepSize].  On the second iteration, the parameters are change to 101 and 101+[StepSize], etc.  The Select tool removes the [StepSize] field from the output

 

In your workflow, you would swap out the Summarize in the macro with an Output tool writing to your database.  You can remove the macro output completely or put it parallel with the Output to return some information to the calling program.    In the main workflow, change {StepSize] to be 10000 ro whatever is required

 

Dan

Saraabdi955
8 - Asteroid
Thank you so much for helping me, if it's possible please share with me the setting of macro in main workflow. Thanks again
danilang
19 - Altair
19 - Altair

Hi @Saraabdi955 

 

It's all in the attached package.  Click on the macro tool in the main workflow to see the settings in the config window

 

danilang_0-1612106182349.png

 

 

 

Dan

Saraabdi955
8 - Asteroid

sorry but This is what i can see in main workflow...

Saraabdi955
8 - Asteroid

Sorry but 1.png is what I can see in main workflow...

and my mean was 2.png and 3.png setting that I don't know how it is ..

Thank you so much Dan

danilang
19 - Altair
19 - Altair

I see your point, @Saraabdi955 

 

The first time I tried opening the .yxzp file, the macro didn't get extracted, but it did the second time.  Try dragging it onto designer again and say Yes if prompted to overwrite files

 

Dan

Labels
Top Solution Authors