Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Counting Records from hive in Alteryx

becki
8 - Asteroid

if i connect to hive with hortonworks hive odbc versin 2.1.2.1002 (alteryx 10.6.9.21458), is there a way to get a count of how many records are in the original database so I know how long the workflow will take to run?  right now, i limit the record count, just to write the logic of the code, but I wanted to get an idea of how long running the entire data set would take

 

thanks

 

becki kain

 

1 REPLY 1
MarqueeCrew
20 - Arcturus
20 - Arcturus

StackOverflow has some suggestions:  https://stackoverflow.com/questions/21208574/hive-query-to-quickly-find-table-size-number-of-rows

 

-- gives all properties
show tblproperties yourTableName

-- show just the raw data size
show tblproperties yourTableName("rawDataSize")

and...

 

Here is the quick command

ANALYZE TABLE tablename [PARTITION(partcol1[=val1], partcol2[=val2], ...)] COMPUTE STATISTICS [noscan];

For Example,If table is partitioned

 hive> ANALYZE TABLE ops_bc_log PARTITION(day) COMPUTE STATISTICS noscan;

output is

Partition logdata.ops_bc_log{day=20140523} stats: [numFiles=37, numRows=26095186, totalSize=654249957, rawDataSize=58080809507]

Partition logdata.ops_bc_log{day=20140521} stats: [numFiles=30, numRows=21363807, totalSize=564014889, rawDataSize=47556570705]

Partition logdata.ops_bc_log{day=20140524} stats: [numFiles=35, numRows=25210367, totalSize=631424507, rawDataSize=56083164109]

Partition logdata.ops_bc_log{day=20140522} stats: [numFiles=37, numRows=26295075, totalSize=657113440, rawDataSize=58496087068]

OK

Time taken: 5.252 seconds

 

Cheers,

Mark

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels
Top Solution Authors