Hi, I hope this sort of question is ok....
I have a server running Hadoop HDFS with a DataNode and two NameNodes. I have a large CSV sat in HDFS which I would like to run some in database processing on. My initial question is around Hadoop.... Does anyone know that when I connect to Hive and start InDB processing does this automatically 'invoke' the clustered power of Hadoop and will this speed up the processing time???? Sorry if this is aimed at Hadoop clustering - but we intend to use Alteryx for our major ETL work.
Many thanks,
Fiorano
Hello @fiorano
Alteryx has two modes : in memory (classic mode) and in database (where the data is processed by the databse, here the datalake). So you can upload your csv on the HDFS and then use the indb tools to query it on Alteryx.
Best regards,
Simon