Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

[Solution] Designer slow on Hive In-DB on first workflow run + Initialization Timed Out

sraynal
7 - Meteor

Hi, 

 

Maybe this post will help someone.
I'm an Alteryx administrator. That problem gave me a headache for a week.


Problem
We use Alteryx 2022.1 connected to Hive In-DB.
We faced these issues : 

 

  • The first run of the day of a workflow using the Hive In-DB connection takes 4 minutes more than it should.
    The next runs on any workflows using the same connection are ok.

    Exemple : 

It's 9am, John run his workflow for the first time of the day.
It takes 7 min.
The next runs take 3 min.

 

  • If a user open a workflow connected to Hive In-DB and does not run it, the meta-data refresh are very slow and sometime end up with this message :
    "Error: Designer x64: Initialization Timed Out: Workflow must be run for field meta info to be accurate."

    The metada refresh is smooth if the user run one workflow using the same connection.

    Exemple : 

It's 9am, John open his Alteryx workflow to check some rules. He does not need to run it. 
Metadata refresh are super slow, he gets the "Initialization Timed Out", it's a pain for him to work.
Then he runs another workflow using the same connection.
By some black magic, the refresh time are now smoothly on the other workflow.

 

 

After some research we found that Alteryx tries to get info on existing temporaries tables.
Here are the Simba Hive driver logs.

 

 

 

 

 

Mar 15 11:51:02.126 INFO  680 Simba::ODBC::Statement::SQLGetStmtAttrW: Attribute: SQL_ATTR_IMP_ROW_DESC (10012)
Mar 15 11:51:02.126 INFO  680 Simba::ODBC::Statement::SQLGetStmtAttrW: Attribute: SQL_ATTR_IMP_PARAM_DESC (10013)
Mar 15 11:51:02.126 INFO  680 Simba::ODBC::Connection::ExecuteCatalogFunction: SQLPrimaryKeys("HIVE", NULL, "SHOW TABLES like 'ayx*'")
Mar 15 11:51:02.126 DEBUG 680 Simba::Hardy::HardyHiveClientFactory::CreateClient: Create HS2 client.
Mar 15 11:51:02.126 DEBUG 680 Simba::Hardy::HardyHiveCxnPool::GetHS2Cxn: Get HS2 connection from pool.
Mar 15 11:52:02.127 ERROR 680 Simba::Hardy::HardyTCLIServiceClient::GetPrimaryKeys: TTransportException: 
Mar 15 11:53:02.144 DEBUG 680 Simba::Hardy::HardyHiveCxnFactory::CreateThreadSafeHS2CxnWithoutServiceDiscovery: Create HS2 HTTP transport.
Mar 15 11:53:02.144 DEBUG 680 Simba::Hardy::HardyHiveCxnFactory::GetTSslSocketFactory: +++++ exit +++++: 
Mar 15 11:53:02.144 DEBUG 680 Simba::Hardy::HardyHiveCxnFactory::CreateTSocket: +++++ exit +++++
Mar 15 11:55:02.864 DEBUG 680 Simba::ThriftExtension::TETCLIServiceUtils::LogTOperationHandle: __isset.modifiedRowCount=false
Mar 15 11:55:02.864 DEBUG 680 Simba::ThriftExtension::TETCLIServiceUtils::LogTOperationHandle: hasResultSet=false
Mar 15 11:55:02.864 DEBUG 680 Simba::ThriftExtension::TETCLIServiceUtils::LogTHandleIdentifier: guid=
Mar 15 11:55:02.864 DEBUG 680 Simba::ThriftExtension::TETCLIServiceUtils::LogTOperationHandle: operationType=EXECUTE_STATEMENT
Mar 15 11:55:02.864 DEBUG 680 Simba::Hardy::HardyTCLIServiceUtils::ThrowOnNotSuccess: Backend returned non-SUCCESS status for operation GetPrimaryKeys: 
    TStatus.statusCode=ERROR_STATUS
    TStatus.infoMessages="*org.apache.hive.service.cli.HiveSQLException:org.apache.thrift.protocol.TProtocolException: Required field 'db_name' is unset! Struct:PrimaryKeysRequest(db_name:null, tbl_name:SHOW TABLES like 'ayx*', catName:hive):49:48""org.apache.hive.service.cli.operation.GetPrimaryKeysOperation:runInternal:GetPrimaryKeysOperation.java:100""org.apache.hive.service.cli.operation.Operation:run:Operation.java:260""org.apache.hive.service.cli.session.HiveSessionImpl:getPrimaryKeys:HiveSessionImpl.java:1028""sun.reflect.GeneratedMethodAccessor359:invoke::-1""sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43""java.lang.reflect.Method:invoke:Method.java:498""org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78""org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36""org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63""java.security.AccessController:doPrivileged:AccessController.java:-2""javax.security.auth.Subject:doAs:Subject.java:422""org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1730""org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59""com.sun.proxy.$Proxy76:getPrimaryKeys::-1""org.apache.hive.service.cli.CLIService:getPrimaryKeys:CLIService.java:416""org.apache.hive.service.cli.thrift.ThriftCLIService:GetPrimaryKeys:ThriftCLIService.java:812""org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys:getResult:TCLIService.java:1717""org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys:getResult:TCLIService.java:1702""org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39""org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39""org.apache.thrift.server.TServlet:doPost:TServlet.java:83""org.apache.hive.service.cli.thrift.ThriftHttpServlet:doPost:ThriftHttpServlet.java:210""javax.servlet.http.HttpServlet:service:HttpServlet.java:707""javax.servlet.http.HttpServlet:service:HttpServlet.java:790""org.eclipse.jetty.servlet.ServletHolder:handle:ServletHolder.java:848""org.eclipse.jetty.servlet.ServletHandler:doHandle:ServletHandler.java:585""org.eclipse.jetty.server.session.SessionHandler:doHandle:SessionHandler.java:224""org.eclipse.jetty.server.handler.ContextHandler:doHandle:ContextHandler.java:1180""org.eclipse.jetty.servlet.ServletHandler:doScope:ServletHandler.java:513""org.eclipse.jetty.server.session.SessionHandler:doScope:SessionHandler.java:185""org.eclipse.jetty.server.handler.ContextHandler:doScope:ContextHandler.java:1112""org.eclipse.jetty.server.handler.ScopedHandler:handle:ScopedHandler.java:141""org.eclipse.jetty.server.handler.gzip.GzipHandler:handle:GzipHandler.java:493""org.eclipse.jetty.server.handler.HandlerWrapper:handle:HandlerWrapper.java:134""org.eclipse.jetty.server.Server:handle:Server.java:539""org.eclipse.jetty.server.HttpChannel:handle:HttpChannel.java:333""org.eclipse.jetty.server.HttpConnection:onFillable:HttpConnection.java:251""org.eclipse.jetty.io.AbstractConnection$ReadCallback:succeeded:AbstractConnection.java:283""org.eclipse.jetty.io.FillInterest:fillable:FillInterest.java:108""org.eclipse.jetty.io.ssl.SslConnection:onFillable:SslConnection.java:259""org.eclipse.jetty.io.AbstractConnection$ReadCallback:succeeded:AbstractConnection.java:283""org.eclipse.jetty.io.FillInterest:fillable:FillInterest.java:108""org.eclipse.jetty.io.SelectChannelEndPoint$2:run:SelectChannelEndPoint.java:93""org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume:executeProduceConsume:ExecuteProduceConsume.java:303""org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume:produceConsume:ExecuteProduceConsume.java:148""org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume:run:ExecuteProduceConsume.java:136""java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149""java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624""java.lang.Thread:run:Thread.java:748""*org.apache.thrift.protocol.TProtocolException:Required field 'db_name' is unset! Struct:PrimaryKeysRequest(db_name:null, tbl_name:SHOW TABLES like 'ayx*', catName:hive):69:20""org.apache.hadoop.hive.metastore.api.PrimaryKeysRequest:validate:PrimaryKeysRequest.java:457""org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args:validate:ThriftHiveMetastore.java:-1""org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args$get_primary_keys_argsStandardScheme:write:ThriftHiveMetastore.java:-1""org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args$get_primary_keys_argsStandardScheme:write:ThriftHiveMetastore.java:-1""org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_primary_keys_args:write:ThriftHiveMetastore.java:-1""org.apache.thrift.TServiceClient:sendBase:TServiceClient.java:71""org.apache.thrift.TServiceClient:sendBase:TServiceClient.java:62""org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client:send_get_primary_keys:ThriftHiveMetastore.java:3994""org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client:get_primary_keys:ThriftHiveMetastore.java:3986""org.apache.hadoop.hive.metastore.HiveMetaStoreClient:getPrimaryKeys:HiveMetaStoreClient.java:2171""sun.reflect.GeneratedMethodAccessor105:invoke::-1""sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43""java.lang.reflect.Method:invoke:Method.java:498""org.apache.hadoop.hive.metastore.RetryingMetaStoreClient:invoke:RetryingMetaStoreClient.java:212""com.sun.proxy.$Proxy59:getPrimaryKeys::-1""sun.reflect.GeneratedMethodAccessor105:invoke::-1""sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43""java.lang.reflect.Method:invoke:Method.java:498""org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler:invoke:HiveMetaStoreClient.java:3133""com.sun.proxy.$Proxy59:getPrimaryKeys::-1""org.apache.hive.service.cli.operation.GetPrimaryKeysOperation:runInternal:GetPrimaryKeysOperation.java:89" 
    TStatus.sqlState=
    TStatus.errorCode=0
    TStatus.errorMessage="org.apache.thrift.protocol.TProtocolException: Required field 'db_name' is unset! Struct:PrimaryKeysRequest(db_name:null, tbl_name:SHOW TABLES like 'ayx*', catName:hive)" 
    TStatus.__isset.errorCode: true
    TStatus.__isset.errorMessage: true
    TStatus.__isset.infoMessages: true
    TStatus.__isset.sqlState: false
Mar 15 11:55:02.864 ERROR 680 Simba::ODBC::Statement::ExecuteCatalogFunction: [Simba][Hardy] (35) Error from server: error code: '0' error message: 'org.apache.thrift.protocol.TProtocolException: Required field 'db_name' is unset! Struct:PrimaryKeysRequest(db_name:null, tbl_name:SHOW TABLES like 'ayx*', catName:hive)'.

 

 

 

 

 

 

There is roughly 4 min between SQLPrimaryKeys("HIVE", NULL, "SHOW TABLES like 'ayx*'") and the error message.

The thing is Alteryx tries to do "stuff" with the temporaries tables by calling this function "SQLPrimaryKeys".
However the second attribute can't be NULL.
Hive retries a few times during 4 min then goes timeout.


Solution

We found that after every first run of each day on each connexion, Alteryx set these registry keys with the "today" date : 
HKEY_CURRENT_USER\SOFTWARE\SRC\Alteryx\InDB\LastTempTableCleanUp

sraynal_1-1680774983464.png


We think Alteryx tries to delete all remainning temporary tables everyday, when a client uses the connection for the first time of the day.

Which raise a question :
- What if a workflow generating temporary tables is running when the cleanup is done ? 

The solution to avoid timeout and slow refresh metadata is to update these key to the date of the day.


We do that with a script that runs on Windows startup and each time a user open Alteryx Designer.

We don't care about Alteryx cleaning up temporary tables. Our DBA already scripted that.


2 REPLIES 2
simonaubert_bd
13 - Pulsar

Hello @sraynal 

 

Well found !! To be frank, the temp tables are really badly managed by Alteryx:

-we can't give the exact path (database, schema..)

 

-when the worflow crashes, temporary tables aren't dropped

like you point, the issue is not only the time lost but more "- What if a workflow generating temporary tables is running when the cleanup is done ?" that can lead to simply... a crash in production !!

So right now, my recommandation to users is to not use temp table, replacing it with real table with a date in it (so that, it's deterministic)... and dropping this real tables just after with a macro (and with a batch for when workflows crashed).

simonaubert_bd
13 - Pulsar

Hello,

Here what I found in the the 2021.4 release notes :
https://help.alteryx.com/release-notes/designer/designer-20214-release-notes

 

simonaubert_bd_0-1680811362382.png

 

TDCB-1353
DE28666
In-DB tools: Temp tables don't get cleaned up.2021.4.1.04899Fixed


Sometimes, the fix is worse than the bug :(


Best regards,

Simon

Labels