I am looking for suggestions for integrating data flows with both Consuming data from and Publishing data to Alteryx workflows from a Kafka Channel. I have found from my research that using Kafka-Python is about the only option. I wanted to reach out to the community to find out what sorts of success has been had using this approach. Secondly are there any other methods that have been used successfully. I would like to suggest that Kafka be added as an option in the INPUT/OUTPUT tools.
Do you mean Alteryx Designer or Cloud Designer? This is the Desktop forum - so assuming you posted in the right place - Can you explain to me how the architecture here is going to work? So like let's say you have a Kafka input/output tool - how do you imagine this works?
I'll be building the workflow on desktop and then it will run on the server. It won't be on cloud. The output will be published on Kafka.
the output will be published TO kafka. You can build an API to push it as json to Confluent or wherever you Kafka cluster is housed - basically your Kafka broker is probably set up for protobuff or json - I'd publish via api in Json. I don't see this as a good design - at it's most basic - Kafka is for tons of tiny amounts of straming data and Alteryx is really a batched process. So like let's say your workflow runs 10 times every minute - that's a ton for Alteryx and requires lots of Server resources (ie cores/nodes/licenses etc) - for Kafka - that's nothing. I'm not sure how Alteryx fits here - but I'd recommend using something else (ie Lambda?) either for Alteryx or for Kafka. There are more cost effective ways to use Alteryx that frequently and use Kafka that infrequently.