This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
In the Data Science Blog article "Exploring Hidden Relationships: Alteryx Team 1's Journey at the HHS Opioid Code-a-Thon" (Blog Link) it talks about using Alteryx to combine disparate data from multiple sources into something loadable to a Neo4j's graph database. But no where in this great article does it actually step you through how it was done.
I have a use case to try and take logistic shipping information and optimize shipping routes and also compare the actual shipments against a "should be" routing guide.
I have done many wonderful things in Alteryx over the years, and I know there are tools within Alteryx like the R based Network Analysis and Neural Network or many of the spatial tools I could use, but I would like to see if using a Graph Database approach to this problem is a better solution.
The main issue I am having is how to automate, as much as possible, the ETL of multiple source files so that I can create file(s) with the proper Nodes, Properties and Relations structures/details that can then be loaded into Neo4j using Cypher.
I know this is an extremely open ended question I am asking here. I am just wondering if anyone has tried to undertake this activity and can provide me with a little bit of guidance on the best approach in Alteryx for creating properly formatted files that can be loaded into a graph database tool like Neo4J.
A very clever colleague of mine, @jfgirouard, created an app that uploads nodes and relations to a local NEO4J instance using the API. Each of the nodes is a json statement that is set as the payload of Dynamic Input Post command
The JSON statements are built on the fly from information pulled from one of our databases. The strings are built from the inside out, first building out the Cypher strings and then wrapping them in JSON. The heavy lifting is done by a series of formula tools that update EntityName1 and EntityName2
The process for creating Cypher files should be similar, minus the JSON wrapper part
ON CREATE SET n.prop = 0
ON CREATE SET a.name = 'me'
ON MATCH SET b.name = 'you'
This statement could be built using a formula like
ON CREATE SET a.name = '" + [A] +"'
ON MATCH SET b.name = '" + [B] +"'"
Of course on a large scale, you'd probably want to have the boiler plate set as a series of templates like this(assuming all Merge statement have the same format)
ON CREATE SET a.name = '__ANAME__'
ON MATCH SET b.name = '__BNAME__'
and use some mechanism to replace __ANAME__ and __BNAME__ at run time
Of course, the hard part will be digesting the Cypher manual:)
The above two formulas create the Cypher compliant files that I can load in Neo4J Browser.
Finally, with the continued growth of Graph Databases, it would be nice for Alteryx to creates a new tool set that would allow input/output connectors for Graph Databases like Neo4j which software tools like Pentaho and Talend already have. I will probably log a enhancement request on this.