Using Alteryx to Create File(s) to Load into Graph Database
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
In the Data Science Blog article "Exploring Hidden Relationships: Alteryx Team 1's Journey at the HHS Opioid Code-a-Thon" (Blog Link) it talks about using Alteryx to combine disparate data from multiple sources into something loadable to a Neo4j's graph database. But no where in this great article does it actually step you through how it was done.
I have a use case to try and take logistic shipping information and optimize shipping routes and also compare the actual shipments against a "should be" routing guide.
I have done many wonderful things in Alteryx over the years, and I know there are tools within Alteryx like the R based Network Analysis and Neural Network or many of the spatial tools I could use, but I would like to see if using a Graph Database approach to this problem is a better solution.
The main issue I am having is how to automate, as much as possible, the ETL of multiple source files so that I can create file(s) with the proper Nodes, Properties and Relations structures/details that can then be loaded into Neo4j using Cypher.
I know this is an extremely open ended question I am asking here. I am just wondering if anyone has tried to undertake this activity and can provide me with a little bit of guidance on the best approach in Alteryx for creating properly formatted files that can be loaded into a graph database tool like Neo4J.
Thanks,
Keith
Solved! Go to Solution.
- Labels:
- Datasets
- Preparation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
A very clever colleague of mine, @jfgirouard, created an app that uploads nodes and relations to a local NEO4J instance using the API. Each of the nodes is a json statement that is set as the payload of Dynamic Input Post command
{ "statements" : [ { "statement" : "MERGE (n:SourceEntity { name:\"EntityName\"}) RETURN n" } ] }
Once the nodes are uploaded, the relations are sent
{ "statements" : [ { "statement" : "MATCH (a:Table),(b:Table) WHERE a.name = \"EntityName1\" AND b.name = \"EntityName2\" CREATE (a)-[r:REFERENCES {} ]->(b) RETURN r" } ] }
The JSON statements are built on the fly from information pulled from one of our databases. The strings are built from the inside out, first building out the Cypher strings and then wrapping them in JSON. The heavy lifting is done by a series of formula tools that update EntityName1 and EntityName2
The process for creating Cypher files should be similar, minus the JSON wrapper part
MERGE (n)
ON CREATE SET n.prop = 0
MERGE (a:A)-[:T]-(b:B)
ON CREATE SET a.name = 'me'
ON MATCH SET b.name = 'you'
RETURN a.prop
This statement could be built using a formula like
"MERGE (a:A)-[:T]-(b:B)
ON CREATE SET a.name = '" + [A] +"'
ON MATCH SET b.name = '" + [B] +"'"
Of course on a large scale, you'd probably want to have the boiler plate set as a series of templates like this(assuming all Merge statement have the same format)
MERGE (a:A)-[:T]-(b:B)
ON CREATE SET a.name = '__ANAME__'
ON MATCH SET b.name = '__BNAME__'
and use some mechanism to replace __ANAME__ and __BNAME__ at run time
Of course, the hard part will be digesting the Cypher manual:)
Good Luck
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Dan,
Excellent, Thanks for the reply!
This give me a starting point that I can build out from.
Keith
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Just a quick update.
I created two formulas to create Nodes and Relationship files based on Dan's feedback. This was a simple test for linking cities together.
For Nodes I used -
CREATE(N:City " + "{id:" + ToString([CityID]) + ",name:" +"'"+ (ReplaceChar([City], "'", "\'")) +"'"+ "});"
Note - the ReplaceChar function put the "\" escape character in front of any embedded quotes in City Name
For Relationships I used -
"MATCH(from:City{id:" + ToString([from]) + "}),(to:City{id:" + ToString([to]) + "}) CREATE (from)-[:SHIP_TO]->(to);"
The above two formulas create the Cypher compliant files that I can load in Neo4J Browser.
Finally, with the continued growth of Graph Databases, it would be nice for Alteryx to creates a new tool set that would allow input/output connectors for Graph Databases like Neo4j which software tools like Pentaho and Talend already have. I will probably log a enhancement request on this.
Keith
