The previous two blogs in this series have been about loading data into snowflake and bulk updating data in Snowflake using a MERGE statement.
My fellow Sales Engineers would say, so what? Why did we load spatial data into Snowflake?
Well we have loaded spatial data into Snowflake, to show the art of the possible but also to set you up, my dear reader, to try out spatial In-DB queries for yourself.
Source: imgur
This blog follows the previous one on how to ulitise spatial data in Snowflake.
We will use the In-DB Tools, so we can really harness the power of your databases and extract the detail we need from the start of your workflows.
For example, for all stores within a particular county, we can calculate how many stores are within an area and find your nearest stores. These are all great examples of when spatial querying is used within an organisation. Let’s do this!
Source: gfycat
Resource: https://docs.snowflake.com/en/sql-reference/functions/st_within.html
Let’s walk this through… we could bring in two datasets and do a spatial match tool and get the answer we want but remember spatial data is large and power/memory intensive.
My alternative option here is to run a WITHIN query on the database itself using the In-DB tools. It would be much faster and lighter for your production database to do this rather than you try and bring down the dataset each time, especially if you were then to filter on the specific county you wanted rather than the whole country.
Here is a simple WITHIN query workflow for the whole country to start with.
Example Spatial SQL Within Clause:
Select s.ID, s.G STORE_LOCATION, c.ID county_id, c.G COUNTY_BOUNDARY
from "Database"."Schema"."STORES" s, "Database"."Schema"."COUNTY" c
where ST_WITHIN(s."G", c."G") = 1;
Using SQL, we are selecting the fields from the tables s (stores) and c (county) we would like. Then within the WHERE clause, we use the function ST_Within(point, polygon)=1.
Basically, we ask Snowflake to return only those which are within a county boundary. If you had points outside of any boundaries, they would not be returned.
Results: all my stores with their corresponding county boundary 😊
Resource: https://docs.snowflake.com/en/sql-reference/functions/st_intersects.html
Another great example of Spatial querying is the use of a bounding box. For example, you may be looking at data within a specific location on a map, between four corners of a screen. 😉 And you want to see all the data that falls within that area.
This is when you would utilise a bounding box
There are many ways within Alteryx to get the coordinates of a bounding box—I can share this another time.
For now, let’s take Southampton:
I would like to see all the stores that fall within this box/area that I am interested in.
Within Snowflake, this would be a ST_Intersects query, i.e., show me where on this plane (geometric) points intersect with it. Yes, we are treating this as 2D, not 3D.
Select * from "Database"."Schema"."STORES" s
Where ST_Intersects(s."G",
to_geography('POLYGON((-1.95682845068099 51.383915269452,-0.7293442833561 51.383915269452,-0.7293442833561 50.7061091658136,-1.95682845068099 50.7061091658136,-1.95682845068099 51.383915269452))'));
When you pop this code within your InDB tool, you get the results almost instantly.
You receive 750 results within 3.7 seconds using AMP.
To do this traditionally in Alteryx, you would bring in the dataset and do a spatial match which means loading the entire dataset, doing the match within Alteryx, and then outputting the results. Here, you start with the results right at the start of your workflow to begin doing the deeper analysis. 😊
The time saving here on 17K points is around 5.9-3.7=2.2 seconds with AMP.
More than one table? No problem…
(Select * from "Database"."Schema"."STORES" s
Where ST_Intersects(s."G",
to_geography('POLYGON((-1.95682845068099 51.383915269452,-0.7293442833561 51.383915269452,-0.7293442833561 50.7061091658136,-1.95682845068099 50.7061091658136,-1.95682845068099 51.383915269452))')))
UNION
(Select * from "Database"."Schema"."COUNTY" c
Where ST_Intersects(c."G",
to_geography('POLYGON((-1.95682845068099 51.383915269452,-0.7293442833561 51.383915269452,-0.7293442833561 50.7061091658136,-1.95682845068099 50.7061091658136,-1.95682845068099 51.383915269452))')));
Wrap each one in brackets and stack them with a UNION clause. Keep the results simple here, i.e., only return details from each table where the columns align: name, description, spatial object, etc.
Can Snowflake do that too? Yes, it can.
Resource: https://docs.snowflake.com/en/sql-reference/functions/st_distance.html
Quick use case, please tell me the top 5 stores closest to this point: -1.343086 51.045012
GeoJSON format: {"type":"Point", "coordinates":[-1.34308636701855,51.0450122176328]}
Here’s the spatial SQL to find the distance between the origin and the destinations:
Select
s."ID",
'{"type":"Point", "coordinates":[-1.34308636701855,51.0450122176328]}' as origin,
s."G" as destination,
ST_DISTANCE(to_geography('POINT(-1.34308636701855 51.0450122176328)'),s."G")/1000 as kilometers,
ST_DISTANCE(to_geography('POINT(-1.34308636701855 51.0450122176328)'),s."G")/1609 as miles
from "Database"."Schema"."STORES" s
order by
ST_DISTANCE(to_geography('POINT(-1.34308636701855 51.0450122176328)'),s."G")/1000;
Want the top 5? No problem – just add a limit clause at the end of your SQL statement, in this case LIMIT 5;.
Select
s."ID",
'{"type":"Point", "coordinates":[-1.34308636701855,51.0450122176328]}' as origin,
s."G" as destination,
ST_DISTANCE(to_geography('POINT(-1.34308636701855 51.0450122176328)'),s."G")/1000 as kilometers,
ST_DISTANCE(to_geography('POINT(-1.34308636701855 51.0450122176328)'),s."G")/1609 as miles
from "Database"."Schema"."STORES" s
order by
ST_DISTANCE(to_geography('POINT(-1.34308636701855 51.0450122176328)'),s."G")/1000
LIMIT 5;
It took 6.2 seconds to calculate the distance between the location provided and all the stores within the stores table (~17K) but what is great is that we have all origins and destinations alongside their distances in both kilometers and miles.
We hope you have enjoyed this mini-series using Alteryx, Snowflake & Spatial Methods.
Did you enjoy this series? Did it help you within your organisation? Please let us know your use cases and how you applied these techniques.
WANT MORE: We would love to know what you would like to learn more of next, please let us know in the comments.
Samantha is well known for sharing her knowledge and evangelising Alteryx with new users both in her role as a Sales Engineer and at INSPIRE. Her passion lies with making every day business chaos, simple and automated giving people hours back in their day jobs and personal lives. With over 15 years experience in data, she enjoys solving puzzles, spatial analytics, writing blogs and helping others develop themselves into gurus like herself.
Samantha is well known for sharing her knowledge and evangelising Alteryx with new users both in her role as a Sales Engineer and at INSPIRE. Her passion lies with making every day business chaos, simple and automated giving people hours back in their day jobs and personal lives. With over 15 years experience in data, she enjoys solving puzzles, spatial analytics, writing blogs and helping others develop themselves into gurus like herself.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.