Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
Samantha_Jayne
Alteryx
Alteryx

Samantha_Jayne_0-1665008505790.png

 

If you have caught my previous blog post on bulk uploading data into Snowflake with speed, then you know I left you a little promise to share with you how to bulk upload spatial data to Snowflake. Using primary keys, you can also use this process to bulk update too! Happy days.

 

Before we jump into how to do this, we need to understand a couple of things about Snowflake and Spatial data.

 

Firstly, the good news: Snowflake can handle spatial geography. However, it must be presented in a particular way. Their documentation is exceptionally helpful around this. Here is a snippet, link below if needed.

 

Samantha_Jayne_1-1665008505804.png

Source: Snowflake Geography Documentation

 

The two most common ways would be WKT (Well Known Text) or GeoJSON; all the others are variations of the same.

 

I have created a macro for WKT that I will share on Gallery (Sam, you just keep on giving – you are welcome!) However, Alteryx is a wonderful tool because it handles spatial data in the form of GeoJSON within the tool itself. All you need to do is change the data type from SpatialObj to VW_String within the select tool, and you have exactly what you need—no conversions needed. Fantastic and simple! Believe it or not, easily forgettable, so bookmark this page now! 😀😀

 

Now, all we need is a dataset. I have again used Geolytix Retail Points to support me on my quest. We are simply going to:

  1. Input the data
  2. Create Points
  3. In the Select tool:
    1. Rename Centroid to g
    2. Change its data type to VW_String (reducing the string length to something reasonable – Snowflake doesn’t like MAX length strings)
    3. For this example, I have turned all other fields off, but you can leave them on. It just means adding more fields in your SQL code. (I’ll show you where to do this!)
  4. Create an ID with a Record ID tool, then add a Browse tool

 

Writing Points.PNG

 

 

The data should look like this:

 

Samantha_Jayne_3-1665008505822.png

 

Now we need to focus on the Output tool and push the data into Snowflake. Whereas my previous blog was all about Post SQL statements within the Output tool.

 

In order to achieve uploading spatial data into Snowflake, we utilise both Pre and Post SQL statements.

 

Spatial Write Points Output.PNG

 

First Stop PreSQL

 

Pre SQL Point Write.PNG

 

drop table "Database"."Schema"."STORES";
create table "Database"."Schema"."STORES" (id integer, g geography);

 

As you can see, the PreSQL statements are very simple. In fact, the first time you run the workflow, you only need the create table statement.

 

Let’s see if you are reading or just copying code :D. If you run it with the drop table line before the table exists, you will get an error.

 

Here you are really creating the table using Snowflake syntax to ensure that the table includes a geography column. You would have to include all the columns you want to write to the database alongside their correct data types.

 

Second Stop PostSQL

 

Point Write Post SQL.PNG

 

MERGE INTO "Database"."Schema"."STORES" t1
using "Database"."Schema"."STORES_Temp" t2
on t1."ID" = t2."id"

WHEN MATCHED then UPDATE SET
t1."G" = t2."g"

WHEN NOT MATCHED THEN
INSERT (t1."ID", t1."G")
VALUES (t2."id", t2."g");

DROP TABLE "Database"."Schema"."STORES_Temp";

 

Now we use Snowflake Merge statements to push the data from the TEMP table to the PRODUCTION table. This means the GeoJSON is copied from a string field into a geography field which is why it needs to be a Snowflake agreed format. 😉

 

Then we drop the TEMP table to keep the database clean.

 

Now we have Spatial data in Snowflake!

 

Source: GIPHY

 

Let’s do this again but now push some Polygon data into Snowflake.

 

Write Polygon.PNG

 

Similar process here as above:

  1. Load Spatial data
  2. Change Data Type of SpatialObj to VW_String and the max size to a reasonably large one
  3. Add a Record ID tool
  4. Output tool
    1. PreSQL – Remember it’s database.schema.table (and its your database name and schema)

 

drop table "Database"."Schema"."COUNTY";
create table "Database"."Schema"."COUNTY" (id integer, g geography);

 

2. PostSQL

 

MERGE INTO "Database"."Schema"."COUNTY" t1
using "Database"."Schema"."COUNTY_Temp" t2
on t1."ID" = t2."id"

WHEN MATCHED then UPDATE SET
t1."G" = t2."g"

WHEN NOT MATCHED THEN
INSERT (t1."ID", t1."G")
VALUES (t2."id", t2."g");

DROP TABLE "Database"."Schema"."COUNTY_Temp";

 

Spatial Write Polygon Output.PNG

 

Congratulations you now have a polygon and point dataset in Snowflake!!

 

Tune in same time next week to learn how to do cool spatial SQL in Snowflake using Alteryx InDB tools!

 

Samantha Clifton
CGeog | ACE Emeritus | Sales Engineering

Samantha is well known for sharing her knowledge and evangelising Alteryx with new users both in her role as a Sales Engineer and at INSPIRE. Her passion lies with making every day business chaos, simple and automated giving people hours back in their day jobs and personal lives. With over 15 years experience in data, she enjoys solving puzzles, spatial analytics, writing blogs and helping others develop themselves into gurus like herself.

Samantha is well known for sharing her knowledge and evangelising Alteryx with new users both in her role as a Sales Engineer and at INSPIRE. Her passion lies with making every day business chaos, simple and automated giving people hours back in their day jobs and personal lives. With over 15 years experience in data, she enjoys solving puzzles, spatial analytics, writing blogs and helping others develop themselves into gurus like herself.

Comments
vizwhiz
5 - Atom

Several issues when attempting to follow this process when loading the Census Tract shapefile from the US Census Bureau (https://hazards.fema.gov/nri/data-resources#shpDownload) (I only attempted to load one statefips (11 District of Columbia) - :


PostSQL did not work for DC, returning an error.  Using the same sql within Snowflake also resulted in an error message: Geography validation failed: Loop 0: Edge (-77.089012,38.903912) -- (-77.088527,38.903713) crosses edge (-77.088552,38.903721) -- (-77.089002,38.903908)

 

StateFIPs for Hawii (15), did not return an error, but even though the resulting field seems to be a valid geography datatype, Tableau version 2023.1.2 picks that field up as a text field, not as a spatial object.  Using ST_DIMENSION funtion for the Hawaii data returns that all the data is of type '2', or Polygon (per this article re: Snowflake and Tableau -- https://medium.com/starschema-blog/tableau-map-layers-snowflake-geospatial-support-b3b0c1ec0847)

So, the two questions remain for me:
1. How to overcome the US Census Bureau's data that returns edge validation errors
2. Even if the data does reside in Snowflake, what's up with Tableau not recognizing the spatial geography datatype

 

 

 

 

vizwhiz
5 - Atom

One more comment -- I just did the simple approach, adjusted the spatial field to string, then uploaded it to a temp table in Snowflake.  Within Snowflake, then used this SQL to create my final table, using the try_to_geography function rather than just the to_geography.

create or replace table db.schema.NEW_TABLE_WITH_GEOGRAPHY copy grants as
select *, try_to_geography(geometry) GEOGRAPHY from db.schema.UPLOADED_TABLE_WITH_GEOMETRY_AS_VARCHAR

 

Once you are satisfied that the process works, this could also have been done as a post sql statement rather than in snowflake directly, or as @blyons mentioned, uploading with a blank field as a geography datatype, an update statement could have been used.