Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
Samantha_Jayne
Alteryx
Alteryx

Please note: Location Intelligence now supports Zipped Shapefile uploads directly in the application. For a full list of supported file types, view this documentation.

 

Congratulations! You are one of the lucky people who have got their hands on Alteryx Location Intelligence, and now you want to see your data on the amazing new interactive map. But what is this geoparquet format, and how do I convert my existing shapefiles to geoparquet? Well, you are in the right place; this blog post will help you answer both of these questions.

 

Users want to have speed when using online maps, and this is where we need to understand the parquet file format. Geoparquet is a speedy, lightweight format that is easily readable and interoperable across many geospatial platforms. For more information: https://geoparquet.org/

 

Alteryx Location intelligence reads parquet files and the geometry column provided within them, which is fabulous as it gives us worldwide data at localized data speeds. 😊

 

So, I know what you are all thinking: Sam, how do I get my data into geoparquet format? I have shapefiles… will they work? The great news is, yes, you can convert your shapefiles into parquet with your very own Designer license using the GeoPandas Python library.

 

the big bang theory kaley cuoco gif.gif

 Source: Pinterest

 

You can do this in Designer, so if you were wondering how this blog will give you all you need to get up and running. We are going to build a workflow with a macro to support you on your quest.

 

The finished product will look like this:

 

image002.png

 

But wait, we need some Python libraries to achieve this task: GeoPandas & Shapely (update)

 

To install GeoPandas on Designer, you must use the Python tool and run Designer in administrator mode. To run Designer in administrator mode, simply right-click the Designer icon when opening and click Run as Administrator.

 

image004.png

 

 

Once Designer is open, drop down a Python tool to your canvas and run the following code:

 

 

 

 

from ayx import Package
Package.installPackages(['pandas==1.1.0', 'geopandas==0.10.2','Shapely==1.8.2','pyarrow'])

 

 

 

image005.png

 

 

Now you should have everything you need to achieve this task. Once installed, comment out this line in your code with a “#” so that you know the requirements of the task but don’t need to run it every time.

 

image007.png

 

The error message below is the kind of error you get if you don’t have the latest Shapely library installed: “a polygon does not itself provide the array interface. Its rings do.” If you see this error, investigate your Shapely library version. 😉

 

image008.png

 

Let’s build a macro

 

This is the project structure we are working in today: Create a folder called GeoParquet with the following folders inside:

 

image010.png

 

Your finished macro workflow should look like this:

 

SHP2PARQUET MACRO.PNG

 

First things first, you need a text input that is going to feed in your requirements to run the macro:

  • FullPath
  • FileName
  • Output_Location

Once you have these sorted in your text input, you can right-click and turn it into a Macro Input for your macro.

 

image011.png

 

Now we are going to use a formula tool to update our Output_Location with the FileName replacing the .shp with .parquet for your output extension, so that we can keep the original name of the file but just update the file extension to what we need for the output.

 

image012.png

 

Now it's time for the Python tool: First, don’t worry; you have already dealt with the necessary missing packages 😉 if you haven’t, please see above!

 

Copy the following code block into a Python tool, into a new cell.

 

 

 

 

#import libraries required
from ayx import Alteryx
import geopandas as gpd
import warnings

#read in data from connection #1 into the python tool
InputDF = Alteryx.read('#1')

#function to convert each shapefile row provided into a new geoparquet file
def convert_shp_to_geoparquet(row):
    try:
        #ignore parquet warnings
        warnings.filterwarnings('ignore', message='.*initial implementation of Parquet.*')
        
        # Extract the shapefile path and output path from the row
        shp_path = row['FullPath']
        output_path = row['Output_Location']
        
        print(shp_path)
        print(output_path)
        # Read the shapefile using geopandas
        gdf = gpd.read_file(shp_path)     

        # Convert the geopandas GeoDataFrame to GeoParquet format
        gdf.to_parquet(output_path)
      
        print(f"Shapefile converted to GeoParquet successfully: {shp_path} -> {output_path}")
        return "Shapefile converted successfully."

    except Exception as e:
        print(e)
        return e

# Assuming you have a pandas DataFrame called 'df' with 'shapefile_path' and 'output_path' columns
InputDF['Results'] = InputDF.apply(lambda row: convert_shp_to_geoparquet(row), axis=1)
Alteryx.write(InputDF,1)

 

 

 

 

Want to understand a bit more? The GeoPandas library allows us to read and transform.

 

To convert from shapefile to geoparquet, you need to read in your shapefile into a geodataframe:

 

 

 

 

gdf = gpd.read_file(shp_path)

 

 

 

 

Then we utilize geopandas conversion function on our geodataframe: to_parquet()

 

 

 

 

gdf.to_parquet(output_path)

 

 

 

 

To double-check your work, you can read it back in as parquet.

 

 

 

 

gdf = gpd.read_parquet(output_path)

 

 

 

 

When running successfully, you should get the following results:

 

Samantha_Jayne_9-1690911712278.png

 

Overall, it’s very simple to build the macro/tools you need to build geoparquet files directly within Designer. If time is of the essence and you desperately need to convert, please see this link to download the macro directly from gallery.

 

All you need to do is point a directory tool at the location of your shapefiles and build an output location in a formula tool, and you are good to go. (As long as you have installed the GeoPandas library!)

 

Spatial SME Team is putting spatial back to the heart of analytics because everything happens somewhere!

 

Please see example shapefiles attached to this blog should you wish to try this out. 

 

To learn how to put your geoparquet data on the interactive map, check out the next article in this series.

Samantha Clifton
CGeog | ACE Emeritus | Sales Engineering

Samantha is well known for sharing her knowledge and evangelising Alteryx with new users both in her role as a Sales Engineer and at INSPIRE. Her passion lies with making every day business chaos, simple and automated giving people hours back in their day jobs and personal lives. With over 15 years experience in data, she enjoys solving puzzles, spatial analytics, writing blogs and helping others develop themselves into gurus like herself.

Samantha is well known for sharing her knowledge and evangelising Alteryx with new users both in her role as a Sales Engineer and at INSPIRE. Her passion lies with making every day business chaos, simple and automated giving people hours back in their day jobs and personal lives. With over 15 years experience in data, she enjoys solving puzzles, spatial analytics, writing blogs and helping others develop themselves into gurus like herself.