Distance Matrix Weather Tool
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Notify Moderator
on 05-16-2023 12:15 PM - edited on 10-29-2024 11:30 AM by baileykaplan
Distance Matrix Weather Tool
In this guide, we use the Alteryx Python SDK and Alteryx Plugin CLI to create a tool that connects directly to 2 public APIs and gathers the weather forecast for, and distance to, nearby cities. We then use this information to determine the best locations to hold promotional events.
Suppose you're an analyst for a retail chain and need to give advice on when to hold a promotion for a specific product in the region. Historically, people tend to stay home when the weather is bad, so you want to use the forecast to determine the best cities in which to hold the next promotion.
The weather forecast is available in JSON format from WeatherAPI, and the travel distance, also in JSON, is available from Google Maps' Distance Matrix API. The combined data from the 2 sources will help determine a suitable location.
We use the requests
package to make HTTP requests to both of both of these APIs.
Parsing JSON in Python is relatively straightforward but can become quite verbose with deeply nested objects. To alleviate this, we use the jsonpath-rw-ext package, which allows you to parse JSON with a regular expression style syntax.
Table of Contents
- Get WeatherAPI Key
- Get Google Maps API Key
- Basic Plugin Setup
- Write a Plugin
- Packaging into a YXI
- Run the Test Client
- Install in Designer
- Run in Designer
Get WeatherAPI Key
WeatherAPI offers weather data via a JSON RESTful API. While it is free, it does require an API key to make requests. Sign up to get an API key which will be used in this demo.
Get Google Maps API Key
The Google Maps Distance Matrix API also requires an API key, which requires a Google billing account. While it's technically not a free API, there is an automatic credit of $200 applied each month. For non GCP users, there is a trial in which $300 worth of API calls can be consumed, without charge. With the first 100,000 API calls running $0.005, this will be quite hard to breach.
You can generate an API key from the Credentials tab on the GCP API's and Services page.
Basic Plugin Setup
Before you proceed, please ensure you have this basic setup:
- Create a Workspace to house your plugin.
- Add a new plugin.
- When prompted choose the single-input-single-output (default) plugin type.
- To match what is found in this guide, choose WeatherDistance as the plugin name.
After you run the above setup procedures, you will have a file named weather_distance.py
under ./backend/ayx_plugins
with a generated boilerplate. When you open the file, you should see something like this:
class WeatherDistance(PluginV2):
"""Concrete implementation of an AyxPlugin."""
def __init__(self, provider: AMPProviderV2):
"""Construct the plugin."""
self.name = "Weather Distance"
# truncated code
def on_record_batch(self, batch: "pa.Table", anchor: Anchor) -> None:
self.provider.write_to_anchor("Output", batch)
def on_incoming_connection_complete(self, anchor: Anchor) -> None:
self.provider.io.info(
f"Received complete update from {anchor.name}:{anchor.connection}."
)
def on_complete(self) -> None:
self.provider.io.info(f"{self.name} tool done.")
Write a Plugin
Dependencies
The first step is to update the ./backend/requirements-thirdparty.txt
file and tell Python that we depend on jsonpath_rw_ext
as a 3rd-party dependency. At the time of this writing, 1.2.2 is the current stable version. Add this line to the file:
jsonpath_rw_ext==1.2.2
The second step is to pip install all of the imports found in ./backend/requirements-thirdparty.txt
.
pip install -r requirements-thirdparty.txt
ℹ️ Note that we don't have to add
requests
to the requirements file, as it's already a dependency of the Python SDK.
Imports
The next step is to update the Python module's imports. Imports tell Python what other code the plugin references. Add the required import statements near the top of your weather_distance.py
file. Your code should look like this:
import re
from typing import List
from ayx_python_sdk.core import (
Anchor,
PluginV2,
)
from ayx_python_sdk.providers.amp_provider.amp_provider_v2 import AMPProviderV2
import pyarrow as pa
import requests
import jsonpath_rw_ext as jp_ext
ℹ️ Note that
pyarrow as pa
is imported in the default plugin only if TYPE_CHECKING is true. This should be changed to import always.
Initialization
Now, it's time to update the plugin's __init__
method and set up some basic variables that we reference later in the code.
def __init__(self, provider: AMPProviderV2):
self.name = "Weather Distance"
self.provider = provider
self.set_output = False
self.forecast_endpoint = "http://api.weatherapi.com/v1/forecast.json"
self.weather_key = "[Your WeatherAPI key here]"
self.distance_endpoint = "https://maps.googleapis.com/maps/api/distancematrix/json"
self.origin = "San Francisco"
self.units = "imperial"
self.distance_key = "[Your Google Maps key here]"
self.provider.io.info(f"{self.name} tool started")
ℹ️ Protect your API keys! They are included here for ease of use and the assumption is that nobody will look at your version of the code. If you need to check your code into version control, make sure to remove your keys!
Weather API Function
To keep our code clean, we will write a function called _get_weather
that takes a destination city and returns a dictionary which contains parts of the weather forecast we are interested in.
We first use the requests
library to make a GET request to the WeatherAPI
REST endpoint, passing our destination, the number of days we're interested in (1), and our key as parameters.
With the results, we will use the jsonpath_ext_rw
library to get the value we want out of the response with just a couple of lines of code.
A very truncated example response might look like this:
{
"location": {
"name": "London",
"region": "City of London, Greater London",
"country": "United Kingdom",
},
"current": {
"condition": {
"text": "Partly cloudy",
"icon": "//cdn.weatherapi.com/weather/64x64/day/116.png",
"code": 1003
},
},
"forecast": {
"forecastday": [
{
"date": "2023-05-02",
"date_epoch": 1682985600,
"day": {
"mintemp_f": 52,
"maxtemp_f": 75,
"totalprecip_in": 0.33,
"maxwind_mph": 15,
"daily_will_it_rain": 1,
"daily_chance_of_rain": 74,
},
}
]
}
}
You can see the entire response in forecast.json.
Notice how the values we want, daily_chance_of_rain
, maxwind_mph
, etc., are several levels deep in the JSON response. In Python, you can access this value with:
return json["forecast"]["forecastday"][0]["day"]["daily_chance_of_rain"]
However, this is far from robust. If the response is empty because, for example, the user misspells a city name, the plugin would throw an error.
We can also do a check for None
for each key or index until we reach the key we're interested in, but this quickly becomes quite verbose.
A better solution is to use the jsonpath_ext_rw
library, as shown in the code below.
The function does the following:
- Initializes the dictionary it will return.
- Makes the GET request using the requests library.
- Checks that request's return code was successful.
- Populates the JSON returned from the GET request to a Dictionary.
- For each
dayKey
, gets the desired values from the response body and puts them in the return dictionary.
def _get_weather(self, destination, dayKeys: List[str]) -> dict:
ret = {}
params = {
"q": destination,
"days": 1,
"key": self.weather_key
}
r = requests.get(self.forecast_endpoint, params)
if r.status_code != 200:
self.provider.io.warn("_get_weather(%s) received error response %d" % (destination, r.status_code))
for key in dayKeys:
ret[key] = None
return ret
json = r.json()
for key in dayKeys:
match = jp_ext.match('forecast.forecastday[0].day.%s' % key, json)
if (match and len(match) == 1):
ret[key] = match[0]
return ret
Distance Matrix API Function
The _get_distance
function does the same steps as the _get_chance_of_rain
function. The main difference is we set the return value to -1.0 and of course make the request to Google Maps.
Also note that the distance values returned in this API are "8 mi", "12.5 mi", etc. So we use a regular expression to extract only the numeric value and convert it from a string to a float.
The code looks like this:
def _get_distance(self, destinationCity) -> float:
ret = -1.0
params = {
"destinations": destinationCity,
"origins": self.origin,
"units": self.units,
"key": self.distance_key
}
r = requests.get(self.distance_endpoint, params)
if r.status_code != 200:
self.provider.io.error("get_distance received error response " + str(r.status_code))
return ret
json = r.json()
match = jp_ext.match('$.rows[0].elements.[0].distance.text', json)
if not match:
self.provider.io.info("no match")
return ret
if (len(match) == 1):
self.provider.io.info("match %s" % match[0])
re_match = re.search('(?P<dist>(.+)) mi', match[0])
if re_match:
ret = float(re_match.group("dist"))
self.provider.io.info("Got %f from %s" % (ret, match[0]))
return ret
Data Processing
Now it's time to get to the core of the plugin. We accept input that has a City
column and use each city as a destination. We then call _get_weather
and _get_distance
for each city.
We store an array of tuples consisting of each:
- Destination
- Chance of Rain
- Precipitation in Inches
- Minimum Temperature
- Maximum Tempurature
- Maximum Wind Speed
- Distance from the Origin
First, we initalize the array we will store the tuples in, then:
- Convert the table to batches and iterate over each batch.
- Convert each batch to a dictionary.
- Get the city from each row, assuming it exists.
- Get the weather details for each destination.
- Get the distance from the origin for each destination.
def on_record_batch(self, table: "pa.Table", anchor: Anchor) -> None:
destinations = []
for batch in table.to_batches():
d = batch.to_pydict()
if d.get('City') == None:
self.provider.io.error("No column named City in in batch")
return
for city in zip(d['City']):
if (city[0] == None):
continue
dest = city[0]
dayKeys = ["daily_chance_of_rain", "totalprecip_in", "mintemp_f", "maxtemp_f", "maxwind_mph"]
weather = self._get_weather(dest, dayKeys)
distance = self._get_distance(dest)
destinations.append([dest,
weather[dayKeys[0]] if weather[dayKeys[0]] != None else -1,
weather[dayKeys[1]] if weather[dayKeys[1]] != None else -1,
weather[dayKeys[2]] if weather[dayKeys[2]] != None else -1,
weather[dayKeys[3]] if weather[dayKeys[3]] != None else -1,
weather[dayKeys[4]] if weather[dayKeys[4]] != None else -1,
distance])
ℹ️ Note that the function defintiion in the default plugin calls the table a
batch
. We have renamed it totable
here for clairty when callingto_batches
explicitly.
At this point, we have all the data we need to publish the records. To do so we...
- Create the schema for the data we will output.
- Create an array of arrays to hold our data.
- Populate the array.
- Convert the array to a pyarrows RecordBatch.
- Send the record bactch to the output anchor.
schema = pa.schema([
pa.field("City", pa.string()),
pa.field("ChanceOfRain", pa.int64()),
pa.field("PrecipitationInches", pa.float64()),
pa.field("MinTemp", pa.float64()),
pa.field("MaxTemp", pa.float64()),
pa.field("MaxWindMph", pa.float64()),
pa.field("DistanceMiles", pa.float64())
])
arrays = [[] for _ in schema]
cst = {
pa.string(): str,
pa.int64(): int,
pa.float64(): float,
}
for dest in destinations:
arrays[0].append(cst[schema[0].type](dest[0]))
arrays[1].append(cst[schema[1].type](dest[1]))
arrays[2].append(cst[schema[2].type](dest[2]))
arrays[3].append(cst[schema[3].type](dest[3]))
arrays[4].append(cst[schema[4].type](dest[4]))
arrays[5].append(cst[schema[5].type](dest[5]))
arrays[6].append(cst[schema[6].type](dest[6]))
batch = pa.RecordBatch.from_arrays(arrays, schema=schema)
self.provider.write_to_anchor("Output", batch)
Putting It All Together
The final code is available here.
Packaging into a YXI
Now that the code is ready, we can package it into a portable YXI archive via the ayx_plugin_cli create-yxi
command. The process looks like this:
~/MyWorkspace$ ayx_plugin_cli create-yxi
[Creating YXI] started
[Creating YXI] -- generate_config_files:generate_config_xml
[Creating YXI] -- generate_config_files:generate_tool_config_xml
[Creating YXI] . generate_config_files:generate_manifest_jsons
[Creating YXI] Generating manifest.json for WeatherDistance...
[Creating YXI] Done!
...omitted...
~\MyWorkspace\main.pyz -e ayx_python_sdk.providers.amp_provider.__main__:main
[Creating YXI] Created shiv artifact at: ~\MyWorkspace\main.pyz
[Creating YXI] . create_yxi:create_yxi
[Creating YXI] finished
Install in Designer
In this section, we review the 2 ways to install the plugin into Designer.
Method 1
After you create a .yxi, you can double-click the .yxi to install it in Designer. This opens Designer and prompts you to install the package in a new dialog box. It looks something like this:
Once it installs, you can find the plugin under the Python SDK Examples
tool category.1
Method 2
You can also create the .yxi
and install it all at once via the ayx_plugin_cli designer-install
command. Choose the install option that matches your Designer install. Typically, this is the user
install option.
You can also create the .yxi and install it all at once via the ayx_plugin_cli designer-install command. Choose the install option that matches your Designer install. Typically, this is the user install option.
~/MyWorkspace$ ayx_plugin_cli designer-install
Install Type (user, admin) [user]: user
[Creating YXI] started
[Creating YXI] -- generate_config_files:generate_config_xml
[Creating YXI] -- generate_config_files:generate_tool_config_xml
[Creating YXI] . generate_config_files:generate_manifest_jsons
[Creating YXI] Generating manifest.json for WeatherDistance...
[Creating YXI] Done!
...omitted...
[Creating YXI] finished
[Installing yxi ~\MyWorkspace\build\yxi\WeatherDistance.yxi into designer] started
[Installing yxi ~\MyWorkspace\build\yxi\WeatherDistance.yxi into designer] . install_yxi
[Installing yxi ~\MyWorkspace\build\yxi\WeatherDistance.yxi into designer] finished
If this is your first time installing these tools, or you have made modifications to your ayx_workspace.json file, please restart Designer for these changes to take effect.
Once the command finishes, you can open Designer and find your tool under the Python SDK Examples
tool category.1
Run in Designer
After the plugin is installed, you can find it in the Python SDK Examples tool category of Designer. Drag it onto the canvas.
Then, drag a Text Input
tool onto the canvas and use the Configuraiton pane to input some data. Name the first column City
and enter a few cities as shown in the image. Make sure you connect its output anchor to the plugins input anchor.
Next, run the workflow. The output window should be in a format similar to that shown below.
-
⚠️ If you created the plugin workspace with a non-default
Tool Category
(from the Create a Workspace section), then your plugin will appear in the tool category that corresponds to the input that you passed toTool Category
. ↩ ↩2