Overview of Use Case
The Signal Delivery Group at Sirius XM, an American broadcasting company, are finding that they’re having to analyze larger data sets more often that are mainly composed of spatial data now. One of the largest problems to solve at SiriusXM, was knowing where radio connection problem spots are to better the average radio consumer’s experience. They used Alteryx to figure out where the connection outages are and why they are occurring. The Signal Delivery Group uses Alteryx as a single tool to handle large and diverse data-sets like spatial data and map to learn where and why radio connection is being lost.
Describe the business challenge or problem you needed to solve
Stoppages in radio signal is the main problem that Sirius XM is trying to overcome, particularly physical obstructions and wireless interference from other users of the same frequency. This problem negatively effects the experience of Sirius XM listeners. The analytical goal for Sirius XM was to figure out exactly what is causing blockages for Sirius XM listeners and all consumers of radio and to figure out how to prevent those signal blocks or failures. The objective is to look at three main variables: Signal power, signal quality and any interference power that we might see.
We used to write custom tools, either C# or Python to assist but it takes a long time, it's not flexible at all and there's no support. It's just you write it yourself or you hope the program you're using was written by someone who is still at the company. There's no error handling. Windows will throw some error if something happens, but not a detailed explanation like Alteryx does.
We would visualize in Google Earth. Not too long ago, I would literally draw that map that we just saw because it was just some basic filtering, it was something easy to do in C#, and I would manually draw polygons around those things that looked like clusters. You can imagine that it's tedious, mind-numbing and nobody wants to do that. And nobody wants to make manual reports in PowerPoint.
Describe your working solution
We recently have been ingesting a large amount of Spatial data. We receive it by following the cell phone company model to acquire data, through test driving cars around like Verizon or AT&T would. We built kits and had multiple generations of radios in them and drove these around specific locations. In our case we wanted to take smaller regions on a map and find out what's happening in those areas. We recorded all types of variables; each data point was time stamped and assigned a GPS location where the outages were recorded. One of the things we first did was normalize the data by scaling things to zero since it’s such a large data set and then used the Make a Grid tool and we set the Poly Build tool roughly to 15 meters in size. We put all of those points of interest into the grid and this is how we encapsulate the different location points within an area.
The Grid tool is to find the spatial patterns for the completed kit car’s driving
(solving for where the communication breakdown is). We average everything, group by grid’s name, keep the spatial object and average the 3 variables that we’re measuring:
- signal power
- signal quality
- interference power
We then utilized the K Centroids diagnostics to show the two charts below
- Adjusted Random indices
- Calinski Harabaszindecies
This shows us why certain points are of interest in relation to whether their signal power, signal quality, or interference was higher than another; we view them on a vertical statistical distribution
After reviewing these cluster rankings based on our 3 factors of measurement, we then looped everything into one large polygon by tying the data together. The tool will then show all the small pockets (274 areas) of radio disruption in Nashville.
We then wanted to let each grid vote for itself on which cluster they think the overall shape should be. The best way to do that is to use the spatial match tool again and take our grids and match them back into the individual bad regions that we just made. We're going to split them up because we want to know how many cluster ones, how many cluster twos, how many cluster threes fall in each of those bad regions.
We have our bad regions, the polygon, and how many occurred in each of those. So, we can do some data cleansing. Now we see how big each of these spatial bad regions are, what clusters are inside of them.
A lot of these clusters were not very big. There are a lot of small one-off events and that's not helpful when trying to do an analysis like this. We then decided on keeping everything that's contained four or more failures. By doing so we went from 1,053 points to 22 regions that we really should look at. We were able to narrow our data to a very specific and extreme granular level. We started this in a basic spreadsheet. Basically, a CSV file. We were able to turn those into points, manipulate them to encapsulate our points of interest, using these clustering tools to tell us why these are interesting.
Describe the benefits you have achieved
- With Alteryx spatial tools, the coordinate data is not just latitude and longitude anymore. It's an actual, physical location. We can manipulate those locations. We can filter based on regions and what's in more than one individual cluster we map (if there’s overlay).So the return on investment has really been in our time, we essentially can give ourselves a promotion and dig deeper into this data. It's been pretty unbelievable, we made that model on 1700 miles of driving in one city. You can imagine how much more accurate we can get with more variables. Trust me, there's tons of variables. And if we do that over multiplicities, we can really build up a very robust model. We've been doing this to find lots of things. We found very strange and interesting thing. You wouldn't believe what people build on frequencies that aren't theirs. We find them so that they don't knock your radio off.
- In terms of the future, real time analytics is something XM is very interested in. At the moment, the way SiriusXM works, especially through satellite, it's a one way communication. Your car can't talk back to us. We would love it if it could. That would be a lot of information we'd love to have. There are next generation concepts that are going to transpire that can hopefully provide us a couple interesting points over a cellular network. We do have deals with some of the car companies that will report back data to us.
- With Alteryx we can run our saved model over and over again. I now every Monday morning, I come into the office and I have an email from a script I ran on the server that tells me, "The last seven days, we drove in this city, this city, this city. Here's how many spots I found that you might want to take a look at. Here's why I think you might want to take a look at them." I basically have an assistant now.
The entire PowerPoint presentation can be found here