For a full list of episodes, guests, and topics, check out our episode guide.
Go to GuideData analysis doesn’t just change motorsport racing; it defines the winners. Learn how Ken Black and his team measure and analyze the absolutely massive amount of data that racecars put out on the NASCAR circuit. Ken is tasked with coming up with new ways of looking at data so GM drivers can gain insights on racing and win.
Ep 158 NASCAR analytics - Ken Black
[00:00:00] Megan Dibble: Welcome to Alter Everything, a podcast about data science and analytics culture. I'm Megan Dibble, and today I'm talking with Ken Black, a performance Motorsports data scientist at gm. In this episode, we chat about the challenges of data volume for veracity and velocity and motorsports. How he's using Alteryx would've surprised him along the way and more.
Let's get started.
Ken, it's great to have you on our podcast today. Could you give a little introduction to yourself for our listeners?
[00:00:35] Ken Black: Sure. My name's Ken Black. I work for. General Motors, GM Performance Motor Sports out of Concord, North Carolina. I've been a GM employee for nine years. The first eight years I worked in advanced analytics doing analytics across the enterprise, but last year I got the opportunity to join the Motor Sports Division, which is a fairly new division, although GM's been in motorsports for many years.
There's an an increased focus on GM supporting various forms of motorsport. So they built a big new facility. And for the past year, I've been working remotely to that facility with periodic trips to meet the teams that I work with.
[00:01:15] Megan Dibble: Awesome. Yeah, it's fun to have you back on our podcast. You joined us last year for a cool episode around climate change data.
That one was pretty popular, so it's always good to have people back to talk about new analysis that they're doing. So I'd love to hear a little bit more about the data that you're working with in your current role at gm.
[00:01:35] Ken Black: Sure. When you're working in motorsports, it's basically a data-driven world.
Everything is determined by data. When I say everything, I mean the outcome of races. Who gets to the finish line first? Who performs the best in terms of restarts and races? Everything is measured and analyzed through data. So motorsports is a high intensity sport with massive volumes of data being collected during races.
What I've been doing is working with. Various forms of data coming through the NASCAR circuit. So this data includes things like data collected during the races. Each car has telemetry and each car will send data out in these packets, five data points per second, or five hertz. So in a typical race, you might have two or 3 million data points being created by 38 cars ripping around the track.
This data can be. Anything from how the drivers are controlling the car with throttle or braking or steering, what their speeds are, what their location is. So it's a deep data set coming at us at very high speed. And of course, with five data points per second, that means you have to be accurate in your calculations at the sub-second level.
So it's very challenging to handle the amount of data and then to interpret it. And part of the challenge that I encountered when I first came to work with a team. Was the data is asynchronous. So what I mean by that is each race car produces its own data set. They're putting out data wherever they happen to be at, whatever signal time that they're putting out data.
And so you have asynchronous data in terms of time, but it's also locationally variable. So cars are in different positions on the track. So if you're gonna begin to do sort of the things they were asking me to do as a developer to to write code. You have to put this data into a framework where you could do comparisons.
So I came in and I was able to quickly ingest the data, figure out really how to use it, figure out how to structure it, how to compute with it, and then how to put it into a structure that we could do comparisons. So, so part of the challenge here for us is to be able to help drivers become better drivers.
What can we teach them about their driving styles and the way that they handle their race cars in the context of a race or a practice session or qualifying sessions? For example, what I'll finish up with is NASCAR has three different series. They have the Cup series, which is sort of the, the major leagues of nascar, and then they have the Xfinity, which is one step below their their running race cars.
This is for the younger drivers of people who are building their careers. Then there's a truck series too, so they're driving pickup trucks at high speed around the track. So there's each week at up to three events, truck cup, Xfinity, process, all that data come around, be able to do fast reporting on it.
That's really the challenge and that, that's why I was brought into the team. I.
[00:04:32] Megan Dibble: It's interesting. I didn't even know about the truck series. That sounds really interesting. So not only is there a massive amount of data, but then there's different types of cars and probably pulling in data sets that are depending on the different cars that you're looking at, there's new variables.
[00:04:48] Ken Black: Well, what I was gonna say is, you're exactly right, Megan. It's the data that comes through the race channel that what we call SMT data. That's only one part of it. Then there's other data sources that we have to bring in to be able to tell the complete picture of what happened. That data might be accessible through APIs, through nascar, APIs or other sources.
So the fact that I have Alteryx to do this, you know, I use Alteryx to process this data, is very advantageous because it allows me to pull the disparate data together to tell a story. When I need to get new pieces of information, then I'm able to, so that, that's been one advantage.
[00:05:25] Megan Dibble: I'm curious to know, like if throughout analyzing this kind of data, what has surprised you or have you had any findings that like you weren't expecting with this large amount of racing data?
[00:05:37] Ken Black: Yeah. Okay. So here's a good story. I guess this is gonna date me a little bit, but when I was a kid growing up in Chicago in say the 1970s, I used to watch NASCAR race. On a Saturday morning. Back then, there were only a few channels and there was a NASCAR race on, and I, I found out that I liked it and I watched it for a few years, and I got to know the drivers.
I got to know the car numbers. I, I learned about the sport back then, and back then I looked at it as though it was like the demolition derby. It was. Who's gonna get into a fist fight this time? Who's gonna be blowing their engine? Who's gonna drop their transmissions? Who's gonna burn their brakes out?
These are the kinds of things that that were happening back then. And it was fun because it was just pure chaos. So fast forward all these years and I end up working for GM Performance Motor Sports, where we have the Chevrolet team. We have a number of cars in the races, and now I look at it and it doesn't look like demolition derby anymore, although there's crashes.
The number of dropped engines and blow blown transmissions and all these things that pretty much don't happen very much anymore. They still happen, but I actually did the analytics on it. I went back into historical data and took a look at how the engineering has improved for the cars. And so I learned a great story there from when I was a kid thinking demolition derby to, well watch it now and here's the biggest surprise.
Okay? Here's the answer to your question. How is it that the sport has evolved to where these cars function literally, almost perfectly. And these drivers are world-class athletes, and they drive a 400 mile race at 180 miles per hour, but yet they finish happened two weeks ago with three cars wide, a photo finish separated by 0.003 seconds at the finish line.
How does that happen?
[00:07:20] Megan Dibble: It's insane.
[00:07:21] Ken Black: The parody is amazing. And what's happened, I think, you know, as I look back at it, was this retrospective view that I have. Being in the right place at the right time of understanding the sport. Over time, I think that NASCAR has created a set of rules over time and the engineering has gotten better and the manufacturing has gotten better.
The cars are just solid and the performance engineers really do their job great, and the pit crews are incredible. All the people that are focused on this sport are top notch. Everybody's on the equal playing ground. NASCAR gives rules that make everybody on the same playing ground. What happens is it's execution.
It comes down to which pit crews are best, which driver is just totally locked in. Who doesn't make mistakes, and that's why these races come down to photo finishes and very narrow gaps at the end. There's no big blowout, somebody winning by 15 laps anymore. It doesn't happen. The sport has evolved to be one of excitement and that's what surprised me more than anything.
I couldn't believe it when I started looking at the data and saying, whoa, this is not the sport I remember.
[00:08:26] Megan Dibble: That's super interesting and I think it also makes sense that you are doing these kinds of analysis that like any data analysis you can do to help the drivers perform better can give that little extra edge, you know, when it's coming into such close finishes.
If there's any sort of analysis that can be done that can help out a driver, then now is the time that's gonna really matter the most.
[00:08:47] Ken Black: That's exactly right. And what's interesting coming into this role for me. That I'm a data prep and blending, and a computational specialist as well as visualization specialist.
So I'm bringing in both backend and frontend capabilities to analyze races from beginning to end, including the development of innovative graphics. But motorsports, it's an entrenched community where the race engineers, the team captains, all these people who ingest the information, they want the data a certain way.
So although I may see data being presented in a different way for better. Intuitive comprehension, they may not wanna see it that way. They may want to see it the way they've always seen it. So part of my challenge is to be able to help those drivers is to be able to come up with new ways of looking at the data that could give them that little bit of insight.
And what's interesting is when I create these graphics and I show 'em to the performance engineers right away, they'll say, oh look, this driver did this going into this curve where the other one didn't. They understand the dynamics of the car way better than I do. They could see in the data. The way the drivers are driving different and then they can relay that information to the drivers.
Now, you know, that's, in a perfect world, it's difficult because everybody's going week to week and everyone's going full speed. So that's the challenge is coming up with ways to have these analytics help the drivers perform better.
[00:10:06] Megan Dibble: That's super interesting. You've touched on this a little bit already, but how has Alteryx been well suited to help with this analysis that you're doing?
[00:10:15] Ken Black: My first challenge was to build this big computational tool to be able to compare drivers. We call it driver versus driver comparison, and all the intellectual knowledge behind that came outta my brain and then was formed a basis for a full analysis of a race. I. What's happened then is when I work with the team members, when I go to the office and I say to the team members, what can I do to help automate, use Alteryx, to automate your job, to make your job easier, to give you more time to do the things you need to do in terms of analysis and gaining insight.
They tell me, you know, this is what takes so much time. This is what I do week after week. I'm grinding it out. What's happened was that this platform that I built was able to be extended to do some other things that are critical evaluations of a race in two or three different ways. Of course, I can't go into details, but basically there's these offshoot codes that I've been able to develop in the off season that are now being used in the second racing season that I've been on the team to alleviate workload for others, let's put it that way.
Mm-hmm. I'm kicking out these results after race is over, I run what? What it is. I run nine individual Alteryx workflows that each do a different thing, including things like going and getting API data and processing race results and pulling it all together and then doing computational science work on it.
So this kind of infrastructure that I built is growing in its capability, which was beyond the original scope of what we had planned. So that was a nice finding from this, and so we'll see where it goes in the future as we learn more about its capabilities.
[00:11:56] Megan Dibble: Definitely. And along with that, I mean, you mentioned earlier about the subsecond data and performance, and when we chatted about this episode, you told me that you were involved with the recent Alteryx designer update where.
We could do calculations on, was it milliseconds?
[00:12:16] Ken Black: Yeah. Yeah, it is. I
[00:12:17] Megan Dibble: can't remember the precision.
[00:12:19] Ken Black: Yeah, you could go all the way down to below femtoseconds of 10 to the minus 18th or something. I mean, the daytime functions in Alteryx now are very robust to be able to use it in scientific endeavors.
Things like high energy article, physics, whatever. But for me, all I had to do is get below one second accuracy. And so I was able to do that and then by through my testing of it. I was able to discover how that new functionality, how it interacted with other tools, and they gave the feedback to the development team at Alteryx so that they could fix those issues that hadn't been discovered yet.
So it was a great thing for me that Alteryx came out with Subsecond.
[00:12:59] Megan Dibble: Yeah,
[00:13:00] Ken Black: accuracy. I needed it. And it came out during the time that I was developing the code last year, so that was perfect timing. I think it had been in the hopper for like eight years or something. It was not a trivial,
[00:13:10] Megan Dibble: yeah,
[00:13:11] Ken Black: development task.
It was very challenging for the development team. But they, they did it though.
[00:13:16] Megan Dibble: They did it. And great timing for you and for gm. Yeah. That's really cool.
[00:13:20] Ken Black: Exactly.
[00:13:22] Megan Dibble: And so you mentioned too that you've been able to automate some processes, like throughout your organization. What does that look like and what has the benefit of Alteryx been for even adjacent organizations?
[00:13:35] Ken Black: Sure. Most of that work that, that you're mentioning is the work that I did before coming to Performance Motorsports, where I used Alteryx across the enterprise. And back in 2015, I wrote an article about where I saw my life going, joining GM and with the expectation of what will I be able to do. And I happened to come across this article the other day and I read it, and it's been amazing for me to go back, you know, jump back nine years to an article I wrote, find out that I was, I was pretty much right.
I was able to use Alteryx across many different avenues of the automotive industry. Everything from financial analysis to travel and expense reporting to believe it or not, developing real world signage for the Ultra Crew simulator. To.
[00:14:23] Megan Dibble: Oh, really? Wow.
[00:14:24] Ken Black: Yeah. Just amazing things. Autonomous vehicle simulators.
[00:14:28] Megan Dibble: Cool.
[00:14:29] Ken Black: Geofencing, I mean, all of this stuff that if you were to say to somebody, could one package develop all this, they would say, no, probably not. But through the continuous years of trials and tribulations of just continuing to study and learn how to do these things, uhhuh, what I found is that Alteryx has been able to do everything that I wanted it to do.
And that, that's really the tale of the story is that I haven't been thwarted by anything yet. If people talk about scale, they talk about can it do this, can it do that? You need this big data platform or whatever. But the truth is I've been able to do whatever I've wanted to do with, for example, of one project where I was pulling in data from six, 7 million vehicles every month in hundreds of millions of trips, billions of trips, and analyzing them.
I never had any trouble with memory or scale of anything I've ever done. And of course, you know, I can't publish any of that stuff 'cause it's all proprietary. But it's just my personal experience is that Alteryx was cleverly designed program from the beginning and the AMP engine has helped dramatically in terms of being able to handle the large scale projects.
I have not had anything that I haven't been able to do, including this performance motor sports stuff. It's worked well.
[00:15:47] Megan Dibble: Wow.
[00:15:47] Ken Black: And it runs fast. And for example, to analyze one NASCAR race takes, depending upon how the length of the race between six and 10 minutes is all, and it's doing billions of calculation.
So I mean, it's fast.
[00:16:01] Megan Dibble: That's super impressive. It sounds like you have just gotten to the point where you can bend Alteryx to your will for any project, for anything. I'm curious about that. You mentioned signage. What did that look like? How did Alteryx generate signage?
[00:16:14] Ken Black: This was the project that I, I wrote an article.
It said how Alteryx took me to the edge of insanity, but Tableau saved me. Okay. This is a project where you take a database of all the signs on road networks for whole states like Massachusetts, California, whatever. And so you've got this database that tells you the type of sign it is, where it's located, and then you have to do all the geospatial work on it to figure out how far from a junction are you, what angle are you with respect to the road, all of these geometric calculations, incredible amounts of detail.
And what we're trying to do is simulate. Real world conditions in a simulator when you're trying to do what's called localization studies. Okay. Processing all of the signage from three states is what I did. Massive undertaking all the road networks down to every little tiny primary secondary roads, all of it, interstates, everything.
Trying to determine what roads are on top of each other. I mean, just incredible complexity. And I was able to do it. It started off as just like, we're gonna do this. We're gonna try this, and then we found out that. Each week we just kept going and we just kept making more and more advances. And I had, I had some help with a great buddy of mine at Mapbox and that was just tremendous work and that, that's probably the pinnacle of difficulty that I've, that I've tried with Alteryx, let's say.
It was very, very challenging.
[00:17:37] Megan Dibble: That sounds extremely complex, but I love that Alteryx opens up that. Question of, well let's just try this. Let's experiment. You can like experiment and iterate quickly and then build upon that and start to solve some really complex problems. So I think that's really cool.
[00:17:53] Ken Black: Well, you just, you just nailed probably the most important finding that I have. You asked me what surprised me in performance motor sports, and going back to that question, and here's what it is, is quick iteration, quick. Proof of concept development. We have an idea. Are the competitors using a different strategy?
Well go grab the data from the races and find out when you have this kind of infrastructure that I have built where I store all the race data, I could go back quickly and get it from years ago. Even that process of previous years of, of results, then I could begin to see the strategic differences of people, teams that are using a, we're doing this strategy now versus what we used to do.
It's that pivoting, that quick iteration that makes Alteryx beneficial in this environment.
[00:18:38] Megan Dibble: Totally.
[00:18:39] Ken Black: Yeah.
[00:18:39] Megan Dibble: So then moving back to your main project right now, I'm curious like what success looks like for you guys and how you're helping NASCAR drivers improve? I.
[00:18:50] Ken Black: Well, everybody likes to win and right now in in Chevy well in performance motor sports, everybody's on Cloud nine because Chevy has won eight of the first nine races.
They won all the three cup races. They won all three truck races, and they won 2 0 3 Xfinity races. Winning eight of nine at the beginning of a season is a big deal because these teams, Ford and Toyota, are competing for what's called Manufacturer's Cup Championship, and then there's a driver's cup championship.
So what looks like success? Okay, well success is winning and you get judged on it every week. So that's the thing is, you know, you may win this week, but you may get fourth, fifth place the next week and you gotta rebuild and you gotta build that momentum again. I would say that performance motor sports is a momentum sport, no pun intended, but it is a momentum sport.
I've noticed that Ford and Toyota, they have new bodies this year, so their performance engineers are tweaking down force versus aerodynamics and trying to get the car set directly for the different tracks. So they're gonna be a little bit behind where we are. We have a stable body, but you gotta win when you can win, that's the thing.
And so last year, the last two years, Chevy has won the championship. There's a lot more of the analytics for GM performance motor sports than what I'm talking about. There's a whole different level of team support, which we won't talk about here, but trust me, we get, we do realtime race sport with the teams and you know, we have engineer, performance engineers that are communicating with team directors and at the track.
It's really incredible. You just can't believe the technology and I, I'm just thrilled to be here. I'm just thrilled to be working with it.
[00:20:27] Megan Dibble: Everything you've talked about just sounds like some of the most interesting analysis. I feel like it makes me wanna work with that kind of data, solve those kind of interesting problems.
What are your hopes for the future of this project or the future of your role in terms of the analysis that you get to do?
[00:20:45] Ken Black: Yeah, that's a good question. This goes back to my earlier commentary on the how the race engineers like to see data. Most people in the sport have been around for 10, 20 years. This is their life.
This is their mission. Around the road 40 weeks a year away from their families and they have to make quick decisions. So everything that we do to support them, it's important that what we give them makes their life easier. So it's hard to break through with new technologies to, to show them, Hey, look at the data this way.
So what we're trying to do now. Is automate some processes that have been very manually labor intensive, and they're really geospatial type things. For example, trying to track cars that are going into pit stalls, which pit stalls are they going into, and how did they do it before? And so it's like using time series data and geospatial data to understand issues with respect to that because the pit stall that you pick could be very advantageous in a particular race.
So you need to assemble historical data. You need to assemble. Geospatial data, which could change over time, the pit locations can change. So it's very big picture of trying to understand the relationship, spatial relationships more than anything. So I'm looking into using Alteryx to do a lot of geospatial work more so than I have now.
Of course, my code already does a lot of that, but I, I would say that the frontier for me right now is the geospatial world.
[00:22:08] Megan Dibble: Super exciting.
[00:22:10] Ken Black: Yep. Never a dull moment.
[00:22:12] Megan Dibble: Yeah, for real. Especially when there's races every week, you know, you're constantly getting more data, more feedback. Sounds fast paced.
[00:22:20] Ken Black: My wife feels like she's losing me because here I am watching, trying to sneak outta my phone to watch an Xfinity race or a truck race to see when's it gonna be over so I could go process the data, sneak up to my office and process on Saturday night or Friday night, and.
But it's that passion I think that gets infused in you because everybody works so hard. I mean, the whole entire performance motor sports world, from the fans to the employees. To the companies. Everybody just loves it, and maybe that's the most surprising thing. So I would encourage people to take a look at NASCAR if you haven't for a while and see what it's all about.
It's very, very cool sport, and the drivers have their great personalities. In fact, I didn't, didn't even mention Netflix. Just put a series out on NASCAR from the last season's playoffs. So it's a six episode season. I can't remember what they call it. I'm blanking right now, but take a look at the Netflix series to get an idea of what I'm talking about with the witness car's all about.
It's pretty good. Pretty good little series.
[00:23:19] Megan Dibble: Sweet.
[00:23:20] Ken Black: Yeah.
[00:23:20] Megan Dibble: Yeah, listeners, go ahead and check that out. Well, Ken, thank you so much for joining us today. It's always great to have you on our podcast. Great discussion. And I think you mentioned some of your blogs. We can link those in the show notes for people.
Okay. To check out as well. But yeah, thanks for sharing about what you're working on.
[00:23:37] Ken Black: Megan, it's always a pleasure. Thank you for having me. I just work as an independent lonely soul here, just crunching numbers all day, so when anybody gives me a chance to talk, I have to take it. So thank you. I appreciate it.
[00:23:49] Megan Dibble: Yeah. Have a good one.
[00:23:51] Ken Black: Okay, bye-Bye.
[00:23:53] Megan Dibble: Thanks for listening to find a link to Ken's previous podcast, appearance and other resources related to this episode. Head over to our show notes on community.alteryx.com/podcast. See you next time.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.