Data Science Mixer

MaddieJ · ‎05-04-2021

Struggling to find insight with your image, video, audio, or text data? Join us for a conversation about Veritone’s aiWARE tools for Alteryx that help users at any skill level find breakthroughs in their data.

Panelists

Trevor Jones - @trevor_jones, LinkedIn
Robbie Booth - LinkedIn
Susan Currie Sivek - @SusanCS, LinkedIn, Twitter

Topics

Mixer LI.png

Cocktail Conversation

During the podcast episode, Robbie talked about how he's using his data skills to support his kids' athletic careers, tracking and analyzing their baseball stats. Have you ever used your analytics and data science skills for a fun personal project or for someone in your family?

Join the conversation by commenting below!

Transcript

Episode Transcription

SUSAN: 00:00	Want to attend a virtual data conference that doesn't feel like watching an awkward YouTube video that runs all day? Well, our virtual Inspire conference is just right for you, and it's coming up on May 18th to 21st. Plus, it's free. We'll hear from some amazing data science experts at Inspire, including Dr. D.J. Patil, Jake Porway, Billy Beane, and Dr. Hannah Fry. Plus, many more data and tech superstars. We'll also tour the world for international Women in Data Science panel discussions, addressing data science in education and digital transformation. There are also other fantastic surprises and experiences sprinkled into the schedule. And we're so excited to have two special video sessions of Data Science Mixer premiering at the conference. I interviewed internationally recognized expert and author Alberto Cairo about data visualizations and how to create and consume them efficiently. I also had a chat with Renee Teate, who's the Director of Data Science for HelioCampus and also well-known for sharing her journey into data science on Twitter and in her Becoming a Data Scientist podcast. I hope you'll join us at Inspire for these exciting conversations and much more. Register for free now at inspire.alteryx.com.
SUSAN: 01:15	You know how on those TV crime shows, they catch a fuzzy image of a license plate on a surveillance camera, and they take it back to the tech experts who magically enhance the numbers on the plate to identify the villains? Well, while that magical enhance might be mostly Hollywood fiction, there are incredible advancements happening in AI, including tools that make it easier than ever for people at all skill levels to process audio, image, and video data. Welcome to Data Science Mixer, a podcast featuring top experts in lively and informative conversations that will change the way you do data science. I'm Susan Currie Sivek, the data science journalist for Alteryx Community. And in this episode, I talked with two folks from Alteryx partner, Veritone.
TREVOR: 02:00	My name is Trevor Jones, VP of Business Operations at Veritone. What we do is we're actually big Alteryx customers. Been using Alteryx for about seven, eight years now. Presented at Inspire and a big fan. What we've also done recently is built-in integration with aiWARE in Alteryx, which is a really fun project to be a part of, and happy to be here.
SUSAN: 02:26	Awesome. Thank you. And Robbie?
ROBBIE: 02:27	Yeah. My name's Robbie Booth. I'm the one that talks funny. My official title, it seems to change. I've been here two years, and I'm going on my third title. But today, I think I'm Senior Director of AI Cognitive Engines for Veritone, or at least I think I will be tomorrow. Yesterday, I was head of engineering for a government legal and compliance organization. And yeah, I'm a he or the Scotts get. It's fine. That guy.
SUSAN: 02:57	All right. Great. Thank you so much. Trevor and Robbie are experts on Veritone's aiWARE tools. We talk about how the aiWARE tools integrate with Alteryx, dive into the layers of layered cognition, and find out why that magical enhance is, for now, a Hollywood-only feature. Let's get started. So, as you know, on Data Science Mixer, one of the things that we like to do during the show is treat ourselves to a little happy hour drink or snack. So are you having anything special with you there as we were chatting today?
ROBBIE: 03:28	Yes. Today, I'm not breaking out the good stuff, but I do have the Glenlivet. It's a 12-year-old single malt. I figured I would be a complete representation of a Scottish person for today's interview.
TREVOR: 03:47	Perfect. What I have is electrolyte water because I'm obviously a party animal. Watch out for me.
ROBBIE: 03:54	You are so SoCal.
TREVOR: 03:58	No. I actually have a bike ride after this and a couple more meetings, so just kind of keeping it low-key for now. But don't worry, I will catch up to you later Robbie. I promise.
ROBBIE: 04:10	Well, I'm umpiring my kid's six-year-old baseball game which is, hence why the early start because you kind of need it to get through that.
SUSAN: 04:17	Love it. Terrific. All right. So you guys gave us a little bit of a hint of what Veritone is all about, but maybe give us a little more detail here on some of the AI tools that you've developed and what different purposes they might be used for. And maybe tell us a little bit about one of your favorites in particular.
ROBBIE: 04:36	So what's interesting about Veritone is we really have two real things. So one is we have a platform that we can use to orchestrate scale and run different types of AI jobs. And what's kind of cool about it and it's a problem that I was running into in other companies that I was working at is that today it all kind of comes down to, where is the data, how do you land the data, and how do you decide what models you want to run? And you're running one model, and all of a sudden, some other company comes up with a better plan. And you want to switch, but it's just not simple. So Veritone came up with this really nice way of just kind of creating a common standard for input and output. And so using this platform, you can basically trivially swap, for example, translation models and then move them around and decide which one works best for your scenario because it's not one-size-fits-all.
ROBBIE: 05:29	And so we have this platform that you can scale and orchestrate, but of course, the whole sort of value premise and proposition is you have to be able to run different models on it. And so we went ahead and built tools and applications that sit on that layer and use those sort of engines and ones we've built ourselves to. So we've spent a lot of time in my space-- obviously, Trevor's spent a ton of time around Alteryx because he's the finance geek and really cares about some specific ops problems. But we've really gotten into things like using computer vision to help redact faces for Freedom of Information Act requests, things like that, where you can sort of layer different types of cognition together and different types of models to create some really powerful results. And so it's been kind of interesting watching how those three sort of pieces of the stack, sort of the platform that lets you ingest, scale, and run different types of models, the different models themselves and then applications you build on the models can all kind of work together. Yeah. I mean, it's been super extensible and kind of a labor of love. But yeah, we're excited about it.
TREVOR: 06:49	One of the use cases we've been exploring recently is around interaction analytics. So these days, there's so much content being generated through Zoom and Hangouts and all these things, especially since COVID, and it's a lot of data that's just not being untapped. And so one of the use cases that we really see emerging is around interaction analytics. And what that really means is if you have a conversation between people is, well, how can I query that? How can I query those recordings, right? And so some of the applications could be sales calls for sales enablement training or are my sales reps being effective? Revenue generating type activity, right? Everything to telemedicine. Are my doctors providing a good experience? Or call center. My call center agents, are they complying to our policies and are they providing a good experience? And to see all that information. And so kind of the way that works is either using our platform natively or working with Alteryx as well. As you say, you pointed out the media, and you get dashboards that shows you, what are they talking about? What's the sentiment? Computer vision aspects to figure out, what are the facial expressions? Who are the people on the call using facial recognition? Content, entity extraction, topical extraction, those types of things. And what I really like about it is that we get to use such a diverse amount of AI engines together in one solution. And it's really fun as well because you get beautiful dashboards as well. So that's been kind of a fun one we've been doing over the last couple of months.
SUSAN: 08:26	Yeah. That's very cool. I was trying to count all the things that you were mentioning. Sounds like topic modeling, sentiment analysis, entity recognition. I mean, there's a lot going on there.
ROBBIE: 08:37	Yeah. Layer cognition is such a buzzword for us. That's when things really start to light up when you can start to run a couple of different models on the same data set and all of a sudden, you're providing a ton of functionality that seems quite extraordinary. Or it would have seemed extraordinary, maybe four or five years ago.
SUSAN: 08:57	And now it's just ordinary. [crosstalk].
ROBBIE: 09:00	It's like, what are you going to do next?
SUSAN: 09:02	So your term for that is layered cognition? Did I catch that correctly?
ROBBIE: 09:05	I mean, that's what I've been calling it. I don't know what other folks would say. But it's basically the ability of running to either have multiple models of processing the same piece of information, either in serial or parallel, just depending on what you're doing. So it's things like being able to verify someone's identity by using computer vision and then getting a voice print and maybe some biometrics. It's doing voice analysis. It's really quite interesting.
SUSAN: 09:37	Yeah. Very cool. So it sounds like this is really a way of making the most of all of the unstructured data that's out there, whether it's these conversations, or other potential sources. What are some of the other kinds of sources of that unstructured data that you've seen folks working with?
ROBBIE: 09:54	So I think it would be fair to say that, driven by the market today, it's primarily audio and video. But what's interesting is that's just what's driving the bulk workloads today. We're starting to see other use cases really kind of come to the fore. And I think in the last 6 to 8 months, maybe 12 months, all of a sudden, it's been about integrations. Basically, integrating our stack with somebody who has a large data source and wants to light up and basically run some kind of operation that they can then do some kind of visualization against or get some insight on, so. But I think today, it's fair to say it's still mostly audio and video and sometimes static documents too. But, yeah, that would be the vast bulk of what we do.
SUSAN: 10:38	Very cool. And I'm assuming it would be possible to combine those different things as well, then, right? Combine somehow the documents with what you're getting out of audio and video and so forth?
ROBBIE: 10:48	Yeah. Yeah.
TREVOR: 10:49	That's actually a very popular use case for us is searchability across a diverse input of content, right? So e-discovery is something that is very common within the legal sub-vertical for us. And what they do is they have a massive amount of content: audio recordings, phone calls, documents, emails. You name it, they have it. And they need to just dump everything in and be able to search for keywords across all that content. And so that's one of those use cases that's right in our wheelhouse because what it does is it flexes the muscle of being able to run all the models that work together and have strong search capabilities.
SUSAN: 11:31	Yeah. Very cool. So going back to the use cases and the really interesting applications that you've seen, I always think it's neat to hear stories as much as we can in whatever terms you're willing and able to describe them, of people who have put these tools to use and really made some meaningful impact on their businesses and their outcomes. Are there are a couple of favorite stories and use cases that you have that you've run across during your time working on these tools?
ROBBIE: 11:59	Veritone started off as a media and entertainment kind of centric organization, and then they found that the way the platform had been built, it would easily allow them to move to other business verticals. And so government legal has been something we kind of kicked off in the last maybe 18 months. But I know that we've-- so we have a product called IDentify, which really hasn't-- it's not really something we've used a ton while we're looking to see what happens with legislation around facial recognition. But that was actually used to catch someone who was pulled over for speeding, that there was an existing warrant out for a violent crime in another district. So what's really interesting is that it turns out that a lot of local law enforcement doesn't actually share their known offender databases. And so the concept with that product is you basically just ingest all of the known offender databases from all of these different agencies and then they can search and do a facial match across all of those folks. And so I think somebody was picked up for DUI, turned out they were wanted for a violent crime elsewhere, and that they'd sort of fleed. And I think it wasn't like it was just a couple of counties away. I think it was Alaska to Southern California. So it was something that you just you'd normally find.
ROBBIE: 13:27	I'm trying to think of the other analogies and stories that we've talked about publicly. I'm not quite sure which ones I'm allowed to mention. But there's a lot of use cases that are really interesting, like things in the power industry using drones, like the DGI drones, for example have a NVIDIA Jetson devices onboard and really powerful computer vision and object detection techniques and models. You can use them to check power lines after a storm so that crews don't have to manually climb each pole to verify that that's okay. And it's those types of scenarios, being able to take like a whole ton of visual data and push it through and run an analysis on which ones have the anomalies that need to be manually checked by a human that can just save tons and tons of time. So those are probably sort of two of the scenarios that I can think of just off the top of my head that I know we can talk about, but--
TREVOR: 14:31	Yeah.There's--
ROBBIE: 14:31	Go ahead.
TREVOR: 14:33	There's definitely one on the media entertainment side that I think is pretty fun. So a lot of the time we work with advertisers that place media on live sporting events or their sponsors. And so at live sporting events, you don't always know where the cameras are going to be, which way they're facing, things like that. So you want to try to measure the ad value of your sponsorship, right? And so what we use for that use case is logo detection. What we do is we ingest the media for the sporting event, and we run it through logo detection. And what it does is identifies where it is on the screen, what angle it's at and how prominent it is on the screen. And what we do is compile an aggregate index that essentially equates that to value. And then we match that against, whether it's Google analytics impressions or orders and things like that. We actually attribute that back to the actual impression.
ROBBIE: 15:31	Yeah. So really actually that's a really good point, Trevor, this other-- I'm thinking in the M&E space, there's some really nice already announced things like our partnership with the San Francisco Giants. So we basically took the San Francisco Giants entire media catalog, ingested it, indexed all the players, and now for the first time if they want to find an image or video of a specific player, they can very, very quickly pull all of the clips and highlights and create these amazing highlight reels going back throughout their entire recorded history, which is really cool. And of course, before that would have been people having to manually index and just trawl through things and just remember where things were and manually catalog. So super powerful.
SUSAN: 16:18	So, Robbie, you told us earlier that your six-year-old has his first baseball game tonight.
ROBBIE: 16:23	He does.
SUSAN: 16:23	So are you creating an archive of all of his baseball game media from over the years, so that--
ROBBIE: 16:28	Funnily enough, funnily enough--
SUSAN: 16:30	Uh-oh.
ROBBIE: 16:30	--I do. Yeah. So from my accent, you may be able to tell, I wasn't born here. So, therefore, I think this games silly. But we're fully embracing it in the Booth household. And I'm not going to be the best coach or the best dad trainer, but we will be the best analytics team. Yeah. So I was showing you earlier we've got the-- Trevor's not seen this. Trevor, it's a smart baseball. IoT device. Charges via induction. And will measure spin, velocity, curve, whether it was in the strike zone, all that kind of fun stuff. We have the same thing for the actual baseball bat. And then all of the bats for all of the kids is on video, I index it all, and then I measure difference in variation of swing over time.
TREVOR: 17:20	[crosstalk].
SUSAN: 17:21	Wow.
ROBBIE: 17:22	Yeah. So, for example, my eldest son, we installed a batting cage and he got all excited. And so we've been cranking up the pitching machine and so he'll throw pitches at 46 to 50 miles an hour. And so he got all hyped up on hitting that. And now he's got this problem where he starts to swing too early and telemetry shows that his bat speed has slowed down. So yeah, just this is what you get for having a dork as a dad, I guess.
TREVOR: 17:51	Oh, that's awesome. Well, you may not know this, but I also have a six-year-old in baseball. So I have a feeling this is an area we should be collaborating.
ROBBIE: 17:59	We should be collaborating on this Trevor. So we've got all the things. We've got the Axe Bat. We've got the swing tracker. We've got the pitch tracker now. Yeah. I would show you the telemetry of my nine-year-old's swings, but I don't think this podcast could handle the excitement.
TREVOR: 18:17	Oh, we'll can grow some descriptive analytics on that. I'm all over that.
ROBBIE: 18:20	Oh, well that's--
TREVOR: 18:21	I'm here.
ROBBIE: 18:23	That's it. I do have analytics for all of these things. It traces the swing. it shows you the potential damage of the shot. It tells you what you're doing wrong. But really, it's the power of data and it's amazing that in our lifetimes we're getting to see this application. It's been really neat.
SUSAN: 18:45	It's going to look very different when kids from your kid's generation will be moving up into college sports and so forth--
ROBBIE: 18:52	Oh, it's going to be crazy.
SUSAN: 18:52	--and all of the coaching process and recruitment will be--
ROBBIE: 18:55	Yeah. Well, we bought--
SUSAN: 18:56	Well, can we see your six-year-old data?
ROBBIE: 18:57	Oh, yeah. Well, what was wild was we-- so I just bought this new camera. So it's by a company called VIO. And it's basically two 4 K cameras that you position in one corner of the field and it captures the whole game and it tracks the ball. And what you do is you basically upload the entire game footage - so it's basically three hours times 4 K video - up to their cloud and then they go run player identification on all of the players. So you can create highlights. It's absolutely out of control. They've been using this stuff in Europe for the remote cameramen for all of the soccer games when they kicked everything back up in the pandemic and they didn't want to have too many people that were near the players. So all the cameramen or camera people, rather, in British sports right now are all these robot cameras. It's amazing.
SUSAN: 19:56	Wow. So cool. So much [crosstalk].
ROBBIE: 19:58	Because I was getting really excited. I always end up pointing the camera at my feet when somebody does something good because I get too excited and see some Scottish unintelligible shouting. It's not good. It's terrific.
SUSAN: 20:13	That's terrific. It's awesome that you have parents with analytic skills who can take all these data, set all this up, have such amazing insights. When you've got tools like what you have at your fingertips, it sounds like there's so much that you can do with all of that data that you've collected. Thinking about folks who are wanting to work with their own kids' sports data or whatever other video, audio, text data that they might have, what would be some of the things that somebody who is maybe not super sophisticated in data science could try that they might like to experiment with? And then kind of on the flip side of that, for somebody who is more experienced, who is sophisticated in data science, what would be the appeal of trying out tools like what Veritone offers?
TREVOR: 21:01	Yeah. I'll take that one. I think that if you look at analytical maturity, every company is at a different place in their evolution, right? And I think one of the misconceptions out there is that because AI is one of those just completely overused terms, right? And so when people are thinking about their own analytical maturity and what their aspirations are and what their roadmaps are and where they place their budgets, I think there are some folks out there that think that AI is, "Oh, yeah, that's the thing that I'm going to do after all of the things that I know of that I already need to do with my structured data and some automation initiatives and things like that." But it's really companies like us and others who are really making AI more approachable. Is it's something that you can just experiment with, right? And you don't have dependencies on your analytical maturity for you to start leveraging it, right? Just think of it as another data source that represents the vast majority of your data that you're not doing anything with today. And they really kind of parallel efforts, I guess, is the way that I would say that.
ROBBIE: 22:12	Well, one of the reasons I came to this company in the first place was when I first saw what we were doing with Automate simply because if you look at other industries where people have empowered scientists or just other folks with tools that don't require a high-level entry point. So if you think of video game development and the rise of the game engine, like unity, unreal engine, and things like that, it's really amazing what the difference has been. I got my start in the video game business and to go make a game like when I was working on flights and we'd have 150 people and it'd take us five years and now--
SUSAN: 22:58	Deep sigh. [laughter]
ROBBIE: 22:59	Yeah. It was painful. And then we started playing around with the concepts of a common engine. And then really by sort of the end of my tenure in the video game space, you license your engine and the bulk of your team are really working in that sort of script layer or the drag-and-drop. So all of a sudden, your team doesn't have 150 engineers and a couple of token artists. You basically have designers. You have people who are creating the experience. You're basically empowering creatives. And I kind of think that that's really what automates going to do. Before, you had all of these barriers to entry. And really, I think automates goal and other tools like it, which are inevitably going to pop up, is to really be the way that you can go run any kind of cognition without having to write any code to support it. And to do so in the cloud and to kind of pick what your environment looks like, so that if you need to run it on a GPU you can and just make it super simple. So I think it's really exciting what things are going to look like in five years' time. I mean, I think our kids are going to be taking classes on how to build zero-shot object detection models, hook them up to their Lego and do it for a class project, and that will be second grade. It's going to be amazing. My second grader built a Lego R2-D2 robot and he kind of drives it using his iPad.
TREVOR: 24:29	[They do?].
ROBBIE: 24:30	So Trevor, I don't think it's a stretch. Imagine that with a little NVIDIA GPU in there and all of a sudden, you can go run some object detection and now R2-D2 can run around and not bump into walls. It's just mind-blowing how cool and how quickly this is moving. Because really the algorithms haven't really moved in advance that much. It's the ability for more of us to apply that's become sort of this democratization and this driving force that's important.
SUSAN: 25:00	Absolutely. I mean, when I was in elementary school, people were building volcanoes out of clay and using vinegar and baking soda. It just really pales in comparison to these other cool projects, so.
ROBBIE: 25:12	I think I just aged myself because we had this one computer in the whole school and it was this old IBM thing. And you got one hour on it in a month and you could go down there and write something in logo.
SUSAN: 25:23	Yeah. Yeah. I hear that for sure. Oh, man. So on this note of democratizing access to these two algorithms and to AI here, how do people get started with this, if somebody wants to give this a shot with whatever data they have on hand?
TREVOR: 25:42	Well, we offer free trials on our platform. So anyone can just go to aiware.com and sign up for a free trial. And you can train libraries, right? You can use the Alteryx integration. You can load your own content and experiment immediately.
ROBBIE: 26:05	And as a developer, it's not even that hard. I mean, I remember-- going back to my job interview two years ago, our CEO Chad Steelberg is a really smart guy. And I got this one-- after my interview, I got this one-line email from him saying, "Go build something on the platform and make it cool." So it's like, "All right." So I went and grabbed the documentation, and I was able to build-- I mean, even at sort of a fairly early stage of maturity, I was able to build my own little engine. We call them engines in our platform. It's basically a way of running a model. And I was able to build a little computer vision model using some open-source code from GitHub and have it running and recognizing photos of our CEO. And I don't know. It took me about three or four hours to build the thing and that was from having zero experience or exposure to the platform. And it's advanced a lot since then as well.
SUSAN: 27:04	Very cool. And of course, I think Trevor, also you mentioned earlier the relationship with Alteryx. Can you talk a little bit more about that? And for people who are Alteryx users who listen to the podcast, how they might use the tools together?
TREVOR: 27:17	Definitely. One of the things that we've really enjoyed doing over the last three months, four months, is talking with the Alteryx ACEs. So we engaged with them immediately and started brainstorming ideas as to where the applications for aiWARE plus Alteryx, where those would be, right? And so that's been a really fun ride. We're also sponsoring Inspire this year. So we're excited that we'll be part of the keynote as well as the presentations by Mark Frisch as well as AJ Guisande. And so they're among the two strongest adopters out of the gate and have been really wonderful to work with us as fellow thought leaders in the space. And I'd say, generally speaking, we're engaging with Alteryx at all levels, right? We're working with sales engineers. We're working with marketing. We're working with executive leadership. And it's just been really fun to see the companies come together and work together for a common cause, which is bringing AI into the analytics space.
SUSAN: 28:26	Definitely. We've had at least one guide to using the aiWARE tools with Alteryx published on the Alteryx Community. So that would be one place that folks could look for sort of an example of an application and then actually pulling the tools into a workflow and using them there.
TREVOR: 28:43	Actually, this week we're publishing starter kits. Some really great starter kits for people just to have some fun with their data. If you just want to do basic facial recognition or transcription or translation or just sentiment or things like that, you want to play just in one category at a time or bring them together for the full interaction analytics solution they're built to support that. And we think that'll just make it a lot easier for an Alteryx user to not really have to think about what they're doing AI-wise and just kind of point it at some media and see what you get.
SUSAN: 29:18	Nice. Yeah. And if those are published when the podcast is published, then we'll be sure to link to that in the show notes as well so folks can easily click through and check those out. So kind of a random question that's something I was curious about. I think Robbie earlier you mentioned the company coming out of this media and entertainment background. So I'm just curious kind of how that evolution happened to now aiWARE, right? Because I can see a connection but I'm curious about that if that's something you can flesh out a little bit.
ROBBIE: 29:47	Well, Trevor, you pre-date me by a little while, but I can give you all I know. So we're kind of doing this archaeological hunt here. So obviously, Veritone was formed by the Steelberg brothers. A big background in media and entertainment and the ad space. They'd sold their previous companies to Google. They start Veritone. And really, they were basing it on sort of the [inaudible]. So the direct correlation to [inaudible] would be truth in the signal, or at least how they're choosing to interpret it. And then really sort of aiWARE as a product, what's been really cool about this, and it really resonated with me as someone who in my previous life, my whole job had been to try and land Fortune 500 companies with these big AI workloads and just running a whole ton of complexity and pain. Not fun stuff. Just really frustrating work.
ROBBIE: 30:48	So the idea of an operating system that would allow me to just orchestrate AI workloads, regardless of whether I want to run it-- whether I want to run it on-prem. I can be cloud-agnostic. I can run on a laptop. I can run on a desktop computer. Our North Star is pursuing sort of truth in the signal of noise and in unstructured data primarily. I think that was our origin. And then the methodology by which we would do that was we'd build this operating system that would basically simplify a lot of that complexity and just remove that from us in the same way that cloud meant that we don't need to really worry about the size of the box that we have sitting under our desks that we want to go crunch and build a model with. It's been very liberating, I think. Trevor, do you have any other funny insight? I mean, I liked my story, but if it was really just because they were playing beer pong and they decided that Veritone would be cool. I don't think we should--
TREVOR: 31:47	I had two or three bullets queued up in the back of my head and you crushed it, man. You already said them, so. [laughter]
ROBBIE: 31:56	Because I can just imagine Chad and Ryan going surfing and then having this argument about what it should be called.
TREVOR: 32:03	Ryan made this really funny comment to me one time because they're serial entrepreneurs and they've been just hitting them out of the park. And so I think Ryan looked at Chad and Chad's like, "I want to start an AI company." And Ryan's like, "Of course he had to pick something hard." [laughter] Making AI easy. [crosstalk], right? And of course, you have to do that. [laughter]
SUSAN: 32:32	Amazing. That's great. I love that there's etymology behind the name as a big nerd. I think that's super cool. Yeah. It's interesting how many times the ideas of playing and having fun with what you're doing in AI, I mean, I think those words have recurred kind of a lot in this conversation. So I think that says a lot about what you all are working on and how folks will enjoy working with it. So one question that we ask all of our guests and that I will ask you all, this is are alternative hypothesis recurring segment that we. Very formal here. And the question is, and if either or both of you would like to take this, feel free. What is something that people often think is true about data science or about AI, but that you have found to be incorrect?
TREVOR: 33:23	Well, I mentioned mine earlier, which is that AI comes after analytical maturity, right? When you're moving from descriptives, through predictive, through prescriptive analytics, and you're evolving on the value chain, that AI's sort of that thing at the end. But it's just not true, right? So I think that AI is here, it's approachable, it's easy, and that nearly every business process could benefit from it in a very simple way.
SUSAN: 34:02	I think for me, it would be that it's still not magic. The number of times we'll talk to-- I think Trevor landed on talking about different companies have different levels of maturity. And so we'll talk to a company and they'll have audio that's recorded in an extremely low bit rate with a ton of background noise on it. You can barely understand what somebody's saying. And then the expectation that somehow you would get accurate transcription from that, it's just not good. I mean, there's certain things you can do, obviously. You can run a bunch of noise reduction. You can try and pop the signal a little bit. But at the end of the day, it's an algorithm somebody wrote, operationalized, turned into a model, trained, and then we put it on the platform. And so while you're looking for signal in that noise for sure your noise can't be garbage. And I think there's still-- we still run into that every-- I would say a good percentage of the time that there's just a bunch of corrupt stuff and there's very little you can do with that.
TREVOR: 35:14	That's really true. Yeah. It's really true. Sometimes we get surveillance video, right? So we're going to [crosstalk]. Okay. "Well, why didn't the engine identify that person?" And then you're looking at the image and you're like, "Can you tell me who that person is?"
ROBBIE: 35:30	It's because they all watch that TV show. It's like name that cop show where they run the magic enhance button.
SUSAN: 35:35	Magnify. Magnify.
ROBBIE: 35:36	It's like, "Where's the magic enhance button?" Then, you press it, and all of a sudden, it magically becomes this high-definition image. So, no.
SUSAN: 35:44	That doesn't exist? What? Oh. [laughter]
ROBBIE: 35:48	I don't even know the names of the shows, but I know there's a ton of them.
SUSAN: 35:51	That's great. That's a great point. Yeah. Data science is not magic. You have to have some sort of foundation there to work with before you can actually do much with it. So that's a great point.
ROBBIE: 36:00	We do spend quite a bit of time now in education sort of working with customers, sort of explaining what are the things that they can do. Actually, one last good example we had was we had-- so we have this redaction product I talked about earlier, which we use for Freedom of Information requests. And a university had an incident and they sent us all this video and asked us if we could go ahead and just redact all of the people in it. Well, what's interesting is not only do we use-- we don't do facial recognition, we do head detection. So we've basically trained an object detection engine to detect heads. And then we put bounding boxes around those heads. And then we have some other tech, like if you occlude a head so all of a sudden, you're off the screen or something or you step behind somebody we use interpolation to kind of keep track of where we think you are. And the solution works pretty well. It's kind of a combination of data science and just good old-fashioned software engineering.
ROBBIE: 37:02	Well, what's interesting was when they sent us the video, the video had an effective frame rate of three frames per second. And we maybe only sample the video maybe twice a second or something like that. So we ended up with all these weird scenarios where humans would appear to blip and teleport across the screen. So you wouldn't know that they were the same object or something. But that doesn't happen a lot because at some point that facility probably had to make decisions on storage and that determined how many times, they were going to take basically static shots. And then they ran all those static shots through some kind of movie maker and ended up with this bizarre kind of frame rate thing. So I think dealing with those types of data sources is going to be a real thing. Like 7-Eleven grainy video feeds and all kinds of other things. Yeah. That's going to be a reality for quite some time to come, I think.
TREVOR: 38:05	Yeah. And one thing that's been really nice with COVID, I think - I'm the first person to start a sentence that way - is that these call, video calls, like Zoom and Teams and all that, is really good for facial recognition, right? So it's right square on the face. It's usually good resolution, it's prominent in the screen, and the audio quality is really good for transcription and some of the NLP stuff. So the good news is that that tends to work really well. We've actually been kind of blown away. We're training libraries to do facial recognition on some of the calls they've been getting. And it's amazing that we'll just take one picture of somebody from their LinkedIn profile and it'll catch every frame in the recording. And we're used to having to train three to four to five different images for the same person. So that's been really cool to see.
SUSAN: 39:04	Yeah. Interesting. Yeah, certainly all different kinds of data available now that maybe we didn't have in the past that we can do new things with. So anything that we haven't talked about yet that you want to get in there, that you want to be sure to mention?
TREVOR: 39:17	Yeah. I think that, given the diverse opportunity that we have within the AI space and just the general capabilities of what AI is able to do, is that we love new ideas. For myself, I think one of the things that we're going to have fun with over the next few months is jumping into predictive and onboarding predictive models and things like that to help us with forecasting and orchestrating some of our other engines. And I think that's going to be a really fun project. But even if you take what they're doing with Illuminate where you're taking known offenders database, yeah, but what about for retail? If you want to track VIPs or you want to track known fraudulent folks or whoever and maybe things like that? And so there's so many different opportunities where AI is going to disrupt everything. And we love just talking about those ideas and just problem-solving. And so I would just encourage anyone who is having these ideas to reach out and let's start a conversation.
ROBBIE: 40:22	Preferably with Scotch. [laughter] It doesn't have to--
SUSAN: 40:26	Well, I think you all have given our listeners a lot to think about and a lot to also be excited about. So I hope some of them will give this a try. It's very exciting. So thank you both for joining us on Data Science Mixer.
TREVOR: 40:36	Absolutely. Thanks for the time.
ROBBIE: 40:38	Yeah. Thanks very much. And go Mighty Mussels. I have to give a shout-out to my six-year-old. Go, Ronan!
TREVOR: 40:42	Yes. Can't wait to hear about the game. [laughter] All right. Thanks, guys. Thanks for listening to our data science Mixer chat with Trevor Jones and Robbie Booth of Veritone. Hear more from Veritone at the Alteryx Inspire Conference on May 18th to 21st. We've also got links for resources for trying out Veritone's aiWARE tools in your Alteryx workflow in our show notes at community.alteryx.com/podcast. Also, be sure to join us on the Alteryx Community for this week's Cocktail Conversation to share your thoughts. Robbie and Trevor talked about how they're using their data skills to support their kid's athletic careers, tracking and analyzing their baseball stats. Have you ever used your analytics and data science skills for a fun personal project or for someone in your family? Tell us about what you did and share your ideas. You can leave a comment directly on the episode page at community.alteryx.com/podcast or post on social media with the hashtag #DataScienceMixer and tag Alteryx. Cheers.

SUSAN: 00:00 Want to attend a virtual data conference that doesn't feel like watching an awkward YouTube video that runs all day? Well, our virtual Inspire conference is just right for you, and it's coming up on May 18th to 21st. Plus, it's free. We'll hear from some amazing data science experts at Inspire, including Dr. D.J. Patil, Jake Porway, Billy Beane, and Dr. Hannah Fry. Plus, many more data and tech superstars. We'll also tour the world for international Women in Data Science panel discussions, addressing data science in education and digital transformation. There are also other fantastic surprises and experiences sprinkled into the schedule. And we're so excited to have two special video sessions of Data Science Mixer premiering at the conference. I interviewed internationally recognized expert and author Alberto Cairo about data visualizations and how to create and consume them efficiently. I also had a chat with Renee Teate, who's the Director of Data Science for HelioCampus and also well-known for sharing her journey into data science on Twitter and in her Becoming a Data Scientist podcast. I hope you'll join us at Inspire for these exciting conversations and much more. Register for free now at inspire.alteryx.com. SUSAN: 01:15 You know how on those TV crime shows, they catch a fuzzy image of a license plate on a surveillance camera, and they take it back to the tech experts who magically enhance the numbers on the plate to identify the villains? Well, while that magical enhance might be mostly Hollywood fiction, there are incredible advancements happening in AI, including tools that make it easier than ever for people at all skill levels to process audio, image, and video data. Welcome to Data Science Mixer, a podcast featuring top experts in lively and informative conversations that will change the way you do data science. I'm Susan Currie Sivek, the data science journalist for Alteryx Community. And in this episode, I talked with two folks from Alteryx partner, Veritone. TREVOR: 02:00 My name is Trevor Jones, VP of Business Operations at Veritone. What we do is we're actually big Alteryx customers. Been using Alteryx for about seven, eight years now. Presented at Inspire and a big fan. What we've also done recently is built-in integration with aiWARE in Alteryx, which is a really fun project to be a part of, and happy to be here. SUSAN: 02:26 Awesome. Thank you. And Robbie? ROBBIE: 02:27 Yeah. My name's Robbie Booth. I'm the one that talks funny. My official title, it seems to change. I've been here two years, and I'm going on my third title. But today, I think I'm Senior Director of AI Cognitive Engines for Veritone, or at least I think I will be tomorrow. Yesterday, I was head of engineering for a government legal and compliance organization. And yeah, I'm a he or the Scotts get. It's fine. That guy. SUSAN: 02:57 All right. Great. Thank you so much. Trevor and Robbie are experts on Veritone's aiWARE tools. We talk about how the aiWARE tools integrate with Alteryx, dive into the layers of layered cognition, and find out why that magical enhance is, for now, a Hollywood-only feature. Let's get started. So, as you know, on Data Science Mixer, one of the things that we like to do during the show is treat ourselves to a little happy hour drink or snack. So are you having anything special with you there as we were chatting today? ROBBIE: 03:28 Yes. Today, I'm not breaking out the good stuff, but I do have the Glenlivet. It's a 12-year-old single malt. I figured I would be a complete representation of a Scottish person for today's interview. TREVOR: 03:47 Perfect. What I have is electrolyte water because I'm obviously a party animal. Watch out for me. ROBBIE: 03:54 You are so SoCal. TREVOR: 03:58 No. I actually have a bike ride after this and a couple more meetings, so just kind of keeping it low-key for now. But don't worry, I will catch up to you later Robbie. I promise. ROBBIE: 04:10 Well, I'm umpiring my kid's six-year-old baseball game which is, hence why the early start because you kind of need it to get through that. SUSAN: 04:17 Love it. Terrific. All right. So you guys gave us a little bit of a hint of what Veritone is all about, but maybe give us a little more detail here on some of the AI tools that you've developed and what different purposes they might be used for. And maybe tell us a little bit about one of your favorites in particular. ROBBIE: 04:36 So what's interesting about Veritone is we really have two real things. So one is we have a platform that we can use to orchestrate scale and run different types of AI jobs. And what's kind of cool about it and it's a problem that I was running into in other companies that I was working at is that today it all kind of comes down to, where is the data, how do you land the data, and how do you decide what models you want to run? And you're running one model, and all of a sudden, some other company comes up with a better plan. And you want to switch, but it's just not simple. So Veritone came up with this really nice way of just kind of creating a common standard for input and output. And so using this platform, you can basically trivially swap, for example, translation models and then move them around and decide which one works best for your scenario because it's not one-size-fits-all. ROBBIE: 05:29 And so we have this platform that you can scale and orchestrate, but of course, the whole sort of value premise and proposition is you have to be able to run different models on it. And so we went ahead and built tools and applications that sit on that layer and use those sort of engines and ones we've built ourselves to. So we've spent a lot of time in my space-- obviously, Trevor's spent a ton of time around Alteryx because he's the finance geek and really cares about some specific ops problems. But we've really gotten into things like using computer vision to help redact faces for Freedom of Information Act requests, things like that, where you can sort of layer different types of cognition together and different types of models to create some really powerful results. And so it's been kind of interesting watching how those three sort of pieces of the stack, sort of the platform that lets you ingest, scale, and run different types of models, the different models themselves and then applications you build on the models can all kind of work together. Yeah. I mean, it's been super extensible and kind of a labor of love. But yeah, we're excited about it. TREVOR: 06:49 One of the use cases we've been exploring recently is around interaction analytics. So these days, there's so much content being generated through Zoom and Hangouts and all these things, especially since COVID, and it's a lot of data that's just not being untapped. And so one of the use cases that we really see emerging is around interaction analytics. And what that really means is if you have a conversation between people is, well, how can I query that? How can I query those recordings, right? And so some of the applications could be sales calls for sales enablement training or are my sales reps being effective? Revenue generating type activity, right? Everything to telemedicine. Are my doctors providing a good experience? Or call center. My call center agents, are they complying to our policies and are they providing a good experience? And to see all that information. And so kind of the way that works is either using our platform natively or working with Alteryx as well. As you say, you pointed out the media, and you get dashboards that shows you, what are they talking about? What's the sentiment? Computer vision aspects to figure out, what are the facial expressions? Who are the people on the call using facial recognition? Content, entity extraction, topical extraction, those types of things. And what I really like about it is that we get to use such a diverse amount of AI engines together in one solution. And it's really fun as well because you get beautiful dashboards as well. So that's been kind of a fun one we've been doing over the last couple of months. SUSAN: 08:26 Yeah. That's very cool. I was trying to count all the things that you were mentioning. Sounds like topic modeling, sentiment analysis, entity recognition. I mean, there's a lot going on there. ROBBIE: 08:37 Yeah. Layer cognition is such a buzzword for us. That's when things really start to light up when you can start to run a couple of different models on the same data set and all of a sudden, you're providing a ton of functionality that seems quite extraordinary. Or it would have seemed extraordinary, maybe four or five years ago. SUSAN: 08:57 And now it's just ordinary. [crosstalk]. ROBBIE: 09:00 It's like, what are you going to do next? SUSAN: 09:02 So your term for that is layered cognition? Did I catch that correctly? ROBBIE: 09:05 I mean, that's what I've been calling it. I don't know what other folks would say. But it's basically the ability of running to either have multiple models of processing the same piece of information, either in serial or parallel, just depending on what you're doing. So it's things like being able to verify someone's identity by using computer vision and then getting a voice print and maybe some biometrics. It's doing voice analysis. It's really quite interesting. SUSAN: 09:37 Yeah. Very cool. So it sounds like this is really a way of making the most of all of the unstructured data that's out there, whether it's these conversations, or other potential sources. What are some of the other kinds of sources of that unstructured data that you've seen folks working with? ROBBIE: 09:54 So I think it would be fair to say that, driven by the market today, it's primarily audio and video. But what's interesting is that's just what's driving the bulk workloads today. We're starting to see other use cases really kind of come to the fore. And I think in the last 6 to 8 months, maybe 12 months, all of a sudden, it's been about integrations. Basically, integrating our stack with somebody who has a large data source and wants to light up and basically run some kind of operation that they can then do some kind of visualization against or get some insight on, so. But I think today, it's fair to say it's still mostly audio and video and sometimes static documents too. But, yeah, that would be the vast bulk of what we do. SUSAN: 10:38 Very cool. And I'm assuming it would be possible to combine those different things as well, then, right? Combine somehow the documents with what you're getting out of audio and video and so forth? ROBBIE: 10:48 Yeah. Yeah. TREVOR: 10:49 That's actually a very popular use case for us is searchability across a diverse input of content, right? So e-discovery is something that is very common within the legal sub-vertical for us. And what they do is they have a massive amount of content: audio recordings, phone calls, documents, emails. You name it, they have it. And they need to just dump everything in and be able to search for keywords across all that content. And so that's one of those use cases that's right in our wheelhouse because what it does is it flexes the muscle of being able to run all the models that work together and have strong search capabilities. SUSAN: 11:31 Yeah. Very cool. So going back to the use cases and the really interesting applications that you've seen, I always think it's neat to hear stories as much as we can in whatever terms you're willing and able to describe them, of people who have put these tools to use and really made some meaningful impact on their businesses and their outcomes. Are there are a couple of favorite stories and use cases that you have that you've run across during your time working on these tools? ROBBIE: 11:59 Veritone started off as a media and entertainment kind of centric organization, and then they found that the way the platform had been built, it would easily allow them to move to other business verticals. And so government legal has been something we kind of kicked off in the last maybe 18 months. But I know that we've-- so we have a product called IDentify, which really hasn't-- it's not really something we've used a ton while we're looking to see what happens with legislation around facial recognition. But that was actually used to catch someone who was pulled over for speeding, that there was an existing warrant out for a violent crime in another district. So what's really interesting is that it turns out that a lot of local law enforcement doesn't actually share their known offender databases. And so the concept with that product is you basically just ingest all of the known offender databases from all of these different agencies and then they can search and do a facial match across all of those folks. And so I think somebody was picked up for DUI, turned out they were wanted for a violent crime elsewhere, and that they'd sort of fleed. And I think it wasn't like it was just a couple of counties away. I think it was Alaska to Southern California. So it was something that you just you'd normally find. ROBBIE: 13:27 I'm trying to think of the other analogies and stories that we've talked about publicly. I'm not quite sure which ones I'm allowed to mention. But there's a lot of use cases that are really interesting, like things in the power industry using drones, like the DGI drones, for example have a NVIDIA Jetson devices onboard and really powerful computer vision and object detection techniques and models. You can use them to check power lines after a storm so that crews don't have to manually climb each pole to verify that that's okay. And it's those types of scenarios, being able to take like a whole ton of visual data and push it through and run an analysis on which ones have the anomalies that need to be manually checked by a human that can just save tons and tons of time. So those are probably sort of two of the scenarios that I can think of just off the top of my head that I know we can talk about, but-- TREVOR: 14:31 Yeah.There's-- ROBBIE: 14:31 Go ahead. TREVOR: 14:33 There's definitely one on the media entertainment side that I think is pretty fun. So a lot of the time we work with advertisers that place media on live sporting events or their sponsors. And so at live sporting events, you don't always know where the cameras are going to be, which way they're facing, things like that. So you want to try to measure the ad value of your sponsorship, right? And so what we use for that use case is logo detection. What we do is we ingest the media for the sporting event, and we run it through logo detection. And what it does is identifies where it is on the screen, what angle it's at and how prominent it is on the screen. And what we do is compile an aggregate index that essentially equates that to value. And then we match that against, whether it's Google analytics impressions or orders and things like that. We actually attribute that back to the actual impression. ROBBIE: 15:31 Yeah. So really actually that's a really good point, Trevor, this other-- I'm thinking in the M&E space, there's some really nice already announced things like our partnership with the San Francisco Giants. So we basically took the San Francisco Giants entire media catalog, ingested it, indexed all the players, and now for the first time if they want to find an image or video of a specific player, they can very, very quickly pull all of the clips and highlights and create these amazing highlight reels going back throughout their entire recorded history, which is really cool. And of course, before that would have been people having to manually index and just trawl through things and just remember where things were and manually catalog. So super powerful. SUSAN: 16:18 So, Robbie, you told us earlier that your six-year-old has his first baseball game tonight. ROBBIE: 16:23 He does. SUSAN: 16:23 So are you creating an archive of all of his baseball game media from over the years, so that-- ROBBIE: 16:28 Funnily enough, funnily enough-- SUSAN: 16:30 Uh-oh. ROBBIE: 16:30 --I do. Yeah. So from my accent, you may be able to tell, I wasn't born here. So, therefore, I think this games silly. But we're fully embracing it in the Booth household. And I'm not going to be the best coach or the best dad trainer, but we will be the best analytics team. Yeah. So I was showing you earlier we've got the-- Trevor's not seen this. Trevor, it's a smart baseball. IoT device. Charges via induction. And will measure spin, velocity, curve, whether it was in the strike zone, all that kind of fun stuff. We have the same thing for the actual baseball bat. And then all of the bats for all of the kids is on video, I index it all, and then I measure difference in variation of swing over time. TREVOR: 17:20 [crosstalk]. SUSAN: 17:21 Wow. ROBBIE: 17:22 Yeah. So, for example, my eldest son, we installed a batting cage and he got all excited. And so we've been cranking up the pitching machine and so he'll throw pitches at 46 to 50 miles an hour. And so he got all hyped up on hitting that. And now he's got this problem where he starts to swing too early and telemetry shows that his bat speed has slowed down. So yeah, just this is what you get for having a dork as a dad, I guess. TREVOR: 17:51 Oh, that's awesome. Well, you may not know this, but I also have a six-year-old in baseball. So I have a feeling this is an area we should be collaborating. ROBBIE: 17:59 We should be collaborating on this Trevor. So we've got all the things. We've got the Axe Bat. We've got the swing tracker. We've got the pitch tracker now. Yeah. I would show you the telemetry of my nine-year-old's swings, but I don't think this podcast could handle the excitement. TREVOR: 18:17 Oh, we'll can grow some descriptive analytics on that. I'm all over that. ROBBIE: 18:20 Oh, well that's-- TREVOR: 18:21 I'm here. ROBBIE: 18:23 That's it. I do have analytics for all of these things. It traces the swing. it shows you the potential damage of the shot. It tells you what you're doing wrong. But really, it's the power of data and it's amazing that in our lifetimes we're getting to see this application. It's been really neat. SUSAN: 18:45 It's going to look very different when kids from your kid's generation will be moving up into college sports and so forth-- ROBBIE: 18:52 Oh, it's going to be crazy. SUSAN: 18:52 --and all of the coaching process and recruitment will be-- ROBBIE: 18:55 Yeah. Well, we bought-- SUSAN: 18:56 Well, can we see your six-year-old data? ROBBIE: 18:57 Oh, yeah. Well, what was wild was we-- so I just bought this new camera. So it's by a company called VIO. And it's basically two 4 K cameras that you position in one corner of the field and it captures the whole game and it tracks the ball. And what you do is you basically upload the entire game footage - so it's basically three hours times 4 K video - up to their cloud and then they go run player identification on all of the players. So you can create highlights. It's absolutely out of control. They've been using this stuff in Europe for the remote cameramen for all of the soccer games when they kicked everything back up in the pandemic and they didn't want to have too many people that were near the players. So all the cameramen or camera people, rather, in British sports right now are all these robot cameras. It's amazing. SUSAN: 19:56 Wow. So cool. So much [crosstalk]. ROBBIE: 19:58 Because I was getting really excited. I always end up pointing the camera at my feet when somebody does something good because I get too excited and see some Scottish unintelligible shouting. It's not good. It's terrific. SUSAN: 20:13 That's terrific. It's awesome that you have parents with analytic skills who can take all these data, set all this up, have such amazing insights. When you've got tools like what you have at your fingertips, it sounds like there's so much that you can do with all of that data that you've collected. Thinking about folks who are wanting to work with their own kids' sports data or whatever other video, audio, text data that they might have, what would be some of the things that somebody who is maybe not super sophisticated in data science could try that they might like to experiment with? And then kind of on the flip side of that, for somebody who is more experienced, who is sophisticated in data science, what would be the appeal of trying out tools like what Veritone offers? TREVOR: 21:01 Yeah. I'll take that one. I think that if you look at analytical maturity, every company is at a different place in their evolution, right? And I think one of the misconceptions out there is that because AI is one of those just completely overused terms, right? And so when people are thinking about their own analytical maturity and what their aspirations are and what their roadmaps are and where they place their budgets, I think there are some folks out there that think that AI is, "Oh, yeah, that's the thing that I'm going to do after all of the things that I know of that I already need to do with my structured data and some automation initiatives and things like that." But it's really companies like us and others who are really making AI more approachable. Is it's something that you can just experiment with, right? And you don't have dependencies on your analytical maturity for you to start leveraging it, right? Just think of it as another data source that represents the vast majority of your data that you're not doing anything with today. And they really kind of parallel efforts, I guess, is the way that I would say that. ROBBIE: 22:12 Well, one of the reasons I came to this company in the first place was when I first saw what we were doing with Automate simply because if you look at other industries where people have empowered scientists or just other folks with tools that don't require a high-level entry point. So if you think of video game development and the rise of the game engine, like unity, unreal engine, and things like that, it's really amazing what the difference has been. I got my start in the video game business and to go make a game like when I was working on flights and we'd have 150 people and it'd take us five years and now-- SUSAN: 22:58 Deep sigh. [laughter] ROBBIE: 22:59 Yeah. It was painful. And then we started playing around with the concepts of a common engine. And then really by sort of the end of my tenure in the video game space, you license your engine and the bulk of your team are really working in that sort of script layer or the drag-and-drop. So all of a sudden, your team doesn't have 150 engineers and a couple of token artists. You basically have designers. You have people who are creating the experience. You're basically empowering creatives. And I kind of think that that's really what automates going to do. Before, you had all of these barriers to entry. And really, I think automates goal and other tools like it, which are inevitably going to pop up, is to really be the way that you can go run any kind of cognition without having to write any code to support it. And to do so in the cloud and to kind of pick what your environment looks like, so that if you need to run it on a GPU you can and just make it super simple. So I think it's really exciting what things are going to look like in five years' time. I mean, I think our kids are going to be taking classes on how to build zero-shot object detection models, hook them up to their Lego and do it for a class project, and that will be second grade. It's going to be amazing. My second grader built a Lego R2-D2 robot and he kind of drives it using his iPad. TREVOR: 24:29 [They do?]. ROBBIE: 24:30 So Trevor, I don't think it's a stretch. Imagine that with a little NVIDIA GPU in there and all of a sudden, you can go run some object detection and now R2-D2 can run around and not bump into walls. It's just mind-blowing how cool and how quickly this is moving. Because really the algorithms haven't really moved in advance that much. It's the ability for more of us to apply that's become sort of this democratization and this driving force that's important. SUSAN: 25:00 Absolutely. I mean, when I was in elementary school, people were building volcanoes out of clay and using vinegar and baking soda. It just really pales in comparison to these other cool projects, so. ROBBIE: 25:12 I think I just aged myself because we had this one computer in the whole school and it was this old IBM thing. And you got one hour on it in a month and you could go down there and write something in logo. SUSAN: 25:23 Yeah. Yeah. I hear that for sure. Oh, man. So on this note of democratizing access to these two algorithms and to AI here, how do people get started with this, if somebody wants to give this a shot with whatever data they have on hand? TREVOR: 25:42 Well, we offer free trials on our platform. So anyone can just go to aiware.com and sign up for a free trial. And you can train libraries, right? You can use the Alteryx integration. You can load your own content and experiment immediately. ROBBIE: 26:05 And as a developer, it's not even that hard. I mean, I remember-- going back to my job interview two years ago, our CEO Chad Steelberg is a really smart guy. And I got this one-- after my interview, I got this one-line email from him saying, "Go build something on the platform and make it cool." So it's like, "All right." So I went and grabbed the documentation, and I was able to build-- I mean, even at sort of a fairly early stage of maturity, I was able to build my own little engine. We call them engines in our platform. It's basically a way of running a model. And I was able to build a little computer vision model using some open-source code from GitHub and have it running and recognizing photos of our CEO. And I don't know. It took me about three or four hours to build the thing and that was from having zero experience or exposure to the platform. And it's advanced a lot since then as well. SUSAN: 27:04 Very cool. And of course, I think Trevor, also you mentioned earlier the relationship with Alteryx. Can you talk a little bit more about that? And for people who are Alteryx users who listen to the podcast, how they might use the tools together? TREVOR: 27:17 Definitely. One of the things that we've really enjoyed doing over the last three months, four months, is talking with the Alteryx ACEs. So we engaged with them immediately and started brainstorming ideas as to where the applications for aiWARE plus Alteryx, where those would be, right? And so that's been a really fun ride. We're also sponsoring Inspire this year. So we're excited that we'll be part of the keynote as well as the presentations by Mark Frisch as well as AJ Guisande. And so they're among the two strongest adopters out of the gate and have been really wonderful to work with us as fellow thought leaders in the space. And I'd say, generally speaking, we're engaging with Alteryx at all levels, right? We're working with sales engineers. We're working with marketing. We're working with executive leadership. And it's just been really fun to see the companies come together and work together for a common cause, which is bringing AI into the analytics space. SUSAN: 28:26 Definitely. We've had at least one guide to using the aiWARE tools with Alteryx published on the Alteryx Community. So that would be one place that folks could look for sort of an example of an application and then actually pulling the tools into a workflow and using them there. TREVOR: 28:43 Actually, this week we're publishing starter kits. Some really great starter kits for people just to have some fun with their data. If you just want to do basic facial recognition or transcription or translation or just sentiment or things like that, you want to play just in one category at a time or bring them together for the full interaction analytics solution they're built to support that. And we think that'll just make it a lot easier for an Alteryx user to not really have to think about what they're doing AI-wise and just kind of point it at some media and see what you get. SUSAN: 29:18 Nice. Yeah. And if those are published when the podcast is published, then we'll be sure to link to that in the show notes as well so folks can easily click through and check those out. So kind of a random question that's something I was curious about. I think Robbie earlier you mentioned the company coming out of this media and entertainment background. So I'm just curious kind of how that evolution happened to now aiWARE, right? Because I can see a connection but I'm curious about that if that's something you can flesh out a little bit. ROBBIE: 29:47 Well, Trevor, you pre-date me by a little while, but I can give you all I know. So we're kind of doing this archaeological hunt here. So obviously, Veritone was formed by the Steelberg brothers. A big background in media and entertainment and the ad space. They'd sold their previous companies to Google. They start Veritone. And really, they were basing it on sort of the [inaudible]. So the direct correlation to [inaudible] would be truth in the signal, or at least how they're choosing to interpret it. And then really sort of aiWARE as a product, what's been really cool about this, and it really resonated with me as someone who in my previous life, my whole job had been to try and land Fortune 500 companies with these big AI workloads and just running a whole ton of complexity and pain. Not fun stuff. Just really frustrating work. ROBBIE: 30:48 So the idea of an operating system that would allow me to just orchestrate AI workloads, regardless of whether I want to run it-- whether I want to run it on-prem. I can be cloud-agnostic. I can run on a laptop. I can run on a desktop computer. Our North Star is pursuing sort of truth in the signal of noise and in unstructured data primarily. I think that was our origin. And then the methodology by which we would do that was we'd build this operating system that would basically simplify a lot of that complexity and just remove that from us in the same way that cloud meant that we don't need to really worry about the size of the box that we have sitting under our desks that we want to go crunch and build a model with. It's been very liberating, I think. Trevor, do you have any other funny insight? I mean, I liked my story, but if it was really just because they were playing beer pong and they decided that Veritone would be cool. I don't think we should-- TREVOR: 31:47 I had two or three bullets queued up in the back of my head and you crushed it, man. You already said them, so. [laughter] ROBBIE: 31:56 Because I can just imagine Chad and Ryan going surfing and then having this argument about what it should be called. TREVOR: 32:03 Ryan made this really funny comment to me one time because they're serial entrepreneurs and they've been just hitting them out of the park. And so I think Ryan looked at Chad and Chad's like, "I want to start an AI company." And Ryan's like, "Of course he had to pick something hard." [laughter] Making AI easy. [crosstalk], right? And of course, you have to do that. [laughter] SUSAN: 32:32 Amazing. That's great. I love that there's etymology behind the name as a big nerd. I think that's super cool. Yeah. It's interesting how many times the ideas of playing and having fun with what you're doing in AI, I mean, I think those words have recurred kind of a lot in this conversation. So I think that says a lot about what you all are working on and how folks will enjoy working with it. So one question that we ask all of our guests and that I will ask you all, this is are alternative hypothesis recurring segment that we. Very formal here. And the question is, and if either or both of you would like to take this, feel free. What is something that people often think is true about data science or about AI, but that you have found to be incorrect? TREVOR: 33:23 Well, I mentioned mine earlier, which is that AI comes after analytical maturity, right? When you're moving from descriptives, through predictive, through prescriptive analytics, and you're evolving on the value chain, that AI's sort of that thing at the end. But it's just not true, right? So I think that AI is here, it's approachable, it's easy, and that nearly every business process could benefit from it in a very simple way. SUSAN: 34:02 I think for me, it would be that it's still not magic. The number of times we'll talk to-- I think Trevor landed on talking about different companies have different levels of maturity. And so we'll talk to a company and they'll have audio that's recorded in an extremely low bit rate with a ton of background noise on it. You can barely understand what somebody's saying. And then the expectation that somehow you would get accurate transcription from that, it's just not good. I mean, there's certain things you can do, obviously. You can run a bunch of noise reduction. You can try and pop the signal a little bit. But at the end of the day, it's an algorithm somebody wrote, operationalized, turned into a model, trained, and then we put it on the platform. And so while you're looking for signal in that noise for sure your noise can't be garbage. And I think there's still-- we still run into that every-- I would say a good percentage of the time that there's just a bunch of corrupt stuff and there's very little you can do with that. TREVOR: 35:14 That's really true. Yeah. It's really true. Sometimes we get surveillance video, right? So we're going to [crosstalk]. Okay. "Well, why didn't the engine identify that person?" And then you're looking at the image and you're like, "Can you tell me who that person is?" ROBBIE: 35:30 It's because they all watch that TV show. It's like name that cop show where they run the magic enhance button. SUSAN: 35:35 Magnify. Magnify. ROBBIE: 35:36 It's like, "Where's the magic enhance button?" Then, you press it, and all of a sudden, it magically becomes this high-definition image. So, no. SUSAN: 35:44 That doesn't exist? What? Oh. [laughter] ROBBIE: 35:48 I don't even know the names of the shows, but I know there's a ton of them. SUSAN: 35:51 That's great. That's a great point. Yeah. Data science is not magic. You have to have some sort of foundation there to work with before you can actually do much with it. So that's a great point. ROBBIE: 36:00 We do spend quite a bit of time now in education sort of working with customers, sort of explaining what are the things that they can do. Actually, one last good example we had was we had-- so we have this redaction product I talked about earlier, which we use for Freedom of Information requests. And a university had an incident and they sent us all this video and asked us if we could go ahead and just redact all of the people in it. Well, what's interesting is not only do we use-- we don't do facial recognition, we do head detection. So we've basically trained an object detection engine to detect heads. And then we put bounding boxes around those heads. And then we have some other tech, like if you occlude a head so all of a sudden, you're off the screen or something or you step behind somebody we use interpolation to kind of keep track of where we think you are. And the solution works pretty well. It's kind of a combination of data science and just good old-fashioned software engineering. ROBBIE: 37:02 Well, what's interesting was when they sent us the video, the video had an effective frame rate of three frames per second. And we maybe only sample the video maybe twice a second or something like that. So we ended up with all these weird scenarios where humans would appear to blip and teleport across the screen. So you wouldn't know that they were the same object or something. But that doesn't happen a lot because at some point that facility probably had to make decisions on storage and that determined how many times, they were going to take basically static shots. And then they ran all those static shots through some kind of movie maker and ended up with this bizarre kind of frame rate thing. So I think dealing with those types of data sources is going to be a real thing. Like 7-Eleven grainy video feeds and all kinds of other things. Yeah. That's going to be a reality for quite some time to come, I think. TREVOR: 38:05 Yeah. And one thing that's been really nice with COVID, I think - I'm the first person to start a sentence that way - is that these call, video calls, like Zoom and Teams and all that, is really good for facial recognition, right? So it's right square on the face. It's usually good resolution, it's prominent in the screen, and the audio quality is really good for transcription and some of the NLP stuff. So the good news is that that tends to work really well. We've actually been kind of blown away. We're training libraries to do facial recognition on some of the calls they've been getting. And it's amazing that we'll just take one picture of somebody from their LinkedIn profile and it'll catch every frame in the recording. And we're used to having to train three to four to five different images for the same person. So that's been really cool to see. SUSAN: 39:04 Yeah. Interesting. Yeah, certainly all different kinds of data available now that maybe we didn't have in the past that we can do new things with. So anything that we haven't talked about yet that you want to get in there, that you want to be sure to mention? TREVOR: 39:17 Yeah. I think that, given the diverse opportunity that we have within the AI space and just the general capabilities of what AI is able to do, is that we love new ideas. For myself, I think one of the things that we're going to have fun with over the next few months is jumping into predictive and onboarding predictive models and things like that to help us with forecasting and orchestrating some of our other engines. And I think that's going to be a really fun project. But even if you take what they're doing with Illuminate where you're taking known offenders database, yeah, but what about for retail? If you want to track VIPs or you want to track known fraudulent folks or whoever and maybe things like that? And so there's so many different opportunities where AI is going to disrupt everything. And we love just talking about those ideas and just problem-solving. And so I would just encourage anyone who is having these ideas to reach out and let's start a conversation. ROBBIE: 40:22 Preferably with Scotch. [laughter] It doesn't have to-- SUSAN: 40:26 Well, I think you all have given our listeners a lot to think about and a lot to also be excited about. So I hope some of them will give this a try. It's very exciting. So thank you both for joining us on Data Science Mixer. TREVOR: 40:36 Absolutely. Thanks for the time. ROBBIE: 40:38 Yeah. Thanks very much. And go Mighty Mussels. I have to give a shout-out to my six-year-old. Go, Ronan! TREVOR: 40:42 Yes. Can't wait to hear about the game. [laughter] All right. Thanks, guys. Thanks for listening to our data science Mixer chat with Trevor Jones and Robbie Booth of Veritone. Hear more from Veritone at the Alteryx Inspire Conference on May 18th to 21st. We've also got links for resources for trying out Veritone's aiWARE tools in your Alteryx workflow in our show notes at community.alteryx.com/podcast. Also, be sure to join us on the Alteryx Community for this week's Cocktail Conversation to share your thoughts. Robbie and Trevor talked about how they're using their data skills to support their kid's athletic careers, tracking and analyzing their baseball stats. Have you ever used your analytics and data science skills for a fun personal project or for someone in your family? Tell us about what you did and share your ideas. You can leave a comment directly on the episode page at community.alteryx.com/podcast or post on social media with the hashtag #DataScienceMixer and tag Alteryx. Cheers.

This episode of Data Science Mixer was produced by Susan Currie Sivek (@SusanCS) and Maddie Johannsen (@MaddieJ).
Special thanks to Ian Stonehouse for the theme music track, and @TaraM for our album artwork.

Data Science Mixer

Episode Guide

Leveraging AI to unlock unstructured data | Trevor Jones and Robbie Booth

Panelists

Topics

Cocktail Conversation

Transcript