Alter Everything Podcast

A podcast about data science and analytics culture.
Podcast Guide

For a full list of episodes, guests, and topics, check out our episode guide.

Go to Guide
AlteryxMatt
Moderator
Moderator

We are joined by Akshay Swaminathan to talk about understanding the divide between technical and non-technical teams collaborating more efficiently in a business environment.

 

Whether you're a data scientist, business analyst, project manager, or executive, this episode offers insights into fostering collaboration and driving success in data-driven initiatives. Tune in to gain the tools and strategies needed to bridge the data divide within your organization. Check out a more in-depth dive into these topics in Akshay’s book, linked in the show notes!

 

 

 


Panelists


Topics

 

Ep 156 (YT thumb).png

 

Transcript

Episode Transcription

Ep 156 - Winning with Data Science 

[00:00:00] Megan Dibble: Welcome to Alter Everything, a podcast about data science and analytics culture. I'm Megan Dibble, and today I'm talking with Akshay Samanathan, head of data science at Cerebral and researcher at Stanford. In this episode, we chat about how to best collaborate with data science teams. how the data skills gap can be closed, and how companies can become more data driven.

Let's get started.

Hi, Akshay. Thanks so much for joining us on the show today. Super excited to have you here. I'd love it if you could just give a quick introduction to yourself for our listeners. 

[00:00:39] Akshay Swaminathan: Hi, Megan. Thanks for having me. Really glad to be here. My name is Akshay. I work on improving health systems with data, data science.

And I wear two hats these days. My one hat is as head of data science at Cerebral, which is a tele mental health platform. We serve patients in all 50 states providing therapy and medication management services. And my role there involves, you know, using data science to improve the lives of clinicians and patients and improving healthcare delivery.

And my other hat is studying medicine and data science. at Stanford. So my research there focuses on the implementation of AI systems within Within healthcare workflows. 

[00:01:24] Megan Dibble: Super cool. Yeah, the reason why we brought you on today was to talk a little bit more about your book winning with data science that you co wrote.

So I'd love to hear a little bit more about what motivated you to write that book and what that process was like. 

[00:01:40] Akshay Swaminathan: Sure. So winning with data science. It's not like your typical data science book. So my co author and I Howard Howard. We had both found from our experience that the success of data science projects, sometimes it has to do with the actual, the data teams, but a lot of times it has to do with the non data stakeholders, the business partners, the domain experts.

And these are actually the people who can make or break the success of your data science projects. And what we found is that, you know, there are a lot of resources out there to help people become data scientists. There are not many resources out there to help non technical folks become effective data collaborators, data customers, data stakeholders.

And so that's why we wrote this book and, you know, we wanted to make it as interesting and engaging as possible. So it's actually written like a story, closer to a novel than, than a textbook, I would say. So the book follows the journeys of two characters. There's Kamala, who's chief of clinical strategy, head of clinical strategy at a health insurance company.

And then there's Steve who works at a finance firm. He's a new MBA grad. And so the book follows their journeys, solving various business problems as these two characters collaborate with their data science teams. And the reason we did that one is to make it more enjoyable to read, but also to, to illustrate what do effective collaborations look like and what are the conversations look like?

So a lot of the book is dialogue, right? Where we're simulating conversations between. you know, the, the characters and their data team counterparts. So, so the readers can see, okay, what does an effective communication dynamic look like? So that's a little bit about the book and it was fun to write. And I think it's been exciting to share because ultimately it's about bringing people who are not normally in the data science conversation into that world.

[00:03:37] Megan Dibble: That's really cool. And timely, I think with so many, uh, companies pushing for AI implementation for machine learning from like the top level. I think more and more business professionals are going to need to know at least how to speak the language of data science, how to know just enough that they can collaborate well for projects.

So a super timely topic and sounds really interesting. And I know that in the book you talk about adopting a customer mindset when working with data science teams. So I was curious to hear what that approach looks like and what the benefits are to adopting a customer mindset. 

[00:04:18] Akshay Swaminathan: Yeah. What does it mean to be a good customer?

Being a good customer is not actually too different than being a good data science customer or data science collaborator or stakeholder, all these words, you know, we, we use them interchangeably. So to give you a, an analogy to say home improvement, right? Let's say you engage a contractor and you want to redo your kitchen.

What would you do as a good customer? I always say a good customer does three things. They ask the right questions. They challenge assumptions. And they speak the language. Okay, asking the right questions. For example, what are we optimizing for here? Are we optimizing for open space? Are we optimizing for functionality?

Are we optimizing for storage space? Right? These are all important questions to answer when you're planning your kitchen remodel challenge assumptions. Why do we need a countertop here? Why can't we just rip out the countertop and put in an island instead? So rethinking the fundamental premises. So that's challenging assumptions.

The third is speaking the language. If you want to have a productive dialogue with your contractor, with, you know, other vendors, you got to know what is a backsplash? What is cabinet refacing? You know, what is an Island? Like all basic things like that. You have to be familiar with the terminology. What's the difference between granite and quartz?

You know, so you, there's a little bit of knowledge, right. That you need to have to effectively engage in these conversations. It's the same with data science, right? So when we say asking the right questions. What is the success metric for this project? How are we defining our outcome variable? Challenging assumptions.

How come we didn't do any data cleaning here? How can we assumed that we can throw out missing data? Speaking the language, a good, a business person who knows what feature selection is. Who knows what cross validation is, who knows what overfitting is, who knows what inclusion criteria are. Someone who can speak that language.

And I'm not saying you need to be able to prove algorithms and derive algorithms from first principles. I'm saying you need to be familiar with the 20 percent of content that shows up 80 percent of the time. And so a good customer does these three things. They ask the right questions, they challenge assumptions, and they speak the language.

And that's the approach that we advocate for in the book. 

[00:06:42] Megan Dibble: I think sometimes it can be tempting at work to go to another team and be like, You're the experts, you advise, you like relinquish all control to this team. And then if you do that, then you end up maybe with something that isn't quite what you expected or quite what you wanted.

Or maybe you do have more knowledge on the subject matter than, than you thought you did when it comes to if data science is using your data, just like only, you know, your kitchen needs, I guess. 

[00:07:10] Akshay Swaminathan: Exactly. That's exactly right. Absolutely. Absolutely. Absolutely. 

[00:07:13] Megan Dibble: I'm curious what you would say to people who are afraid of, I mean, you threw out some technical terms there, like feature selection or things about algorithms, like for people who have trouble learning that language and kind of closing that gap in communication, what would you say to those people?

[00:07:32] Akshay Swaminathan: So I would say two things. One, get the book. No, I'm kidding. We really do try to break down those concepts and the whole thing is, like I said, written as a story. And so the meta point that we speak to in the book is a good dialogue between data teams and business teams is one where people don't use jargon.

People don't obscure explanations in complex terminology that the other party doesn't understand. So the other thing I'd say is if you're talking with a data person who throws out a word like cross validation or overfit, and you don't know what that means, you need to stop them and say, Hey, I'm not a data scientist.

What does that mean? No one is going to think you're stupid or incompetent, and actually, a lot of the times you ask that question. What does this mean? You might actually not get as satisfying of an answer as you might want, which then will make the data team reflect and come back to you with a more compelling explanation.

So a lot of times asking these fundamental questions can help refine the data team's thinking and can get get you closer to truth to a better approach. 

[00:08:43] Megan Dibble: Definitely. I'm also curious to hear from you on whether you think in general that there's a data skills gap in terms of, do you see companies that maybe aren't getting the most out of their data and you know, how could companies work to close that?

[00:09:01] Akshay Swaminathan: So there's two pieces here. Why do companies struggle to get value out of data? And so there are technical reasons and then there are non technical reasons. The technical reasons, you know, might be they don't have data to begin with, right? Maybe they're not collecting data. Maybe they don't have the skills and expertise needed to set up the infrastructure required to collect data.

Maybe they have data, but it's not clean. It's not usable. And they don't have the pipelines in place to transform that data into a usable format. So these are all technical reasons. Maybe there's, you know, not enough investment and not enough, you know, resourcing. So those are all technical reasons. Non technical reasons are really what we focus on in the book, because it's not enough to just have a data team.

A lot of companies think, Oh, if I, I hire five engineers, three data scientists, a couple of analysts and a data PM, I'm all set. You know, they're going to come and save the day. That's actually not enough. And that's our whole point in the book. The business people need to realize that they play a critical role.

In, uh, helping the business gain value from data, right? So what we don't need, we don't need the business, but we don't need the PMs. We don't need the managers to go out and start learning Python, right? We don't need them to start, you know, learning how to use GitHub and learning how to deploy cloud based solutions.

We need them to learn how to effectively leverage their domain expertise. We need them to learn how to be effective guides and shepherds and even mentors to the data team. The worst thing is when you let the data team run free and they come back with something that's really cool. It might even do something really cool, but either it's solving the wrong problem or the business isn't ready to put it in action.

This happens in healthcare all the time. There are thousands of papers out there. Where, you know, we built a model that can detect pneumonia. If you're going to get pneumonia in the next, you know, five years, right. I'm exaggerating here, but there are a ton of models like that, that are, does that people build to predict some disease outcome to predict some adverse event and they have decent performance, but they never end up having any impact because it's not enough to just build a model that model needs to align with a business use case.

In a health system, it needs to align with a health system use case. The hospital has to be ready to implement that model within their care workflows. If no doctor wants that model, if doctors feel, Hey, there's no problem to solve here, we're already pretty good at diagnosing pneumonia. No one's going to use the model.

So it's not enough to have a skilled data team that can build performant tools. The, the building needs to align with the business problems needs to align with the capabilities of the business, the readiness, the willingness of the business to adopt those solutions. And that's why we need to work on empowering the, the business folks, right?

The domain experts to become better collaborators. 

[00:12:08] Megan Dibble: That's so true. And something that has come up on other episodes of the podcast, this idea that sometimes data professionals will jump straight to the solution. Or you see the data and you think about all the cool models you could build or all the cool things you can do.

But the conversation didn't start around the problem and how do we, what is the business problem and like really defining that. And that is something that has come up a bunch. Someone was asking me about, Oh, what's something that surprises you on the podcast? And I'm like, I think honestly, how much it comes up that the very simple.

We aren't always really defining the problem and really solving that problem, and then the analytics don't add value. So, yeah, I've seen that too, and I think the healthcare example is super interesting. There are tons of papers and tons of cool research, but when it comes to implementation, you need the business folks help with that as well.

So then, what are some steps you think that companies can take to put them in that direction of becoming more data driven and having more of that? 

[00:13:12] Akshay Swaminathan: So there are a couple things here, uh, I'll speak to one, maybe product or, or solution focused approach. And the other one is a more process, it's a process change, right?

So one tool or infrastructure change, make your data available and accessible to non technical folks. What I mean is don't design a system where the data team is a bottleneck to all your data requests. So in a lot of companies, the only people who have access to the data is the data team. If a business person has a question, they need to make a request.

These days, with the amazing business intelligence platforms that we have available, it makes no sense to operate in that way. The role of the data team should not be to fulfill these requests, The role of the data team should be to clean the data and package the data so that it's usable by the business people.

I read a piece recently which talked about the importance of basically playgrounds. We're creating environments where people can play. Because when people can play, their creativity is at an all time high. And so this piece was about software products. And so it was basically arguing that software products that create an environment where the user can just play around with ideas, with their data, whatever, those are the ones that I think they were talking about retention and things like that.

But in this context, what I'm saying is what the data team should be doing is building the ETL pipelines that ingest the raw data, clean it, transform it, package it into logical data models. That are then surfaced via a business intelligence tool like a Looker, Power BI, Tableau. Then, anyone in the business, without writing a single line of code, can create dashboards, can create tables, can create visuals, can answer their own questions about the business.

Data teams should also be writing data dictionaries, they should be defining things so people know where to look. The documentation should be robust. If you do this, You will 10 X the number of insights that you're able to generate because all of a sudden, instead of having, you know, five people, 10 people on your data team that have access to the data, it's everyone, everyone has access to the data.

And so then the, the, the goal of the data team is to empower these business people, they should hold office hours, right? This is what we do at cerebral, the data team, we hold weekly. Looker office hours where business people come and we help them to become better users of the tool. We actually ran a whole Looker power users program where we took the business people, the non technical people who use Looker the most and who are most excited about Looker.

We brought them in, we put them through a training regimen to help them level up their Looker skills. And I can't tell you the number of business insights that have come out of those efforts because they're the people with the ideas and the understanding and the domain knowledge. Transcribed If you also give them access to the data, superpower.

[00:16:09] Megan Dibble: I love that takeaway of holding office hours. I think that's like a great tip, a great thing to enable more users. As you were talking about, you know, making the data available, making a space where people can play. I was also thinking about just one of our newer products, Altrix Auto Insights, which is another dashboarding solution where.

You know, once the data is in there, you can start to really easily ask your questions of the data and there's like. I think the page is called search and you can start to just build out your own and answer your own questions. And I think that there's a lot of tools, but it's powerful when you're able to answer your own questions and even expand on those as opposed to having to go to a data scientist with every variation of your question.

And I know the data scientists don't love that either. So it's just a win win if you have solutions like that. You talked a little bit about how important it is to like surface the data to everyone A pitfall to avoid would be keeping it super locked down so that only data teams can access. But are there any other pitfalls that companies should avoid when they're trying to better leverage data science projects in their organizations?

[00:17:19] Akshay Swaminathan: Oh yeah, another thing I was going before I went down this, you know, making the data available for play rabbit hole. When you're working on more involved data projects, it's got to be cross functional. The most effective data science teams I've seen are cross functional teams. It's never the data team responsible for, you know, building a product.

Generally when that happens in my experience It's been actually less effective. So what does that mean? It means you have a business person, say a product person doing the scoping, the question refinement, right? That's where they shine. They have the best understanding of the business needs. So they're the best people to pose the question, refine the question, scope the problem, and then they work with, so the project team is made up of a project manager.

Maybe you have some engineers, maybe you have some data scientists, maybe you have some designers, maybe you have some clinical people, some domain experts, and this is the team. This is a cross functional team. The model, in my opinion, is not product, person, makes a request, fills out a ticket, fills out a design doc, sends it to the data team, data team works on it for several weeks, reports back to the product.

That doesn't work. So it should be a cross functional effort. And in my experience, those cross functional teams avoid a lot of the issues that arise, largely because they communicate better. They have stand ups together. They have meetings where everyone is talking together. There are many more touch points and opportunities for For alignment and opportunities to correct misunderstandings.

[00:18:53] Megan Dibble: Yeah, I think that cycle of communication is definitely important that there's not a two, three week, four week gap the next time you hear back and then all of a sudden maybe the project isn't looking how you envisioned it or there's, yeah, miscommunications to correct. We actually had someone write a really good article about that recently, about the cycle of communications, someone on our data team.

So I'll definitely link that in the episode notes too, but I think that that's a really good takeaway as well. So yeah, I'd love to wrap up with a question about future implications. So What does a world with no data skills gap look like to you, or a world where everyone is collaborating well with data science?

What's your vision for the future after this book and everybody reads it? 

[00:19:43] Akshay Swaminathan: It's a good question. In an ideal world, if everyone read the book and everyone was empowered to become a stronger collaborator, I think a lot of the hype around data science and AI would die down quite a bit. 

[00:19:57] Megan Dibble: Ooh, interesting. 

[00:19:58] Akshay Swaminathan: So in the book, we talk a lot about different biases, selection bias, You know, response bias, et cetera.

One bias we don't talk about is AI is our savior bias. So this is the bias where people think that AI and data science is going to solve all their problems. When in reality, it's just another tool. I think it's important for us to be a little bit humble and recognize that. We're providing tools. They're great tools, but at the end of the day, they're tools and tools can be misused.

We can make the mistake of thinking, Oh, I have the best tool. Let me use it to solve all my problems, but that doesn't work. So in an ideal world, people recognize that the data teams are fallible without a skilled craftsman to wield a chisel or whatever the chisel is, you know, useless. Right. So similarly, there, it has to be a collaboration.

And so in a, in an ideal world, I think right now in a lot of conversations, the dynamic is very imbalanced data team can be thought of as this, you know, mystical, powerful, magical team that just works their magic. And it's a black box. But the, the more we can bring people in and set up these cross functional teams where people are talking with each other and not using jargon to shroud decision making, the more we can challenge each other's assumptions, ask tough questions, explain fundamental concepts to each other.

And it goes both ways, right? Data people, a lot of times don't understand business needs. Ask a data scientist, you know, that's working on building ML algorithms. When is the last time you, you thought about, you know, LTV or customer acquisition costs? It doesn't really enter into the picture. So it goes both ways, right?

The data people need to learn from the business people. And the best way to become a better data scientist is to get better at understanding the business problems. So I think there's a lot of mutual growth that can happen. If both parties come to these conversations with more. humility and a more collaborative energy.

[00:22:01] Megan Dibble: I totally agree. I think that's a really good summary of the conversation and something I would hope to see more of in the future too. So thanks so much for being on our show today, Akshay. It was really awesome to learn from you. We'll be linking the book in the show notes and I'm excited for our listeners to get to learn from you on this episode.

Thanks again for coming on. 

[00:22:24] Akshay Swaminathan: Thanks Megan. Really appreciate it. 

[00:22:27] Megan Dibble: Thanks for listening. To find a link to Akshay's book, Winning with Data Science, and the blog article we mentioned on data science collaboration, head over to our show notes on community.alteryx.com/podcast. See you next time.


This episode was produced by Megan Dibble (@MeganDibble), Mike Cusic (@mikecusic), and Matt Rotundo (@AlteryxMatt). Special thanks to @andyuttley for the theme music track, and @mikecusic for our album artwork.