Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Data Science Mixer

Tune in for data science and cocktails.
Episode Guide

Interested in a specific topic or guest? Check out the guide for a list of all our episodes!

VIEW NOW
MaddieJ
Alteryx Alumni (Retired)

How can leaders encourage data scientists to experiment? Best selling author John K. Thompson joins us to share tips, including the importance of having a psychological understanding of what drives data scientists. 

 

 


Panelists

 


Topics

 

 

Cocktail Conversation

 

Mixer LI-2.png

 

 

What role has curiosity played in your data projects? Has it helped you ask a creative or unusual question at the right moment? Has curiosity helped you advance your career?

 

We're giving away two copies of John's book, Building Analytics Teams, so be sure to leave a comment by 4/20/21 to enter!

 

Join the conversation by commenting below!

 


Transcript

 

Episode Transcription

SUSAN: 00:01

When I hear the word artisanal, I usually think of fancy small-batch sauerkraut or maybe some beautifully crafted, unique wooden furniture. That's probably because I live in Oregon, where the term artisanal gets used a lot. But what would it mean for a data science team to be artisanal; made up of data scientists who prep data like the finest foods, who are craftspeople of models perfectly tuned by hand. And is there a time and place for a team structure that's a little different from that, maybe a hybrid of approaches? Our guest today offers ways to build successful data science teams and advice on perfecting the ideal blend of projects for data science and analytics professionals. John K. Thompson joins us on this episode of Data Science Mixer. The podcast that features top experts in lively and informative conversations that will change the way you do data science. I'm Susan Currie Sivek, the data science journalist for the Alteryx community. John has a unique perspective on how to lead data science and analytics teams drawing on over three decades of experience that he's also shared in two books. You'll want to put some of his ideas into action right away, whatever role you have in the data world. Let's jump into it.

SUSAN: 01:21

John, thank you for joining me on Data Science Mixer. We sure are glad to have you here today to talk about your work. Would you tell us your name and your job title, and where you're currently working?

JOHN: 01:32

Absolutely. Thanks for the invitation, Susan. I really appreciate it. I'm excited to be here with you. My name is John Thompson. I'm the Global Head, Advanced Analytics and Artificial Intelligence at CSL Behring.

SUSAN: 01:44

Awesome. And can you tell us a little bit about the company and what they do?

JOHN: 01:47

Sure. CSL Behring is a 100-year-old company. It's a biopharmaceutical company. It was founded in response to the last pandemic, and it's been around since then. And everything that CSL does is related to human plasma. So we have over 300 plasma donation centers across the United States, and we take human plasma, and we turn it into life-saving therapies for rare diseases.

SUSAN: 02:16

Wow, it's so interesting. I actually just gave my very first blood donation over last week, so it was very interesting to see the process and think about where all of that goes. That's very cool. And just along the lines of introductory things, again, do you mind sharing with us which pronouns you prefer?

JOHN: 02:31

He/him.

SUSAN: 02:32

Awesome. Thank you so much. And as you know, on Data Science Mixer, we sometimes enjoy a happy-hour type drink or snack or something special while we are recording. So do you happen to have anything special to enjoy there?

JOHN: 02:45

Mountain Dew.

SUSAN: 02:45

Mountain Dew. That's the drink of champions, I think. Very cool. Well, I am having a chai tea because we keep recording these during the day and at lunchtime, and people usually have 10 more meetings to go to afterwards. So we don't always get to imbibe, but Mountain Dew that will keep you going for the rest of your day.

JOHN: 03:06

Yes, it will. I mean, I'm a lapsed developer, so from my early days of staying up late and coding.

SUSAN: 03:15

Awesome. Love it. Cool. Yeah. So on that note, actually, would you tell us a little bit about your career path in analytics and data science?

JOHN: 03:22

Absolutely. It's been 37 years, so I've been at it for quite a while. And I left college. I was an assembler programmer, and I was building systems for large corporations in Chicago. And I just had an epiphany about two or three years into my career that everything I did was related to data. And I wanted to just get closer to data, and I wanted to work with it. And I wanted to do things that had more of an impact on how people made decisions and what the strategies were, and how businesses were run. So I ended up going to a small company called Metaphor, and there's a fair number of us around. Bill Schmarzo and many other people came from Metaphore, who are now writing books and doing thought leader pieces and building systems around the world. And then I went to IBM. I was part of the initial boom and bust of data mining. I helped create what is now known as predictive modeling markup language. So that's the first piece of technology that actually enabled model portability. So if you've ever got an alert from Visa or MasterCard about fraud on your credit card, you can thank me for that. And then, I ended up at Dell running the Advanced Analytics Division there, and I spent my career either innovating technology for data scientists or actually being a data scientist.

SUSAN: 04:46

Excellent. Awesome. And all of that experience then, as you mentioned, led you to write a couple of books. Can you tell us about that?

JOHN: 04:52

That's true. That's true. When I was working at Dell, I was flying around the world, and I was meeting with non-technical C-level executives, and I found that when we were talking about data and analytics, there was a certain reticence in their body posture and in their language and in their voice that they were just really not understanding much about data and analytics. And this was in 2017. So I thought, I'm going to write a primer, a small book. I live in Chicago, as you said. So if you go to O'Hare and get on a plane and fly to London, you can read the book in that entire flight. You get off the plane, and the idea was that as a CEO or a C-level executive, you're going to have people come to you-- senior managers come to you and say, "Hey, we want to build this kind of function. We want to do this with data." And I just really wanted to make it clear that there are certain ways to do it. There are certain leaders you should hire, a certain amount of teams you can work with, and teams can do these kind of projects and be successful, and these kind of projects are generally not very successful. So it turned out to be a very well-received book. It was listed as one of the top 100 books in analytics, and it did very well.

SUSAN: 06:04

Fantastic. Yeah. That's awesome.

JOHN: 06:06

And then, in 2020, I wanted to give myself a challenge. So I started on January 1st, sat down at the keyboard, and cranked out 100,000 words in three and a half months. So by April 13th, I had written the entire book, so.

SUSAN: 06:22

Wow. Much more productive 2020 than a lot of us.

JOHN: 06:26

Yeah. And so I wanted to publish that book in the span of six months, which I did. I started on January 1st. It was published on June 30th. And that book is called Building Analytics Teams, and it was for the complement of the first book. So people who have to actually build an analytics team and manage the team and choose projects and hire and fire and those kind of things, so that book was targeted to those people. It has been in the top 1% of all book sales on Amazon since it was released. It's been a huge success. I'm humbled and honored by all the people who have engaged with it. So it's been fun.

SUSAN: 07:07

Terrific. Wow, that is very impressive. I love your setting of goals and achieving them all within six months. So yeah, this is the book that I had the good fortune to take a look at. And also, I just want to mention we've also purchased a couple of copies that we will be giving away to folks who engage with our cocktail conversation, which is our little conversation starter that we'll include at the end of the episode. So folks, as they get their appetite whetted for the book through our conversation, hopefully, they will jump into the further discussion and have a shot at winning a copy. So we're excited about that. Yeah, I wanted to dive into some of the concepts that you talked about in the book. One of the things that I thought was really interesting was you provided this breakdown of an artisanal factory and a hybrid approach to putting together a data analytics team. So what do those mean to you? Is there a way that a certain leader should decide which of those structures to adopt?

JOHN: 08:01

Yeah, the entire book came from my experiences. I honestly made every mistake you could possibly make going through my career. So I thought it's kind of a community service to try to help people not make those mistakes. And what I found is that I really personally enjoy an artisanal team or team made of artisans. And I really like having a group of people that they handle everything: they do data acquisition; they write the project charters; they do feature engineering; they do modeling; they work with subject matter experts; they get the models put into production. I really like that kind of team, and that resonates with me personally. But it's not just your personal preference on what team you should build. You look at the artisanal team or the artisan team, and there's maybe five or six people on that team. The modular team, you may have 20 people on that team. So it's also how you manage people and how many people you like to have report to you. And it's also the cultural fit of the organization. If you have an organization that is like a scientifically based organization, like a pharmaceutical company, many of those companies are built on artisans, and people who are highly skilled, PhD level educated people, other organizations like General Motors, or some of the other more production-oriented companies like to work on more of a modular approach. So you want your team's architecture to marry up and match the culture of your organization, match with your personal style, and be able to operate in an organization in a very comfortable way.

JOHN: 09:43

One of the things that people ask me about with their regularity is, "Well, if you have 6 people on this team and 25 in this team, doesn't this team cost a lot more?" Well, no. When you take into account the pay differential of the small number of highly paid people versus the larger number of lower-paid people comes out to be almost the same. So really, cost is not really a consideration in it. And then, as you mentioned, there's also the hybrid team, and people are like, "Well, how does that work?" So what I'm doing right now is I have an artisanal team, an artisan-based team, and they're doing everything from feature engineering to modeling to all those kind of things. And then I've taken all the more mechanistic work: acquiring data, cleaning data, integrating data, and I've put that into a modular team. And they feed those data objects up to the artisan team. So you can have a structure that has both.

SUSAN: 10:42

Awesome. Is there a particular anecdote or story that you can think of from your experience where that model has really served you well?

JOHN: 10:51

Yeah. We're doing some really intriguing work at CSL right now. We've been doing-- we built an application to predict where the next donation center should be built. I mentioned earlier we have over 300 of them in the United States. Our competitors have another three or four hundred of them. So as you can imagine, all the prime locations have been taken. So we're starting to look at some of the other locations. The artisan team has allowed for a high degree of creativity. So we had a couple of people working on this model. And one person came back and said, "One of the real issues we're having is trying to figure out how people can get from their homes to the donation centers." Well, the other person said, "Well, why don't we use Google's drivetime information?" So just the creative interplay of these two people gave us a real ground truth, because in some places, maybe you've been to Phoenix. Maybe you haven't. You'd look at Phoenix and think, "Okay. Well, we can put a donation center here." But if you didn't know there's a mountain range that runs right next to it, that's almost like--

SUSAN: 11:59

Slight obstacle. Yeah.

JOHN: 12:00

Yeah. All the people you would think would be able to just drive there can't because there's a huge obstacle. And rivers and different things like that, lakes. So it really helps out to bring different people together who are highly skilled to come up with really creative solutions.

SUSAN: 12:19

Right. I love that. I love the idea of this creative interplay and bouncing ideas off of each other. That makes a lot of sense. So you also talk a little bit about making sure that other people in the organization feel that urge to go to the data analytics and data science team with the problems that they have. But I thought it was really interesting how you pointed out that you also want to be sure that they don't overthink that problem in advance. You just want them to go straight to you. Just we think we have this thing. Help us with it. What's going on there? What's the reason for that caveat that you provide?

JOHN: 12:49

Yeah. It's really interesting. You get to-- I don't want to torture the metaphor too much, but it's almost the difference between a statistician and a data scientist. In the past, the statistician looked at all the outliers, and they wanted to throw those away because they weren't near the mean or something like that. But in data science, that's where the great stuff is. So what I found is executives or subject matter experts or whoever they happen to be, think about it too much. They sand off all the good stuff, and they bring you this really refined pearl. And it's like, "Okay. Well, yeah, but there's nothing there. There's no juice there. There's nothing we can really get." But if they come to your office or on Zoom or wherever you happen to be, and they say, "Hey, I've got this idea. I just have to talk to you about it," they have a passion for it. They have an urgency for it. They have a real verve for the idea. And all the mixed-up stuff is there. So that seems to work better than people don't-- they knock off things that they're like, "Oh, that would be impossible," or, "We could never get that," or, "We wouldn't want to consider that. That's too hard." Well, they're not experts in data science. They don't know. We had one person show up, and he was really apologizing for showing up and wanting to ask for our services. And I'm like, "Don't apologize, man. We're here for you. That's why we're here. This is our job." So he went through and explained everything he wanted, and it was just really thought this is going to be impossible. And we sat down and listened, and we conferred, and it was when we could be together. So he took a bathroom break and came back, and he said, "Well, how long do you think it'll be?" And we said, "Well, we'll probably have it done by noon tomorrow." And the guy was just flabbergasted. It was really easy to do, but from his mind, it was difficult. But what it did, it saved him weeks of effort. Basically, what we did is we mined the entire PubMed universe and came back, and he was looking for this very specific health condition. We said, "Here are the 10 papers that are really what you want to understand. Here's the next 50. The next 100. The next 1,000." So hone in there, and you'll find what you're looking for. He was elated.

SUSAN: 15:16

What an amazing feeling to be able to be like, "Yeah. Noon tomorrow we'll have everything you need. No problem" to be able to deliver on that. That's terrific. So how do you make people feel that urgency to go to the data science team and feel comfortable taking what they feel is an impossible problem? How do you encourage that culturally?

JOHN: 15:34

Yeah, this is obviously pre-COVID, but when I joined CSL, I spent a lot of time traveling. I had well over 600 meetings in the first year. And there were meetings with big groups of people, small groups of people. So I met with thousands of people in the first year, and I just made it very clear that, hey, I have an open inbox, open phone, open teams, message, whatever policy. Just please reach out to me with anything that you want to think about. Not everybody's comfortable doing that. So what we did is we created a Center of Excellence. That's what I've run is the Center of Excellence for Data Science. And then, we built a community of practice around that. And all people have to do is just raise their hand and say, "Hey, I'm interested in data science," and they can join the community of practice. Then we broke that down. There's nearly 500 people in that community of practice. There's 25,000 employees in CSL. So we have the nearly 500 of the most interested. And then, we broke that down into 15 special interest groups. And every special interest group has a volunteer leader. It could be R. It could be Python. It could be data visualization, pharmacovigilance statistics, whatever it happens to be. And they meet every month or every quarter, and they get together and talk about things that are interesting to them in the SIGs. And then the SIG leaders feed that back up to me, and I spread that out to the community of practice. And then the people from a company come to the Center of Excellence and talk about data science and projects they're interested in. So it turns out to be a really interesting ecosystem that gives you global coverage but local engagement.

SUSAN: 17:16

Right, right. Well, that's fascinating. And this comes back to something that I want to touch on a little bit later, too, which is that emphasis on having folks continue to learn and develop those new skills around things that they're passionate about. So something that it sounds like you yourself have also done. So kind of bringing it down a level then from that team level and the cultural level to the individual team member. You have this interesting way of talking about the workload for individual analytics team members and would use the term the prioritized personal project portfolio, which I love the alliteration there, by the way. It's awesome. So what does that portfolio look like, and why do you think that's kind of a recipe for success?

JOHN: 17:58

Again, trial and error, learning from doing; big mantra of mine. So the idea was that data scientists are very passionate, engaged people. They love what they're doing, and they want to be working on things that make a difference. One of the things that, in addition to that backdrop that we look for in people, is that we want them to also understand that it's okay to fail. So you have people who are really into it, really working hard, really engaged, really going for it, really experimenting, and having the idea that they can fail. There's some creative tension between those two poles, really. So I learned and experienced this myself is that if you have one project that you're engaged in and you're working on, and you're really out there on the edge of experimenting, you are going to blow things up every now and then. And if that's the only thing you're working on, that is generally going to work, going to manifest itself in a major freakout. So I would get calls from my employees at all times of the night and the weekend. It's like, "I've been working on this for weeks, and it's not working. What am I going to do?" It's like, oh, none of us need that kind of cortisol shooting through our system. So what I was thinking is, how do I stop that? How do I ameliorate that reaction? So what I came up with was the portfolio. So you have two major projects that you're working on, and a major project is six months to a year long. You have a few little projects that are a couple of months in duration, and you have service requests. And service requests are when the CEO shows up and says, "Hey, I've got a board meeting in two days, and I need to understand the price elasticity of this donor base." I'm like, "We're going to work on that. We're going to get that done for you." Drop everything else. So everybody has this body of work that they're responsible and accountable for. So how does that relate to the little vignette that I just told?

JOHN: 20:10

So if you're working on something and it blows up, and it doesn't work, you, as this kind of person, this dynamic, strategic, intelligent, engaged person, immediately switch to something else to engage your mind. So that project goes on the back burner. It goes into your subconscious when you go out on a bike ride or take a shower or eat ice cream with your kids or whatever; you were going to solve that problem. And that may come in a day later, two days later, a week later, whatever it is. So I stopped having these crisis conversations. And in our weekly meetings, we started having; first, you hear A, I'm working on Project A, and then you wouldn't hear anything about Project A for a while, and they'd start talking about Project B, and you intuitively knew that something didn't work out front. But then a week later, then they come to the team meeting, and they'd be exuberant, and they like, "Hey, I figured out what the problem is with project A, and I'm moving on. And I got new data, and I'm trying a different algorithm and a different approach. And it looks great, and I'm excited." So it was a way to give data scientists autonomy, responsibility, and the ability to time slice between projects that made them feel successful all the time, even though they were experimenting and failing. But they still had other things to work on and focus on rather than just the lack of success in that attempt.

SUSAN: 21:42

Well, I love that. That takes a lot of psychological understanding of what drives your data scientists and also a lot of patience to wait for them to have those breakthroughs that happened during the downtime, as you mentioned.

JOHN: 21:54

Yeah, and we've had lots of conversations about that. And I've had conversations with our C-level executives. We'll sit down, and we'll be talking, and they're like, "Well, you've got this project, and we'd like to know what day and hour the result will come." And I'm like, "I can't tell you." And they're like, "What do you mean? We put in this CRM system, and they told us it'd be done at noon on April 16th. And you've got this project in the CRM system is the same as this analytics project. So what is the day and what is the hour?" And I say, "Well, they're not the same. They're different." This one's all about experimenting and learning and having a strategic impact on the business. And this one's about putting in a transactional system. This one can be planned and understand very clearly. This one, we're going to make some mistakes, and there's going to be some elasticity in the project, and it'll be somewhere between April and June. They can either understand that, or they don't.

SUSAN: 22:53

We're waiting for our data scientists to go out on that critical bike ride where they have to pull over and start recording a voice memo because they've just had that great idea.

JOHN: 23:01

Exactly. Exactly.

SUSAN: 23:03

Awesome. Love it. Very cool. So you talk a lot about these organizational and political factors. It sounds like one of them is having leadership understand what the best care and feeding of the data scientist. Are there other kinds of organizational and political factors that you think help an analytics data science team succeed within the organization? What kinds of steps can folks take to help move toward that success?

JOHN: 23:30

One of the things that really helps is when the data science leaders like myself do their job. If we do our job, it makes it a lot easier for the people that work for us. And one of the things that I do is if we're going to do work for, let's say, the plasma organization or supply chain or pricing or manufacturing, I go and meet with the executive that owns that area of the business. And I explain that we're all in. We're completely dedicated to this project. We want to make it work. We want you to be dedicated to the project-- not you personally. Well, we do want your personal commitment, but we don't want you at every meeting we're going to have. But we do want subject matter experts. So we expect your authority to roll down and be vested in these people, so they can come to the meeting. They feel beholden or committed to the meetings, committed to the project. We need them because, as data scientists, we model all sorts of things, and we know math, and we know data. But we're not the experts on supply chain. So we'll model things, and we'll come back, and they go, "You know what? We forgot to tell you that in the UK it has to go to this warehouse for two weeks and then it has to go to this regulator and then it has to do this and that. And we didn't tell you that, so you wouldn't have understood it. So could you please go back and redo it?" And we're like, "Yeah, sure. Fine. Okay. No problem at all." So we need the real-world input into it. And then we also need the leader in the business function to understand that these projects-- we generally do either projects or programs. Projects give you a result. They give you a number, some kind of insight you're looking for. A program gives you a set of models that you're going to put into production, and we're going to update those over time. And your business is going to improve continuously over time. So we have a conversation that if you want a project, that's fine. That's one effort. If you want a program that's going to end in a process re-engineering and a systems change function because all the modeling we're going to do is going to come up with a new way to do whatever we're looking at. And if you're not committed to doing the program, then we'll just do the project, and we'll come to the end, and we'll give you the information, and we'll go away, and you can think about it. And maybe later we'll do a program. People get confused about that all the time. And data science leaders do a disservice to their teams when they don't go to the line of business executives and make it clear on what you're working on. So you need to, as a data science leader, make sure that you are clear on what you're engaging your team in and what the line of business executives are signing up for.

SUSAN: 26:20

Gotcha. And now, from the leaders' perspective, that makes a lot of sense. What about from the individual data scientist perspective? We'll have listeners who are in a variety of different data roles. Is there anything that they can do individually? People, who are part of these teams to help boost the success of their team within the larger organization?

JOHN: 26:39

Absolutely. We just last year took all our data scientists through presentation training because we expect our data scientists to present to C-level executives, EVPs, SVPs, directors, and generally, I'm not going to be there. We expect them to stand up on their own, present their projects, take the field questions, answer those questions professionally, and feel good about being in front of those people. And that's hard. Sometimes that's difficult for people that are in data science. I mean, some of us are more introverted than others. Some of us don't like public speaking, but in the roles in our team, it's required. So we put our teams through presentation training. We also put them through data visualization training because that's one of the areas in data science that is lacking. We do a really good job on helping people understand boosted trees and neural nets and advanced statistics, and different things like that and really give them a good education on understanding is this a-- should this be a map? Should this be a Venn diagram? Should this be a mine chart? Should it be a histogram? What's the best way to present this kind of information? So we hire for the hard skills, and we can train for the more soft skills. So I think everybody can be aware that we're not all experts at everything and we should own what we're great at, and we should work hard at what we're not so great at.

SUSAN: 28:11

It's a great philosophy, and that's actually a great transition to my next question. So thank you for that. I loved reading throughout your book as somebody who used to teach and is very interested in this concept personally, this idea of always learning, always working on those things that maybe aren't your biggest strengths, and continuing to develop personally and professionally in that way. Is there something that has inspired that for you personally in your life and in your career being really emphasizing this constant importance of continuing learning and development?

JOHN: 28:42

Yeah, I grew up in Michigan in a small town of about 200 people. And when I left high school, I was never going to go to college. I was an auto mechanic, and I was building hot rods. And that's what I was going to do the rest of my life. And I did that for about a year. And I was working during the day doing auto mechanics, and at night, I was doing diesel mechanics. So I was making a lot of money. But I was saying to myself, if I do this for the next 40 years, this is going to be really hard. And so I went-- I drove to one of the schools that I had sent my ACT scores to, and they had this new thing back then - I'm quite old - this new thing called computer science. And I said, "I'll try that. I'll give it a go." And when I talk to people and they said, "What are you doing?" I said, "I'm going to school." And they said, "What for?" And I said, "This thing called computer science." And they're like, "Computers? Is there a future in that?"

SUSAN: 29:43

That's great.

JOHN: 29:44

And I was like, "Yeah, I think there might be. I don't know. I'm guessing. I'm going out on a limb here." But that experience of making a decision, making a significant life change, going through my undergraduate, going through my master's, and then meeting my wife, who's a lifelong learner, it just makes life a lot more fun. If you're curious about everything there is out there: gravity, physics, why did door closers work? I don't know. I asked my wife last night as we were sitting at the dinner table how many hours a night do squirrel sleep? And she just [laughter] [inaudible] and she's like, "I don't know. [laughter] [inaudible]."

SUSAN: 30:21

My husband does that sometimes. I think he thinks I'm Siri sometimes and likes to just ask me random questions. And I don't know the answer to that question. But I love the curiosity. And I can imagine that that has really fueled not just your career development, but also an advancement, but also your constant ability to take on new roles and take on new projects. So that's really cool.

JOHN: 30:43

Yeah, and I was just talking to someone the other day, and they said, "Well, why analytics? Why do you like analytics?" And I said, "What other job or career can you be talking about scheduling in a manufacturing plant in the morning and then discussing credit risk in the afternoon?" I said it's fun all the way around.

SUSAN: 31:04

Yeah, yeah. That's a lot of fun. So what are you excited about for the future, either near term, long term, for advanced analytics and data science? What are some things that you are looking forward to and looking forward to learning more about?

JOHN: 31:19

That's a great question. Another question that I was asked a few years ago was when is all this going to be done? And I said, "Well, it's math, and it's data, so never is the answer." And some people get depressed when I say that. Other people get really excited. I tend to get really excited. So there's some things going on right now. We've passed the tipping point on what's being referred to as explainable AI. It used to be with neural networks and other advanced approaches. We couldn't really understand what was going on inside the systems. We're getting to the point now where we can. So I'm very excited that we're going to be able to start using the most precise and predictive capabilities on some of the hardest problems possible because we can understand what they're doing. So I think explainable AI, in the short term, is really exciting.

SUSAN: 32:21

And is that something that you're already incorporating into your own current work?

JOHN: 32:24

Yes, we are working on that right now. It's really interesting. I have always enjoyed working on problems where I could bring a lot of different data together, small amounts of different diverse data. And that seems to work out now in the analytics and the data and the math side of it too; ensemble approaches. I think those are really interesting and intriguing. Another thing that you see more and more of it out there in the market is about one-shot learning or ice. I think they call it ice-based learning. In the past, we had to have millions and billions of instances to train our networks, our neural nets, and our analytical techniques. Now we're actually getting to the point where we can initialize the networks with a random generator, random number generators, and other kinds of case-based reasoning and get the same performance of learning through those models as we did with all that massive amount of data and learning. We're getting to the point where we can see nimble training, nimble learning, and fast implementation of very sophisticated models in production. So I'm really excited about that. I really am excited about the expansion of data science out into the teams in general business. We're starting to see more and more people raise their hand and say, "I want to learn more about data science. I want to be involved in it." And we've built some classes that CSL, so people can dip their toe into it. I often have people ask me what do I do? And I said, "Well, you could go out and look at Kaggle. You can do some of those competitions. You can play around." And I would say about 50% of the people come back and say, "Oh, I really don't like that."

SUSAN: 34:20

Oh, what do they not like? What are they--?

JOHN: 34:24

They think data science is all about predicting the future, which at some level it is. But you know, I know a lot of data science is hard work in getting data, cleaning data, aligning data, integrating data. Once you get all that stuff done, which most people find mundane and boring, then you can get on to the math and the algorithms and the predicting and things like that. But I always try to let people know that there is hard work in the front.

SUSAN: 34:53

Yeah, absolutely.

JOHN: 34:54

And if you don't like working with data, then you are going to really not like being a data scientist, so.

SUSAN: 35:01

It is kind of useful. Yeah.

JOHN: 35:03

Yeah. That's one of the things that I always try to give people a reality check. You can listen to Tom Davenport, who's a dear friend of mine and says that data science is the sexiest job in the world. And there are parts that are fun and sexy, but there are parts that are decidedly not.

SUSAN: 35:18

Right. Trying to figure out why there's that one extra character in that one cell and then how it messes everything up. Yes. Yeah. We've all been there. And I think I may have cut you off before you talked about something longer term.

JOHN: 35:32

People always ask me about - I can't say that I'm excited about it - but people always ask me about AGI, artificial general intelligence, and dystopian future of the Terminator and Skynet. And I've been asked enough that I've actually sat down and thought about it, and they said, "Well, when do you think it's going to happen?" And I said, "Well, number one, I don't think it's ever going to happen." I said, "But number two, when we get to AGI, it's 2250. So 230 years from now."

SUSAN: 36:03

Okay, we've got it in the recording now, so we'll revisit this with our great great-great-great-grandchildren. Yeah. Wow. So we also have a recurring segment on Data Science Mixer called The Alternative Hypothesis. And we ask every guest basically the same question, which is: what's something that people often think is true about data science or about being a data scientist, but you found to be incorrect?

JOHN: 36:30

They often think that data scientists have no sense of humor. That they're all very straight-laced and very nerdy, and that's not really true. I've hung out with some data scientists that are pretty funny and kind of a little up there in their behavior. So now the stereotypes are beginning to break down at this point.

SUSAN: 36:54

Yeah. That's great. I love that. Very cool. Something else that I always like to ask toward the end of our conversation is, is there something that we haven't talked about yet that you want to get in there that you think is really important to the topic of building analytics teams or data science more generally?

JOHN: 37:14

I will go a little broader on this one. I've started working with the Mark Cuban Foundation on their AI boot camps, so we're going to be running one of those in the fall for underprivileged youths in the Philadelphia area. I've been working with University of Michigan, Penn State, University of Texas at Austin to build out data science programs and help people understand what a well-rounded data science curriculum would look like. University of Michigan just came out with the first master's degree in AI. My alma mater, Ferris, just came out with the first Bachelors of Science in AI. And I was having some conversations this morning about how we can build K-12 AI education curriculum and get it into all the schools across the United States. So I'm passionate about helping young people understand artificial intelligence, and I'm passionate about them seeing themselves in the field, so they don't look at it and say, "Oh, that's something these really smart people do out there." That they can think about it, and they can build solutions that actually do things for them in their communities. So I think that we as an AI community need to come together and need to work hard to help the next generations have these capabilities.

SUSAN: 38:48

That's so interesting. Do you think that the data science teams of the future will be different in some key way when there are people who have basically been raised on learning and knowing about AI?

JOHN: 39:02

I do think they will. I do think they will be different. When I was talking to John Seely Brown, who was one of the early directors of Xerox Palo Alto Research Center, and he said you know you'll be successful when it's as ubiquitous as electricity, and nobody even thinks about it. And everything is AI-empowered, and it's just made better and better and better. And I think I'm an optimist, obviously. So I think making everything better with AI will just continually see make life better. So I think we'll see all sorts of things that will be AI-enabled. And when people are AI natives, we say digital natives now, but if they're AI natives, then, of course, they'll look at things differently.

SUSAN: 39:51

Well, this is great. And I think folks will really take a lot of inspiration from your perspectives on both the team and individual levels and how that can enhance their work in data science. So, John, thank you so much for joining us on Data Science Mixer.

JOHN: 40:04

Thanks for inviting me, Susan. It's been a pleasure.

SUSAN: 40:09

Thanks for tuning in to Data Science Mixer. Earlier, we mentioned John's book Building Analytics Teams, harnessing analytics and artificial intelligence for business improvement. We have two copies to share, and all you have to do to enter to win one of them is go to community.alteryx.com/podcast. Navigate to this episode's page and leave a comment with your thoughts on our cocktail conversation. Submissions to win a copy of this book end when our next episode comes out in two weeks, so be sure to comment by April 20th. For today's cocktail conversation, we want to know what role has curiosity played in your data projects? Has it helped you ask a creative or unusual question at the right moment? Has curiosity helped you advance your career? We can't wait to hear your thoughts on the episode page. Cheers.


This episode of Data Science Mixer was produced by Susan Currie Sivek (@SusanCS) and Maddie Johannsen (@MaddieJ).
Special thanks to Ian Stonehouse for the theme music track, and @TaraM  for our album artwork.

Comments
cgoodman3
14 - Magnetar
14 - Magnetar

I’ll definitely add this to my running podcast list. Building analytic teams is part of my day job!

blyons
11 - Bolide

Love his philosophy of being flexible with schedule and deadlines. A lot of the challenge is keeping enough room in your workload to allow for that flexibility. So often, the team is not large enough, and the demands are so high, that there is no extra bandwidth to think creatively and tinker. That also means no time to learn new technologies, techniques, software, etc. It is what I like to call "the tyranny of the urgent."

 

It is because of this that we began setting aside a block of time to do our own thing - whatever we wanted. Could be reading a book, watching videos, experimenting with a feature of software, building a dream project, or whatever. This is similar to what Google called "20% time." We started out with just 4 hours per week. We figured not much gets done on Friday afternoon anyway, so we put it there. We called it "Data Science Friday," and even created a logo we hijacked from NPR's Science Friday. It wasn't long before we realized so much good usable productive stuff was coming out of it that 4 hours per week wasn't enough, so we dedicated the entire day on Friday every week to Data Science Friday.

SusanCS
Alteryx Alumni (Retired)

@blyons, great points. I love the concept of Data Science Friday (and, as an NPR fan, especially the Science Friday connection!). Fantastic that it's been such a successful initiative for you.

 

Did you discover anything in particular on one of your Fridays (a book, video, etc.) that you're recommend to others as a source of inspiration?

blyons
11 - Bolide

@SusanCS, thanks for asking. Here are just a few examples by different team members over the years.

Data Science Friday logo.png

SusanCS
Alteryx Alumni (Retired)

@blyons, what a fun logo! Thanks so much for sharing these. (I especially love the 2 months vs. 15 minutes comparison!) That's a terrific list of things folks tackled. Such a great opportunity to learn and to explore possibilities that might not immediately fit into your everyday work, but that pay off in new ways of thinking and unexpected applications.

KevinHarrison
8 - Asteroid

What a great podcast! I enjoyed the flexible schedule and the concept of Project vs Program. 

MaddieJ
Alteryx Alumni (Retired)

@KevinHarrison I thought that was a cool point as well! Especially as a reminder of the importance of being clear with LOB leaders and setting clear expectations.


So glad you enjoyed it!

MaddieJ
Alteryx Alumni (Retired)

@KevinHarrison@cgoodman3, and @blyons thank you so much for chiming in on this Cocktail Conversation!

 

As you may have heard at the end of the episode, we had two of John Thompson's books to give away to folks who responded to this thread. Instead of picking which two would get a copy of his book, I've ordered another copy so all three of you can enjoy!

 

I'll DM each of you to get addresses 🚚📬

 

 

 

P.S., do I sense an unofficial Data Science Mixer book club forming?

cgoodman3
14 - Magnetar
14 - Magnetar

cgoodman3_1-1619159425510.gif

I’m looking forward to reading this.

 

(plus I am obsessed with Schitts Creek gifs after @ElizabethB‘s certification post got me interested in Schitts Creek!)

MaddieJ
Alteryx Alumni (Retired)

Shamefully I haven't given in to the Schitts Creek craze! But with an endorsement from @cgoodman3 I think I'll need to give it another shot 😊