Now that it's live, don't forget to accept your certification badge on Credly today! Learn more here.

Data Science Mixer

Tune in for data science and cocktails.
MaddieJ
Alteryx Community Team
Alteryx Community Team

Kristen Werner, Director of Data Science and Engineering at Snowflake, highlights how data engineering and automation can enhance the human experience of being a data scientist. 

 

 


Panelists

 


Topics

 

Cocktail Conversation

 

Kristen Werner social asset.png

 

During our chat, Kristen suggested that data engineering and automation can improve the human experience of doing data science. How have you experienced that? Have you seen improvements in these areas that have made doing data science more enjoyable and productive for you?

 

Join the conversation by commenting below!

 


Transcript

 

Episode Transcription

SUSAN: 00:01

Welcome to Data Science Mixer, the podcast featuring top experts in lively and informative conversations that will change the way you do data science. I'm Susan Currie Sivek, the data science journalist for the Alteryx Community. And for today's Happy Hour conversation, I had a cup of tea with Kristen Werner, a director of data science and engineering at Snowflake. The focus is on IT and security. If you're into data science, you know how cool Snowflake is. We're big fans, too. And in fact, we just formed a partnership, so you can seamlessly use Alteryx and Snowflake together. Kristen also has a doctorate in neurobiology and behavior and was a researcher before getting into data science. Her background gives her a really interesting perspective on the field.

KRISTEN: 00:45

I was working in a lab at Princeton. And then I started working at Facebook and everything's kind of gone from there, kind of data science, data engineering, and used Snowflake a couple of times before I started working here, and yeah.

SUSAN: 00:57

Well, so it was destiny coming back to Snowflake. Very cool. I love that. Just a couple of other kind of quick introductory things. If you don't mind sharing, could you tell us which pronouns you use?

KRISTEN: 01:06

She/her.

SUSAN: 01:08

Okay. Awesome. Thank you. Kristen and I actually completed the same data science program called Insight Data Science, which helps PhDs move from academia to industry. We'll touch on that briefly as we also get into her fascination with the human brain and the parallels between biology and data science concepts like neural nets, and A/B testing, and how data engineering and automation can enhance the human experience of being a data scientist. Let's get started. As you know, we like to enjoy a special drink or maybe a snack while we're chatting. So do you have something there with you as we're talking today?

KRISTEN: 01:50

I do. I'll tell you. In the morning, I was excited. I was thinking I was going to have a glass of wine, and I'd really gotten into orange wine during the pandemic.

SUSAN: 01:58

Oh, nice.

KRISTEN: 01:59

But I have a few more evening meetings tonight. So I am having turmeric tea from a place in Berkeley, so yeah.

SUSAN: 02:06

Yeah. Nice. That sounds delicious, Very cool. Yeah. I have to actually go run a couple of errands. So I, too, am having something not too adventurous, just some sparkling water with some grapefruit bitters in it, which is very tasty. Yeah.

KRISTEN: 02:20

I'd like that. I will try that.

SUSAN: 02:23

Yeah. I didn't know if I would like grapefruit that much, but it turns out to be really good. So yay.

KRISTEN: 02:28

Yeah. It sounds delicious.

SUSAN: 02:29

Yeah. Awesome. So I would love to come back to your path toward data science a little bit and how you got into the field. You mentioned that you have this super interesting background in neurobiology and behavior. I noticed also that you've done vaccine research in the past. How did all of that lead you toward data science?

KRISTEN: 02:46

Yeah. Vaccine research was so long ago.

SUSAN: 02:51

Yeah. Also, I mean, how contemporary right now, though. Gosh.

KRISTEN: 02:54

I know. It's can pull that back out of my history. I've actually talked to a couple of people about vaccine development from my experience. But yeah, so my science journey was very-- I come from a family of academics, and I thought I wanted to work on cancer biology. I was in love with the city of Seattle. I applied to an intern program, an internship program. I got drafted into a neuroscience lab. And that woman's name was Linda Buck who went on to win a Nobel prize in medicine and physiology for discovering how a sense of smell is transduced through neurons. And she really influenced my decision to go into neuroscience. So I gave immunology a good try. And so I actually ended up at Karolinska Institute. I think as part of my senior thesis was doing research at the Karolinska Institute, which is in Stockholm. And that's where I worked on vaccine development. But I was still drawn back to neuroscience. It was one of these really big final frontiers of biology of simply understanding how the brain works. Someone told me the story once about in an interview, a neuroscientist came around his desk, put a sticky note on his shoe, and then went back to the other side of his desk. And he goes, "You'll never forget that I did that. There's no reason for me to have put a sticky note on your shoe. You'll never forget that I did that." And that's what makes the brain so interesting.

SUSAN: 04:23

[laughter] Oh, that's awesome.

KRISTEN: 04:25

That's what he said. I was like, "That's so true." And now, not only will that person never forget the sticky note, I will never forget the random story about the sticky note. So I just found the brain just such an interesting space to work in because there was so much that was unknown. And I didn't really get away from that until you start to see the career paths that are available. And I think this happened. I kind of see myself now as this particular class of data scientist or data engineer that really enjoyed problem-solving and getting deep into technical work and learning new techniques to solve problems that are on the path to the thing you're interested in. As you look down the path of applying for jobs at any school that has an opening and, potentially, years of being on the job market, and insight data science was just coming on the radar, I think, when I learned about it from someone who had been at Stanford. And so I thought I actually have all these options. Right. It's not just apply for a professorship for-- until I get one somewhere that is a place where I also might want to live, or maybe I won't ever get a job in a place where I want to live. And I didn't have kids at the time. I didn't. It was me and my dog and my bike and my coffee machine. And so, "I can go out to California for a few weeks and try this."

SUSAN: 05:58

A few weeks.

KRISTEN: 06:00

Yeah. A few weeks, I could do the insight thing. And my dog loves the beach. It'll be great. And I'll also say, something that my my mom didn't get at the time, I was like, "Oh, data scientist." I was like, "I'm a scientist. I work with data. This must be the industry term for people like me." I had no idea that it was kind of this new thing that folks were looking to revolutionize how we do business. So I think that probably, a long story short, that's kind of how I got into it and got to insight. And insight really, really helped from there. I know you're also an alum, so you can appreciate that journey.

SUSAN: 06:38

Yeah. For sure. Yeah. I'm certainly a recovering academic as well. I definitely understand that perspective. And I think it's so interesting, particularly with your history in neuroscience, and this may be a totally dumb question. So feel free to just be like, "No." But I wonder if things like neural networks and some of these ways that AI has tried to mimic the structure of the brain, have those been of particular interest to you, or have you mainly focused on other kinds of issues as you've moved on in your career?

KRISTEN: 07:07

I'll tell you. So I do find it interesting, the links between a lot of popular techniques and industry back to the history of the areas that I studied, and so not just neural nets but also A/B testing. So I worked with fruit flies and C elegans, the little lab worm, nematodes. And genetic analysis, mutant analysis, comparing one mutant to another is one of the first places that A/B testing was used. How do you statistically evaluate the difference between mutant A and mutant B from a wild type. And kind of neural nets also-- think about how when you're trying to identify networks in a culture of neurons or in an animal system, I think in terms of the application of neural nets, I haven't really gotten into that too deeply in industry. I think one of the big things that I've focused on in industry that comes from my work in biology, specifically, is creating clarity from a very ambiguous space. And I will say creating clarity in neural nets and interpret ability is very hard.

SUSAN: 08:29

Oh, yeah. Yeah.

KRISTEN: 08:31

And in industry, when you're trying to move very quickly, if you're not doing an application of neural nets that requires it, that requires sort of deep learning, that sort of advanced technology, then you usually don't do it. So that's kind of been my experience with kind of linked concepts. I think the ambiguity of biology and trying to bring clarity, that has actually been the most powerful experience that I bring with me.

SUSAN: 08:56

That's awesome. And I would love to come back to that in a little bit. But first, I have to say I love that A/B testing started with the study of mutants. I think that's fantastic.

KRISTEN: 09:05

I'm not going to claim to be an ultimate historian. But it's one of the very original applications for A/B testing, so.

SUSAN: 09:14

That's fantastic. Very cool. So just to kind of spoil things for our audience, sometimes we do talk a little bit before we actually record. And one thing that you mentioned to me in that previous conversation was that you feel like you have a recovering data scientist perspective in your current role and maybe a little bit especially around data engineering. Would you be willing to talk about that a little bit and what that means to you?

KRISTEN: 09:37

Yeah. Totally. I don't feel alone in this either. I'm not alone as an academic in industry, and I don't think I'm alone as a recovering data scientist either. Kind of starting out in 2014, and you still see it today, working at Snowflake, you see, and I'm sure at Alteryx as well, you see a lot of companies just trying to get a handle on data. Right. And the instinct is to hire some data scientists or some data analysts and just they'll solve your problems for you. They'll do one of two things. They'll solve your problems, or they'll just bring data to the table and tell you you're right all the time. Neither of those things is going to happen, right, if you just bring a data scientist. And I think, for me, my experience is long-running queries where it's if you're running queries, and you don't have results returned back to you within seconds, it can be incredibly hard to do fast and iterative exploratory analysis to understand what's the next step. I remember having this experience of trying to quality-check data, but the queries were taking 30 minutes to run, and it became-- it was very hard for me to think about what had I checked? What had I not checked? How do I even think about this, and when is it ever going to end? And so this idea of automation and simplification and commodification of common processes became super, super appealing to me.

KRISTEN: 11:07

And I think, so now, I never want to throw a data scientist on a problem, or an analyst, without making sure that the infrastructural support is there, the data engineering is there. I have this sense that if 5 and 10 years ago, people didn't know what to hire in a data scientists, is it a mathematician, or is it a physicist, or who can you hire for this role? I feel like data engineering might be in a similar spot today where there are a lot of data engineering, classically, who have done ingest and some rudimentary structuring. But this idea of analytics, engineering, or prep work that helps enable data science specifically, I'd really like to see that become more acknowledged and brought to the forefront as a critical part of any data team or the data stack that a company is developing.

SUSAN: 12:02

Yeah. Absolutely. That makes a lot of sense. I want to come back to this idea of your long-running queries where it was hard to keep track of that train of thought. It kind of reminds me of sending a spaceship out and having that communication delay where it's hard to carry on that conversation because you're waiting for the other person to reply. You're waiting to get the data, waiting to get the response. What are some challenges, right now, that you've seen in easing that process and speeding that up?

KRISTEN: 12:27

Yeah. So I think there are a lot of tools in this space that are seeking to sort of structure and commodify parts of that supply chain from, you have data in your warehouse. Now, how do you create business value from that? You see a lot of apps kind of building in this space. How do you make your data model good? How do you make it self-serve? How do you make it end-user-friendly? Every company is made up of different types of people, different types of roles, and the evolution of any given data stack can happen in a different way. So I think, for me, I have a very specific taste of what I want of it, and I tend to build over buy a lot of times. So it's good. But it's always good to have that discussion. What can you buy to solve your data stack problem, and what should you build custom? But I think it's always good to take a step back and say like, "What are the common processes that we need to standardize, and what are things that need to be manual that we don't understand well enough to build the right thing or to buy the right thing? And so you understand what problem you're solving in the data stack, kind of hold off and then get the right solution. I think very often folks just want to get to insights so quickly that there's not a lot of thought put into this middle process. So I think I don't want to enumerate all the players out there, but there are definitely pipelining and some data viz tools want to help organize data. So yeah, I think there is-- I don't know if that's a clear enough answer.

SUSAN: 14:05

Yeah. No, it makes a lot of sense. I think part of that is actually going back to your previous point about bringing clarity to ambiguity. Right. You have to figure out what the problems are in order to figure out what the right tools and solutions might be to address this kind of problem.

KRISTEN: 14:19

Exactly. Not everyone's going to have this 30-minute query problem. Right. You're going to have a lot of data for that to become an issue. But you might have a data consistency problem. Right. You're trying to report sales metrics to marketing and finance. How do you make sure that the sales metrics are going to be coherent with what somebody over in marketing is trying to measure or what somebody over in finance is trying to measure? How do you create that global view? I think that's a problem a lot of folks have. And so how does your infrastructure support that, and how does it support it today? How does it support it when you're at 200 people, 500 people, 10,000 people? I always want to ask those questions today like what happens when we're 10,000 people? Would we solve this the same way?

SUSAN: 15:10

Yeah. That's a great question. Talk to me a little more about this issue of data consistency because I think that's one that-- particularly for those organizations that are growing and adding more people, having maybe a different departmental structure as they grow, what are some ways that you've seen that dealt with effectively or maybe failure or something?

KRISTEN: 15:28

Yeah. I think the earlier you can have a concept of core metrics at a company, probably the better. But what I've seen in multiple instances is that you get a product or a product team or company, and they have a few folks who are kind of interested in data. And you start collecting data in a data warehouse because that's a natural thing to do now. And then you write a few ad hoc pipelines, and put up a dashboard. But then you have, as you're alluding to, then you hire five more people, and then you hire five more people, and now you have all of those people are creating all their own custom pipelines, and you're going to create-- that's where the chaos kind of comes from is full customization. So I think if you're still small, but you have an eye on this tech debt, and the goal of creating, at least for your top-line metrics, a coherent and core data set that everyone can use and everyone should reference. Find whoever, the specialists, in the company to own that, or hire somebody that can at least build, protect and maintain and socialize a core data set. I mean, that's something you can do without having to buy an extra tool or whatever, but just build in that concept of having sanity amongst the chaos. Yeah.

SUSAN: 16:51

Yeah. If only we could all achieve sanity among the chaos. That's [crosstalk].

KRISTEN: 16:56

Just a little sanity. [laughter] If you're B2C, you probably want to have a core space for understanding users. If you're B2B, you want to probably have a core place to understand accounts. And no matter what you are, B2B, B2C, that you want to understand revenue. Get those things down. Create them once, and figure out how to lend them to all the departments that need to understand they're part of the business with respect to users or accounts or revenue. Yeah.

SUSAN: 17:27

So it may be kind of self-evident, but what are some of the benefits of establishing that one sources of data truth?

KRISTEN: 17:37

Yeah. [laughter] Put in hours and hours of tense meetings. I think that's the--

SUSAN: 17:48

Major plus.

KRISTEN: 17:48

Yeah. I don't know how you measure that. I mean, the way the surface is, right-- and actually, I think this is something I think-- always trying to promote data literacy, having leaders that understand the importance of data, and always that they check with their business leaders, but also check with their data leaders for insight.

SUSAN: 18:07

So what happened?

KRISTEN: 18:10

What you see happen is that two groups are reporting numbers that just don't make sense together. They're either trying to report the same metric, and they have a different answer, or they're reporting metrics that you just can't reason about together. Say, if marketing says, "We have eight million leads, and sales is like, "We addressed 500," and it's like, "Well, how do we even make --" How do you make sense of these numbers together? How can you map the leads that were addressed to the leads that you're saying had come in? And often, people look, and they're like, "Well, our data people just aren't doing their job. They're not very good at their jobs. Go back and fix it. Go back and clean it up." And so then you have multiple data scientists or analysts, and you have multiple high-level businesspeople breathing down your neck trying to figure out what happened. And that's a terrible use of people's time, right?

SUSAN: 19:01

It is. Yeah.

KRISTEN: 19:03

Now, you have five people troubleshooting something. If you kind of just recognize here are the core things. Everybody has to relate to these. This is what you use, whether using a platform to develop them, or you're just managing it by policy because you're a smaller company. If you can agree on that, I think you remove a lot of this friction and tension around reporting inconsistent numbers. And the other thing about reporting inconsistent numbers is that people are like, "Data folks aren't doing their jobs," and it erodes trust. And all data really can do-- I had a manager a while back who said, "Data doesn't do anything except influence." So if there's no trust, how do you influence? And you really lose power that way. So I think it's in everybody's-- those are some of the outcomes. You maintain trust. You build the ability to influence. You go to fewer tense meetings that are full of [inaudible]. I think there's a lot to-- and also, you start to, if you do this, you find all of your departments and find all of your top-line metrics for the different departments and make them reconcilable against each other. You can start to ask these flywheel questions about how does product engagement drive revenue, drive company growth? Right. And now you're connecting product, and finance, and sales, and marketing, and asking how can they positively impact one another? And I think that's the real goal that you want to get to,

SUSAN: 20:34

Yeah. That's where you want to spend the time and engage in those deep conversations. I think those are all really great insights and certainly should motivate people to aim for that sort of structure that would allow them to recognize those benefits. Something else that you mentioned I'm really curious about. You talked about automating common processes for the data scientist. What's some of the work that you're doing in that area that you're finding interesting and rewarding?

KRISTEN: 21:00

Yeah. Yeah. So this is the great thing about data engineering as sort of the observer of business and data scientists is - and these are going to sound obvious, but I'm going to say it anyway - is you see folks are commonly reporting on 1-day, 7-day, and 28-day metrics. There are multiple ways in SQL that you can express those simple aggregations. And you might also want lifetime metrics. Right. And you can think, if you see people doing this over and over again and you're like, "You shouldn't have to write that SQL. That not a good use of your time as a data scientist." But it's not data science. Right. And counting things is not trivial. I don't want to belittle that. Counting things is hard. Otherwise, you wouldn't have all these PHDs like you and me in the industry. Right. And so abstracting away the SQL and into functions auto-generate these common time-series metrics is something we have worked on. And it's a very basic use-case. But you can imagine you can save hundreds and thousands of lines of SQL that nobody has to audit. Nobody has to write. Nobody makes the mistake of-- I mean, can you imagine how many times everyone's trying to report on 28-day metrics. And some of them are 28. And somebody, accidentally, they use greater than or equal to instead of just greater than, and you get 30-day metrics or 29-day metrics. And you don't know it until somebody calls it out in the executive meeting. So I think standardizing, extracting that away, these really common things people shouldn't have to think about. You shouldn't have to worry about those kind of mistakes this day and age. So we kind of pull it back into Python functions, and store that as a library that folks can call. Yeah.

SUSAN: 22:49

Sweet. Yeah. Very cool. What are some other things that you're excited about or that you think are really promising in data science or data engineering or both right now?

KRISTEN: 23:00

I'm really interested, actually. I don't want to harp on it too much, but I'm bringing order to the chaos of the supply chain. So if you bring your data into a warehouse, how do you get it from warehouse to business value in an orderly way? And then the use-cases of business, we're thinking a lot about what are the common ways that data brings business value? One big question that data can answer is like, "How do I decide which product to ship at scale?" And that's A/B testing. Someone can set up an ad hoc A/B test. They can write the SQL to pull the underlying data. They can write a T-test or ANOVA test, or they can use a package to run that over their resultant data, and they've gotten a one-off result. But you could also create standard-- as companies grow, a lot of people are going to ask, "Does my A/B test impact revenue? Does my AB test impact daily active users?" really common metrics that you'd want to know. And you also want to know, "Does my A/B test increase, for us, SQL query performance? But we don't want to do that at the cost of other key metrics. So you could create the standardized underlying table structure or transformation pipelines to just have the daily summaries for users or first query performance. If you standardize those tables, the naming conventions, and how they're built, then you can build a generalizable A/B test framework on top of that. So everyone doesn't have to rewrite their A/B test. Right.

KRISTEN: 24:59

So that's kind of the way I think about it. What's the business value proposition that you are fulfilling, and what are the pieces that you can build out that support that, starting from the data curation side? And I think there's always going to be custom. And I love that that's problem-solving that's specific to your business. But I think if people aren't reinventing the wheel every time with these other standard processes, I think you can do more of that than like, "Okay. Well, we have deep questions about security in our ecosystem. We have deep questions about product feature usage that we can get to more quickly," so.

SUSAN: 25:39

Absolutely. Let's pause for a moment. Kristen has more advice for data scientists on how to bring order to the chaos of data and on how to demonstrate the incredible value of their work. But first, let's hear a little bit about one way you can get your data in better order quickly and easily.

S3: 25:57

Hi, everyone. This is Gaurav Sheni. I'm the senior software engineer working on our open-source software. Today, I wanted to talk about Woodwork, which is an open-source Python Library for data types and two-dimensional tabular data structures. Woodwork provides simple interfaces for working with typing information. Woodwork makes it easy to identify proper data types and helps you gain statistical insights. It also helps you prepare your data for machine learning. This library works great with EvalML and AutoML library from Alteryx. Using woodwork with EvalML provides additional information about the data set, allowing you to build better machine learning models. You can access all of these projects by visiting GitHub.com/alteryx. Furthermore, you can view the Woodwork documentation by visiting Woodwork.altleryx.com. For updates and tutorials, follow us on Twitter, @alteryxOSS.

SUSAN: 26:50

Thanks so much for those details about Woodwork. And now, let's get back to our conversation with Kristen Werner. What are some of the biggest changes that you've seen as you've moved into this field? What are some big either technical changes, structural changes, business changes? What are some big things that you've observed that you've found intriguing?

KRISTEN: 27:15

Gosh. It could be any of those, not all of them. [laughter] Okay. Yeah. Well, I mean, so first what I want to say is so much has changed for me in the last seven years. I've always kind of looked to work places. I mean, Facebook, how lucky was I to land there to start my career off? And I learned so much from them that, stepping out of Facebook, you see it's a rough world in data. And I've tried to pick companies where data comes first. I was at a company called OpenDoor. Data is their product. They price homes, and their success metric is accuracy. Right. And then Snowflake, right, it's a company that's interested in making data accessible to all companies. So I felt really lucky to work at those companies. But now that we do some amount of customer-facing work from Snowflake and seeing other people bring their data problems to us, it's not an even playing field out there. I think there is a clear idea of what data should do, kind of globally accepted. Who do you hire? When do you need them? Yes, they're a critical part of your staff. But I think you still see people struggling with, "Well, I've got my data, and I've got my data scientists. Now what?"

SUSAN: 28:43

Now, the magic happens, right?

KRISTEN: 28:45

Yeah. [laughter] Solve the problems, right, that's-- or they're going to tell me I'm right all the time. I think that there's still a little bit of-- I don't know how this changes going forward is, you need somebody, you need executive sponsorship who either knows how to work with data, knows how to incorporate them into your whole strategy, or is willing to be taught by the practitioners that they hire. I think you need that empowerment from the top level. And I think that's something that's not as widespread yet. But I do think there's a better general acceptance of the investment in data that people are willing to make, and that they know it's important, and they're going to buy these tools to support it. They're going to hire people. But I think getting data impact in the business, I think, is something that folks still struggle with.

SUSAN: 29:41

True. So I wanted to ask you a question that we ask all of the guests for this recurring segment that we have now called Alternative Hypothesis. And the question is, what is something that people often think is true about data science or about being a data scientist but that you think is actually incorrect? It's a tough one.

KRISTEN: 30:08

Yeah. I still get this feeling that there's a large sense that if you hire a data scientist, oh, man, they think you're going to build a model that tells you what to do next. And I get it. People talk about predictive and prescriptive analytics these days, and people are upping the ante of buzzwords from ML to AI. I feel like some of the magic has maybe-- people are like, "Oh, right. First, you have to count stuff, and then you can model stuff, and then you can think about it." And if your data is good, then you can do these other great things as well. But I think that sort of sends from the business side that you hired a data scientist, so they're going to create models. When I think data science, I think of is you hire a data scientist, and they're going to help you solve problems through partnership. And maybe that requires a model. But no matter what, it always requires taking a step back and analyzing and interpreting that model and figuring out how it relates to business and interacting with the business to get feedback, figuring out how. There are human-observed things that happen that we can't figure out with math and statistics. Right. And so I think that's sort of-- I don't know if this is-- for a data scientist this is probably well accepted. For business people, maybe it's a novel and new. But yeah, I think there's still this expectation that there is some magic inherent to data science that there's not. Yeah.

SUSAN: 31:44

Yeah. I think some of the media coverage of the field kind of reinforces that too.

KRISTEN: 31:48

Yeah. But it keeps us employed so you notice. [laughter]

SUSAN: 31:52

There are pluses, minuses. [laughter]

KRISTEN: 31:54

Yes. I can imagine people are like, "Oh, data science is magic. We can hire a neuroscientist for this. That's fine. That'll do."

SUSAN: 32:03

[laughter] Do you feel, I mean, though, your neuroscience background and your background as a researcher-- I'm sure there are ways that all of the variety of skills that you cultivated for that role certainly helped you in your current position.

KRISTEN: 32:18

I also would want to give a nod to folks who are not in purely quantitative fields or in the more natural sciences. I think biology itself has a lot of ambiguity. And as a biologist, when I looked at some of the physicists around me, in biology, you might have a team of three to five people creating a research paper or writing a journal article together doing everything end to end, writing the proposal to get the money, designing the experiments that are going to help you answer the question, collecting the data, making sure the data you collect is clean, analyzing the data, figuring out the follow-up experiments you need to run, writing up the paper, and submitting it to journals until you get it done. And I think having really hands-on end to end to all that process is what you do in industry as well.

SUSAN: 33:08

Yeah. Yeah. As you were talking about it, I thought, "Oh, we're kind of mirroring the process of collaborative work," the same kinds of things.

KRISTEN: 33:16

Yeah. In biology, you can't always isolate one variable. There's so many things at play. And so I think being able to kind of cut through noise and all of the extra variables and figuring out what's the simplest possible thing to do that will make the point, I think, is a super valuable skill. And I really do think there's a lot of hypothesis-driven thinking and work that I think a lot of academics have. But I think in biology, the clarity and looking for simplicity in something as complex as biology, I think, is really useful. So I also want to give a nod. I know there are a lot of physicists out there who might not like that answer. [laughter] But I do think there's a lot to be said for biologists in the field too.

SUSAN: 34:02

Yeah. That's awesome. Although, as a social scientist, I have to come back and say I did hear you say something earlier about things we can't solve with maths and things we can't necessarily observe through statistics. So that stood out to me too.

KRISTEN: 34:15

Yeah. Yeah. Yeah. I think, sometimes, you just got to go look. You got to just go ask. Right. In some of our work at Snowflake, we were trying to score accounts, and everything was super off, and we went to the sales field, and we're like, "What do you think?" And they're like, "Your results are terrible." Yeah, like, "How can we think about this?" And we sat down, and we were like, "Oh, my gosh, we picked the wrong outcome variable. Our company has changed. We can't--" So what we did, we were like, "Okay. Well, this account scored low. You think it should be high. Why is that?" And one of the salespeople was like, "Yeah. I was on that team. We were busy selling to someone else. We couldn't close this account. We couldn't spend more-- but now we can. They were a great candidate. We just didn't have time two years ago."

SUSAN: 35:02

Oh, wow. Yeah. Yeah.

KRISTEN: 35:06

Yeah. Maybe we would have figured that out one day. But it was just so much easier to talk to somebody.

SUSAN: 35:12

Yeah. Yeah. People have information. That's cool. I love that story. I mean, they totally are complementary methods to me, qualitative and quantitative. So I love hearing that story. It's super important.

KRISTEN: 35:26

Absolutely. Actually, just to that, if you don't mind adding, there's something I tell people a lot about when I try to talk to people about statistics and being data-driven because it's not an all-or-nothing deal. Right. And a friend of mine, she kind of navigated the academia to industry path with me. And she gave me this example. So if you knew with 99.9% confidence that you drove home-- and this is a very Bay Area example. If you take the 101, every time, you're going to get home two minutes earlier from work if you go on the 101. Then if you take the 280, if you take the 280, you will always take two minutes longer. Well, anyone who's in the Bay Area will tell you, hands down, they would much rather take the 280 any day than be on the 101. The traffic pattern is so bad. But data would tell you the metric you're measuring is speed to home, time to home, you would tell people to take the 101. You would say everyone should take the 101. But the experience and the value proposition to the driver is the much more pleasant drive, and two minutes isn't meaningful. So I think that it's always important to kind of step back and look at what's actually happening.

SUSAN: 36:45

I love how this conversation, and it seems like so much of your work, is really around that human experience, the human experience of the data scientist who doesn't need to be writing that same SQL query all the time, complementing your data with these other kinds of insights. I just think that's super interesting.

KRISTEN: 37:01

Yeah. And I think it's what makes it fun too. You don't want to do the same thing day in, day out. You get to interact with the business. And you don't want to be sitting locked in a room and expect you to come out one week later with fantastic business insights. And it also helps you build those relationships. Right. The person on my team that worked on an account, the account scoring at the time, everyone knew him. They were like, "That's the account scoring guy." And that's great recognition to have people come to you. You have those relationships. And that's another way to also start building trust in the work that you do.

SUSAN: 37:31

Is there anything that we haven't talked about yet that you want to be sure to get in there or any additional clarity that you would like to add to anything?

KRISTEN: 37:39

Given the flavor of our discussion, I think if I were to add something, I think, my first manager at Facebook, he said something that was a little bit offensive to my sensibilities at the time. But I get it, and it makes sense, and it has benefited me a lot. It's you don't need to learn just to learn. If you go solve a problem, you will learn on the way. I often hear people asking about the skill set, that skill set, how do I do this, that, and the other thing? If you focus on the problem you're solving, especially the folks from academia or other kind of problem-solving areas, you'll figure it out. If we need to use real-time data for something, and somebody absolutely needs it, we're going to figure out how to build the best system possible. And I think we'll go out, get the tools, set up the infrastructure, test it out. We'll do all of those things. And that's what's interesting. And I think that's a great way to learn. And it certainly has given me a broad, broad set of experiences in the last seven years. So I think that's something I think about as well is, you don't have to go in with stats. You'll figure it out. You'll learn what a T-test is or ANOVA, or you'll learn about NLP. There are some things it's always great to have a specialist around. I think there's so much growth in just the experience of different jobs and problem-solving in the industry that it's super rewarding. I don't think people should ever hold themselves back because of, "I don't know this language or that language or this skill or this approach."

SUSAN: 39:13

True. Yeah. That's awesome. And I think that will be super encouraging to a lot of folks who are always concerned like, "Do I have that precise list of things for this job?" or, "Do I need to practice this other thing before I can do anything else?" I think that's really helpful to hear. Well, Kristen, thank you so much for joining us today. I know our audience is really going to enjoy all of the things that you've had to share. And I'm super grateful for you spending the time talking with us.

KRISTEN: 39:38

Thank you. It's been really fun, really fun at the end of my day. Thanks so much.

SUSAN: 39:45

Thank you for listening to my chat with Kristen Werner of Snowflake for today's Data Science Mixer. For this episode's cocktail conversation, Let's go back to earlier in my chat with Kristen. She suggested that data engineering and automation can improve the human experience of doing data science. How have you experienced that? Have you seen improvements in these areas that have made doing data science more enjoyable and productive for you? Share your thoughts on the Alteryx Community at Community.alteryx.com/podcast. And finally, if you'd like to experience how the combination of Alteryx and Snowflake could make doing data science better for you, we've got a starter kit on our website that you can explore. Visit Alteryx.com/snowflake for all the details. And we'll put a link in the show notes. Thanks for joining us. Cheers.

 


This episode of Data Science Mixer was produced by Susan Currie Sivek (@SusanCS) and Maddie Johannsen (@MaddieJ).
Special thanks to Ian Stonehouse for the theme music track, and @TaraM  for our album artwork.