Alter Everything

A podcast about data science and analytics culture.
Episode Guide

Interested in a specific topic or guest? Check out the guide for a list of all our episodes!


Bias in AI not only impacts repressed groups of people but also inhibits our ability to utilize the technology to its full potential. If bias in AI didn’t appear out of thin air, where did it come from? Charita McClellan and Monica Cisneros join us in a discussion on where AI bias comes from and how we can build a more inclusive future by training AI to include everyone.







Ep 159 (YT thumb).png


Episode Transcription

Ep 159 Bias in AI

[00:00:00] Megan Dibble: Welcome to Alter Everything, a podcast about data science and analytics culture. I'm Megan Dibble, and today I am talking with my Alteryx colleagues, Charita McClellan and Monica Cisneros. In this episode, we chat about bias and ai, how bias works, how it can impact AI and software development, and what we can do as professionals to increase inclusivity and improve AI strategy.

Let's get started.

Thank you both for joining us on the podcast today. Love for you to introduce yourselves to our audience so they can learn a little bit about you. We'll start off with Sarita. 

[00:00:40] Charita McClellan: Uh, hello. Thank you so much. My name is Sharita McClellan. I am the Global Senior Diversity, equity Inclusion and Belonging Manager here at Alteryx.

I started with the company about three years ago. Oh yeah, three years in May. And I've just been really lucky to really see the strides and really help develop that strategy when it comes to DEIB here at the company and seeing the improvements that we've made and, and the path we've went on to ensure that we have an inclusive environment and that all of our employees feel that they belong.

[00:01:07] Monica Cisneros: Awesome. Monica. Hi everybody. My name is Monica Cisneros. I am the product marketing manager for AI here at Alteryx. So I have a very interesting job. I have also been here at Alteryx around three years, and I have gone from machine learning all the way to AI right now. I am very passionate about responsible ai, making sure that we have an equitable and inclusive use of ai. So I'm very excited to be here with Charita. 

[00:01:34] Megan Dibble: That's great. Yeah, and I'm excited to talk about bias in ai. It's a really interesting, really relevant topic that we're gonna be chatting about today. So I'd like to level set and start off here. Really defining what is bias and is it something that's always intentional?

[00:01:49] Charita McClellan: Yeah, I think that's a really great question. It's really important to start at the beginning, right? So we're talking about bias in ai, but what is bias itself? And so bias refers to a prejudice in favor of or against like one thing, a person, a group of individuals versus the other. And so usually speaking that bias is considered to be unfair.

There are two types of biases. There's conscious, which is also known as explicit bias, and then there's unconscious bias, which is known as implicit bias. And so when we're talking about conscious bias, uh, that's a situation. It involves someone being aware of their prejudice towards a particular group thing or individual or particular topics.

And so their attitude is biased, but they are aware of it. And so. When you think about an example of unconscious bias, a light one is that I have a bias towards beach vacations, right? And so if you were to present me with an option to do a beach vacation or camping, I would have happy thoughts when it comes to thinking about the ocean and the sand.

And I'd think about camping and I would automatically think it's gonna be wet and gloomy and dark. A more serious example would be in a workplace that consistently hires only men, identifying individuals for leadership roles because they believe men are more suitable for those kind of positions. That would be a conscious bias that they're aware of.

On the other hand, there's unconscious biases and a lot of time and attention I think is, is really put into understanding. Unconscious biases are because we're not aware of them, hence the term unconscious. But they can have very detrimental impacts. And so when we're saying unconscious bias or implicit bias, we're talking about that prejudice that really operates as a subconscious level, and it affects our decision making, our actions, how we interact with people without our awareness.

So think about teachers who may have lower expectations of students from certain racial or ethnic groups. Due to like deep rooted stereotypes or how they were raised, or information that they've received or processed. Think about a hiring manager who consistently prefers candidates from a particular alma mater or a geographic region.

Maybe they were in the same fraternity or sorority that they were in. And they're not even paying attention. They're not explicitly going out and seeking those individuals, but as they come across their resumes, they're like, oh, and put them into the, the pile to be interviewed. And so that's an unconscious bias.

[00:04:07] Megan Dibble: Mm-Hmm. Yeah, those are super helpful examples. Thanks for sharing. I. So then how do these biases happen? 

[00:04:15] Charita McClellan: Another great question. And so biases occur through a combination of things, social, cultural, even psychological influences from a young age information's being poured into us, whether it's from our parents.

Whether it's media, our other family, friends, people we come in contact with at school or at the grocery store, right? All that information, um, is influencing us. Our brain is absorbing it and it shapes who we are, shapes how we show up in, in the world and society. But with that, it also shapes our perceptions and attitudes towards different groups of individuals.

So the brain, wonderful, wonderful organ, right? Very, very just busy and intuitive organ that we have. It has a natural tendency to categorize information. We have all these millions of bits of information coming into contact with us every day, and so our brain is categorizing that information. I. Help us make really quick judgements, right?

And it leads to shortcuts and stereotypes. And sometimes these judgements are great. We know if we see a car going very, very fast, we should not walk into the street. Our brain is already trained to let us know it's danger. Let's not do that, right? If we see a cliff, we know let's not walk off the cliff.

There's danger. You don't have to spend a lot of time processing that. 'cause our brain is already putting those connections together. But unfortunately, sometimes when other things have entered our brain, they can lead towards these mental shortcuts, which are stereotypes and make us have these unconscious biases towards individuals, things or group of people.

So whereas these processes help in our decision making, they can result in those bias perceptions and behaviors if we overly rely on those simplified incorrect assumptions. So I think it's really important that we take some time and figure out what our biases are so we can be more proactive in mitigating them in the future and how they impact other people.

[00:05:58] Megan Dibble: Definitely. I'd love to move into combining the topics for today. So AI and bias. How can biases negatively impact us when it comes to AI and software development? We'll start with you on this one, Monica. 

[00:06:15] Monica Cisneros: So one of the things that people have misunderstanding about is that AI can inherently be biased, but they're missing all of the context that goes with it.

Ever since we started computing, we have been adding data into the digital world, and that data is representing the real world. So if we are historically having bias systems, then that gets represented in this digital world. For example, hiring from your al matter, that becomes into a historical record that becomes part of the data, and then that gets represented in the digital world.

So machine learning, how it works is that it learns from historical data. We're training the algorithms, the models, with that historical data. Then it's going to learn that the best candidate is the candidate from x alma mater, or a man or a white person. Right. So if this is representing our societal norms that we have done historically, and they're not changing with the time where we want to go forward, then that is going to perpetuate that bias in the digital world.

So ai, what it is, it is a conjunction of models, and models are made out of algorithms, and these algorithms are trained on the data. So AI is not inherent bias, but rather the data that is being trained on is what is bias, and therefore it perpetuates that bias when we are, for example, making decision and we're automating those decisions when those decisions are being automated.

We are lacking this new way of having some critical thinking or criticizing the model itself, scrutinizing it and accepting the output that the model comes out as fact, and that is how it perpetuating bias and how we represents back again into the real world. 

[00:08:18] Megan Dibble: That's great answer and really interesting that it's often more of a data problem when you end up with biased results.

You have to look all the way back into the data and all the way back into history sometimes to figure out what the root causes of some of these predictions that that models are making. So I think that's really interesting. 

[00:08:40] Charita McClellan: I completely agree. As Monica said, and as someone in the DEI field, a lot of times when you're coming up with strategies or you're looking at data and you're giving recommendations of what we should do next, the question you get a lot of is, well, what's the benchmark data?

And benchmark they get data is looking at historic data, looking at what your peers are doing. But if we are in agreement that things have been done a particular way for X amount of years and it's not the way that we want it to. To have been done. I was referring back to that historical or benchmark data.

Looking at what our peers are doing is just not gonna get us where we wanna go. To Monica's point, the data is a record of what occurred in the past, and what occurred in the past is problematic. And so in order to move forward, we have to really focus and diligent on shifting what information and data that we're looking at and not always using that as our go-to.

[00:09:29] Megan Dibble: That's a great point. 

[00:09:31] Monica Cisneros: There's this case, actually an excellent book called Unmasking AI by Dr. Joe. And she in her book, describes her experience as a researcher. She was doing research at MIT where she was programming a mirror and she's a black woman. And the algorithms that she was using, the models that she was using to program or mirror were not recognizing her face.

This is like open source models. Actually, she got them from GitHub and they were just not trained to recognize her face. She ended up, actually, she went to a Halloween party, came back, she put on a white mask that she was using for that party, and then magically the algorithms were like recognizing her face now because it was white.

So it's a very practical example. Very, very simple. That she was experiencing as a researcher. She only wanted to be a researcher, do her science. But it ended up being that she became an advocate. She became thought leader because she realized all of the lack of representation that is in training data, and that is perpetrating that bias.

Right? So for her thesis, she ended up creating a database. Black faces and female faces. So she not only had an issue with being a woman, she also had an issue with being black. And it was an intersectional issue where like her face was being misrecognized even after she had already trained them and she used different models to benchmark it.

And those different models were actually performing differently depending on the dataset that she was feeding at. And baseline models a very big disparity compared to, for example, male white faces. The male white faces were a hundred percent, whereas some models were as slow as 40% of recognizing black women faces.

So that is a real world example where a researcher, she was just doing her research, she just wanted to like finish her project and she ended up with a huge problem right now and she's building a career out of it to try to come up with solutions and, um, working with companies and the public to bring awareness to this big issue.

Because ultimately the systems that we're using, the models that we're using get put into other uses. For example, there is a really big chain of pharmaceutical stores that have flagged innocent people as basically being shoplift, and that was just not true. They were just not recognizing the right face to the right person.

That became a huge issue for, uh, not only of course for the person, right, but then it also affected the brand very negatively as well because they were using a model irresponsibly. They were using AI irresponsibly for use case that shouldn't had been put on in the first place. 

[00:12:48] Charita McClellan: I think that this really touches on the need for inclusivity, right?

And when it comes to inclusion and bias, they tend to go hand in hand. A lot of times you have these biases. Are able to terminate to these spaces such as AI because we don't have a lot of inclusion at the table. Right? So for example, and I love when Monica brought up this real world scenario from this coder, but also, um, was it maybe 2019, 2020, there was a video from a Marriott hotel where the automatic soap dispenser would not recognize a black person's hand who went underneath it.

And it was, it was because the infrared light actually detected hand motions, and it was from the back of the hand. So darker skin, a melanated skin was not recognized in the technology that they put on that soap dispenser. Or a couple years ago where the iPhone 14 couldn't recognize the difference between two Chinese colleagues.

It just decided that they were both Chinese and so it couldn't differentiate between their face, and so it would unlock the screen on both of them versus it's only supposed to unlock the screen on yourself or Google's technology that was self tagging. Darker skinned black people as, as gorillas with their AI software as well.

And so I think if we had more inclusivity and you had more coders from different backgrounds, more people who were developing this software and testing the software from different backgrounds, they would catch these things. But because we don't have the inclusivity in these processes, then these biases seep through the cracks.

Some are conscious, but you know, I, I tend to give the benefit of the doubt. A lot of them, I believe, are unconscious. 'cause people aren't thinking to check these things 'cause it doesn't directly affect them. 

[00:14:20] Monica Cisneros: I think a really interesting topic that I did not have the answer to is that inclusivity piece, right?

I feel sometimes that the onus of creating a more inclusive system or more inclusive data representation, whatever it is, it ultimately comes back to the minorities, the people who are actually being affected, and I don't think that that's fair. Because like Dr. Joy and Winnie or like myself, I have had to build systems for myself to feel more included and it's not part of the institution, the company, to make me feel more included to make sure that I'm having an equitable experience.

And I think that companies, institutions should take a very critical view to their practices today. And really bring it on. I know that recently a lot of companies have had higher, for example, diversity and inclusion offices, but some of them have actually ended in the past two years because of budget cuts as we're implementing generative ai.

In a wider sense, the diversity and inclusion office should be even more critical and be part of the conversations as we are implementing this technology, because we're not only applying it to, for example, manufacturing, we're also applying it to higher people. We're applying it for customer success.

We're applying it to the employee experience as well. So how can we. As corporate citizens, as corporate leaders have that responsibility to make sure that we're implementing AI not only in a responsible way in terms of like privacy and security, but also being equitable and inclusive. 

[00:16:20] Charita McClellan: Yeah, I think that's a really great point.

It needs to be top of mind for everyone, whereas I don't have the answer. I definitely think I have some ways that we can start implementing and doing that. Right? I always say that inclusivity has to be intentional, so until the EI is part of our DNA, we have to intentionally do stuff. So whether it's building out these policies and processes for a lot of processes, it's almost like a checklist.

There's things that you have to do, you have to check to get through, whether it's your due diligence process, your UIT process. And so these things need to be called out specifically. You know, have we looked at this from an inclusive gender perspective? Have we looked at it from an accessibility perspective?

Have we looked at it from a, a racial ethnic perspective, neuro diversion, and just really going through and, and calling that out specifically. I also think it really plays a part in who's in the room. To your point, Monica, it should not be on the marginalized groups. I always have to say, Hey, what about us?

Hey, don't forget about us. Hey, that's looking kind of racist, right? Um, the majority needs to step up and need to be proactive there. And so I think that sometimes it's outta sight out of mind. If I'm sitting in a room, if everyone looks like me or everyone has the same background as me, I'm probably not thinking about anybody else.

If somebody's sitting across from me or they're in a meeting with me and I'm like, oh shoot, we didn't think about A, B, C 'cause I saw that person that made me think of it. Right? And so if we have some more inclusive work groups, more inclusive leadership, if we start really hitting diversity from all angles.

I think that we no longer will have that out of sight outta mind issue and people who it may not directly impact will start thinking about their colleague that they can reach out and touch or their friend that they see more and, and think about them in, in the practice as they're developing these things.

[00:17:57] Megan Dibble: That's great. Really appreciate all the examples too, to put it into perspective for our listeners. So, yeah, I'd love to move on to what Alteryx is doing to combat biases when we are developing our products, when we are working with ai. 

[00:18:13] Monica Cisneros: Absolutely. Right. Before I touch on what we're doing exactly, I do want to make sure that the listeners understand that this have real world consequences, right?

If you are discriminated out against somebody, the company or the institution is at risk. They're at risk for es, were anti-discrimination laws, any other potential lawsuits, and also reputational damage. So this is not only like, Hey, if we don't do this, somebody's gonna be mad at us. It's just kinda sad.

This can lead to real consequences, and I'm actually really glad that the governments right now, the EU AI Act just. Past and also the AI bill of right here in the United States have evolved. And that is something that not only. Personally, I'm looking forward to, but I also feel like a lot of companies are looking forward to, because now we're starting to create a standard for what good looks like.

So based from that, Alteryx has had a hand on machine learning AI for a long time, and we had our best practices. But recently we have come out with our responsible AI principles officially. I'm very, very proud of the team that worked in this. Our legal team, they're amazing. Our product team that gave a lot of feedback.

Our policy team that they're coming with all of these policies, regulatory context, and we're able to put them into actual paper. Review them, have it scrutinized, and now we're able to put it in our website. So the six principles are trust and accountability, transparency and explainability, fairness and inclusivity.

Human agency and oversight, empower social good, and reliability and safety. And these are not coming from a vacuum. These are coming from. Industry leaders and governmental frameworks that exist elsewhere. So we're just not coming up with it. We're actually did research applied them and we put the consensus that match our intention as a company to go forward with those responsible principles.

But it doesn't really stop there because putting responsible AI principles in a webpage, that can actually be really easy, right? You just come up with the terms and then you create a website and that's it. But big shout out to our legal team that they're really keeping us accountable to have those principles in hand when we are developing testing our software.

So we have an AI risk management for analyzing the risk associated with our AI projects. And this AI risk assessment is for both products offered externally and any tools and data that we're using internally as well. Alteryx has joint consortiums like nist. The US AI Safety Institute. We are based from the IS Assessment and our responsible AI principles.

We're also looking at our product roadmap. We're analyzing all the gaps that we have in creating technical controls to address those gaps. Um, we also have a guide for data licensing. Recently there has been a really big issue about intellectual property and copyright. Into training ai. So this is why part of our responsible AI principles and, and our responsible AI framework is that we have this guide for the data licensing basically to reduce that risk of IP n copyright infringement.

Now, without all being said, the risk does not end. There's not such thing as a perfect system. There is no such thing as a risk-free AI implementation, but we do the best that we can and we make sure that we do it in the most detail oriented, committed way possible because we truly believe in it. 

[00:22:29] Megan Dibble: Yeah, and it sounds very intentional, and that's something Trita touched on before is.

Being very intentional with the de and I efforts, but also being very intentional with any sort of AI deployment. So I really appreciate that and that makes me proud of how Alteryx is handling AI and how we're working through it. Just like a lot of other companies are. Definitely curious to hear from listeners in the comments would be really interested to hear how their companies are handling ai, what concerns they have.

I think it'd be great to. Keep the conversation going as well. So yeah, appreciate all those points from you, Monica. So we touched on this a little bit, but I'd love to dive in a little bit more on just what we can do as professionals to avoid bias in our work. I was thinking about this on my end as someone who works on creating content, I'm consuming a lot of content, I think.

For me, like staying up to date on industry and AI news is really big for me because reading stories and case studies, you both brought up a lot of examples and I think like just. Reading stories, opening your eyes to other stories can really help with those. Unconscious bias can help you see other people, other people in the virtual room, I guess.

So that's been something that's been helpful for me. And taking a, a data science ethics course on Coursera, that was also really interesting, just opens your mind to bias possibilities that maybe were unconscious and can help improve your. Critical thinking and and reasoning when you hear those other stories.

So curious what you guys have to say on this one. 

[00:24:09] Charita McClellan: Yeah, I'd love to hear that you're taking those efforts. I'm a real big education person, so I tend to try to give the benefit of the doubts, and I always say we don't know what we don't know until we know. And so I think it's really important to first just recognize we all have biases, right?

I know a lot of times people like to say, well, I don't have a bias, or I don't see color, or I don't see gender, or whatever. Those things aren't true. We all have biases. It's the way. We were made, and so I think it's really important to just first level set with that. In addition, from that, let's try to work on those biases, right?

Most companies probably have, at this point, an unconscious bias training you have to take as part of compliance. So actually pay attention to that, to that training, but also be proactive, learn about something else. Learn about someone new. Read an article. To your point, take a course. You just gotta be open-minded.

'cause if you have a closed mind, you're not gonna be able to receive it. So I think it's really important to, one, recognize you have the biases. Two, try to learn about something else, or someone new, or a different region or different group of individuals that'll help open our mind. It may be circumvent some of those biases.

I always think it's important to do a self-check, right? Take a moment before you submit that final strategy or assessment or whatever you're doing. Think about who isn't included. So I think a lot of times when we talk about DEI, we talk about inclusivity. We always say making sure everyone has a seat at the table.

Pay, pay attention to everyone's there. I think it's even more important to pay attention to who's not there. It's even more important to think about our words that we use because our words have power. Is there a way I can say that differently? We're trying to really get to not using violent terminology and in work.

So if there's something you're gonna try to do, you don't have to take a shot at it, but I can try to get that work done. Right. And so thinking about that terminology that we're utilizing. When we're talking about an AI perspective, I think really just integrated into that, that process to do a self-check, to make sure we're checking for any biases that are there.

Taking a moment to see what groups of individuals may be affected or may be left out. And then I think it's important as people are creating these data sets and they're training them, they use data sets to help them train. So Monica touched on in the beginning how these are data records. Machine learning and AI is looking to that and pulling information out of it.

Right? And we don't, if they look at historical information and that's all they see or look at, it really is gonna be geared towards and have biases towards certain demographics of individuals, and they're gonna assume that those demographics. Or those individuals that follow those demographics are they're pristine or the elite or what we should be more skewed towards.

And so there are companies and websites like Surge AI that have all these human labeled data sets that you could feed into your software, feed into what you're developing to help train it. 'cause what you don't wanna do is have to sit there and make your employee type in hateful things to then make sure that the data set knows these hateful things I typed in are bad.

Right? We don't wanna put that on our employees either. So using those data sets to train the model, so the model is trained accurately, is aware of some things that may come across as being racist or homophobic or biased towards particular gender without putting that on our employees, I think is also really important.


[00:27:14] Monica Cisneros: So one of the things that I share with people is to share the stories, document the harms, and demand that their dignity is a priority and not enough afterthought. There are a lot of AI trends out there, so think twice. Make sure that it's part of your strategy, and make sure that well deployed. If we're doing this haphazardly, we might end up with those risks that we were talking about.

And support organizations that are putting pressure on policy makers and to companies to prevent AI harms. This is one of the pieces where as a corporate citizen, you are able to give feedback, right? Say, Hey, I tested this system. I found X, Y, and Z. There is a very famous case out of New York where they deployed a chatbot and they found that it was basically doing harm.

Like, yes, you have to implement it responsibly, but it's also part of the evolution. We are in a nascent field, especially with generative ai, where we do not know all the gaps. And having a feedback loop with either the public or the users, the people using it to make sure that this system is not doing harm is part of the process of implementing ai.

Now, the people who are putting it out there. They, they have a limited scope. They are limited to their experiences and they're limited to what they know. And this feedback allows 'em to grow and iterate on that technology. And that is also part of the data science lifecycle. We put something out there, we clean the data, capture the data, train the model, deploy the model, but then it doesn't stop there.

We have to monitor it, first of all. Second of all, we have to get feedback and iterate on it. So if we're doing it with traditional machine learning to identify semiconductors through computer vision, we can also do that with the chat bots or co-pilots that are coming out recently and they're be getting implemented in a lot of other products and services out there.

Do not feel bad about talking about it. Report them. Make sure that the company knows that their chat bot has a offhand response. Um, it's, it's part of that iteration and feedback process that needs to happen for us to keep having safe systems. 

[00:29:49] Megan Dibble: Yeah, I think those are really great points. It is a new field and that's not a reason to.

Stop or to be afraid or to not develop anything. But it is a reason to integrate even more feedback loops and to continue just like it's a continuous learning process. And so. I really appreciate both of you coming on today. This has been really great to hear from both of your perspectives and a great episode for listeners to tune into as well.

So thank you so much for sharing. 

[00:30:19] Charita McClellan: Thank you. Thanks for having us. 

[00:30:21] Megan Dibble: Thank you for having us. Thanks for listening To learn more about topics mentioned in this episode, including Alteryx's responsible AI principles. Head over to our show notes on See you next time.

This episode was produced by Megan Dibble (@MeganDibble), Mike Cusic (@mikecusic), and Matt Rotundo (@AlteryxMatt). Special thanks to @andyuttley for the theme music track, and @mikecusic for our album artwork.