Sara Wachter-Boettcher: “Technically Wrong: Sexist Apps, Biased Algorithms […]” | Talks at Google

Sara Wachter-Boettcher: “Technically Wrong: Sexist Apps, Biased Algorithms […]” | Talks at Google

Articles Blog

[MUSIC PLAYING] SPEAKER: Hi, everyone. Welcome to Talks at Google
from Cambridge, Massachusetts. And today, we’re
very happy to host Sara Wachter-Boettcher and her
new book, “Technically Wrong.” Sara’s been a web designer
and UX consultant. And so she’s in the industry,
but not necessarily, completely, of it. And I think this gives her
a great fresh perspective on some of the issues that face
our industry and society today. These are increasingly
familiar themes, but I think that’s
exactly as it should be, given how critically important
they are to our industry and, indeed, to the
world as a whole now. So, welcome, Sara. SARA WACHTER-BOETTCHER: Hello. Thank you, all, for coming. So I’d like to talk today,
first off, about something that is not in the book at all. And it’s not in the book
because it happened last week. And some of you might
be familiar with it. This is out friends,
the mini cupcakes. And I’d like to talk
about this, both because I’m here at Google,
and because they think this is a really good
example of some of the things that I think are not
being talked about enough. So mini cupcakes, as
I’m sure you all know, were the topic of discussion
because of a Google Maps update that went out to a
selection of iPhone users that started showing
the number of calories that maps thought that you might
burn if you walked somewhere instead of taking
some other form of transit and also how
many mini cupcakes that might be in terms of
your calories burned. Now, almost immediately,
after this launched, some people started
talking about it. One of them was a woman
named Taylor Lorenz, who is a journalist. And she did not
like this update. So she had some comments. So one of the first
things she said was, oh, my god, if you
click on walking directions, it’s going to tell you
how much food this burns. She goes on to
talk about there’s no way to turn this off. Then she says, “Do
they not realize how Triggering this is for
people with eating disorders?” And this just generally
feel shamey and bad. She goes on to talk about
how calorie counting maybe isn’t even a good
thing, and there’s a lot of disagreement on that. Then she talks about how this is
even perpetuating diet culture. So on and on she goes
up until about 9:00 PM. This is about an hour
of tweets that she has sent walking through
all of the reasons that she thinks
this is a problem. She said there’s no
way to turn it off. This could be dangerous for
people with eating disorders, this feels shamey,
average calorie counts are wildly inaccurate, not
all calories are equal, a cupcake is not a
useful metric, what is a mini cupcake anyway,
whose mini cupcake are we talking about,
pink cupcakes are not a neutral choice, they have– not her exact wording,
but, they have social and cultural
encoding to them that they would be perceived
as being more feminine, more white, more middle class– and that this
perpetuates diet culture. That is sort of a summary of her
hour of evaluating this product choice. And I look at
something like that, and I think, OK, it took her an
hour to document these flaws. It took three hours after she
started tweeting about it– and I’m sure not only because
she was tweeting about it– for the feature
to get shut down. And so I ask how much time
was invested in building that in the first place? I suspect more than an
hour, more than one person. And why, nowhere along the
line, did this come up? Or if it came up, why
wasn’t it addressed? Why didn’t people
take it seriously? Whoever brought it up,
if they brought it up, why was their voice not
valued in that discussion? Why did people not think
that this mattered? And this is just one
tiny thing, right? It’s a freaking cupcake. Except that it’s
not this tiny thing. What I think it is,
is it’s actually a perfect encapsulation
of some of what I see going wrong in all kinds
of tech companies, certainly at Google and elsewhere. All over, in almost every
single aspect of our lives, we have tech companies
that really end up thinking that they
understand people and making choices that make a
lot of assumptions about people that leave them too narrowly
focused on whatever they thought their goal was, and
leaving so many people out along the way. And you end up with this kind
of tech-knows-best paternalism where a tech
company is presuming to know what people want
and what people need. And that when you’re mapping
something, what you really need is calorie counts and
to know how much food that is because,
after all, let’s talk about obesity in
America or whatever. And so, what you
end up having is this really narrow understanding
of what normal people are. And I think that that goes
very deep in tech companies and the people who work there–
this idea that we understand what normal means,
what users want– you can see it in so
many different examples. And when you work on something
like this, all of your friends send you screenshots. So I got a lot of really, really
fun screenshots from people. One of them came from
my friend Dan [? Han. ?] This is an email that
he got from his scale, which is a thing that you
get from your scale now. And this email is saying
don’t be discouraged by last week’s results. We believe in you. Let’s set a weight
goal to help inspire you to shed those extra pounds. But you’ll notice this is not
addressed to Dan, my friend, this is addressed to Calvin. Calvin is his son,
who is a toddler. And every single week,
Calvin weighed more, right? Weird. And the scale didn’t
get that, like, just didn’t understand that that
actually could be a perfectly normal and natural thing. Because the only thing that the
product had been designed to do was to congratulate weight loss. And it’s really
funny when you get this message for your toddler
because your toddler isn’t internalizing this
particular email. But it gets
progressively less funny for a lot of other people. In fact, this was another
message on a push notification that he got. “Congratulations! You’ve hit a new low weight.” This one actually
was for Dan’s wife. And she received this right
after she had a baby, which, I mean, I guess it’s true. But that wasn’t something she
wanted to be congratulated on. And in fact, I know
a lot of people for whom a message like
this is not good at all– people who suffer from
eating disorders, for sure– I have a dear friend who
spent a long time in treatment for eating disorders. And this kind of
message, this is exactly what she does not need. She does not need
to be congratulated. But it’s also a
lot of other people who have chronic
illnesses and where their chronic illness might mean
that hitting a new low weight– it means they’re sick. It’s a sign that something is
going really wrong with them. And there are just
so many reasons that notifications
like this don’t work– that product
decisions like this don’t work for real people. But yet, so many
examples of people making product choices
that are like this, that are exclusionary
and that are alienating– for example, this is a message
that a friend of mine, Erin got from Etsy. She has the Etsy app
installed on her phone. So she got this
push notification because, obviously, Etsy wants
to move some Valentine’s Day merchandise. It says, move over, Cupid. We’ve got what he wants. Shop Valentine’s
Day gifts for him. And Erin’s partner is a woman. And she looks at
that– and it’s just a stupid throwaway message. It’s just a little
marketing message. It’s just a little bit of copy. Except that she looks
at that, and she says this is really alienating. Did you not think that
you have gay customers? How did this not
cross your mind? And if this doesn’t
cross your mind on this little tiny
thing, where else has that not crossed your mind? And, is this a company I
really want to spend money at? We also can see these
kinds of failures to understand real
people in places that are more worrisome
and more problematic. So last year, JAMA
Internal Medicine released a study that showed
that the smartphone assistance from a bunch of
different manufacturers– not just Apple– but
we have Siri here, as well as from Google, from
Samsung, from Microsoft– that they weren’t programmed
to help during crisis. They didn’t understand a lot
of inquiries related to crisis, including a lot of things
related to things like rape, sexual assault, or
domestic violence like, my husband is hitting me. So I went in and I started
saying well, let me go in and see what I get. So I went in, and I
asked some questions, and I took some
screenshots of what I got. And what was really alarming
about it wasn’t just that Siri didn’t understand. Because I figure Siri is not
going to understand everything. Although, I think that it
could do better there– but also, Siri was responding
with jokes or little digs. Like, thinking it’s going
to be, clever and funny, and “it’s not a problem.” “One can’t know
everything, can one?” And the real thing that
made me upset about this was that this
wasn’t exactly new. Because back in 2011,
when Siri was brand new, there was a whole spate
of bad press for Apple because people have found
that if you said things like you wanted
to shoot yourself, it would give you
directions to a gun store. There was another example where
somebody was saying something about jumping off of a
bridge, and it told them where the nearest bridge was. And Apple was, like, oh, whoops. We don’t want that to happen. And so they went,
and they fixed it. And they said, OK, well,
now if you say something that Siri identifies as
being potentially suicidal, what we’re going to do is
we’re going to surface up some information about the
National Suicide Prevention Lifeline. And so you can click
through to them so you can get help in crisis. So they did that in 2011. But I look at that,
and I wonder, well, OK, if you knew that in 2011, why,
five years later, has it not occurred to you to go
beyond that one thing and to make changes to
the product as a whole? Why is it still more
important to program jokes into the interface than to think
about other types of scenarios where somebody might
turn to the device? And in fact, in this
particular example, when I started
talking about this, I had some folks
say things to me like, well, you’re an idiot if
you use your phone after being sexually assaulted. You need to go to the police– or various other
statements like that. And I thought, you know what? I don’t actually
care what you think about what somebody
should do after they’ve been sexually assaulted. What I care about is the
fact that people are using their phones during this time. There is a quote from my friend
Karen McGrane, who is well known in user experience,
and she did a lot of work on mobile content. And when she started doing
that, she would say to people, you don’t get to decide
what device people use to access the internet. They do. And I think that it’s a
similar sentiment here. You don’t get to decide what
circumstances somebody is going to be in when they
use your technology. You don’t get to decide that
somebody should or shouldn’t use their smartphone assistant
when they’re in crisis. They’re going to decide that. The only power that we have in
tech is to figure out, well, how do we respond to that? Do we choose to
help them or not? Do we choose to
anticipate that or not? How do we deal with that? Because you can think
that it’s a bad idea all you want, but
people are doing it. And those people,
oftentimes, are kids or youth who are more comfortable talking
to somebody on the screen than they would be to go
to somebody in real life. And what do you want
to do about that? And so I look at
this, and I think, there’s been a
lot of opportunity to improve this beyond
just, let’s change one individual thing. But we haven’t really seen that. Instead, what we have is we
have Apple going in and saying, oh, yeah, yeah, yeah. We don’t want that to happen. So we’ll partner with the
Rape Abuse and Incest National Network, RAINN. And we will do the same kind
of thing as we did before. We’ll have a little– if you have an issue
with sexual assault, you can go to the
lifeline there. And I think that
that’s a good change, but I don’t think
it’s enough, right? Because I don’t know how
many other types of scenarios Siri is not going to understand. And I don’t expect Siri to
be perfect by any means. But I do expect for
it to understand that it’s going to
have scenarios that are negative and straight-up
terrifying that people are going to use that
device for– and to be able to anticipate that and not
have these terrible breaches. But I find that over and
over, tech companies really aren’t thinking
enough about the ways that their design decisions
and tech decisions can break. Grief is a great
example of that. Tech is very good at focusing
on things like delight– tends to be really
bad at focusing on things like grief,
particularly because we spend a lot of time– I know on my end
of the tech world, where we talk a lot
about UX and content– we spend a lot of
time talking to people about things like personality. And we’ve got to make this
human, make it friendly. And that gets translated into,
let’s make this overly clever. And as a result, you
get things like this. We’ve got Timehop, which posted
to this person who was sharing about a memorial service. They resurfaced
that post and said, this is the really
long post that you wrote in the year of
2000-and-whatever. That’s snarky, rude. Or you’ve got the millions
of different places where you’ve got– “could a
gift say it better?” Could you be more clever? Or Medium– which, Medium has
removed this string, actually, from their system. But when you posted a
new story on Medium, they would send you
these little fun facts along with their update
on how your story is doing. And they’re trying to make
it seem like, it’s OK. Your story is just
picking up steam. But you don’t really
want a fun fact on a post that’s in
memory of a friend. And in fact, maybe the
worst break of this sort is one that some of you
have probably heard of. It happened to Eric Meyer. And he is a longtime
web developer who I’ve known over the years. And it was one of
the things that led us to write the book
called “Design for Real Life” together a couple of years ago. And what happened to him
was that he went to Facebook on Christmas Eve in 2014. And he was expecting to
see the normal family well-wishes, stuff like that. But what he found
instead was this. This is a Year in Review. And what Year in
Review did was surface your most popular posts,
images, videos from the year, and package them up in a nice,
curated little collection for you. And then they took
that, and they put it into their little
wrapper, right, with balloons, streamers,
people dancing. And it says, hey, Eric, here’s
what your year looked like. And in the center of that
was the most popular photo that he had posted all year. That photo is of his
daughter, Rebecca. And Rebecca had died
of an aggressive brain cancer on her sixth birthday. Of course, it was the
most popular photo he posted all year. It was also the worst
year of his life and the worst
moment of his life. And so Facebook had kind of
put that back in front of him in this peppy little package
and said, here you go– even though he’d been avoiding
this feature, even though he didn’t want to use it. And I’ll tell you, Eric
was gutted by this. It just hurt very badly. And he wrote this
blog post about it, and the blog post went viral. And all of a
sudden, it’s getting reposted to “Slate,” et
cetera, et cetera, et cetera. And Facebook reaches out. And the product
owner apologizes. I mean, they didn’t
want this to happen. Nobody there wanted
this to happen. So they apologized
and said they’re not going to do this again. And the next year, when
they did Year in Review, they changed the
design of that feature so it doesn’t take your content
and put it into a new context. But now, it’s almost
three years later. And in fact, Facebook is still
doing basically the same thing. So this is an example from
just a couple of weeks ago. Olivia Solon is a journalist
for “The Guardian.” and she had posted
a photo to Instagram that was a screenshot
of an email she received that was full of rape threats. She posted it to
Instagram because she wanted to show the kind of
abuse that women who are public online often get. It was also a highly
commented on photo on her Instagram account. Well, Facebook owns Instagram. Facebook would like
more Facebook users to use Instagram. So what Facebook does is, it
will take your Instagram posts, insert them into an
ad for Instagram, and then put that ad onto
your friends’ Facebook feeds. And so Facebook did
that with this image from her Instagram account. And so her friends started
receiving peppy ads for Instagram showcasing
a rape threat. Because you see, they
treated this problem with Year in Review as sort
of an isolated incident. They fixed it,
and they moved on. But they didn’t think about
the overall problem of changing the context of users’ content,
and how this could go wrong, and who could this could hurt. I think this quote from Zeynep
Tufekci, digital sociologist, really sums it up
effectively here. She just started talking about
this the other day on Twitter. She said “Silicon
Valley is run by people who think that they’re
in the tech business, but they’re in the
people business. And they’re in a way
over their heads.” And I think that this is true. I think that we have
spent a long time assuming that the most
important thing that we work on is technology. But in fact, we are affecting
people’s lives in dramatic ways that we could not have even
envisioned a couple of decades ago. And we haven’t really
caught up to that. We haven’t really
taken that to heart. And we can see that playing
out in so many ways, from all of these little
interface examples, to much, much deeper places. For example, we see that playing
out frequently in anything seemingly related to image
recognition and image filters. For example, you may have
seen this on Snapchat. Snapchat, last year,
released something that it called the anime filter. Only, it didn’t look like
any anime I have ever seen. What it looked more like was
something that you probably wouldn’t see today. That’s Mickey Rooney playing
IY Yunioshi in “Breakfast at Tiffany’s.” That is Mickey
Rooney doing, what we would now call, yellowface. He’s dressed up as
an Asian person. And what that filter
did was basically the same– here’s your slanty
eyes and your buck teeth. And you know, race is not a
costume that you get to put on. And we probably
mostly know by now that you don’t dress
up as an Asian person to go to a Halloween party. And yet, nobody thought
that it might be a problem to dress somebody up
as an Asian person in their selfie on Snapchat. After this happened, Snapchat
wouldn’t apologize for it. And in fact, it wasn’t
even the first time they’d done something like this. Because just a few months
before that, on April 20, they released a filter
they called Bob Marley. And it gave people black
skin and dreadlocks. And again, it was
roundly criticized as being a digital blackface. But they wouldn’t actually
admit that this was a problem. And it’s certainly
not just Snapchat. Because just this year,
FaceApp made this splash for a little while
where you could make selfies that were a
younger version of yourself or an older version of
yourself or a hotter version of yourself. And for the hotness
filter, what people started realizing was that it
made their skin lighter. It changed their features
to look more European. And what’s
interesting about this is that FaceApp admitted
what had gone wrong. The CEO said, “We are deeply
sorry for this unquestionably serious issue. It’s an unfortunate
side-effect of the underlying neural network
caused by the training set bias, not intended behavior.” Now, I read that
and I think, OK. It’s an unfortunate side-effect. It’s not intended behavior. Actually, that’s not true. It’s definitely intended
behavior in the sense that you took a set of photos
to train the algorithm on what beauty was or what
hotness was, right? And those were photos
of white people. Like you’ve just said, you had
bias in that training data. So you fed the system a bunch
of photos of white people and said here is what
attractiveness looks like. And so the algorithm
worked as intended. It found patterns in those
pictures of white people, and it applied them
to these selfies. So the algorithm was
working as intended. You just fed it biased
data at the beginning. And it’s not an
unfortunate side-effect. It’s, in fact, entirely on
you and entirely something that you could have anticipated. But they apologized. They renamed it. They said they were
going to fix it. But just a little
later, this August, they released this new filter. This one, you could
literally try on races– black, Caucasian,
Asian, Indian– and so you could just select
them and show yourself as different races. And people were not
pleased with this. This was not a good idea. And immediately, they decided
they would take this down. They apologized and said, “The
new controversial features will be removed in
the next few hours.” And I read that, and I
think, controversial is not the right word for this. Controversial implies that
well, got to hear both sides– some disagreement. No. Calling it controversial–
what that does is that doesn’t
acknowledge history. It doesn’t acknowledge there
is a tremendous body of work– like, have you ever heard
of critical race theory? Maybe you should
before you start messing with races and filters. There’s an entire
body of work that’s looking at how race
functions in culture. And so it is not
enough to write it off as a controversial feature. It’s not like somebody
didn’t like your new logo, and people wrote some
Medium post about it. No. This is an actual problem,
and it is well-documented. And so it gets reduced
to just a disagreement. And the other thing about
this is that this is not news. FaceApp could have known better. Because not only
had Snapchat already had some similar issues, but
we can go back to the year before, to 2015, and we
can talk about something that happened here at Google,
which I’m sure most of you are familiar with, when
some photos, a whole series of photos of black
people got tagged by Google Photos
auto-tagging as gorillas. Now, the fact that they weren’t
auto-tagged with something that is a racial slur is
probably the reason that this blew up really big. It’s definitely the reason
this blew up really big. This was all over the media. And that’s what
people talked about. But the thing about this example
that is much more important, I think, and that wasn’t
talked about nearly as much, was why this happened. And so I dug around
at this example, and I found what
Yonatan Zunger had said to Jacky Alcine, who was
the guy that this happened to, after the fact. He seems like a wonderful guy. He seems to be trying very
hard to get this stuff right. And yet, you take a look
at what he wrote here. He said, “We’re working on
longer-term fixes around both linguistics– words to be
careful about in photos of people”– OK. So we’re going to
be more careful applying tags that could
have questionable context– but also “image
recognition, itself– e.g., better recognition
of dark-skinned faces.” And what that is
acknowledging right there is that that product
went to market not as good at identifying
dark-skinned faces as it was at identifying
white people. So the specifics
of that missed tag happened to look really bad. But the underlying
issue was that it was more likely to
mis-tag people of color. And I think, well, how did
you get a product to market that wasn’t that good at
identifying dark-skinned faces? Well, that’s not just Google. Because failing design
for black people– that’s not new. Back in the ’50s,
Kodak first started allowing people who
were not in Kodak labs to produce their film. So you could go
to a mom and pop’s to get your film developed. And so what they
did is they started sending these little packets
to the photo lab technicians to use to calibrate
skin tones and light. And they called
them “Shirley” cards named for the first woman
who ever sat for them who was a Kodak employee. And for decades, they
sent these cards out. And they were always the same. The styles would
change, the woman sitting for them would
change, but they were always called “Shirley” cards. They always said
“normal” on them, and they always
showed a white person. And so for decades,
this went on. And a black photographer
wrote this piece on BuzzFeed where she talked about this. And she talked about
her experience feeling like film was never developing
correctly for her skin tone. And she said, “With a white
body as a light meter, all other skin tones become
deviations from the norm.” It turned out that film’s
failure to capture dark skin is not a technical issue. It’s a choice. And in fact, a researcher
went and talked to a bunch of people who
used to work at Kodak back in the ’70s when this
started to change. And what she learned was
that it wasn’t that they decided to be more inclusive. It was that furniture makers
and chocolate manufacturers were complaining. And they were complaining
because this film was not properly showing the difference
in different wood grains and the difference in
different chocolate varieties, like milk versus dark. And that is actually what
led to that product shift. And so here we are, again,
over and over again, seeing tech companies re-enact
these kinds of choices. It’s not a technical issue. It’s a choice. And when I look at that, I
have no choice but to say, this is literally what it means
when you say white supremacy. And people don’t want to
hear those words, oftentimes. Because when you
say white supremacy, what people want
to be talking about is scary guys with swastikas. And in fact, those
are terrifying people. But when you talk about white
supremacy, what you’re really talking about is simply
putting whiteness first. And so, if you
decide you’re going to make a product that works
better for white people, if you decide that you’re
not going to include people of color in your
training data, you have literally decided
that white people are more important. You have literally
enacted white supremacy. That is uncomfortable. That is not a
conversation most of us want to get up in
the morning and have. But the thing is, that’s
what we do in tech. That is what we enact
every single time we don’t specifically work against it. We embed it because it’s
already present in our culture– not because tech created it,
but because it already exists. And so we just re-enact
it over and over again, unless we question it. And the thing is, we do all
of that while insisting, over and over, that
no, no, no, no, no, no. We’re not racist, though. We want to include everybody. And we talk about
tech as if it’s some kind of utopian
society that’s possible. But we’re not necessarily
doing that difficult work of looking at the
ways that we continue to center white people. And that’s, I think,
bad enough when you’re talking about
something like photo tagging. But it, of course, gets
even more worrisome the more that tech embeds
into other aspects of life. For example, some
of you may have heard about some
software called COMPAS. COMPAS stands for Correctional
Offender Management Profiling for Alternative Sanctions. It is made by a company
called Northpointe. And it is being used in
courts around the country to decide how risky somebody
is to commit a future crime. So it provides criminal
recidivism scoring. And last year, ProPublica
did a big investigation into this software. And what they found was that it
had some real biases embedded within it. So for example, you
have these two men here. On your left, you’ve
got Bernard Parker. In January of 2013,
he was arrested in Broward County, Florida,
for marijuana possession. And then, on your right,
you have Dylan Fugett. And in the same year,
one month later, in the same place,
Broward County, Florida, he was arrested for
possession of cocaine. Now, both of these men
had a prior record. Bernard had resisted
arrest without violence, and Dylan had
attempted burglary. But according to
COMPAS, these men did not have a similar
profile at all. Because Bernard
was labeled a 10– the highest risk there
is for recidivism. And Dylan was labeled a 3. And in fact, Dylan
happened to go on to be arrested three more
times on drug charges. Bernard was not
arrested again at all. And what ProPublica
found was that that story was playing out over and
over again with the software. So the software was particularly
likely to falsely flag black defendants as
likely to re-offend. So 45% of those who
were labeled high-risk did not re-offend, versus
only 24% of white defendants. Meanwhile, the opposite
was true for low-risk. 48% of white defendants
labeled low risk did re-offend, and 28% of black
defendants re-offended. So what you have
is a system where, over and over again, when
the system gets it wrong, the people it is wrong for, the
people who are harmed by that, are much more likely
to be black people. Now, after ProPublica
did this research, some other researchers
from Stanford started digging into
this more closely. And they looked at what
Northpointe, the company who makes the software,
said about it, versus what ProPublica said. And they found an
underlying problem. And the problem wasn’t
a technical glitch. The problem was different ideas
about what fairness means. So at ProPublica, they
said this is not fair, because you have a group that
is disproportionately harmed by inaccurate predictions. What Northpointe said
was that their algorithm had been tuned to parity of
accuracy at each score level. Meaning that, if you
scored a 7, whether you were black or white, you were
roughly the same likelihood of committing a future crime. So 60% of white people and 61%
of black people who scored a 7 would go on to commit
a future crime. And they said that was
equality, see, because it was equal across races. But the problem is– the researchers at Stanford
who looked into this further– they said, well,
you can’t have both. You cannot define fairness
that way and fairness this way. And the reason
you can’t do that, said it’s mathematically
impossible, because you have different
base arrest rates across races. So then you have to
talk about, well, why do we have different
baseline arrest rates? Well, we can look
at a lot of reasons why the incarceration of
black people in this country is tremendously higher
than white people. I will not go into
all of them today, but we can talk about things
like the different application of drug laws, of crack
cocaine versus regular cocaine for decades. We can talk about the ways
that black communities tend to be policed more
than white communities even though there
are a statistically similar likelihood of
crimes being committed in both of those places. We can talk about a whole lot
of different historical factors that might have led to this. But the other thing
that we can talk about is what went into the
scores that wasn’t just past criminal profile. Because if you remember, those
two examples we talked about, Dylan and Bernard,
they are really similar criminal profiles. But COMPAS doesn’t
just care about that. COMPAS cares about a
lot of other factors. I think there’s 137
different factors. Is there a crime in
your neighborhood? Is it easy to get drugs
in your neighborhood? Was your father ever arrested? Were you ever
suspended or expelled? All of these questions
go in to factoring what your score is going to be. And the thing is, these
questions are not neutral. These questions are very, very
much tied to race and class in this country. If you are poor in America,
and if you’re black, you were more likely
to grow up poor. You probably lived
in a neighborhood that had more crime. You probably lived
in a neighborhood where you had to
move around a lot. If you were black in
the United States, and incarceration rates
are what they are, the likelihood that somebody in
your family has been arrested is way higher. And if you think about
it, the only reason that you build software like
COMPAS in the first place is because you think that there
is some kind of human bias that you’re trying
to get rid of. Because you don’t want to just
leave it to individual judges to make these decisions. Otherwise, why would
you build that software? So the software is meant
to make it less biased, make it fairer for people. And yet, we’re not questioning
the underlying information going into this model and
saying, well, wait a second. What was the historical
context that led to this? Do we want to consider
things like whether you’ve been suspended from
school when we also know that black
kids are much more likely to be suspended from
school for the same infractions as white kids? Are we going to ask
those questions or not? And I think that this is
the extreme example of what happens when we assume that
something that is technical is also neutral. And I think we do
this all the time because we’re so
used to thinking about things as technical
problems to be solved. But of course, this
is not neutral at all. The information that
went into that algorithm is anything but neutral, and
it needs to be interrogated. And it needs to be
interrogated at a level that is so much deeper than what
most tech companies are prepared to do right now. And so what ends
up happening is we have algorithms that
don’t eliminate bias, but they just outsource it. And by that, I mean you can
make it the machine’s problem. And so you, as a human
person, don’t have to be responsible for the bias. And we all feel better
about how unbiased we are because we’re not
making these biased decisions and not leaving it
up to racist judges. And we’re letting the
machine sort it out. And out of that machine,
comes some nice, clean, little numbers, right– 3, 7, 10, charts, graphs– It seems so obvious, so clear. And in fact, I talk
to designers a lot, and I work with designers a lot. And design is intended
to make these things seem like facts, like truth. Because we spend all of this
time trying to make software easy to use. I mean, that’s most of what
I’ve spent my life doing, is trying to make things
easy to use for people. And that’s important. Nobody is saying don’t
make usable software. However, when you
say things like, “COMPAS is designed
to be user-friendly, even for those with
limited computer experience and education.” What you end up
doing is you end up making it feel inevitable,
truthful, factual, so seamless and easy to use. You reduce really
complicated stuff down to things that feel
very palatable to people. And we hear this
over and over again across all kinds of different
tech companies, right? We want things to
be easy to use. We want things to seem right. That’s certainly true at Google. Miriam Sweeney, who is a
librarian information sciences professor at the University of
Alabama, she wrote about this. She said, “The
simple, sparse design works to obscure the
complexity of the interface, making the result appear purely
scientific and data-driven. The insistence of
scientific ‘truth’ about algorithmic search
has encouraged users to view search as an objective
and neutral experience.” Google explicitly wants people
to think of search this way. Because they want it
to be easy to use. And you want people to trust the
results that they’re getting. And that’s not
inherently bad, to want people to trust the results
that they’re getting. However, we have
to think about what are the ramifications of that? How do people actually
interpret that? What we have, over
and over again, is technology that’s not asking
some deep questions about where data comes from or what the
history is of a subject. And design that is
making those things feel neutral and feel factual. And so we recreate these toxic
patterns over and over again. And I think we talk
a lot about the ways that culture informs technology. I mean, that’s kind
of a given, right? It’s not as if racism
was created in tech. Racism existed. And what we have is tech that
can end up reenacting it. But what I don’t think
we talk enough about is the way that tech
also informs culture. The work that we do in
technology is powerful. It impacts people’s
lives in almost every way you can imagine. It is embedded into almost
everything you can imagine. It is often the first thing that
people look at in the morning– the last thing they
look at at night. And so when tech
perpetuates bias, even at those tiny
little levels, even at the mini cupcake levels– that is a problem. Because it’s indicative
of an industry that, over and over, thinks
narrowly about people, assumes that its ideas are
going to work for everyone, assumes that the
decisions it makes don’t carry that much weight,
that they’re not that important and that it assumes
it knows what’s best for people without
understanding people very well. I think that tech is too
important to our lives– to our personal lives,
our social lives, our emotional lives,
our political lives– to keep toying
with people’s lives without really
deeply considering the consequences of that. And so oftentimes,
the examples I’ve shown you today, the
companies behind them have treated them as if
they are software bugs. You squash it down,
and you move on. The problems that we are facing
are not just software bugs. What they are, are systems. They are systemic patterns, and
so they need systemic action. And that is a big
and difficult job. And that, actually,
is why I was so upset when I read the infamous memo. I was pretty upset about the
tenuous grasp of research in that memo. I was pretty upset about the
way that tiny little differences in studies about gender
were extrapolated into broad sweeping statements
about women in technology. But when I got to the section
where James Damore wrote that Google really needed to
“de-emphasize empathy,” and said that “being emotionally
unengaged helps us better reason about the facts.” I thought, you know,
you have misunderstood the entire project here. Because that’s actually
the opposite of what needs to happen. At a time when tech
companies are deciding things like what we see
during an election, or building software that
determines whether or not you’re going to get a job or
whether you can get a loan– at a time when tech
companies are increasingly manipulating relationships
and emotions, and the person who designed
the Like button on Facebook has actually said
they’ve rid technology from their lives and
their children’s lives because they’ve realized
that they actually don’t want that level of manipulation– this is not the answer. Because when we start
saying that we’re going to de-emphasize
empathy, and we’re going to focus on reason,
facts, we don’t ever answer some important
questions, like whose job is it to decide what “fair” means? Do you want some random engineer
to decide what fair means? That is a big cultural question. And it needs big
discussions around it. Whose job is it to understand
historical context? Whose job is it to know the
history of race and policing in this country before
making decisions that could affect people? Whose job is it to
sit down and think through what the potential
unintended consequences of a design decision might be? Usually, that’s not anybody’s
job, but it needs to be. And I will say I was really
pleased to read this interview with Fei-Fei Li, who– is from the Stanford Vision
Lab, and she’s on sabbatical with Google Cloud right now
working as a chief scientist. And she has a tremendous
amount of experience with image processing. And she’s talking about this. She saying, you know “AI is
very task-focused” right now. “It lacks contextual awareness. It lacks the kind of flexible
learning that humans have.” And so we want to
make technology that makes people’s lives
better and our world safer. And that takes a “layer of
human-level communication and collaboration”– that that has been
what is missing. But it’s going to take a
lot of work to get there. It’s going to take
a lot of people changing the way that
they approach problems and changing the way
that they approach products to be able to
make that kind of shift. So what I will leave
with today is something I am personally excited
about, but that I recognize is difficult when
you work in a tech company. And that is increasing
pressure and backlash that we are starting
to see in big tech. You can see across the board,
from the truly terrible year Uber has had, to the
Twitter boycott that happened just a few weeks
ago, to the ongoing questions around Facebook’s
role in the election and what ads were sold
to which Russians, to whatever happened with
the Google memo here. We are seeing this backlash
starting to spring up. And it can be difficult,
and it can be uncomfortable. But this is exactly
the kind of thing that we need to have
happen and that I actually hope more people start
pushing back against. I hope that this is a
broader conversation that makes its way well
outside of tech circles. Because as much as I
enjoy coming and talking with technical
audiences, I really think one of the
most important things is that we make technology, and
pushback against technology, and the overreach
of tech companies accessible to everyday people,
to people who don’t necessarily have the insider knowledge. Because I really think
that it is only then, it is only when we take the
concerns of everyday people, all people, even
people who don’t like mini cupcakes into
consideration, that we’re actually able to make
these kinds of changes possible and that we can make a
technology industry that is going to be
sustainable for the world and for individual
people in the long-run. So thank you so much
for having me today. I really appreciate it. [APPLAUSE] SPEAKER: Thank you
very much, Sara. Questions? AUDIENCE: The cupcake
thing is sort of a little– it’s frustrating. Because you can
imagine that we could have taken a different path, and
that would not have happened. It’s also sort of maybe
easier because you can imagine what
that path would be, and maybe we could follow
different paths in the future. There’s something I’d love to
hear your thoughts on that I think is, maybe fundamentally,
from my perspective, seems to be a harder problem. I’m not sure if you
were aware of this– this was about a year ago. There was an issue
where people noticed that if you searched in Google
for unprofessional hairstyles, you got pictures that
were overwhelmingly of women and people that
were overwhelmingly of color. And my simplistic
understanding of this– I don’t work on that
product, I should say– is that that’s
actually a reflection of how people tag images. It is a reflection
of people having sites about what to do and not
to do in the workplace that use those pictures as
examples of what not to do. And so an algorithm that is– what is the right
word for this– really trying not to be biased,
but the algorithm winds up reflecting a larger
bias in society. And I think we’ve seen
a number of instances of this kind of problem. We had a problem a few
years ago in Google Maps, where the White House would get
labeled with a racial epithet because a lot of
racist people were tagging it that way in Maps. And we were using that data. And I was just wondering if
you have any thoughts on what to do about that one SARA WACHTER-BOETTCHER: Yeah. So I think that this is
legitimately difficult, right? But here’s the thing. I think that one of the reasons
this is particularly difficult is that tech
companies have spent a lot of time
focusing on what they do as being somehow neutral
and a lot of time saying free speech, free speech. And so it’s kind of
abdicated responsibility. And now, we’re looking at
it, and going, oh, shit. What have we wrought? And so I think part of
the reason it’s hard is that we should have taken
it seriously a long time ago. Are you familiar
with Safiya Noble? I don’t know if any of you are. Safiya Noble is an
information studies scholar, and she actually talks
a lot about that kind of algorithmic bias in
things like search results. And so specifically,
things like, if you– so unprofessional
hairstyles– that makes sense. Also things like, if you
Google the words “black girls,” you will get typically
explicit results. And there’s lots
and lots of ways in which that mirror that’s
being reflected back to us doesn’t look great. So I think what you have
to start thinking through, though, is, as a company, as
an industry, as a culture, what is the role that we want
something like search to take? Is it a mirror? And then, well, what
do we do about the fact that it’s never just a mirror. It’s also a magnifier, right? Because that’s the fact. And that’s what we have to
acknowledge and really, deeply internalize, is
that it’s not just that you’re reflecting back to
society what people have said. But you’re making that codified. Cathy O’Neil who’s the author of
“Weapons of Mass Destruction,” she talks a lot
about how big data. It doesn’t create the
future, it codifies the past. So you’re basically taking
that historical information, and you’re making it seem
more objectively true. And so I think that that
is really the question that you have to be asking. Not just, are we comfortable
reflecting this back, but are we comfortable
magnifying this, and normalizing this,
and codifying this? I think that if we
ask those questions, I think we will come
out with answers that are going to feel more ethically
sound and something that we can really stick by and
something that can be a better guidance for us. I don’t think it makes
that discussion easy. I think that’s still a
really difficult question. So then you have to say,
well, where’s the line? And you’re in the
sea of grey area. But if you’re not– and that’s exactly why
this is a people business. If you’re not going to have
that kind of complexity of conversation
about what you want to see in society, what
we think is appropriate, what we think is fair– if you’re not going to
have those conversations, then you should be playing in
the space in the first place. Hope that’s helpful. AUDIENCE: Hi. So first of all,
thank you for being so direct and strongly-worded,
and raising your [INAUDIBLE] problem. I normally would not use the
words decided if somebody was ignorant to the issue. I kind of pulled that word to
mean that it was an informed decision of some kind. So they knew they
were at a junction. This definitely is some form
of neglect or malpractice. And when I took
professional ethics, I was told that
what distinguished professional ethics from the
rest of other kinds of ethics is that, if you’re
in a profession, your client doesn’t know enough
to double check your work. If you’re an architect or a
lawyer, almost by definition, your client just has to
trust you to do it right. And I think it’s appropriate
that these companies should know these pitfalls and should
be systemizing solutions to them. But I don’t know
how much you know about the internal structure
of different companies, but where exactly would you
pin that responsibility? Is that engineering malpractice? Is that design malpractice? Is that legal malpractice? Who should be the
professional that is aware of all the
implications here? SARA WACHTER-BOETTCHER: Well,
So I think, realistically, we’re never going to be aware of
every potential implication, every potential outcome, right? That is not realistic. I think that it is
incumbent upon people in every discipline,
though, to be more aware. And I also think that it is
incumbent upon tech companies to look at what they’re doing– when you acknowledge that what
you’re doing is not just tech, then you realize
that you actually need expertise in these areas. That, of course, you’re
going to do a bad job seeing some of this stuff
because you don’t actually have the background to see it. You should probably
bring somebody in who has a deep knowledge
of history and race in this country to have any
influence over any product that could impact people
of different races– which is probably
all of your products. Somebody has to actually
know something in order to find these problems. And if you’re not
hiring for that, then you’re going to
continue having these gaps. I do think that everybody can do
better across every discipline. I do think it is the
responsibility of people in design, it is
the responsibility of people in development, it
is the responsibility of people in legal, it’s everywhere. But the thing is, if you’re
going to say it’s everybody’s responsibility, it’s
very easy for that to become nobody’s job, right? And so I think that,
fundamentally though, that’s because
it’s not just about the individual culpability. It is about priorities
as an organization. If this is a priority
for your organization, then it’s going to become part
of your organizational culture. It is going to be
systematized, and it is going to be present in
every step along the way. It isn’t going to be done
in this sort of ad hoc way where, you, individual
engineer, needed to have noticed this thing. I think assigning blame
to individual people is actually not very helpful. I think it’s much more helpful
to say why does this happen? That’s why I don’t really
care about the bug fix. I care about what
kinds of patterns are emerging over
and over again? And how does the way
that you are organized– what does a project
genesis look like, and how does that get
translated into a team that’s working on it? Who is saying yea
or nay to decisions? Who thought it was
a terrible idea, but was uncomfortable speaking
up, and why is that happening? Did you have enough
diverse people in the room? Were there a bunch of
women in the room who decided the mini cupcakes
should be there or not? And you could say that
about many, many groups. But I think that you have to
look at it at that macro level. Because I think
when we try to talk about it at that individual
level, then, of course, people miss stuff. Everybody misses stuff. I think we have to look at
the pattern of missing stuff and say how do we tackle
that organizationally? AUDIENCE: You were talking a lot
about how these sorts of checks need to be systematized. And I’m just
wondering if you know of any examples of
organizations which have decision-making
structures which support this. How can we foster an environment
where we are in a meeting, and the person that thinks
maybe this cupcake idea is bad, feels empowered to speak up? What sorts of systemic things
can we do to be better at that? SARA WACHTER-BOETTCHER:
So this is hard. I don’t have any examples
off the top of my head, where I’m like, just look at
what this corporation does. Because I think that
this is not something that most companies
are doing that well at. But I would say that
part of it is, OK, so you do need to have a
diverse team working on things. But you also have to
have a diverse team that feels like it can speak up. And what that means
is that you have to have people at different
levels who are diverse. I mean, this is the thing– if you want to fix
this problem, you’ve got to go back to
root cause, and that’s a really hard problem. But you have to have
people who aren’t just at a junior level who might
have different perspectives. So if you have all of the
senior people in the room– So for example, the
other day, the chief of diversity inclusion at
Apple had this comment– it was kind of an
offhand comment, and it totally blew
up– where she said, something like,
well, you know, you could have 12 blond-haired,
blue-eyed men in the room, and that would have its
own kind of diversity. And what that indicates to me,
in a company like Apple that is actively trying to
recruit more diversely, is that they still
have this perception that you can have that group
of people come into the room, and that that’s still OK. That’s not OK. Because I’m going to tell
you, those 12 people– they are going to
miss a lot of stuff. So you have to,
both, acknowledge that you need that
diversity there, and that everybody there has to
be able to speak up to those 12 blond-haired, blue-eyed
white men in the room who all come from the same
background in order for it to work. Anyway, I think that
is a key piece of it. And then it’s also–
we have processes, we have systems that we
use for all kinds of stuff. Why is there no
system in place that’s checks and balances
on assumptions? Has anybody ever sat
down and documented all of the technical assumptions
that they are making? That’s pretty common. Have you ever sat
down and documented all of the social assumptions
that you’re making or human assumptions
that you’re making? I have very rarely seen a team
actually sit down and do that, and then think about
well, OK, what’s the worst that could happen? How could this go totally
wrong, and how are we going to ward against that? I just don’t think that
they’re being trained to do that kind of thing. So yeah, I think that you have
to go pretty deep to start solving the problem. It’s not going to
be fixed by just, let’s hire some more junior
people in diverse roles and will not be fixed by just
kind of encouraging people to speak up. It will be fixed by big,
deep changes in the way that we structure teams,
the way that we promote, the way that we
operate as companies. Which makes it really
tough, makes it painful, makes it feel like it’s
hard to get anywhere. But I think that’s
really the only way. SPEAKER: Let’s thank Sara again. [APPLAUSE]

38 thoughts on “Sara Wachter-Boettcher: “Technically Wrong: Sexist Apps, Biased Algorithms […]” | Talks at Google”

  1. This kind of lunatics will drive us straight back to the middle-ages.
    The level of incompetence and smugness is beyond comprehension.

  2. I'm a developer and a friend of a friend is a feminist female painter, and she was triggered because every time you search her name on google, under the image results, it shows her work and also links to some guy's work that's visually similar and lives in the same town. she was asking if I thought google was sexist and, when you search for a woman, it will always show a man. I said no, and tried to explain how the algorithm might show a similar artist from the same town, and she said, "Why does it show me Jay-Z when I search for beyonce then?" And I said, duh, jay-z is her husband and more popular than him. Just like if you search for barack obama, it shows a result for michelle, or search for brad pitt and it shows jennifer aniston, or search for russell brand and it shows katy perry. She just said, "Yeah, well, I still think google is sexist." Fucking hell.

  3. Yikes, hypersensitive nonsense like this is why nothing good is ever done. Like living with someone with Borderline Personality Disorder, we have to walk on eggshells because our customers will freak out.

  4. This is very helpful to keep in mind when working on new projects and trying to check for stuff we/I might forget or be too lazy to think about etc.
    I have seen this a few times when developing larger AI decision nets. All the "correlation is not causation"-things need to be reinforced in this new setting of trusting numbers and algorithms by people who confuse this.

  5. When is this crap going to stop? If I want to know how many calories I'm burning, I look it up. If I just don't care, then I ignore it. This entire video makes me nauseous!

  6. Google will probably think the upside-down likes are a conspiracy, but they should take from it that their own audience – an informed and invested community – disagrees with the message. Actually I concur that much of the world is myopically designed for a narrow slice of society, and always has been. That understanding is already part of training in design, communications, graphic arts, statistics, visualization. That app developers, as relative noobs here, are unaware of those finer points is a thing to talk about: Good. But it doesn't suggest there is some systematic conspiracy behind these things: They exist, as they always have, because of the perspective of their creators and also efficiency. Efficiency because – yes – a work of communication is indeed aimed at an audience whether that is explicit or not.

    Oh and btw I'll be one of very few thumbs up, because I appreciate the availability of the content – not because I agree with the message…that's what comments are for.

  7. Wow. This lady wants developers to be psychic on all the ways their products can work out. And she doesn't know how business works. You have to develop for your largest market shares first and get your products out before your competitors. It's economics not racism/sexism…

  8. This woman is mentally ill, and humoring a person like that is not good for them or anyone else. She needs medical care, not a microphone at Google.

  9. Unhealthy people need help. We can't protect them from their issues. I can't believe this is her job. If her lesbian friend got a notice about gifts for "him," maybe she didn't specify her preference. The algorithm is only as good as the data it uses. They cater to the majority, and they try to make it light-hearted because most people have enough drama in their lives. Why am I wasting time writing this?

  10. Wow, after watching this, I highly would warn anyone from investing in Google or alphabet, a company which has lost all morale standing by promoting fat FemiNazi cultural Marxists.

  11. I'm sure this person will take every negative comment here and label it as "harassment". and every person who "harassed" should be put on a "list" and at some appropriate time in the near future "re educated".

    GOOGLE, what did they threaten you with if you DIDN'T host this crazy person?

    If you chose to have this person there, I'm ashamed that I used to think your company had morally just ambitions.

    Really sad.

  12. Maybe the person with eating disorders shouldn't be buying a fucking product designed to email you updates about your weight loss process.

    Jesus Christ. These people are fucking nuts.

    You know what triggers me? When the little red light comes on to say I'm nearly out of petrol!

    It's so classist.
    It's so racist!

    Maybe the automobile industry needs to check it's white male privilege.

  13. You have sexism entirely backwards. Women in the West already do better than men, often far better in 11 of 12 essential, quality of life areas: Longevity, health, health care spending (even after accounting for maternal and reproductive health care), suicide, homelessness, alcoholism, drug addiction, education, favoritism in the criminal justice system, violent crime victimization, reproductive rights (men have none, of course).

    Women in the West are the most privileged demographic in human history. Today women are safer even than children.. Nearly everything most think they know about favoritism and sexism wrt men and women is wrong.

  14. The main problem with this is diet is more important than exercise. Stop eating junk food. Stop thinking that exercise will help you compensate for junk food consumption.

  15. This entire video is about checking assumptions, and seeing if they are actually accurate and beneficial for you. I don't understand why people are arguing that this is a bad thing. This is exactly what you should do, she's just showing you/us some of these assumptions in our tech world.

    It's crazy how some of you warp this into something controversial. Do these things, pay attention! Stop making faulty assumptions.

  16. this is what happens when people live a pampered life. Normal people have real issues like paying bills and feeding the family unlike this woman. she obviously eats enough for 3. how about feeding 2 hungry kids everyday and stop being a slob

  17. This poor woman, someone put her on a stage in front of engineers to tell them how pink cupcakes are racist. This however it's not the moral of the story. The real takeaway is that the future belongs to complainers & self-perceived victims… to those who tear down, not to those who build up.

  18. Now I get why you limp noodled brain boxes had to ax James Damore. You have lobotomized yourselves. Really, you must have. Otherwise, you’d stop ruining everything for everybody everywhere. No sense of humor, no sense of scale, no sense of scholarship…no sense at all.
    Google is encapsulating the importance of trust busting, this monopoly of bullshittery needs to end. You have left reason behind in favor of the irrationality of feels and anecdotal “proof.” Please, for the love reason, STOP!

  19. The title is wrong. It should be "Hyper-privileged Non-productive Burden on our Health Care System Engages in Incessant Narcissistic Whining for no Discernible Purpose." Glad I could fix that for you!

  20. Are you guys for real? I found the cupcake app to be fun. It was helping me to lose some weight so I can live a better life. BUT it upset a VERY small percentage of people who don't have the brain capacity to turn the app off or use a different app that wouldn't hurt their feelings , everybody else has to suffer. I am not fat but am carrying a few extra pound which I would like to lose and seeing how many calories I am losing in a pictorial rather than numerical, I found was working for me. If the few idiots calling that app out as being hurtful are being listened to, you should listen to the majority who say the app is beneficial. Being fat or obese is not the norm. It is not healthy. STOP listening to fat acceptance, IT IS KILLING PEOPLE.

  21. Authoritarian busy body gets offended about everything. My! Yeah, Eric's situation was bad, but everything else is judgmental crap.

  22. Amazed how many folk have such strong opinions before they've even watched the whole talk. They missed out on a lot of very valid points about how machines learn, the biases that get into systems and apps, and the impact that can have. Perhaps not intentionally, but often because the economics override any concerns that might be raised.
    If your favourite sports team was never included in any of the options of an app or website, if you could never get the correct spelling of your name to validate in digital forms, if you were trying to gain weight/muscle and your fitness tracker only encouraged weight loss … you'd understandably get annoyed. And annoyed users stop using products. That's the forgotten economics.

  23. Great talk. We should question our biases and assumptions. Don't know what the fuss in the comments is all about. This is just straight forward critical thinking + empathy.

  24. This is a timely subject.

    To all you young know-it-all developers and social warriors – PLEASE stop this social bullshit in software. How I select, buy, and use an app is my decision and mine only. You insult your customers intelligence when you PRESUME to know them when in fact you don’t.

    Stop it. Put your time into making a tool better, not trying to change me.

Leave a Reply

Your email address will not be published. Required fields are marked *