Reading For Week 10

The document discusses a new AI tool created by Jigsaw that can analyze online comments and rank them based on attributes like nuance, reasoning, and personal stories. This could allow platforms to elevate positive comments instead of just the most engaging ones. The tool is intended to create healthier online discussions and counteract issues like toxic content and polarization.

Uploaded by

1250231207

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views5 pages

Reading For Week 10

Uploaded by

1250231207

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

The AI That Could Heal a Divided Internet

BY BILLY PERRIGO

In the 1990s and early 2000s, technologists made the world a grand promise: new communications
technologies would strengthen democracy, undermine authoritarianism, and lead to a new era of
human flourishing. But today, few people would agree that the internet has lived up to that lofty
goal.

Today, on social media platforms, content tends to be ranked by how much engagement it
receives. Over the last two decades politics, the media, and culture have all been reshaped to meet
a single, overriding incentive: posts that provoke an emotional response often rise to the top.

Efforts to improve the health of online spaces have long focused on content moderation, the
practice of detecting and removing bad content. Tech companies hired workers and built AI to
identify hate speech, incitement to violence, and harassment. That worked imperfectly, but it
stopped the worst toxicity from flooding our feeds.

There was one problem: while these AIs helped remove the bad, they didn’t elevate the good. “Do
you see an internet that is working, where we are having conversations that are healthy or
productive?” asks Yasmin Green, the CEO of Google’s Jigsaw unit, which was founded in 2010
with a remit to address threats to open societies. “No. You see an internet that is driving us further
and further apart.”

What if there were another way?

Jigsaw believes it has found one. On Monday, the Google subsidiary revealed a new set of AI
tools, or classifiers, that can score posts based on the likelihood that they contain good content: Is
a post nuanced? Does it contain evidence-based reasoning? Does it share a personal story, or foster
human compassion? By returning a numerical score (from 0 to 1) representing the likelihood of a
post containing each of those virtues and others, these new AI tools could allow the designers of
online spaces to rank posts in a new way. Instead of posts that receive the most likes or comments
rising to the top, platforms could—in an effort to foster a better community—choose to put the
most nuanced comments, or the most compassionate ones, first.

The breakthrough was made possible by recent advances in large language models (LLMs), the
type of AI that underpins chatbots like ChatGPT. In the past, even training an AI to detect simple
forms of toxicity, like whether a post was racist, required millions of labeled examples. Those
older forms of AI were often brittle and ineffectual, not to mention expensive to develop. But the
new generation of LLMs can identify even complex linguistic concepts out of the box, and
calibrating them to perform specific tasks is far cheaper than it used to be. Jigsaw’s new classifiers
can identify “attributes” like whether a post contains a personal story, curiosity, nuance,
compassion, reasoning, affinity, or respect. “It's starting to become feasible to talk about
something like building a classifier for compassion, or curiosity, or nuance,” says Jonathan Stray,
a senior scientist at the Berkeley Center for Human-Compatible AI. “These fuzzy, contextual,
know-it-when-I-see-it kind of concepts— we're getting much better at detecting those.”

This new ability could be a watershed for the internet. Green, and a growing chorus of academics
who study the effects of social media on public discourse, argue that content moderation is
“necessary but not sufficient” to make the internet a better place. Finding a way to boost positive
content, they say, could have cascading positive effects both at the personal level—our
relationships with each other—but also at the scale of society. “By changing the way that content
is ranked, if you can do it in a broad enough way, you might be able to change the media
economics of the entire system,” says Stray, who did not work on the Jigsaw project. “If enough of
the algorithmic distribution channels disfavored divisive rhetoric, it just wouldn’t be worth it to
produce it any more.”

One morning in late March, Tin Acosta joins a video call from Jigsaw’s offices in New York City.
On the conference room wall behind her, there is a large photograph from the 2003 Rose
Revolution in Georgia, when peaceful protestors toppled the country’s Soviet-era government.
Other rooms have similar photos of people in Syria, Iran, Cuba and North Korea “using tech and
their voices to secure their freedom,” Jigsaw’s press officer, who is also in the room, tells me. The
photos are intended as a reminder of Jigsaw’s mission to use technology as a force for good, and
its duty to serve people in both democracies and repressive societies.

On her laptop, Acosta fires up a demonstration of Jigsaw’s new classifiers. Using a database of
380 comments from a recent Reddit thread, the Jigsaw senior product manager begins to
demonstrate how ranking the posts using different classifiers would change the sorts of comments
that rise to the top. The thread’s original poster had asked for life-affirming movie
recommendations. Sorted by the default ranking on Reddit—posts that have received the most
upvotes—the top comments are short, and contain little beyond the titles of popular movies. Then
Acosta clicks a drop-down menu, and selects Jigsaw’s reasoning classifier. The posts reshuffle.
Now, the top comments are more detailed. “You start to see people being really thoughtful about
their responses,” Acosta says. “Here’s somebody talking about School of Rock—not just the
content of the plot, but also the ways in which the movie has changed his life and made him fall in
love with music.” (TIME agreed not to quote directly from the comments, which Jigsaw said were
used for demonstrative purposes only and had not been used to train its AI models.)

Acosta chooses another classifier, one of her favorites: whether a post contains a personal story.
The top comment is now from a user describing how, under both a heavy blanket and the influence
of drugs, they had ugly-cried so hard at Ke Huy Quan’s monologue in Everything Everywhere All
at Once that they’d had to pause the movie multiple times. Another top comment describes how a
movie trailer had inspired them to quit a job they were miserable with. Another tells the story of
how a movie reminded them of their sister, who had died 10 years earlier. “This is a really great
way to look through a conversation and understand it a little better than [ranking by] engagement
or recency,” Acosta says.

For the classifiers to have an impact on the wider internet, they would require buy-in from the
biggest tech companies, which are all locked in a zero-sum competition for our attention. Even
though they were developed inside Google, the tech giant has no plans to start using them to help
rank its YouTube comments, Green says. Instead, Jigsaw is making the tools freely available for
independent developers, in the hopes that smaller online spaces, like message boards and
newspaper comment sections, will build up an evidence base that the new forms of ranking are
popular with users.

There are some reasons to be skeptical. For all its flaws, ranking by engagement is egalitarian.
Popular posts get amplified regardless of their content, and in this way social media has allowed
marginalized groups to gain a voice long denied to them by traditional media. Introducing AI into
the mix could threaten this state of affairs. A wide body of research shows that LLMs have plenty
of ingrained biases; if applied too hastily, Jigsaw’s classifiers might end up boosting voices that
are already prominent online, thus further marginalizing those that aren’t. The classifiers could
also exacerbate the problem of AI-generated content flooding the internet, by providing spammers
with an easy recipe for AI-generated content that’s likely to get amplified. Even if Jigsaw evades
those problems, tinkering with online speech has become a political minefield. Both conservatives
and liberals are convinced their posts are being censored; meanwhile, tech companies are under
fire for making unaccountable decisions that affect the global public square. Jigsaw argues that its
new tools may allow tech platforms to rely less on the controversial practice of content
moderation. But there’s no getting away from the fact that changing what kind of speech gets
rewarded online will always have political opponents.

Still, academics say that given a chance, Jigsaw’s new AI tools could result in a paradigm shift for
social media. Elevating more desirable forms of online speech could create new incentives for
more positive online—and possibly offline—social norms. If a platform amplifies toxic
comments, “then people get the signal they should do terrible things,” says Ravi Iyer, a
technologist at the University of Southern California who helps run the nonprofit Psychology of
Technology Research Network. “If the top comments are informative and useful, then people
follow the norm and create more informative and useful comments.”

The new algorithms have come a long way from Jigsaw’s earlier work. In 2017, the Google unit
released Perspective API, an algorithm for detecting toxicity. The free tool was widely used,
including by the New York Times, to downrank or remove negative comments under articles. But
experimenting with the tool, which is still available online, reveals the ways that AI tools can carry
hidden biases. “You’re a f-cking hypocrite” is, according to the classifier, 96% likely to be a toxic
phrase. But many other hateful phrases, according to the tool, are likely to be non-toxic, including
the neo-Nazi slogan “Jews will not replace us” (41%) and transphobic language like “trans women
are men” (36%). The tool breaks when confronted with a slur that is commonly directed at South
Asians in the U.K. and Canada, returning the error message: “We don't yet support that language,
but we're working on it!”

To be sure, 2017 was a very different era for AI. Jigsaw has made efforts to mitigate biases in its
new classifiers, which are unlikely to make such basic errors. Its team tested the new classifiers on
a set of comments that were identical except for the names of different identity groups, and said it
found no hint of bias. Still, the patchy effectiveness of the older Perspective API serves as a
reminder of the pitfalls of relying on AI to make value judgments about language. Even today’s
powerful LLMs are not free from bias, and their fluency can often conceal their limitations. They
can discriminate against African American English; they function poorly in some non-English
languages; and they can treat equally-capable job candidates differently based on their names
alone. More work will be required to ensure Jigsaw’s new AIs don’t have less visible forms of
bias. “Of course, there are things that you have to watch out for,” says Iyer, who did not work on
the Jigsaw project. “How do we make sure that [each classifier] captures the diversity of ways that
people express these concepts?”

In a paper published earlier this month, Acosta and her colleagues set out to test how readers
would respond to a list of comments ranked using Jigsaw’s new classifiers, compared to
comments sorted by recency. They found that readers preferred the comments sorted by the
classifiers, finding them to be more informative, respectful, trustworthy, and interesting. But they
also found that ranking comments by just one classifier on its own, like reasoning, could put users
off. In its press release launching the classifiers on Monday, Jigsaw says it intends for its tools to
be mixed and matched. That’s possible because all they do is return scores between zero and one
—so it’s possible to write a formula that combines several scores together into a single number,
and use that number as a ranking signal. Web developers could choose to rank comments using a
carefully-calibrated mixture of compassion, respect, and curiosity, for example. They could also
throw engagement into the mix as well – to make sure that posts that receive lots of likes still get
boosted too.

Just as removing negative content from the internet has received its fair share of pushback,
boosting certain forms of “desirable” content is likely to prompt complaints that tech companies
are putting their thumbs on the political scales. Jigsaw is quick to point out that its classifiers are
not only apolitical, but also propose to boost types of content that few people would take issue
with. In tests, Jigsaw found the tools did not disproportionately boost comments that were seen by
users as unfavorable to Republicans or Democrats. “We have a track record of delivering a
product that’s useful for publishers across the political spectrum,” Green says. “The emphasis is
on opening up conversations.” Still, the question of power remains: who gets to decide which
kinds of content are desirable? Jigsaw’s hope is that by releasing the technology publicly, different
online spaces can each choose what works for them—thus avoiding any one hegemonic platform
taking that decision on behalf of the entire internet.

For Stray, the Berkeley scientist, there is a tantalizing prospect to an internet where positive
content gets boosted. Many people, he says, think of online misinformation as leading to
polarization. And it can. “But it also works the other way around,” he says. The demand for low-
quality information arises, at least in part, because people are already polarized. If the tools result
in people becoming less polarized, “then that should actually change the demand-side for certain
types of lower quality content.” It’s hypothetical, he cautions, but it could lead to a virtuous circle,
where declining demand for misinformation feeds a declining supply.

Why would platforms agree to implement these changes? Almost by definition, ranking by
engagement is the most effective way to keep users onsite, thus keeping eyeballs on the ads that
drive up revenue. For the big platforms, that means both the continued flow of profits, and the fact
that users aren’t spending time with a competitor’s app. Replacing engagement-based ranking with
something less engaging seems like a tough ask for companies already battling to keep their users’
attention.

That’s true, Stray says. But, he notes that there are different forms of engagement. There’s short-
term engagement, which is easy for platforms to optimize for: is a tweak to a platform likely to
make users spend more time scrolling during the next hour? Platforms can and do make changes
to boost their short-term engagement, Stray says—but those kinds of changes often mean boosting
low-quality, engagement-bait types of content, which tend to put users off in the long term.

The alternative is long-term engagement. How might a change to a platform influence a user’s
likelihood of spending more time scrolling during the next three months? Long-term engagement
is healthier, but far harder to optimize for, because it’s harder to isolate the connection between
cause and effect. Many different factors are acting upon the user at the same time. Large platforms
want users to be returning over the long term, Stray says, and for them to cultivate healthy
relationships with their products. But it’s difficult to measure, so optimizing for short-term
engagement is often an easier choice.

Jigsaw’s new algorithms could change that calculus. “The hope is, if we get better at building
products that people want to use in the long run, that will offset the race to the bottom,” Stray
says. “At least somewhat.”

Post-reading tasks:

1] Try to find out important facts that would lead to a deeper-level understanding of the passage: