fbpx
Connect with us

The Conversation

Human differences in judgment lead to problems for AI

Published

on

theconversation.com – Mayank Kejriwal, Research Assistant Professor of Industrial & Engineering, of Southern California – 2024-05-14 07:14:06

Bias isn't the only human imperfection turning up in AI.

Emrah Turudu/Photodisc via Getty Images

Mayank Kejriwal, University of Southern California

Many people understand the concept of bias at some intuitive level. In society, and in artificial intelligence systems, racial and gender biases are well documented.

Advertisement

If society could somehow bias, would all problems go away? The late Nobel laureate Daniel Kahneman, who was a key figure in the field of behavioral economics, argued in his last book that bias is just one side of the coin. Errors in judgments can be attributed to two sources: bias and noise.

Bias and noise both play important roles in fields such as law, medicine and financial forecasting, where human judgments are central. In our work as computer and information scientists, my colleagues and I have found that noise also plays a role in AI.

Statistical noise

Noise in this context means variation in how people make judgments of the same problem or situation. The problem of noise is more pervasive than initially meets the eye. A seminal work, dating back all the way to the Great Depression, has found that different judges gave different sentences for similar cases.

Worryingly, sentencing in court cases can depend on things such as the temperature and whether the local football team won. Such factors, at least in part, contribute to the perception that the justice system is not just biased but also arbitrary at times.

Advertisement

Other examples: Insurance adjusters might give different estimates for similar claims, reflecting noise in their judgments. Noise is likely present in all manner of contests, ranging from wine tastings to local beauty pageants to college admissions.

Behavioral economist Daniel Kahneman explains the concept of noise in human judgment.

Noise in the data

On the surface, it doesn't seem likely that noise could affect the performance of AI systems. After all, machines aren't affected by weather or football teams, so why would they make judgments that vary with circumstance? On the other hand, researchers know that bias affects AI, because it is reflected in the data that the AI is trained on.

For the new spate of AI models like ChatGPT, the gold standard is human performance on general intelligence problems such as common sense. ChatGPT and its peers are measured against human-labeled commonsense datasets.

Put simply, researchers and developers can ask the machine a commonsense question and compare it with human answers: “If I place a heavy rock on a paper table, will it collapse? Yes or No.” If there is high agreement between the two – in the best case, perfect agreement – the machine is approaching human-level common sense, according to the test.

Advertisement

So where would noise in? The commonsense question above seems simple, and most humans would likely agree on its answer, but there are many questions where there is more disagreement or uncertainty: “Is the sentence plausible or implausible? My dog plays volleyball.” In other words, there is potential for noise. It is not surprising that interesting commonsense questions would have some noise.

But the issue is that most AI tests don't account for this noise in experiments. Intuitively, questions generating human answers that tend to agree with one another should be weighted higher than if the answers diverge – in other words, where there is noise. Researchers still don't know whether or how to weigh AI's answers in that situation, but a first step is acknowledging that the problem exists.

Tracking down noise in the machine

Theory aside, the question still remains whether all of the above is hypothetical or if in real tests of common sense there is noise. The best way to prove or disprove the presence of noise is to take an existing test, remove the answers and get multiple people to independently label them, meaning answers. By measuring disagreement among humans, researchers can know just how much noise is in the test.

The details behind measuring this disagreement are complex, involving significant statistics and math. Besides, who is to say how common sense should be defined? How do you know the human judges are motivated enough to think through the question? These issues lie at the intersection of good experimental design and statistics. Robustness is key: One result, test or set of human labelers is unlikely to convince anyone. As a pragmatic matter, human labor is expensive. Perhaps for this reason, there haven't been any studies of possible noise in AI tests.

Advertisement

To address this gap, my colleagues and I designed such a study and published our findings in Nature Scientific Reports, showing that even in the domain of common sense, noise is inevitable. Because the setting in which judgments are elicited can matter, we did two kinds of studies. One type of study involved paid workers from Amazon Mechanical Turk, while the other study involved a smaller-scale labeling exercise in two labs at the University of Southern California and the Rensselaer Polytechnic Institute.

You can think of the former as a more realistic online setting, mirroring how many AI tests are actually labeled before being released for and evaluation. The latter is more of an extreme, guaranteeing high quality but at much smaller scales. The question we set out to answer was how inevitable is noise, and is it just a matter of quality control?

The results were sobering. In both settings, even on commonsense questions that might have been expected to elicit high – even universal – agreement, we found a nontrivial degree of noise. The noise was high enough that we inferred that between 4% and 10% of a system's performance could be attributed to noise.

To emphasize what this means, suppose I built an AI system that achieved 85% on a test, and you built an AI system that achieved 91%. Your system would seem to be a lot better than mine. But if there is noise in the human labels that were used to score the answers, then we're not sure anymore that the 6% improvement means much. For all we know, there may be no real improvement.

Advertisement

On AI leaderboards, where large language models like the one that powers ChatGPT are , performance differences between rival systems are far narrower, typically less than 1%. As we show in the paper, ordinary statistics do not really come to the rescue for disentangling the effects of noise from those of true performance improvements.

Noise audits

What is the way forward? Returning to Kahneman's book, he proposed the concept of a “noise audit” for quantifying and ultimately mitigating noise as much as possible. At the very least, AI researchers need to estimate what influence noise might be .

Auditing AI systems for bias is somewhat commonplace, so we believe that the concept of a noise audit should naturally follow. We hope that this study, as well as others like it, to their adoption.The Conversation

Mayank Kejriwal, Research Assistant Professor of Industrial & Systems Engineering, University of Southern California

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Advertisement

The Conversation

US participation in space has benefits at home and abroad − reaping them all will require collaboration

Published

on

theconversation.com – Cheyenne Black, Graduate Research Assistant in the Institute for Public Policy Research and Analysis, of Oklahoma – 2024-05-22 07:24:32

“Cosmic cliffs” in the Carina nebula, captured by the James Webb Space Telescope.

NASA, ESA, CSA, STScI

Cheyenne Black, University of Oklahoma

When people think about what we get from the U.S. space program, it may be along the lines of NASA technology spin-offs such as freeze-dried food and emergency space blankets.

Advertisement

But space activities do much more that life on Earth. Research in space helps scientists study our , develop new technologies, create jobs, grow the economy and foster international collaboration.

Of course, with reports of Russia developing an anti-satellite nuclear weapon, members of and the have focused their attention on space defense and military readiness.

This is critical, but there are still many other benefits to reap from space. Getting the most out of U.S. space involvement will require collaborating across various social, environmental, commercial, governmental, international and technological backgrounds.

As a space policy scholar focused on private-public partnerships, networks and coalitions, I've seen that policymakers can get the most out of U.S. space endeavors if they invite a wide array of experts into policy discussions.

Advertisement

Benefits on Earth

NASA satellites play a crucial role in documenting changes in global temperatures, sea-level rise, arctic ice extent and air quality. Satellites have also been collecting data for almost 50 years to monitor water use, crop health and crop production. These long-term observations help researchers track environmental changes across the globe.

Space research provides a wide array of technologies in addition to rockets and Moon landers. Cellphone cameras, CAT scanners, the computer mouse, laptops, wireless headsets and purification are just a few public goods NASA has generated.

These spin-off technologies come from NASA's partnerships with private firms, which subsequently make scientific discoveries widely available and accessible.

Growing the space economy

Experts predict that the space sector will continue driving the development of nonspace industries. Agriculture, energy, mining, transportation and pharmaceuticals are just some of the sectors that benefit through spin-off technologies and space-based research.

Advertisement

For example, scientists can conduct experiments on the International Space Station using the microgravity of space to study the chemistry of , improve medications and test cancer treatments.

More and individuals than ever share a vested interest in the space sector's success. Experts anticipate the global space economy – the resources used in space for activities – and research and development will continue to grow to a market of US$1.4 trillion by 2030.

Commercialization policies opened U.S. space activities to the private sector. This has led to partnerships with companies, such as SpaceX, Blue Origin and others, that are growing the space economy.

These companies have increasingly launched rockets and deployed satellites in recent years. This has increased the need for workers, both in manufacturing positions and specialized STEM roles. Additionally, private companies and universities are partnering to develop various technologies, such as landing systems for a U.S. return to the Moon.

Advertisement

A cylindrical rocket emitting a plume of flame launches upwards in a haze of smoke.

SpaceX's Starship rocket launched in March 2024. More commercial companies, like SpaceX, have partnered with NASA in recent years.

AP Photo/Eric Gay

Communities that host space industry centers have seen economic and educational benefits. For example, Huntsville, Alabama, home of the Marshall Space Flight Center and the U.S. Space and Rocket Center, has attracted an educated workforce with one of the highest rates of engineers per capita. Almost half of residents over the age of 25 in Huntsville have a bachelor's degree or higher.

An aerial view of three buildings.

The Marshall Space Flight Center in Huntsville, Ala.

NASA

This rate starkly contrasts with the national average, where 37% have at least a bachelor's degree, and the state's 27% average. Additionally, Huntsville's annual median household income is $8,000 higher than the Alabama average.

Advertisement

Since 1982, Huntsville has also hosted over 750,000 students at the U.S. Space and Rocket Center space camp. This camp educates students about science, technology, engineering and leadership to prepare them for a potential future STEM career.

International collaboration

Space also provides an opportunity for the U.S. to collaborate with other countries.

For example, the U.S. works jointly with Italy to observe the impacts of air quality on human health. The James Webb Space Telescope, a result of partnerships between NASA, the European Space Agency and the Canadian Space Agency, allows scientists to peer into previously unobserved parts of the cosmos. International collaboration has also established the Artemis Accords, a set of principles agreed to by 40 countries for peaceful, sustainable and transparent cooperation in space.

Getting the most out of space

Right now, U.S. space policymaking occurs at the federal and international level. And while people outside of the can act as witnesses during congressional hearings or through advocacy groups, that involvement may not be enough to represent the wide spectrum of viewpoints and interests in space policy.

Advertisement

There are a few ways policymakers can receive input from different stakeholders. These might include inviting more experts from various policy areas to provide recommendations in congressional hearings, collaborating with advocacy coalitions to create sustainable policies, strengthening and expanding private-public partnerships, and setting a space agenda that emphasizes research and development.The Conversation

Cheyenne Black, Graduate Research Assistant in the Institute for Public Policy Research and Analysis, University of Oklahoma

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Continue Reading

The Conversation

TikTok law threatening a ban if the app isn’t sold raises First Amendment concerns

Published

on

theconversation.com – Anupam Chander, Professor of and Technology, Georgetown University – 2024-05-21 07:25:32

TikTok users worry about losing their social platform, but First Amendment rights are on the line, too.

AP Photo/Ted Shaffrey

Anupam Chander, Georgetown University and Gautam Hans, Cornell University

TikTok, the short- company with Chinese roots, did the most American thing possible on May 7, 2024: It sued the U.S. government, in the person of Merrick Garland, in federal court. The suit claims the federal law that took effect on April 24, 2024, banning TikTok unless it sells itself violates the U.S. Constitution.

Advertisement

The law names TikTok and its parent company, ByteDance Ltd., specifically. It also applies to other applications and websites reaching more than a million monthly users that allow people to share information and that have ownership of 20% or more from China, Russia, Iran or North Korea. If the president determines that such applications or websites “present a significant threat to the national security,” then those apps and websites, too, must either be sold or banned from the U.S.

TikTok's suit says that the law violates the First Amendment by failing to evidence of the national security threat posed by the app and for failing to seek a less restrictive remedy. Despite legislators' claims to the contrary, the law forcing the divestiture of TikTok – the Protecting Americans from Foreign Adversary Controlled Applications Act – implicates First Amendment interests. In our view, it does so in ways that ripple beyond this specific case.

As a company incorporated in the United States that provides an online publishing platform, TikTok has a right protected by the First Amendment to select what messages – in this case, user videos – it chooses to publish.

A ban appears to us, scholars who study law and technology, to be a massive prior restraint, which is generally barred by U.S. courts. Prior restraint is action by the to prevent speech, typically some form of publication, before it occurs.

Advertisement

The First Amendment limits what the government can do to censor speech.

Speech in the crosshairs

The law's backers say that it is not a ban – all TikTok has to do is sell itself. These supporters describe the bill as a divestiture, a purely economic regulation that they say should insulate it from First Amendment . After the sale, users could happily keep on using TikTok, not caring who owns the company. But the law seems to us an attempt to control speech by mandating a change in ownership.

Changing the speech content on the app is the express goal of some of the law's backers. The principal author of the bill, former U.S. Rep. Mike Gallagher, who stepped down from office in April to join a venture capital firm partly backed by Microsoft, explained to The New York Times that he was principally concerned about the potential for the Chinese Communist Party to spread propaganda on the app. The Times and The Wall Street Journal have reported that passed this bill in part because of unsubstantiated accusations that TikTok was unfairly promoting one side in the Israel-Hamas war.

Imagine if the government told Jeff Bezos that he had to sell The Washington Post because it was worried that he might push a particular agenda using his control of the newspaper. Or to use a digital analogy, what if the government told Elon Musk that he had to sell X, formerly Twitter, because it didn't like his content moderation of legal speech? Those scenarios clearly have a connection to First Amendment protections.

Ownership matters

Transferring TikTok's ownership from one company to another matters greatly for the purposes of First Amendment analysis.

Advertisement

Supreme Court Justice Elena Kagan observed during oral arguments in a case unrelated to TikTok's ownership that ownership can make a difference in an app. She noted that the sale of Twitter to Elon Musk changed the character of the app. Kagan said, “Twitter users one day woke up and found themselves to be X users and the content rules had changed and their feeds changed, and all of a sudden they were getting a different online newspaper, so to speak, in a metaphorical sense every morning.”

Indeed, The Washington Post found a rightward tilt after Twitter changed hands.

By forcing the sale of TikTok to an entity without ties to the Chinese Communist Party, Congress' intent with the law is to change the nature of the platform. That kind of government action implicates the core concerns that the First Amendment was designed to protect against: government interference in the speech of private parties.

U.S. Rep. Raja Krishnamoorthi, co-sponsor of the House bill on TikTok, pointed to another instance where the U.S. government ordered a Chinese company to sell a U.S. app. In 2019, the Committee on Foreign Investment in the United States ordered the new Chinese owners of Grindr to sell the dating app, which the Chinese owners did the year. In that case, the foreign owners could not assert First Amendment rights in the United States, given that they were outside the U.S., and thus no court considered this issue.

Advertisement

TikTok is First Amendment protection against the law forcing its sale or ban.

National security claims

The government hasn't disclosed to the public the national security concerns cited in the TikTok law. While such concerns, if accurate, might warrant some kind of intervention, some Americans are likely to decline to take claims of national security urgency on good faith. To address skepticism of secret government power, particularly when it involves speech rights, the government arguably needs to present its claims.

U.S. Sens. Richard Blumenthal and Marsha Blackburn, both of whom supported the TikTok law and have seen the government's secret evidence, called for the declassification of that information. We believe that's a vital step for the public to properly consider the government's claim that a ban is warranted in this instance. In any case, the courts will ultimately weigh the secret evidence in determining whether the government's national security concerns justified this intrusion upon speech.

What seems likely to happen, absent judicial invalidation or legislative repeal of the law, is a world in which TikTok cannot effectively operate in the United States in a year's time, with mobile app stores unable to push out updates to the software and Oracle Corp. unable to continue hosting the app and its U.S. user data on its servers. TikTok could go dark on Jan. 19, 2025, in the United States.The Conversation

Anupam Chander, Professor of Law and Technology, Georgetown University and Gautam Hans, Associate Clinical Professor of Law, Cornell University

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Advertisement
Continue Reading

The Conversation

AI chatbots are intruding into online communities where people are trying to connect with other humans

Published

on

theconversation.com – Casey Fiesler, Associate Professor of Information Science, University of Colorado Boulder – 2024-05-20 07:27:05
AI chatbots are butting into human spaces.
gmast3r/iStock via Getty Images

Casey Fiesler, University of Colorado Boulder

A parent asked a question in a private Facebook group in April 2024: Does anyone with a child who is both gifted and disabled have any experience with New York public schools? The parent received a seemingly helpful answer that laid out some characteristics of a specific school, beginning with the context that “I have a child who is also 2e,” meaning twice exceptional.

On a Facebook group for swapping unwanted items near Boston, a user looking for specific items received an offer of a “gently used” Canon camera and an “almost-new portable conditioning unit that I never ended up using.”

Both of these responses were lies. That child does not exist and neither do the camera or air conditioner. The answers came from an artificial intelligence chatbot.

Advertisement

According to a Meta help page, Meta AI will respond to a post in a group if someone explicitly tags it or if someone “asks a question in a post and no one responds within an hour.” The feature is not yet available in all regions or for all groups, according to the page. For groups where it is available, “admins can turn it off and back on at any time.”

Meta AI has also been integrated into search features on Facebook and Instagram, and users cannot turn it off.

As a researcher who studies both online communities and AI ethics, I find the idea of uninvited chatbots answering questions in Facebook groups to be dystopian for a number of reasons, starting with the fact that online communities are for people.

Human connections

In 1993, Howard Rheingold published the book “The Virtual Community: Homesteading on the Electronic Frontier” about the WELL, an early and culturally significant online community. The first chapter opens with a parenting question: What to do about a “blood-bloated thing sucking on our baby's scalp.”

Advertisement

Rheingold received an answer from someone with firsthand knowledge of dealing with ticks and had resolved the problem before receiving a callback from the pediatrician's office. Of this experience, he wrote, “What amazed me wasn't just the speed with which we obtained precisely the information we needed to know, right when we needed to know it. It was also the immense inner sense of security that with discovering that real people – most of them , some of them nurses, doctors, and midwives – are available, around the clock, if you need them.”

This “real people” aspect of online communities continues to be critical today. Imagine why you might pose a question to a Facebook group rather than a search engine: because you want an answer from someone with real, lived experience or you want the human response that your question might elicit – sympathy, outrage, commiseration – or both.

Decades of research suggests that the human component of online communities is what makes them so valuable for both information-seeking and social . For example, fathers who might otherwise feel uncomfortable asking for parenting advice have found a haven in private online spaces just for dads. LGBTQ+ youth often join online communities to safely find critical resources while reducing feelings of isolation. Mental health support forums provide young people with belonging and validation in addition to advice and social support.

Online communities are well-documented places of support for LGBTQ+ people.

In addition to similar findings in my own lab related to LGBTQ+ participants in online communities, as well as Black Twitter, two more recent studies, not yet peer-reviewed, have emphasized the importance of the human aspects of information-seeking in online communities.

Advertisement

One, led by PhD student Blakeley Payne, focuses on fat people's experiences online. Many of our participants found a lifeline in access to an audience and community with similar experiences as they sought and shared information about topics such as navigating hostile healthcare systems, finding clothing and dealing with cultural biases and stereotypes.

Another, led by Ph.D student Faye Kollig, found that people who share content online about their chronic illnesses are motivated by the sense of community that comes with shared experiences, as well as the humanizing aspects of connecting with others to both seek and support and information.

Faux people

The most important of these online spaces as described by our participants could be drastically undermined by responses coming from chatbots instead of people.

As a type 1 diabetic, I follow a number of related Facebook groups that are frequented by many parents newly navigating the challenges of caring for a young child with diabetes. Questions are frequent: “What does this mean?” “How should I handle this?” “What are your experiences with this?” Answers come from firsthand experience, but they also typically come with compassion: “This is hard.” “You're doing your best.” And of course: “We've all been there.”

Advertisement

A response from a chatbot to speak from the lived experience of caring for a diabetic child, offering empathy, would not only be inappropriate, but it would be borderline cruel.

However, it makes complete sense that these are the types of responses that a chatbot would offer. Large language models, simplistically, function more similarly to autocomplete than they do to search engines. For a model trained on the millions and millions of posts and comments in Facebook groups, the “autocomplete” answer to a question in a support community is definitely one that invokes personal experience and offers empathy – just as the “autocomplete” answer in a Buy Nothing Facebook group might be to offer someone a gently used camera.

Meta has rolled out an AI assistant across its social media and messaging apps.

Keeping chatbots in their lanes

This isn't to suggest that chatbots aren't useful for anything – they may even be quite useful in some online communities, in some contexts. The problem is that in the midst of the current generative AI rush, there is a tendency to think that chatbots can and should do everything.

There are plenty of downsides to using large language models as information retrieval systems, and these downsides point to inappropriate contexts for their use. One downside is when incorrect information could be dangerous: an eating disorder helpline or legal advice for small businesses, for example.

Advertisement

Research is pointing to important considerations in how and when to design and deploy chatbots. For example, one recently published paper at a large human-computer interaction conference found that though LGBTQ+ individuals lacking social support were sometimes turning to chatbots for help with mental health needs, those chatbots frequently fell short in grasping the nuance of LGBTQ+-specific challenges.

Another found that though a group of autistic participants found value in interacting with a chatbot for social communication advice, that chatbot was also dispensing questionable advice. And yet another found that though a chatbot was helpful as a preconsultation tool in a health context, sometimes found expressions of empathy to be insincere or offensive.

Responsible AI and deployment means not only auditing for issues such as bias and misinformation, but also taking the time to understand in which contexts AI is appropriate and desirable for the humans who will be interacting with them. Right now, many companies are wielding generative AI as a hammer, and as a result, everything looks like a nail.

Many contexts, such as online support communities, are best left to humans.The Conversation

Casey Fiesler, Associate Professor of Information Science, University of Colorado Boulder

Advertisement

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Continue Reading

News from the South

Trending