fbpx
Connect with us

The Conversation

Promising assisted reproductive technologies come with ethical, legal and social challenges – a developmental biologist and a bioethicist discuss IVF, abortion and the mice with two dads

Published

on

Promising assisted reproductive technologies come with ethical, legal and social challenges – a developmental biologist and a bioethicist discuss IVF, abortion and the mice with two dads

A few days after successful fertilization, an embryo becomes a rapidly dividing ball of cells called a blastocyst.
Juan Gaertner/Science Photo Library via Getty Images

Keith Latham, Michigan State University and Mary Faith Marshall, University of Virginia

Assisted reproductive technologies are medical procedures that help people experiencing difficulty having or an inability to have biological children of their own. From in vitro fertilization to genetic screening to creation of viable eggs from the skin cells of two male mice, each new development speaks to the potential of reproductive technologies to expand access to the experience of pregnancy.

Translating advances from the lab to the clinic, however, comes with challenges that go far beyond the purely technical.

Conversations around the ethics and implications of cutting-edge research often happen after the fact, when the science and technology have advanced beyond the point at which open dialogue could best protect affected groups. In the spirit of starting such cross-discipline conversations earlier, we invited developmental biologist Keith Latham of Michigan University and bioethicist Mary Faith Marshall of the University of Virginia to discuss the ethical and technological potential of in vitro gametogenesis and assisted reproductive technology post-Roe.

How new are the ethical considerations raised by assisted reproductive technologies?

Keith

Advertisement

Every new technology raises many of the same questions, and likely new ones. On the safety and risk-benefit side of the ethical conversation, there's nothing here that we haven't dealt with since the 1970s with other reproductive technologies. But it's important to keep asking questions, because the are hugely dependent on the success rate. There are potential biological costs, but also possible social costs, financial costs, societal costs and many others.

Mary Faith

It's probably been that way even longer. One of my mentors, Joseph Francis Fletcher, a pioneering bioethicist and Episcopal priest, wrote a book called “Morals and Medicine” in 1954. It was the first non-Roman Catholic treatment of bioethics. And he raised a lot of these issues there, the technological imperative – the idea that because we can develop the technology to do something, we therefore should develop it.

Fletcher also said that the use of artifice, or human-made creations, is supremely human. That's what we do: We figure out how things work and we develop new technologies like vaccines and heart-lung machines based on evolving scientific knowledge.

Advertisement
Microscopy image of mouse ovum being fertilized by mouse sperm
Scientists were able to create a mouse egg from the skin cells of male mice.
Clouds Hill Imaging Ltd./Corbis Documentary via Getty Images

I think that in most cases, scientists should be involved in thinking about the implications of their work. But often, researchers focus more on the direct applications of their work than the potential indirect consequences.

Given the evolution of assisted reproductive technology, and the fact that its evolution is going to continue, I think one of the central questions to consider is, what are the goals of developing it? For assisted reproduction, it's to help infertile people and people in nontraditional relationships have children.

What are some recent developments in the field of assisted reproductive technology?

Keith

One recent advance in assisted reproductive technology is the expansion of pre-implantation genetic testing methods, particularly DNA sequencing. Many genes in different variants, or alleles, that can be inherited from each parent. Providers can determine whether an embryo bears a “bad” allele that may increase the risk of certain diseases and select embryos with “healthy” alleles.

Genetic screening raises several ethical concerns. For example, the parents' genetic profiles could be unwillingly inferred from that of the embryo. This possibility may deter prospective parents from having children, and such knowledge may also have potential effects on any future child. The cost of screening and potential need for additional cycles of IVF may also increase disparities.

Advertisement

There are also considerations about the accuracy of screening predictions without accounting for environmental effects, and what level of genetic risk is “serious” enough for an embryo to be excluded. More extensive screening also raises concerns about possible misuse for purposes other than disease prevention, such as production of “designer babies.”

In vitro gametogenesis involves making egg or sperm cells from other adult cells in the body.

At a genome-editing conference in March 2023, researchers announced that they were able to delete and duplicate whole chromosomes from the skin cells of male mice to make eggs. This method is one potential way to make eggs that do not carry genetic abnormalities.

They were very upfront that this was done at 1% efficiency in mice, which could be lower in humans. That means something bad happened to 99% of the embryos. The biological world is not typically binary, so a portion of that surviving 1% could still be abnormal. Just because the mice survived doesn't mean they're OK. I would say at this point, it would be unethical to try this on people.

As with some forms of genetic screening, using this technique to reduce the risk of one disease could inadvertently increase the risk of another. Determining that it is absolutely safe to duplicate a chromosome would require knowing every allele of every gene on that chromosome, and what each allele could do to the health of a person. That's a pretty tall order, as that could involve identifying hundreds to thousands of genes, and the effects of all their variants may not be known.

Advertisement

Mary Faith

That raises the issue of efficacy and costs to yet another order of magnitude.

Keith

Genome editing with CRISPR technology in people carries similar concerns. Because of potential limitations in how precise the technology can be, it may be difficult for researchers to say they are absolutely 100% certain there won't be off-target changes in the genome. Proceeding without that full knowledge could be risky.

Advertisement

But that's where bioethicists need to come into play. Researchers don't know what the full risk is, so how do you make that risk-benefit calculation?

Mary Faith

There's the option of a voluntary global moratorium on using these technologies on human embryos. But somebody somewhere is still going to do it, because the technology is just sitting there, waiting to be moved forward.

How will the legal landscape affect the development and implementation of assisted reproductive technologies?

Mary Faith

Advertisement

Any research that involves human embryos is in some ways politicized. Not only because the government provides funding to the basic science labs that conduct this research, but because of the wide array of beliefs that members of the public at large have about when life begins or what personhood means.

The Dobbs decision, which overturned the constitutional right to an abortion, has implications for assisted reproduction and beyond. Most people who are pregnant don't even know they're pregnant at the earliest stages, and somewhere around 60% of those pregnancies end naturally because of genetic aberrations. Between 1973 and 2005, over 400 women were arrested for miscarriage in the U.S., and I think that number is going to grow. The implications for reproductive , and for assisted reproduction in the future, are challenging and frightening.

What will abortion restrictions mean for people who have multiple-gestation pregnancies, in which they carry more than one embryo at the same time? In order to have one healthy child born from that , the other embryos often need to be removed so they don't all die. In the past 40 years, the number of twin births doubled and triplet and higher-order births quadrupled, primarily because of fertility treatments.

Needle touching eggs in petri dish under microscope in IVF
IVF may involve transferring more than one embryo at a time.
Antonio Marquez lanza/Moment via Getty Images

Keith

IVF may transfer one, two, or sometimes three embryos at a time. The cost of care for preterm birth, which is one possible outcome of multiple-gestation pregnancies, can be high. That's in addition to the cost of delivery. IVF clinics are increasingly transferring just one embryo to mitigate such concerns.

Advertisement

The -at-conception bills that have been put forth in some U.S. state legislatures and Congress may contain language they are not meant to prevent IVF. But the language of the bills could be extended to affect procedures such as IVF with pre-implantation genetic testing to detect chromosomal abnormalities, particularly when single-embryo transfer is the goal. Pre-implantation genetic testing has been increasing, with one study estimating that over 40% of all IVF cycles in the U.S. in 2018 involved genetic screening.

Could life-at-conception bills criminalize clinics that don't transfer embryos known to be genetically abnormal? Freezing genetically abnormal embryos could avoid destroying them, but that raises questions of, to what end? Who would pay for the storage, and who would be responsible for those embryos?

How can we determine whether the risks outweigh the benefits when so much is unknown?

Keith

Conducting studies in animal models is an important first step. In some cases, it either hasn't been done or hasn't been done extensively. Even with animal studies, you have to recognize that mice, rabbits and monkeys are not human. Animal models may reduce some risks before a technology is used in people, but they won't eliminate all risks, because of biological differences between species.

Advertisement

Mary Faith

We could look to the example of early recombinant DNA research in the U.S. The federal government created the Recombinant DNA Advisory Committee at the National Institutes of Health to oversee animal and early-phase human research involving synthetic or hybrid genetic material.

The death of Jesse Gelsinger, who was a participant in a gene therapy clinical trial in 1999, led to a halt in all gene therapy clinical trials in the U.S. for a time. When the Food and Drug Administration investigated what went wrong, they found huge numbers of adverse events in both humans and animals that should have been reported to the advisory committee but weren't. Notably, the principal investigator of the trial was also the primary shareholder of the biotech company that made the drug being tested. That raises questions about the reality of oversight.

I think something like that earlier NIH advisory committee but for reproductive technologies would still be advisable. But researchers, policymakers and regulators need to learn from the lessons of the past to try to ensure that – especially in early-phase research – we're very thoughtful about the potential risks and that research participants really understand what the implications are for participation in research. That would be one model for translating research from the animal into the human.

Advertisement
Child looking into a slip lamp microscope for an eye exam with a doctor
The FDA approved a gene therapy for a form of congenital vision loss in 2017. The child in this , then 8, first received gene therapy at age 4.
Bill West/AP Photo

Keith

A process to make sure that the people conducting studies don't have a conflict of interest, like having the potential to commercially profit from the technology, would be useful.

Caution, consensus and cooperation should not take second place to profit motives. Altering the human genome in a way that allows human-made genetic changes to be propagated throughout the population has a potential to alter the genetics of the human species as a whole.

Mary Faith

That raises the question of how long it will take for long-term effects to show. It's one thing for an implanted egg not to survive. But how long will it take to know whether there are effects that aren't obvious at birth?

Advertisement

Keith

We're still collecting long-term outcome data for people born using different reproductive technologies. So far there have been no obviously horrible consequences. But some abnormalities could take decades to manifest, and there are many variables to contend with.

One can arguably say that there's substantial good in helping couples have babies. There can be a benefit to their emotional well-being, and reproduction is a natural part of human health and biology. And a lot of really smart, dedicated people are putting a lot of energy into making sure that the risks are minimized. We can also look to some of the practices and approaches to oversight that have been used over the past several decades.

Mary Faith

Advertisement

And thinking about international guidelines, such as from the Council for International Medical Science and other groups, could guidance on protecting human research subjects.

Keith

You hate to advocate for a world where the automatic response to anything new is “no, don't do that.” My response is, “Show me it's safe before you do it.” I don't think that's unreasonable.

Some people have a view that scientists don't think about the ethics of research and what's right and wrong, advisable or inadvisable. But we do think about it. I co-direct a research training program that includes teaching scientists how to responsibly and ethically conduct research, including speakers who specifically address the ethics of reproductive technologies. It is valuable to have a dialogue between scientists and ethicists, because ethicists will often think about things from a different perspective.

Advertisement

As people go through their scientific careers and see new technologies unfold over time, these discussions can help them develop a deeper appreciation and understanding of the broader impact of their research. It becomes our job to make sure that each generation of scientists is motivated to think about these things.

Mary Faith

It's also really important to include stakeholders – people who are nonscientists, people who experience barriers to reproduction and people who are opposed to the idea – so they have a voice at the table as well. That's how you get good policies, right? You have everyone who should be at the table, at the table.The Conversation

Keith Latham, Professor of Animal Science, Adjunct Professor of Obstetrics, Gynecology and Reproductive Biology, Michigan State University and Mary Faith Marshall, Professor of Biomedical Ethics, University of Virginia

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Advertisement

Did you miss our previous article…
https://www.biloxinewsevents.com/?p=267177

The Conversation

Black holes are mysterious, yet also deceptively simple − a new space mission may help physicists answer hairy questions about these astronomical objects

Published

on

theconversation.com – Gaurav Khanna, Professor of Physics, of Rhode Island – 2024-05-15 07:16:18

An illustration of a supermassive black hole.

NASA/JPL

Gaurav Khanna, University of Rhode Island

Physicists consider black holes one of the most mysterious objects that exist. Ironically, they're also considered one of the simplest. For years, physicists like me have been looking to prove that black holes are more complex than they seem. And a newly approved European space mission called LISA will us with this hunt.

Advertisement

Research from the 1970s suggests that you can comprehensively describe a black hole using only three physical attributes – their mass, charge and spin. All the other properties of these massive dying , like their detailed composition, density and temperature profiles, disappear as they transform into a black hole. That is how simple they are.

The idea that black holes have only three attributes is called the “no-hair” theorem, implying that they don't have any “hairy” details that make them complicated.

Black holes are massive, mysterious astronomical objects.

Hairy black holes?

For decades, researchers in the astrophysics community have exploited loopholes or work-arounds within the no-hair theorem's assumptions to up with potential hairy black hole scenarios. A hairy black hole has a physical property that scientists can measure – in principle – that's beyond its mass, charge or spin. This property has to be a permanent part of its structure.

About a decade ago, Stefanos Aretakis, a physicist currently at the University of Toronto, showed mathematically that a black hole containing the maximum charge it could hold – called an extremal charged black hole – would develop “hair” at its horizon. A black hole's horizon is the boundary where anything that crosses it, even light, can't escape.

Advertisement

Aretakis' analysis was more of a thought experiment using a highly simplified physical scenario, so it's not something scientists expect to observe astrophysically. But supercharged black holes might not be the only kind that could have hair.

Since astrophysical objects such as stars and planets are known to spin, scientists expect that black holes would spin as well, based on how they form. Astronomical evidence has shown that black holes do have spin, though researchers don't know what the typical spin value is for an astrophysical black hole.

Using computer simulations, my team has recently discovered similar types of hair in black holes that are spinning at the maximum rate. This hair has to do with the rate of change, or the gradient, of -time's curvature at the horizon. We also discovered that a black hole wouldn't actually have to be maximally spinning to have hair, which is significant because these maximally spinning black holes probably don't form in nature.

Detecting and measuring hair

My team wanted to develop a way to potentially measure this hair – a new fixed property that might characterize a black hole beyond its mass, spin and charge. We started looking into how such a new property might a signature on a gravitational wave emitted from a fast-spinning black hole.

Advertisement

A gravitational wave is a tiny disturbance in space-time typically caused by violent astrophysical in the universe. The collisions of compact astrophysical objects such as black holes and neutron stars emit strong gravitational waves. An international network of gravitational observatories, the Laser Interferometer Gravitational-wave Observatory in the United States, routinely detects these waves.

Our recent studies suggest that one can measure these hairy attributes from gravitational wave data for fast-spinning black holes. Looking at the gravitational wave data offers an for a signature of sorts that could indicate whether the black hole has this type of hair.

Our ongoing studies and recent progress made by Som Bishoyi, a student on the team, are based on a blend of theoretical and computational models of fast-spinning black holes. Our findings have not been tested in the field yet or observed in real black holes out in space. But we hope that will soon change.

LISA gets a go-ahead

In January 2024, the European Space Agency formally adopted the space-based Laser Interferometer Space Antenna, or LISA, mission. LISA will look for gravitational waves, and the data from the mission could help my team with our hairy black hole questions.

Advertisement

Three spacecrafts spaced apart sending light beams towards each other while orbiting the Sun

The LISA spacecrafts observing gravitational waves from a distant source while orbiting the Sun.

Simon Barke/Univ. Florida, CC BY

Formal adoption means that the has the go-ahead to move to the construction phase, with a planned 2035 launch. LISA consists of three spacecrafts configured in a perfect equilateral triangle that will trail behind the Earth around the Sun. The spacecrafts will each be 1.6 million miles (2.5 million kilometers) apart, and they will exchange laser beams to measure the distance between each other down to about a billionth of an inch.

LISA will detect gravitational waves from supermassive black holes that are millions or even billions of times more massive than our Sun. It will build a map of the space-time around rotating black holes, which will help physicists understand how gravity works in the close vicinity of black holes to an unprecedented level of accuracy. Physicists hope that LISA will also be able to measure any hairy attributes that black holes might have.

With LIGO making new observations every day and LISA to offer a glimpse into the space-time around black holes, now is one of the most exciting times to be a black hole physicist.The Conversation

Gaurav Khanna, Professor of Physics, University of Rhode Island

Advertisement

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Continue Reading

The Conversation

Viruses are doing mysterious things everywhere – AI can help researchers understand what they’re up to in the oceans and in your gut

Published

on

theconversation.com – Libusha , Associate Professor of and Computational Biology, Microbiology and Immunology, Albert Einstein College of Medicine – 2024-05-15 07:16:41

Many viral genetic sequences code for proteins that researchers haven't seen before.

KTSDesign/Science Photo Library via Getty Images

Libusha Kelly, Albert Einstein College of Medicine

Viruses are a mysterious and poorly understood force in microbial ecosystems. Researchers know they can infect, kill and manipulate human and bacterial cells in nearly every environment, from the oceans to your gut. But scientists don't yet have a full picture of how viruses affect their surrounding environments in large part because of their extraordinary diversity and ability to rapidly evolve.

Advertisement

Communities of microbes are difficult to study in a laboratory setting. Many microbes are challenging to cultivate, and their natural has many more features influencing their or failure than scientists can replicate in a lab.

So systems biologists like me often sequence all the DNA present in a sample – for example, a fecal sample from a patient – separate out the viral DNA sequences, then annotate the sections of the viral genome that code for proteins. These notes on the location, structure and other features of genes help researchers understand the functions viruses might carry out in the environment and help identify different kinds of viruses. Researchers annotate viruses by matching viral sequences in a sample to previously annotated sequences available in public databases of viral genetic sequences.

However, scientists are identifying viral sequences in DNA collected from the environment at a rate that far outpaces our ability to annotate those genes. This means researchers are publishing findings about viruses in microbial ecosystems using unacceptably small fractions of available data.

To improve researchers' ability to study viruses around the globe, my team and I have developed a novel approach to annotate viral sequences using artificial intelligence. Through protein language models akin to large language models like ChatGPT but specific to proteins, we were able to classify previously unseen viral sequences. This the door for researchers to not only learn more about viruses, but also to address biological questions that are difficult to answer with current techniques.

Advertisement

Annotating viruses with AI

Large language models use relationships between words in large datasets of text to potential answers to questions they are not explicitly “taught” the answer to. When you ask a chatbot “What is the capital of France?” for example, the model is not looking up the answer in a table of capital . Rather, it is using its on huge datasets of documents and information to infer the answer: “The capital of France is Paris.”

Similarly, protein language models are AI algorithms that are trained to recognize relationships between billions of protein sequences from environments around the world. Through this training, they may be able to infer something about the essence of viral proteins and their functions.

We wondered whether protein language models could answer this question: “Given all annotated viral genetic sequences, what is this new sequence's function?”

In our proof of concept, we trained neural networks on previously annotated viral protein sequences in pre-trained protein language models and then used them to predict the annotation of new viral protein sequences. Our approach allows us to probe what the model is “seeing” in a particular viral sequence that to a particular annotation. This helps identify candidate proteins of interest either based on their specific functions or how their genome is arranged, winnowing down the search of vast datasets.

Advertisement

Microscopy image of spherical bacteria colored bright green

Prochlorococcus is one of the many species of marine bacteria with proteins that researchers haven't seen before.

Anne Thompson/Chisholm Lab, MIT via Flickr

By identifying more distantly related viral gene functions, protein language models can complement current methods to provide new insights into microbiology. For example, my team and I were able to use our model to discover a previously unrecognized integrase – a type of protein that can move genetic information in and out of cells – in the globally abundant marine picocyanobacteria Prochlorococcus and Synechococcus. Notably, this integrase may be able to move genes in and out of these populations of bacteria in the oceans and enable these microbes to better adapt to changing environments.

Our language model also identified a novel viral capsid protein that is widespread in the global oceans. We produced the first picture of how its genes are arranged, showing it can contain different sets of genes that we believe indicates this virus serves different functions in its environment.

These preliminary findings represent only two of thousands of annotations our approach has provided.

Advertisement

Analyzing the unknown

Most of the hundreds of thousands of newly discovered viruses remain unclassified. Many viral genetic sequences match protein families with no known function or have never been seen before. Our work shows that similar protein language models could help study the threat and promise of our planet's many uncharacterized viruses.

While our study focused on viruses in the global oceans, improved annotation of viral proteins is critical for better understanding the role viruses play in health and disease in the human body. We and other researchers have hypothesized that viral activity in the human gut microbiome might be altered when you're sick. This means that viruses may help identify stress in microbial communities.

However, our approach is also limited because it requires high-quality annotations. Researchers are developing newer protein language models that incorporate other “tasks” as part of their training, particularly predicting protein structures to detect similar proteins, to make them more powerful.

Making all AI tools available via FAIR Data Principles – data that is findable, accessible, interoperable and reusable – can help researchers at large realize the potential of these new ways of annotating protein sequences leading to discoveries that benefit human health.The Conversation

Libusha Kelly, Associate Professor of Systems and Computational Biology, Microbiology and Immunology, Albert Einstein College of Medicine

Advertisement

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Continue Reading

The Conversation

Human differences in judgment lead to problems for AI

Published

on

theconversation.com – Mayank Kejriwal, Research Assistant Professor of Industrial & Engineering, of Southern California – 2024-05-14 07:14:06

Bias isn't the only human imperfection turning up in AI.

Emrah Turudu/Photodisc via Getty Images

Mayank Kejriwal, University of Southern California

Many people understand the concept of bias at some intuitive level. In society, and in artificial intelligence systems, racial and gender biases are well documented.

Advertisement

If society could somehow bias, would all problems go away? The late Nobel laureate Daniel Kahneman, who was a key figure in the field of behavioral economics, argued in his last book that bias is just one side of the coin. Errors in judgments can be attributed to two sources: bias and noise.

Bias and noise both play important roles in fields such as law, medicine and financial forecasting, where human judgments are central. In our work as computer and information scientists, my colleagues and I have found that noise also plays a role in AI.

Statistical noise

Noise in this context means variation in how people make judgments of the same problem or situation. The problem of noise is more pervasive than initially meets the eye. A seminal work, dating back all the way to the Great Depression, has found that different judges gave different sentences for similar cases.

Worryingly, sentencing in court cases can depend on things such as the temperature and whether the local football team won. Such factors, at least in part, contribute to the perception that the justice system is not just biased but also arbitrary at times.

Advertisement

Other examples: Insurance adjusters might give different estimates for similar claims, reflecting noise in their judgments. Noise is likely present in all manner of contests, ranging from wine tastings to local beauty pageants to college admissions.

Behavioral economist Daniel Kahneman explains the concept of noise in human judgment.

Noise in the data

On the surface, it doesn't seem likely that noise could affect the performance of AI systems. After all, machines aren't affected by weather or football teams, so why would they make judgments that vary with circumstance? On the other hand, researchers know that bias affects AI, because it is reflected in the data that the AI is trained on.

For the new spate of AI models like ChatGPT, the gold standard is human performance on general intelligence problems such as common sense. ChatGPT and its peers are measured against human-labeled commonsense datasets.

Put simply, researchers and developers can ask the machine a commonsense question and compare it with human answers: “If I place a heavy rock on a paper table, will it collapse? Yes or No.” If there is high agreement between the two – in the best case, perfect agreement – the machine is approaching human-level common sense, according to the test.

Advertisement

So where would noise in? The commonsense question above seems simple, and most humans would likely agree on its answer, but there are many questions where there is more disagreement or uncertainty: “Is the sentence plausible or implausible? My dog plays volleyball.” In other words, there is potential for noise. It is not surprising that interesting commonsense questions would have some noise.

But the issue is that most AI tests don't account for this noise in experiments. Intuitively, questions generating human answers that tend to agree with one another should be weighted higher than if the answers diverge – in other words, where there is noise. Researchers still don't know whether or how to weigh AI's answers in that situation, but a first step is acknowledging that the problem exists.

Tracking down noise in the machine

Theory aside, the question still remains whether all of the above is hypothetical or if in real tests of common sense there is noise. The best way to prove or disprove the presence of noise is to take an existing test, remove the answers and get multiple people to independently label them, meaning answers. By measuring disagreement among humans, researchers can know just how much noise is in the test.

The details behind measuring this disagreement are complex, involving significant statistics and math. Besides, who is to say how common sense should be defined? How do you know the human judges are motivated enough to think through the question? These issues lie at the intersection of good experimental design and statistics. Robustness is key: One result, test or set of human labelers is unlikely to convince anyone. As a pragmatic matter, human labor is expensive. Perhaps for this reason, there haven't been any studies of possible noise in AI tests.

Advertisement

To address this gap, my colleagues and I designed such a study and published our findings in Nature Scientific Reports, showing that even in the domain of common sense, noise is inevitable. Because the setting in which judgments are elicited can matter, we did two kinds of studies. One type of study involved paid workers from Amazon Mechanical Turk, while the other study involved a smaller-scale labeling exercise in two labs at the University of Southern California and the Rensselaer Polytechnic Institute.

You can think of the former as a more realistic online setting, mirroring how many AI tests are actually labeled before being released for and evaluation. The latter is more of an extreme, guaranteeing high quality but at much smaller scales. The question we set out to answer was how inevitable is noise, and is it just a matter of quality control?

The results were sobering. In both settings, even on commonsense questions that might have been expected to elicit high – even universal – agreement, we found a nontrivial degree of noise. The noise was high enough that we inferred that between 4% and 10% of a system's performance could be attributed to noise.

To emphasize what this means, suppose I built an AI system that achieved 85% on a test, and you built an AI system that achieved 91%. Your system would seem to be a lot better than mine. But if there is noise in the human labels that were used to score the answers, then we're not sure anymore that the 6% improvement means much. For all we know, there may be no real improvement.

Advertisement

On AI leaderboards, where large language models like the one that powers ChatGPT are , performance differences between rival systems are far narrower, typically less than 1%. As we show in the paper, ordinary statistics do not really come to the rescue for disentangling the effects of noise from those of true performance improvements.

Noise audits

What is the way forward? Returning to Kahneman's book, he proposed the concept of a “noise audit” for quantifying and ultimately mitigating noise as much as possible. At the very least, AI researchers need to estimate what influence noise might be .

Auditing AI systems for bias is somewhat commonplace, so we believe that the concept of a noise audit should naturally follow. We hope that this study, as well as others like it, to their adoption.The Conversation

Mayank Kejriwal, Research Assistant Professor of Industrial & Systems Engineering, University of Southern California

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Advertisement
Continue Reading

News from the South

Trending