Connect with us

The Conversation

Why building big AIs costs billions – and how Chinese startup DeepSeek dramatically changed the calculus

Published

on

theconversation.com – Ambuj Tewari, Professor of Statistics, University of Michigan – 2025-01-29 08:08:00

Why building big AIs costs billions – and how Chinese startup DeepSeek dramatically changed the calculus

DeepSeek burst on the scene – and may be bursting some bubbles.
AP Photo/Andy Wong

Ambuj Tewari, University of Michigan

State-of-the-art artificial intelligence systems like OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude have captured the public imagination by producing fluent text in multiple languages in response to user prompts. Those companies have also captured headlines with the huge sums they’ve invested to build ever more powerful models.

An AI startup from China, DeepSeek, has upset expectations about how much money is needed to build the latest and greatest AIs. In the process, they’ve cast doubt on the billions of dollars of investment by the big AI players.

I study machine learning. DeepSeek’s disruptive debut comes down not to any stunning technological breakthrough but to a time-honored practice: finding efficiencies. In a field that consumes vast computing resources, that has proved to be significant.

Where the costs are

Developing such powerful AI systems begins with building a large language model. A large language model predicts the next word given previous words. For example, if the beginning of a sentence is “The theory of relativity was discovered by Albert,” a large language model might predict that the next word is “Einstein.” Large language models are trained to become good at such predictions in a process called pretraining.

Pretraining requires a lot of data and computing power. The companies collect data by crawling the web and scanning books. Computing is usually powered by graphics processing units, or GPUs. Why graphics? It turns out that both computer graphics and the artificial neural networks that underlie large language models rely on the same area of mathematics known as linear algebra. Large language models internally store hundreds of billions of numbers called parameters or weights. It is these weights that are modified during pretraining.

YouTube video
Large language models consume huge amounts of computing resources, which in turn means lots of energy.

Pretraining is, however, not enough to yield a consumer product like ChatGPT. A pretrained large language model is usually not good at following human instructions. It might also not be aligned with human preferences. For example, it might output harmful or abusive language, both of which are present in text on the web.

The pretrained model therefore usually goes through additional stages of training. One such stage is instruction tuning where the model is shown examples of human instructions and expected responses. After instruction tuning comes a stage called reinforcement learning from human feedback. In this stage, human annotators are shown multiple large language model responses to the same prompt. The annotators are then asked to point out which response they prefer.

It is easy to see how costs add up when building an AI model: hiring top-quality AI talent, building a data center with thousands of GPUs, collecting data for pretraining, and running pretraining on GPUs. Additionally, there are costs involved in data collection and computation in the instruction tuning and reinforcement learning from human feedback stages.

All included, costs for building a cutting edge AI model can soar up to US$100 million. GPU training is a significant component of the total cost.

The expenditure does not stop when the model is ready. When the model is deployed and responds to user prompts, it uses more computation known as test time or inference time compute. Test time compute also needs GPUs. In December 2024, OpenAI announced a new phenomenon they saw with their latest model o1: as test time compute increased, the model got better at logical reasoning tasks such as math olympiad and competitive coding problems.

Slimming down resource consumption

Thus it seemed that the path to building the best AI models in the world was to invest in more computation during both training and inference. But then DeepSeek entered the fray and bucked this trend.

YouTube video
DeepSeek sent shockwaves through the tech financial ecosystem.

Their V-series models, culminating in the V3 model, used a series of optimizations to make training cutting edge AI models significantly more economical. Their technical report states that it took them less than $6 million dollars to train V3. They admit that this cost does not include costs of hiring the team, doing the research, trying out various ideas and data collection. But $6 million is still an impressively small figure for training a model that rivals leading AI models developed with much higher costs.

The reduction in costs was not due to a single magic bullet. It was a combination of many smart engineering choices including using fewer bits to represent model weights, innovation in the neural network architecture, and reducing communication overhead as data is passed around between GPUs.

It is interesting to note that due to U.S. export restrictions on China, the DeepSeek team did not have access to high performance GPUs like the Nvidia H100. Instead they used Nvidia H800 GPUs, which Nvidia designed to be lower performance so that they comply with U.S. export restrictions. Working with this limitation seems to have unleashed even more ingenuity from the DeepSeek team.

DeepSeek also innovated to make inference cheaper, reducing the cost of running the model. Moreover, they released a model called R1 that is comparable to OpenAI’s o1 model on reasoning tasks.

They released all the model weights for V3 and R1 publicly. Anyone can download and further improve or customize their models. Furthermore, DeepSeek released their models under the permissive MIT license, which allows others to use the models for personal, academic or commercial purposes with minimal restrictions.

Resetting expectations

DeepSeek has fundamentally altered the landscape of large AI models. An open weights model trained economically is now on par with more expensive and closed models that require paid subscription plans.

The research community and the stock market will need some time to adjust to this new reality.The Conversation

Ambuj Tewari, Professor of Statistics, University of Michigan

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post Why building big AIs costs billions – and how Chinese startup DeepSeek dramatically changed the calculus appeared first on theconversation.com

The Conversation

What causes RFK Jr.’s strained and shaky voice? A neurologist explains this little-known disorder

Published

on

theconversation.com – Indu Subramanian, Clinical Professor of Neurology, University of California, Los Angeles – 2025-05-01 07:45:00

U.S. Secretary of Health and Human Services Robert F. Kennedy Jr. speaks at an April 16, 2025, news conference in Washington, D.C.
Alex Wong via Getty Images

Indu Subramanian, University of California, Los Angeles

Health and Human Services Secretary Robert F. Kennedy Jr. has attracted a lot of attention for his raspy voice, which results from a neurological voice disorder called spasmodic dysphonia.

Kennedy, 71, says that in his 40s he developed a neurological disease that “robbed him of his strong speaking voice.” Kennedy first publicly spoke of the quiver he had noticed in his voice in a 2004 interview with journalist Diane Rehm, who also had spasmodic dysphonia.

In 2005, Kennedy was receiving shots of botulinum toxin, the neurotoxin that is now used in Botox as well as to treat migraines and other conditions, every four months. This first-line treatment for dysphonia helps to weaken the vocal folds that contract abnormally with this condition. He used botulinum toxin injections for 10 years and then stopped using them, saying they were “not a good fit” for him.

Kennedy initially developed symptoms while in the public eye teaching at Pace University in New York. Some viewers wrote to him suggesting that he had the condition spasmodic dysphonia and that he should contact a well-known expert on the disease, Dr. Andrew Blitzer. He followed this advice and had the diagnosis confirmed.

I am a movement disorders neurologist and have long been passionate about the psychological and social toll that conditions such as dysphonias have on my patients.

YouTube video
Kennedy says his condition began in 1996, when he was 42.

Types of dysphonias

In North America, an estimated 50,000 people have spasmodic dysphonia. The condition involves the involuntary pulling of the muscles that open and close the vocal folds, causing the voice to sound strained and strangled, at times with a breathy quality. About 30% to 60% of people with the condition also experience vocal tremor, which can alter the sound of the voice.

Typically, a neurologist may suspect the disorder by identifying characteristic voice breaks when the patients is speaking. The diagnosis is confirmed with the help of an ear, nose and throat specialist who can insert a small scope into the larynx, examine the vocal folds and rule out any other abnormalities.

Because the disorder is not well known to the public, many patients experience a delay in diagnosis and may be misdiagnosed with gastric reflux or allergies.

The most common type of spasmodic dysphonia is called adductor dysphonia, which accounts for 80% of cases. It is characterized by a strained or strangled voice quality with abrupt breaks on vowels due to the vocal folds being hyperadducted, or abnormally closed.

In contrast, a form of the condition called abductor dysphonia causes a breathy voice with breaks on consonants due to uncontrolled abduction – meaning coming apart of the vocal folds.

Potential treatments

Spasmodic dysphonia is not usually treatable with oral medications and sometimes can get better with botulinum toxin injections into the muscles that control the vocal cords. It is a lifelong disorder currently without a cure. Voice therapy through working with a speech pathologist alongside botulinum toxin administration may also be beneficial.

Surgical treatments can be an option for patients who fail botulinum toxin treatment, though surgeries come with risks and can be variably effective. Surgical techniques are being refined and require wider evaluation and long-term follow-up data before being considered as a standard treatment for spasmodic dysphonia.

YouTube video
The sudden, uncontrollable movements caused by irregular folding of the vocal folds are referred to as spasms, which gave rise to the name spasmodic dysphonia.

Dysphonias fall into a broader category of movement disorders

Spasmodic dysphonia is classified as a focal dystonia, a dystonia that affects one body part – the vocal folds, in the case of spasmodic dysphonia. Dystonia is an umbrella term for movement disorders characterized by sustained or repetitive muscle contractions that cause abnormal postures or movements.

The most common dystonia is cervical dystonia, which affects the neck and can cause pulling of the head to one side.

Another type, called blepharospasm, involves involuntary muscle contractions and spasms of the eyelid muscles that can cause forced eye closure that can even affect vision in some cases. There can be other dystonias such as writer’s cramp, which can make the hand cramp when writing. Musicians can develop dystonias from overusing certain body parts such as violinists who develop dystonia in their hands or trumpet players who develop dystonia in their lips.

Stigmas and psychological distress

Dystonias can cause tremendous psychological distress.

Many dystonias and movement disorders in general, including Parkinson’s disease and other conditions that result in tremors, face tremendous amounts of stigma. In Africa, for instance, there is a misconception that the affected person has been cursed by witchcraft or that the movement disorder is contagious. People with the condition may be hidden from society or isolated from others due to fear of catching the disease.

In the case of spasmodic dysphonia, the affected person may feel that they appear nervous or ill-prepared while speaking publicly. They may be embarrassed or ashamed and isolate themselves from speaking to others.

My patients have been very frustrated by the unpredictable nature of the symptoms and by having to avoid certain sounds that could trigger the dysphonia. They may then have to restructure their word choices and vocabulary so as not to trigger the dysphonia, which can be very mentally taxing.

Some patients with dysphonia feel that their abnormal voice issues affect their relationships and their ability to perform their job or take on leadership or public-facing roles. Kennedy said in an interview that he finds the sound of his own voice to be unbearable to listen to and apologizes to others for having to listen to it.

A 2005 study exploring the biopsychosocial consequences of spasmodic dysphonia through interviews with patients gives some insight into the experience of people living with the disorder.

A patient in that study said that their voice sounded “like some kind of wild chicken screeching out words,” and another patient said that it “feels like you’re having to grab onto a word and push it out from your throat.” Another felt like “there’s a rubber band around my neck. Someone was constricting it.” And another said, “It feels like you have a sore throat all the time … like a raw feeling in your throat.”

Patients in the study described feeling hopeless and disheartened, less confident and less competent. The emotional toll can be huge. One patient said, “I used to be very outgoing and now I find myself avoiding those situations.” Another said, “People become condescending like you’re not capable anymore because you don’t speak well.”

As conditions such as spasmodic dysphonia become better recognized, I am hopeful that not only will treatments improve, but that stigmas around such conditions will diminish.The Conversation

Indu Subramanian, Clinical Professor of Neurology, University of California, Los Angeles

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post What causes RFK Jr.’s strained and shaky voice? A neurologist explains this little-known disorder appeared first on theconversation.com



Note: The following A.I. based commentary is not part of the original article, reproduced above, but is offered in the hopes that it will promote greater media literacy and critical thinking, by making any potential bias more visible to the reader –Staff Editor.

Political Bias Rating: Centrist

This content is predominantly medical and informational, focusing on the neurological condition of spasmodic dysphonia and the personal experience of Robert F. Kennedy Jr. with it. The article provides a balanced, fact-based explanation of the disorder, its symptoms, treatments, and social impacts without introducing political or ideological commentary. The tone is neutral, aiming to educate rather than persuade, and it refrains from taking any political stance or showing bias toward any political ideology.

Continue Reading

The Conversation

AI is giving a boost to efforts to monitor health via radar

Published

on

theconversation.com – Chandler Bauder, Electronics Engineer, U.S. Naval Research Laboratory – 2025-04-30 07:48:00

AI-powered radar could enable contactless health monitoring in the home.
Chandler Bauder

Chandler Bauder, U.S. Naval Research Laboratory and Aly Fathy, University of Tennessee

If you wanted to check someone’s pulse from across the room, for example to remotely monitor an elderly relative, how could you do it? You might think it’s impossible, because common health-monitoring devices such as fingertip pulse oximeters and smartwatches have to be in contact with the body.

However, researchers are developing technologies that can monitor a person’s vital signs at a distance. One of those technologies is radar.

We are electrical engineers who study radar systems. We have combined advances in radar technology and artificial intelligence to reliably monitor breathing and heart rate without contacting the body.

Noncontact health monitoring has the potential to be more comfortable and easier to use than traditional methods, particularly for people looking to monitor their vital signs at home.

How radar works

Radar is commonly known for measuring the speed of cars, making weather forecasts and detecting obstacles at sea and in the air. It works by sending out electromagnetic waves that travel at the speed of light, waiting for them to bounce off objects in their path, and sensing them when they return to the device.

Radar can tell how far away things are, how fast they’re moving, and even their shape by analyzing the properties of the reflected waves.

Radar can also be used to monitor vital signs such as breathing and heart rate. Each breath or heartbeat causes your chest to move ever so slightly – movement that’s hard for people to see or feel. However, today’s radars are sensitive enough to detect these tiny movements, even from across a room.

Advantages of radar

There are other technologies that can be used to measure health remotely. Camera-based techniques can use infrared light to monitor changes in the surface of the skin in the same manner as pulse oximeters, revealing information about your heart’s activity. Computer vision systems can also monitor breathing and other activities, such as sleep, and they can detect when someone falls.

However, cameras often fail in cases where the body is obstructed by blankets or clothes, or when lighting is inadequate. There are also concerns that different skin tones reflect infrared light differently, causing inaccurate readings for people with darker skin. Additionally, depending on high-resolution cameras for long-term health monitoring brings up serious concerns about patient privacy.

side-by-side images, one of a person and the other a verticle series of nested blobs of color
Radar sees the world in terms of how strongly objects in its view reflect the transmitted signals. The resolution of images it can generate are much lower than images cameras produce.
Chandler Bauder

Radar, on the other hand, solves many of these problems. The wavelengths of the transmitted waves are much longer than those of visible or infrared light, allowing the waves to pass through blankets, clothing and even walls. The measurements aren’t affected by lighting or skin tone, making them more reliable in different conditions.

Radar imagery is also extremely low resolution – think old Game Boy graphics versus a modern 4K TV – so it doesn’t capture enough detail to be used to identify someone, but it can still monitor important activities. While it does project energy, the amount does not pose a health hazard. The health-monitoring radars operate at frequencies and power levels similar to the phone in your pocket.

Radar + AI

Radar is powerful, but it has a big challenge: It picks up everything that moves. Since it can detect tiny chest movements from the heart beating, it also picks up larger movements from the head, limbs or other people nearby. This makes it difficult for traditional processing techniques to extract vital signs clearly.

To address this problem we created a kind of “brain” to make the radar smarter. This brain, which we named mm-MuRe, is a neural network – a type of artificial intelligence – that learns directly from raw radar signals and estimates chest movements. This approach is called end-to-end learning. It means that, unlike other radar plus AI techniques, the network figures out on its own how to ignore the noise and focus only on the important signals.

a diagram with two cartoon representations of people on one side, a brain on the other and vertical curved lines in betwenn
In our study, we used AI to transform raw, unprocessed radar signals into vital signs waveforms of one or two people.
Chandler Bauder

We found that this AI enhancement not only gives more accurate results, it also works faster than traditional methods. It handles multiple people at once, for example an elderly couple, and adapts to new situations, even those it didn’t see during training – such as when people are sitting at different heights, riding in a car or standing close together.

Implications for health care

Reliable remote health monitoring using radar and AI could be a major boon for health care. With no need to touch the patient’s skin, risks of rashes, contamination and discomfort could be greatly reduced. It’s especially helpful in long-term care, where reducing wires and devices can make life significantly easier for patients and caregivers.

Imagine a nursing home where radar quietly watches over residents, alerting caregivers immediately if someone has breathing trouble, falls or needs help. It can be implemented as a home system that checks your breathing while you sleep – no wearables required. Doctors could even use radar to remotely monitor patients recovering from surgery or illness.

This technology is moving quickly toward real-world use. In the future, checking your health could be as simple as walking into a room, with invisible waves and smart AI working silently to take your vital signs.The Conversation

Chandler Bauder, Electronics Engineer, U.S. Naval Research Laboratory and Aly Fathy, Professor of Electrical Engineering, University of Tennessee

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post AI is giving a boost to efforts to monitor health via radar appeared first on theconversation.com



Note: The following A.I. based commentary is not part of the original article, reproduced above, but is offered in the hopes that it will promote greater media literacy and critical thinking, by making any potential bias more visible to the reader –Staff Editor.

Political Bias Rating: Centrist

The article is focused on a scientific and technological development related to health monitoring using radar and artificial intelligence. It provides an overview of the research process, technical details, and potential health care applications without expressing a clear ideological stance. The tone remains neutral, emphasizing the technical capabilities and benefits of the technology, particularly in long-term care and home health monitoring. While it does mention potential privacy concerns with other methods like cameras, it does so without taking a political position, focusing instead on the advantages of radar. The content adheres to factual reporting and avoids overt bias or advocacy, presenting the information in a straightforward and informative manner.

Continue Reading

The Conversation

Forensics tool ‘reanimates’ the ‘brains’ of AIs that fail in order to understand what went wrong

Published

on

theconversation.com – David Oygenblik, Ph.D. Student in Electrical and Computer Engineering, Georgia Institute of Technology – 2025-04-30 07:47:00

Tesla crashes are only the most glaring of AI failures.
South Jordan Police Department via APPEAR

David Oygenblik, Georgia Institute of Technology and Brendan Saltaformaggio, Georgia Institute of Technology

From drones delivering medical supplies to digital assistants performing everyday tasks, AI-powered systems are becoming increasingly embedded in everyday life. The creators of these innovations promise transformative benefits. For some people, mainstream applications such as ChatGPT and Claude can seem like magic. But these systems are not magical, nor are they foolproof – they can and do regularly fail to work as intended.

AI systems can malfunction due to technical design flaws or biased training data. They can also suffer from vulnerabilities in their code, which can be exploited by malicious hackers. Isolating the cause of an AI failure is imperative for fixing the system.

But AI systems are typically opaque, even to their creators. The challenge is how to investigate AI systems after they fail or fall victim to attack. There are techniques for inspecting AI systems, but they require access to the AI system’s internal data. This access is not guaranteed, especially to forensic investigators called in to determine the cause of a proprietary AI system failure, making investigation impossible.

We are computer scientists who study digital forensics. Our team at the Georgia Institute of Technology has built a system, AI Psychiatry, or AIP, that can recreate the scenario in which an AI failed in order to determine what went wrong. The system addresses the challenges of AI forensics by recovering and “reanimating” a suspect AI model so it can be systematically tested.

Uncertainty of AI

Imagine a self-driving car veers off the road for no easily discernible reason and then crashes. Logs and sensor data might suggest that a faulty camera caused the AI to misinterpret a road sign as a command to swerve. After a mission-critical failure such as an autonomous vehicle crash, investigators need to determine exactly what caused the error.

Was the crash triggered by a malicious attack on the AI? In this hypothetical case, the camera’s faultiness could be the result of a security vulnerability or bug in its software that was exploited by a hacker. If investigators find such a vulnerability, they have to determine whether that caused the crash. But making that determination is no small feat.

Although there are forensic methods for recovering some evidence from failures of drones, autonomous vehicles and other so-called cyber-physical systems, none can capture the clues required to fully investigate the AI in that system. Advanced AIs can even update their decision-making – and consequently the clues – continuously, making it impossible to investigate the most up-to-date models with existing methods.

YouTube video
Researchers are working on making AI systems more transparent, but unless and until those efforts transform the field, there will be a need for forensics tools to at least understand AI failures.

Pathology for AI

AI Psychiatry applies a series of forensic algorithms to isolate the data behind the AI system’s decision-making. These pieces are then reassembled into a functional model that performs identically to the original model. Investigators can “reanimate” the AI in a controlled environment and test it with malicious inputs to see whether it exhibits harmful or hidden behaviors.

AI Psychiatry takes in as input a memory image, a snapshot of the bits and bytes loaded when the AI was operational. The memory image at the time of the crash in the autonomous vehicle scenario holds crucial clues about the internal state and decision-making processes of the AI controlling the vehicle. With AI Psychiatry, investigators can now lift the exact AI model from memory, dissect its bits and bytes, and load the model into a secure environment for testing.

Our team tested AI Psychiatry on 30 AI models, 24 of which were intentionally “backdoored” to produce incorrect outcomes under specific triggers. The system was successfully able to recover, rehost and test every model, including models commonly used in real-world scenarios such as street sign recognition in autonomous vehicles.

Thus far, our tests suggest that AI Psychiatry can effectively solve the digital mystery behind a failure such as an autonomous car crash that previously would have left more questions than answers. And if it does not find a vulnerability in the car’s AI system, AI Psychiatry allows investigators to rule out the AI and look for other causes such as a faulty camera.

Not just for autonomous vehicles

AI Psychiatry’s main algorithm is generic: It focuses on the universal components that all AI models must have to make decisions. This makes our approach readily extendable to any AI models that use popular AI development frameworks. Anyone working to investigate a possible AI failure can use our system to assess a model without prior knowledge of its exact architecture.

Whether the AI is a bot that makes product recommendations or a system that guides autonomous drone fleets, AI Psychiatry can recover and rehost the AI for analysis. AI Psychiatry is entirely open source for any investigator to use.

AI Psychiatry can also serve as a valuable tool for conducting audits on AI systems before problems arise. With government agencies from law enforcement to child protective services integrating AI systems into their workflows, AI audits are becoming an increasingly common oversight requirement at the state level. With a tool like AI Psychiatry in hand, auditors can apply a consistent forensic methodology across diverse AI platforms and deployments.

In the long run, this will pay meaningful dividends both for the creators of AI systems and everyone affected by the tasks they perform.The Conversation

David Oygenblik, Ph.D. Student in Electrical and Computer Engineering, Georgia Institute of Technology and Brendan Saltaformaggio, Associate Professor of Cybersecurity and Privacy, and Electrical and Computer Engineering, Georgia Institute of Technology

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post Forensics tool ‘reanimates’ the ‘brains’ of AIs that fail in order to understand what went wrong appeared first on theconversation.com



Note: The following A.I. based commentary is not part of the original article, reproduced above, but is offered in the hopes that it will promote greater media literacy and critical thinking, by making any potential bias more visible to the reader –Staff Editor.

Political Bias Rating: Centrist

The article focuses on the development of a forensic tool, AI Psychiatry, designed to investigate the failure of AI systems. It provides technical insights into how this tool can help investigate and address AI failures, particularly in autonomous vehicles, without promoting any ideological stance. The content is centered on technological advancements and their practical applications, with an emphasis on problem-solving and transparency in AI systems. The tone is neutral, focusing on factual reporting about AI forensics and the technical capabilities of the system. There is no discernible political bias in the article, as it largely sticks to technical and academic subjects without introducing political viewpoints.

Continue Reading

Trending