• Home
  • Blog
  • AI News
  • Microsoft AI Diagnoses Patients More Accurately Than Human Doctors

Microsoft AI Diagnoses Patients More Accurately Than Human Doctors

Updated:June 30, 2025

Reading Time: 3 minutes
An AI generated image of a Microsoft building

Microsoft has unveiled a new AI system that significantly outperforms human doctors in diagnosing medical conditions. 

The tool, called the MAI Diagnostic Orchestrator (MAI-DxO), reportedly diagnoses illnesses four times more accurately than a group of trained physicians.

This development suggests that AI may soon play a more active role in clinical care, potentially improving accuracy and reducing costs.

Approach to Diagnosis

MAI-DxO uses a novel framework that simulates how doctors work together to reach a diagnosis. 

Microsoft describes it as a “chain-of-debate” system. It brings together several advanced AI models (OpenAI’s GPT, Google’s Gemini, Meta’s Llama, Anthropic’s Claude, and xAI’s Grok).

And each model contributes independently. Then, the system evaluates their responses collectively to determine the most accurate and cost-effective diagnosis. 

This process mirrors how a team of specialists might consult one another in a real clinical setting.

Mustafa Suleyman, CEO of Microsoft AI, described the project as “a genuine step towards medical superintelligence.”

The System’s Performance

Microsoft tested MAI-DxO using 304 medical cases from the New England Journal of Medicine.

Each case was used to simulate the diagnostic process. The company built a benchmark called the Sequential Diagnosis Benchmark (SDBench) to evaluate the AI’s performance.

The results were striking. MAI-DxO achieved 80% accuracy, compared to the 20% accuracy achieved by a panel of human physicians. 

In addition, it cut costs by 20%, often selecting less expensive tests or procedures without reducing diagnostic quality.

How the System Works

The AI follows a structured process that mirrors the clinical steps taken by human doctors.

But, unlike a single physician, MAI-DxO gathers input from multiple AI models, creating a multi-perspective view that enhances decision-making:

  • It receives the patient’s case details.
  • It evaluates the information.
  • It suggests necessary tests.
  • It interprets the results.
  • It proposes a diagnosis.

Talent Shifts

The Microsoft office 
Image Credit: Gary Hershorn/Getty Images

To build this system, Microsoft hired several former Google AI researchers. Poaching top tablets isn’t new. Recently, Meta hired OpenAI’s top researchers to further its AI agenda. 

Suleyman himself is a former Google executive, previously involved in AI research. His leadership signals Microsoft’s increasing focus on developing AI applications in high-stakes fields such as health care.

Potential and Risks

While the technology is still in its early stages, Microsoft believes it could support or even improve clinical decision-making in the near future. 

The company is considering multiple deployment options. For example, MAI-DxO could be integrated into Bing, offering symptom-checking features to users. 

It could also support medical professionals by automating parts of the diagnostic workflow.

However, there are challenges. One major concern is data bias. Like all AI models, MAI-DxO relies on the data it was trained on. 

If the training data lacks diversity, the system may not perform equally well for all demographic groups.

Microsoft has acknowledged this risk and has not yet committed to full commercialization.

Instead, it plans to continue testing the system in more real-world scenarios before any broad rollout.

AI and Medicine

This project adds to a growing body of research showing how large language models can assist in medical tasks. 

Both Microsoft and Google have published studies suggesting that these models can interpret medical records and identify conditions accurately.

However, Microsoft’s latest work is different. MAI-DxO reasons through symptoms, orders tests, and suggests cost-conscious decisions, just like a trained clinician would.

Dominic King, a vice president at Microsoft involved in the project, emphasized this point. 

“Our model performs incredibly well, both getting to the diagnosis and getting to that diagnosis very cost-effectively,” he said.

Lolade

Contributor & AI Expert