0% found this document useful (0 votes)
20 views18 pages

W Maci05

Uploaded by

jmbotamedi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views18 pages

W Maci05

Uploaded by

jmbotamedi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

eBook

Ship Health AI
Products Faster:
Strategies to Deploy with
Quality and Speed

Brought to you in partnership by


Ship Health AI
Products Faster:
Strategies to Deploy
with Quality and Speed
This guide explores the landscape of AI deployment in
healthcare, focusing on three key areas: current strategies
for high-quality deployment and evaluation, challenges
faced by organizations implementing these strategies,
and innovative solutions to address these obstacles. By
understanding these elements, healthcare organizations
can navigate the complexities of AI implementation more
effectively, ultimately leading to faster deployment of
high-quality AI products.
Current Strategies for High-Quality AI Deployment and
Evaluation in Healthcare
As AI rapidly advances, efficient deployment and evaluation has become crucial for healthcare, especially in
areas like medical imaging and clinical decision support. This section presents key strategies in four critical
areas: building high-quality datasets, using modern development techniques, integrating human expertise,
and implementing quality assurance.

By adopting these approaches, healthcare organizations can streamline AI development, enhance


diagnostic accuracy, and improve patient outcomes.

Accessing and Building Diverse High-Quality Datasets


Effective AI systems rely on acquiring and creating large, diverse, and high-quality datasets.

Multimodal Datasets: There's a growing emphasis on developing AI systems that can process and
understand multiple types of data simultaneously – text, images, audio, and more. This multimodal
approach leads to more versatile and robust AI systems. In healthcare, multimodal AI systems can combine
patient electronic health records (text), medical imaging (visual data), and vital sign measurements
(numerical data) to provide more comprehensive diagnosis and treatment recommendations. In oncology,
multimodal AI systems can combine genomic data, histopathology images, clinical notes, and treatment
outcomes to provide more accurate cancer prognosis and personalized treatment plans.

Data Acquisition Strategies: Organizations acquire data through partnerships, crowdsourcing, and
in-house collection. In healthcare, this might involve partnerships with hospital networks for anonymized
patient data, collaborations with medical imaging centers, or participation in large-scale health data
initiatives like the UK Biobank.

Data Augmentation with Synthetic Data: To overcome limitations in data availability many organizations
are turning to synthetic data generation. This involves creating artificial data that mimics the characteristics
of real-world data, allowing for larger and more diverse training sets. In medical imaging, synthetic data can
be used to generate rare pathologies, helping AI models learn to identify conditions that occur infrequently
in real-world datasets.

Scalable Data Annotation Tools and Processes: Efficient data labeling is crucial for AI development.
Advanced annotation tools and streamlined processes are being developed to make this traditionally
time-consuming task more manageable and scalable. In healthcare, this involves developing specialized
tools for tasks like segmenting medical images or annotating complex clinical notes, often requiring input
from medical professionals.

Modern Development Techniques


With high-quality datasets in hand, the field of AI is witnessing a shift towards more sophisticated
development techniques that promise greater efficiency.

Model Adaptation and Transfer Learning: Healthcare organizations are accelerating AI development by
leveraging pre-trained foundation models as starting points for specialized applications. This approach
allows organizations to fine-tune existing models for specific healthcare tasks using relatively small amounts
of domain-specific data. For example, foundation models like Med-PaLM 2 can be adapted for medical
question answering or clinical note summarization, while general-purpose language models can be
fine-tuned with a few thousand radiology reports to create specialized medical imaging analysis tools. This
efficient approach significantly reduces development time and computational requirements while maintaining
high performance standards.

Retrieval-Augmented Generation (RAG): This technique combines the power of large language models
with the ability to retrieve and incorporate external knowledge. RAG allows AI systems to generate more
accurate and contextually relevant responses by accessing up-to-date information. In healthcare, RAG can
be used to create AI systems that generate clinical summaries by combining patient-specific data with the
latest medical literature and treatment guidelines.

Graph RAG: An extension of RAG, Graph RAG incorporates structured knowledge in the form of knowledge
graphs. This allows for a more deeper understanding and generation of information, particularly in domains
with complex relationships between concepts. In healthcare, Graph RAG can be particularly useful for tasks
like drug discovery, where understanding the complex relationships between molecules, proteins, and
biological pathways is crucial.

AI Agents and "Agentic AI": There's a growing interest in developing AI systems that can act more
autonomously, making decisions and taking actions based on high-level instructions. This approach aims to
create more versatile and adaptable AI solutions. As of now, "Agentic AI" in healthcare remains largely in the
experimental and development stages. The field has seen preliminary applications mainly in research
settings where AI agents perform tasks such as managing patient data, predicting patient outcomes, and
automating routine tasks in a controlled environment. However, the deployment of fully autonomous AI
agents in clinical settings is still limited due to regulatory, ethical, and safety considerations.

Balancing Automation and Human Input: The most successful AI deployments strike a delicate balance
between leveraging the power of automation and preserving the informed judgment of human experts. This
hybrid approach allows for rapid scaling while maintaining high standards of quality and relevance. In
healthcare, this often means using AI to assist rather than replace clinicians, such as AI systems that flag
potential issues in medical images for human review.

Continuous Model Evaluation: Healthcare organizations are increasingly recognizing the critical role of
expert oversight in post-deployment monitoring. This involves clinicians and domain experts actively
evaluating AI performance in real-world settings, identifying emerging edge cases, and providing feedback
to ensure models maintain their accuracy as clinical practices evolve. For instance, radiologists might
regularly assess an AI diagnostic tool's performance across diverse patient populations, helping identify any
shifts in accuracy or potential biases that develop over time.
Quality Assurance and Accuracy Improvement
As AI systems become more complex, ensuring their quality and accuracy becomes increasingly critical,
especially in high-stakes domains.

Model Monitoring and Feedback Loops: Continuous monitoring of AI models in production environments
is becoming standard practice. This involves tracking performance metrics and collecting user feedback to
identify areas for improvement. For example, an AI system for predicting hospital readmissions might be
continuously monitored for changes in performance as patient demographics or treatment protocols evolve.

Regular Testing and Evaluation: Rigorous testing protocols, including stress testing and edge case
analysis, are being implemented to ensure AI systems perform reliably across a wide range of scenarios. In
healthcare, this includes testing AI models across diverse patient populations and rare medical conditions to
ensure robust performance in real-world clinical settings.

Reinforcement Learning from Human Feedback (RLHF): This technique involves fine-tuning AI models
based on human feedback, allowing for continuous improvement and alignment with human preferences
and values. RLHF could be used in healthcare to refine AI-powered clinical decision support systems based
on feedback from experienced clinicians.

By employing these strategies, healthcare operations are working towards the goal of deploying AI products
that are not only powerful and efficient but also accurate, reliable, and scalable. However, as we'll explore in
the next section, these approaches are not without their challenges.
Challenges in Current AI Deployment Practices in
Healthcare
While the strategies outlined in the previous section offer promising avenues for AI deployment in
healthcare, they are not without significant challenges. This section explores the obstacles healthcare
organizations face when implementing these approaches, with a particular focus on data quality,
accessibility, and the unique complexities of the healthcare domain.

Data Quality and Accessibility Issues


The foundation of AI systems—data—often presents significant hurdles in the deployment process in
healthcare. Organizations face multiple challenges in acquiring, preparing, and maintaining high-quality
datasets that are representative, compliant, and cost-effective.

Dataset Bias and Representation: Ensuring datasets are truly representative and free from bias remains
a significant challenge in healthcare AI. Biased datasets can lead to AI systems that perform poorly for
certain demographic groups or rare medical conditions, potentially exacerbating existing health disparities.
For example, a skin cancer detection AI trained predominantly on images of lighter skin tones may
underperform when analyzing darker skin tones, leading to missed diagnoses and delayed treatments for
certain patient populations.

To mitigate this, healthcare organizations must implement inclusive data collection practices, ensuring
sufficient sample sizes across all relevant patient demographic groups and conditions. This may involve
stratified sampling techniques and collaboration with diverse medical experts and patient advocates to
define comprehensive data requirements.

Synthetic Data Limitations: While synthetic data offers a potential solution to data scarcity in healthcare,
its practical application remains highly experimental and unproven. Current synthetic data generation
techniques are not able to accurately replicate real-world medical data. They may also inadvertently
reinforce existing biases present in the seed data or fail to capture the full complexity of real-world medical
scenarios. For instance, synthetically generated electronic health records might not accurately reflect the
nuanced patterns of comorbidities or treatment responses seen in actual patient populations.

Data Privacy and Regulatory Compliance: Strict regulations such as HIPAA in the United States impose
significant constraints on data sharing and access in healthcare. These necessary privacy protections can
make it challenging to build comprehensive datasets for AI training, especially for rare conditions or across
different healthcare systems. Ensuring compliance while maintaining data utility for AI development requires
sophisticated anonymization techniques and secure data sharing frameworks.

High Costs of Quality Data: Beyond the financial burden, acquiring high-quality, labeled healthcare
datasets presents significant accessibility challenges. Many valuable datasets are siloed within individual
healthcare
institutions or research centers, making it difficult for AI developers to access diverse, representative data.
Furthermore, the expertise required to accurately label complex medical data (e.g., interpreting medical
imaging or annotating clinical notes) can be scarce and expensive.

To address this, healthcare organizations are exploring collaborative data-sharing initiatives and investing in
scalable, expert-driven data annotation solutions. These approaches aim to democratize access to
high-quality healthcare data while maintaining stringent quality and privacy standards.

Scaling Human Expertise


Human expertise plays a vital role in AI development, particularly in healthcare. However, integrating this
expertise effectively at scale presents its own set of challenges. Healthcare organizations must navigate the
complexities of finding, utilizing, and balancing human knowledge with automated processes.

Lack of Universal Truth in Healthcare: A critical challenge in healthcare AI is the absence of a single,
universally accepted "ground truth" for many medical decisions. There are often high rates of inter-rater
disagreement among physicians, especially in complex cases. For example, studies have shown significant
variability in radiologists' interpretations of the same medical images. This inherent subjectivity in medical
decision-making poses a unique challenge for AI development, as it complicates the process of creating
reliable training data and evaluating model performance.

Shortage of AI and Domain Experts: There's a growing demand for professionals who understand both AI
technologies and specific healthcare domains. However, it's important to note that domain expertise in
healthcare is not always straightforward or easily identifiable. Just because an individual has a background
in a particular medical area doesn't necessarily mean they excel at AI-related tasks in that domain.

Furthermore, there's a scarcity of methods to reliably assess whether human experts are indeed "experts"
whose judgments can be trusted for AI development. This challenge is compounded in healthcare, where
expertise can be highly specialized and context-dependent. For instance, a general radiologist might not
have the specific expertise needed for a cutting-edge AI project in neuroradiology.

Knowledge Transfer Bottlenecks: Efficiently capturing


and integrating expert knowledge into AI systems remains
challenging, often resulting in a trade-off between scale
and depth of expertise. In healthcare, where knowledge is
often tacit and based on years of clinical experience,
translating this expertise into a form that can be used by AI
systems is particularly difficult. For example, an
experienced dermatologist might instinctively recognize a
rare skin condition based on subtle visual cues that are
hard to articulate or codify for an AI system. Developing
methods to effectively transfer this type of
experience-based knowledge is a significant challenge in
healthcare AI.
Balancing Automation and Human Oversight: Determining the right level of human involvement in AI
systems is an ongoing challenge. While AI can process vast amounts of data quickly, human expertise is
often needed to interpret results in context, handle edge cases, and make ethically complex decisions. For
instance, in a clinical decision support system, how much autonomy should the AI have in recommending
treatment plans versus serving as a tool to augment physician decision-making?

Need for Scalable Expert Annotation: To develop high-quality healthcare AI, there's a pressing need for
expert annotation of large datasets. Traditional methods of expert annotation are often slow and don't scale
well. Healthcare organizations need to create systems that can leverage human expertise at a speed
consistent with the rapid pace of AI development.

This challenge calls for innovative approaches to data annotation, such as developing platforms that can
efficiently crowdsource expert opinions while maintaining high quality standards. Such systems need to be
designed to handle the complexity and specificity of healthcare data while also accounting for the potential
variability in expert opinions.

Technical Hurdles in AI Development


The path from conceptualization to implementation of AI systems in healthcare is often fraught with technical
obstacles. These challenges range from the limitations of current methodologies to the complexities of
adapting general models to specific use cases.

Limitations of Transfer Learning: While transfer learning can accelerate development, fine-tuning models
for specific domains often requires significant domain-specific data and expertise. For example, adapting a
general-purpose image recognition model to identify rare genetic disorders from facial features requires not
only specialized medical data but also deep understanding of genetic phenotypes.

Complexity of RAG Systems: Implementing effective Retrieval-Augmented Generation (RAG) systems in


healthcare involves managing knowledge bases that must rapidly evolve with medical science. The
emergence of COVID-19 illustrates this challenge perfectly - RAG systems needed to quickly incorporate
new information about symptoms, treatments, and outcomes as they emerged, while maintaining accuracy
and relevance for all other conditions. Beyond just adding new information, these systems must understand
complex medical queries and synthesize results that remain contextually relevant as medical knowledge
expands. This requires sophisticated approaches to validate both existing and newly incorporated
information.

"Last Mile" Development Challenges: Adapting general-purpose AI models to specific healthcare use
cases often requires substantial effort and resources, sometimes negating the initial efficiency gains. This is
particularly evident in areas like personalized medicine, where models need to account for individual patient
variability and complex comorbidities.

Quality Assurance and Performance Validation: In healthcare AI, ensuring reliable performance across
diverse patient populations and clinical scenarios is critical. For instance, a diabetic retinopathy detection AI
must not only be accurate but also perform consistently across different ethnicities, ages, and comorbidities.
Continuous monitoring and updating of these systems to adapt to new medical knowledge and changing
patient demographics present ongoing technical challenges.

Ethical and Trust Considerations


The deployment and evaluation of AI systems in healthcare raises critical ethical questions and trust issues
that organizations must carefully navigate.

Explainability and Transparency: Developing AI systems that can provide clear explanations for their
decisions is paramount, where understanding the rationale behind diagnosis or treatment recommendations
is essential for both clinicians and patients. For example, an AI system recommending a particular cancer
treatment must be able to explain its reasoning in terms that oncologists can validate and patients can
understand. This remains technically challenging, especially for complex deep learning models.

Managing User Trust: Building and maintaining trust in AI systems among healthcare professionals,
patients, and regulatory bodies is not a simple task.. This requires ongoing effort, clear communication, and
demonstrated reliability. AI-assisted diagnostic tools need to prove their consistency and accuracy over time
to gain the confidence of radiologists or pathologists who may initially be skeptical of automated
assessments.

Addressing Ethical Concerns: Healthcare AI must navigate complex ethical terrain, including issues of
fairness, accountability, and potential impacts on the healthcare workforce. Ensuring AI systems don't
perpetuate or exacerbate existing health disparities is vital. For example, AI-driven resource allocation
systems in hospitals must be carefully designed to avoid biases against certain patient groups. Additionally,
the potential for AI to change the roles of healthcare workers must be addressed thoughtfully, balancing
efficiency gains with the preservation of essential human care elements.

Data Privacy and Consent: In healthcare, AI systems often require access to sensitive patient data, raising
significant privacy concerns. Ensuring robust data protection measures and obtaining informed consent for
AI use in patient care are ongoing challenges. This is particularly complex in scenarios like federated
learning, where AI models are trained across multiple healthcare institutions without centralizing patient
data.

By recognizing and proactively addressing these ethical and trust considerations, healthcare organizations
can develop more responsible and effective AI deployment strategies. This approach not only mitigates risks
but also paves the way for AI systems that enhance patient care while maintaining the trust and confidence
of all stakeholders in the healthcare ecosystem.
Centaur Labs' Experts-in-the-Loop Approach for
Accelerating AI Deployment and Evaluation in
Healthcare
Centaur Labs has developed an innovative expert-in-the-loop approach that directly addresses the
challenges faced in healthcare AI deployment. By combining the power of collective intelligence with
advanced technology, Centaur Labs offers a comprehensive solution that enhances data quality, accelerates
development, and ensures ongoing model refinement. Let's explore how this approach is transforming
healthcare AI.

Scalable High-Quality Data Labeling


Centaur Labs revolutionizes data labeling in healthcare AI through its global network of over 50,000 health
experts. This vast talent pool, combined with an innovative gamified platform, enables:

● Rapid, Large-Scale Annotation: Processing over 2 million opinions weekly, translating to


thousands of labels per day for each customer.
● High Accuracy: Consistently achieving 85-95% agreement with ground truth across various use
cases.
● Diversity in Expertise: Approximately 50% US-based and 50% medical students, providing a rich
pool of knowledge.

The platform, DiagnosUs, is HIPAA and SOC 2 Type II compliant, ensuring data security and privacy in
medical AI development. Users compete to label data, with continuous monitoring and reward for skill on
hidden test cases.

Expert Oversight in Crowdsourced Annotations


Centaur Labs ingeniously combines the scalability of crowdsourcing with the precision of expert oversight:

● Collective Intelligence: Leveraging insights from diverse medical experts to surpass the accuracy
of individual annotators. This approach recognizes that in healthcare, there's often no single "ground
truth," and multiple expert opinions can capture the nuances of medical interpretation more
effectively than a single expert view.
● Quality Control System: Continuous monitoring through hidden test cases, ensuring only
top-performing annotators' inputs are utilized. This system dynamically assesses annotator
performance, adjusting the weight given to their opinions based on ongoing accuracy.

This model tackles the challenges of scaling human expertise in healthcare AI and ensures high-quality data
annotation even for complex medical tasks.
Multimodal Solutions for Healthcare AI
Centaur Labs' platform supports a wide range of data types, crucial for comprehensive healthcare AI
development:

● Diverse Data Handling: Processes text, images, audio, video, and waveforms (e.g., clinical notes,
2D and 3D medical imaging, heart and lung sounds, surgical and endoscopic footage, ECG, EEG).
● Versatile Annotation Tasks: Supports polygon, mask, box, line, circle, contour, NER, centroid,
classification, and range selection annotations.
● Regulatory Compliance: Ensures adherence to healthcare standards such as HIPAA and SOC 2
Type II.

This multimodal capability addresses the challenges of data heterogeneity and standardization in
healthcare, while maintaining regulatory compliance.

Continuous Model Evaluation and Refinement


Centaur Labs extends its expertise beyond initial development, offering ongoing model monitoring and
refinement:

● Real-Time Performance Assessment: Enables rapid identification of edge cases and model
underperformance in live healthcare environments.
● Adaptive Improvement: Allows for quick adjustments to AI models based on expert feedback and
evolving medical knowledge.
● Regulatory Alignment: Helps streamline compliance with evolving regulatory requirements through
continuous expert oversight.

This continuous feedback loop addresses the challenges of maintaining model relevance, handling concept
drift in medical knowledge, and ensuring ongoing regulatory compliance.

Empowering Advanced AI Techniques


Centaur Labs' platform is designed to support cutting-edge AI development techniques in healthcare:

● Foundation Model Fine-Tuning: Provides high-quality, expert-validated data for effectively


fine-tuning large language models and other foundation models for specific healthcare applications.
● Enhanced Contextual Understanding: Facilitates the development of more sophisticated and
contextually aware AI models, crucial for navigating the complexities of medical terminology and
procedures.
● Data Curation and Cleaning: Helps understand dataset composition, create classes for efficient
labeling, and remove low-quality or irrelevant data before labeling.
● Vector Database Augmentation: Improves LLM outputs by enhancing prompts with context
relevant to healthcare use cases.
● RAG Development Support: Enables creation and maintenance of high-quality retrieval systems
through expert-validated knowledge bases that evolve with medical science.
Transformative ROI for Healthcare AI Initiatives
Centaur Labs' approach delivers substantial return on investment (ROI) for organizations deploying AI in
healthcare:

● Accelerated Time-to-Market: Enables companies to bring innovative AI solutions to market faster,


responding swiftly to emerging healthcare needs.
● Quality Improvements: Higher accuracy and reliability in AI models lead to greater user trust and
adoption, while reducing potential liability and reputational risks.

Clients have reported up to 20x faster annotation speeds, increases in model accuracy from 70% to over
90%, and the avoidance of costly redevelopment cycles through early issue detection.

Comprehensive Support Across the AI Lifecycle


Centaur Labs offers solutions that support the entire AI development process:

● Design: Data curation, cleaning, and quality control to ensure efficient downstream labeling.
● Build: High-quality data labeling for model training, supervised fine-tuning, and expert feedback for
reinforcement learning.
● Test & Monitor: Evaluation of model performance, support for regulatory submissions, and ongoing
model monitoring in production.

By addressing the core challenges of healthcare AI deployment and evaluation - from data quality and
expert integration to continuous refinement and scalability - Centaur Labs' expert-in-the-loop approach
offers a comprehensive solution for organizations looking to harness the full potential of AI in healthcare.
This innovative methodology not only accelerates AI development but also ensures the delivery of reliable,
impactful AI solutions that can truly advance the field of healthcare.
Case Studies: Centaur Labs'
Impact on Healthcare AI
Development
The following case studies illustrate how Centaur Labs' expert-in-the-loop approach has significantly
improved AI development processes and outcomes across various healthcare domains. These examples
demonstrate the tangible benefits of Centaur Labs' solution in terms of speed, accuracy, and scalability.

1. SciBite: Accelerating Vocabulary Curation for Scientific Analytics


Challenge: SciBite identified 5,000 potential new synonyms for two of their most popular vocabularies -
indication and anatomy. Curation was expected to take months of dedicated work from their in-house team.
They needed to accelerate the vocabulary update process.

Solution: Centaur Labs partnered with SciBite to design two separate crowdsourcing workflows to evaluate
candidate synonyms:

● Evaluated 700-1200 synonyms per day


● Generated 7-10 qualified opinions per case
● Classified the relationship between the candidate synonym and the existing reference term as
"Exact", "Broad", "Narrow", or "No Match"
● Achieved high accuracy - 90.3% for disease and 95.1% for anatomy agreement with ground truth
reference cases

Impact:

● Saved 2 months of dedicated scientific curation effort


● Added 1500 new terms - 900 disease terms and 600 anatomy terms
● Established ongoing collaboration for additional vocabulary updates, new vocabulary development,
and NER model evaluation

Quote: "The Centaur Labs crowd offers the scalability and quality we need to rapidly update vocabularies
based on new data sources. This frees our curators to focus on the most complex requirements rather than
tedious tasks." - Mark Streer, Scientific Curator
2. Paige: Enhancing Breast Cancer Detection
Challenge: Paige, a leader in digital pathology solutions, needed to develop ML models to identify key
cellular features within breast cancer tissue. Their in-house annotation system, relying on four staff
pathologists working after hours, was unscalable and produced low throughput.

Solution: Centaur Labs classified 20,000 images of breast tissue, generating annotations at a rate of 4,000
per week, with 10 qualified opinions per image. The annotations achieved 90% agreement with Gold
Standard cases.

Impact:

● Improved model's F1 score from 0.6 to 0.83


● 10 times faster annotation speed
● Saved valuable time for AI team and staff pathologists

Quote: "Working with Centaur Labs is the best way to get thousands of pieces of data annotated in a day,
rather than weeks. With Centaur Labs we have both an army of people doing high quality annotations, and
annotating the data very quickly." - Fausto Milletarì, Sr. AI Scientist at Paige

3. Eight Sleep: Advancing Snore Detection Technology


Challenge: Eight Sleep, developing an innovative snore detection feature for their Pod 4 product, needed a
scalable and skilled multimodal annotation solution to improve their model's performance.

Solution: Centaur Labs set up a multimodal labeling task evaluating three data types: audio representation
of snoring vibration, spectrogram visual of vibrations, and waveform of respiratory patterns. They classified
4,000 snore cases, averaging 1,000 per day, collecting more than 53,000 high-quality reads.

Impact:

● Improved model accuracy from 70% to 93%


● Hit target launch date for Pod 4
● Achieved 99% agreement with ground truth
reference cases
● Beat typical AI development timeline - from
idea to production in less than a year

Quote: "It was getting the reliable high quality labels


- and a lot more of them - that was able to unlock
this improvement in model accuracy from 70 to
93%" - John Maidens, Machine Learning Lead at
Eight Sleep
4. Consensus: Elevating Scientific Literature Search
Challenge: Consensus, building an AI-driven search engine for scientific literature, needed to evaluate and
improve their model's performance. Their existing annotation system using PhDs via UpWork was
unscalable and inefficient.

Solution: Centaur Labs classified 5,000 search results and 7,500 query-search result pairs, generating
17,500 annotations at a rate of 1,900-6,600 per week, with 10 qualified opinions per annotation.

Impact:

● Improved model's quality score from 0.73 to 0.87


● Achieved 86-93% agreement with Gold Standard cases
● Completed in two weeks what would have taken Consensus' team three months

Quote: "Working with Centaur Labs to annotate data is better in every way than our prior system. The
annotations are more accurate, more affordable and the system is easier for our team to manage" - Eric
Olson, Founder/CEO at Consensus

5. VUNO: Accelerating FDA Clearance for Brain MRI Segmentation AI


Challenge: VUNO, seeking FDA clearance for their DeepBrain model, needed their validation dataset
annotated by US board-certified radiologists, which was difficult to source from their base in Korea.

Solution: Centaur Labs recruited three US board-certified and fellowship-trained neuroradiologists from
leading US medical institutions. They applied masks to 104 brain regions of interest according to VUNO's
protocol, following a double-blind with adjudication approach to ensure quality.

Impact:

● Achieved FDA clearance ahead of schedule


● Hit target go-to-market launch at RSNA 2023
● Completed all data annotations within one month of kickoff

Quote: "Both of the initial labelers as well as the adjudicator were all very prompt. We expected the project
to go longer, but we actually finished very much on time, so we were very surprised." - Tae Min Son, Product
Manager at VUNO

These case studies showcase Centaur Labs' ability to address diverse challenges in healthcare AI
development, from improving model accuracy and accelerating development timelines to enabling
regulatory clearance. They demonstrate the platform's versatility across various data types and medical
specialties, highlighting the significant impact of expert-in-the-loop annotation on AI model performance and
development efficiency.
As the field of AI continues to evolve, Centaur Labs' methodology offers a blueprint for responsible and
effective AI deployment. By balancing speed with quality, and scalability with precision, their approach not
only addresses current challenges but also paves the way for future innovations in healthcare AI. For
organizations looking to harness the full potential of AI in healthcare while maintaining the highest standards
of accuracy and reliability, Centaur Labs provides a compelling partnership opportunity, promising to
accelerate the journey from AI concept to real-world impact.
About us
KDnuggets is an online platform on business analytics, big data,
data mining, and data science. The platform covers analytics and
data mining, including news, software, jobs, meetings, courses,
data, education, and webinars.

Machine Learning Mastery is an online community and store that


offers support and training to help developers get started and get
good at applied machine learning. We teach machine learning
using a highly productive top-down and results-focused approach
that is counter to the math-heavy academic approach taken by the
rest of the industry.

You might also like