0% found this document useful (0 votes)
200 views10 pages

CPT Coding

This document summarizes and compares three research papers on using machine learning and natural language processing techniques to predict Current Procedural Terminology (CPT) codes from medical text data: 1) The first paper develops a neural machine translation model to translate procedure texts to standardized terms to predict CPT codes, finding it performs equivalently to support vector machines and LSTM models. 2) The second paper compares several machine learning models for CPT coding, finding support vector machines have the highest accuracy at 87.9% compared to random forests, LSTMs, and other models. 3) The third paper also compares support vector machines, XGBoost, and BERT models to predict CPT codes from pathology reports,

Uploaded by

hsyedamaria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
200 views10 pages

CPT Coding

This document summarizes and compares three research papers on using machine learning and natural language processing techniques to predict Current Procedural Terminology (CPT) codes from medical text data: 1) The first paper develops a neural machine translation model to translate procedure texts to standardized terms to predict CPT codes, finding it performs equivalently to support vector machines and LSTM models. 2) The second paper compares several machine learning models for CPT coding, finding support vector machines have the highest accuracy at 87.9% compared to random forests, LSTMs, and other models. 3) The third paper also compares support vector machines, XGBoost, and BERT models to predict CPT codes from pathology reports,

Uploaded by

hsyedamaria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Research papers on the prediction of

Current Procedural Terminology (CPT)


codes in medical data
Neural Machine Translation–Based Automated Current Procedural Terminology Classification System Using Procedure Text: Development
and Validation Study

Introduction
• This paper has developed an automated anesthesiology current procedural terminology (CPT) prediction system that translates manually entered surgical
procedure text into standard forms using neural machine translation (NMT) techniques. The standard forms are calculated using similarity scores to
predict the most appropriate CPT codes
• The model's performance is compared with that of previously developed machine learning algorithms for CPT prediction.

Dataset
• The researchers collected and analyzed all operative procedures performed at Michigan Medicine between January 2017 and June 2019 (2.5 years).
• The first 2 years of data were used to train and validate the existing models and compare the results from the NMT-based model. Data from 2019 (6-
month follow-up period) were then used to measure the accuracy of the CPT code prediction.
• Three experimental settings were designed with different data types to evaluate the models.

NMT Model Architecture


• The authors developed NMT-based automated CPT coding system that first translates surgical procedure texts in electronic health records (EHRs) into
preferred terms from the Unified Medical Language System (UMLS) and then normalizes the translated preferred term to predict CPT codes.
• Within medicine, each surgical procedure contains a surgical procedure text and a preoperative diagnosis entered by a surgeon or surgical resident. After
completion of the procedure, surgical and anaesthesiology CPT codes were assigned by clinical staff and/or professional medical coders.
• The manually entered texts are the input source, and the preferred terms of the assigned CPT codes are the output target sentences of the NMT model.
• In this study, surgical procedure texts and preoperative diagnoses were the inputs of the model to predict CPT codes.
• Once trained, the NMT model generates multiple candidate translation outputs ranked by a beam search algorithm.
• The top three target sentences were retained and processed through step 2: transformation. With the three target sentences, the best CPT code was
computed in the transformation step using the Levenshtein and Jaccard distances.
The architecture of the NMT-based automated CPT prediction system is shown above.

• The researchers also selected the SVM and LSTM models as the baseline models. For SVM model development, they applied grid search cross-validation
for training and tuning hyperparameters. The input features of the SVM model were bigrams extracted from the training data and weighted using the
term frequency-inverse document frequency.
• For the LSTM model development, a sequence of words from the procedure text and preoperative diagnosis text in the training data was fed into the
embedding layer. The embedding layer then converted each word in the sequence to a vector representation using a Word2Vec model pretrained on
PubMed, PubMed Central, and Wikipedia. The LSTM model was trained on this sequence of vector representations and returned a hidden vector from
each state that was passed through a fully connected layer. A final softmax layer was then used to predict the final label.
Results
• The results in the figure indicate that the top-1 and top-3 accuracies of the NMT-based model were equivalent to those of the SVM and LSTM
models using procedure texts.
• The study also demonstrated that the use of additional information, such as preoperative diagnosis, improves SVM, LSTM, and NMT model
performance.
Classification of Current Procedural Terminology Codes from Electronic Health Record Data Using Machine Learning

Introduction
• This paper uses data science techniques applied to perioperative electronic health record data across multiple centers. Anesthesia CPT code classification
models were developed via multiple machine learning methods and evaluated.
• The study hypothesized that machine learning and NLP could be used to develop an automated system capable of classifying anesthesia CPT codes with
accuracy exceeding current benchmarks.
• This classification modeling could prove beneficial in efforts to optimize performance and reduce costs for research, quality improvement, and
reimbursement tasks reliant on such codes.

Dataset Used
• This study included all patients, adults and pediatrics, undergoing
elective or emergent procedures with an institution-assigned
valid anesthesia CPT code and an operative date between January
1st, 2014 and December 31st, 2016 from 16 contributing centers
in the Multicenter Perioperative Outcomes Group database.
• This data set includes both academic hospitals and community
based practices across the United States.
• A second and distinct data set was created using cases on
patients undergoing elective or urgent procedures with a valid
institution-assigned CPT code between October 1st, 2015 and
November 1st, 2016 from a single Multicenter Perioperative
Outcomes Group institution not included in the Train/Test data
set.
• This “Holdout” data set was used for external validation of the
models created in this study. The figure shows a flow diagram of
the data sets used and the experimental design of this study.
Features
• To maximize the number of cases included in the study, the features used in each model were limited to perioperative electronic health record data
commonly found in anesthesia records:
• age
• gender
• American Society of Anesthesiologists (ASA) physical status
• emergent status
• procedure text
• procedure duration
• derived procedure text length (number of words in procedure text)
• Institution-assigned anesthesia CPT codes were used as labels for each case and each case represents an instance for machine learning modelling.

Supervised Machine Learning Methods


• Five unique supervised machine learning classification models were compared: Random Forest, Long Short-term Memory, Extreme Gradient Boosting,
Support Vector Machine, and Label-embedding Attentive Model.
• After initial hyper-parameter tuning, all models were trained and tested 20 times using 5-fold cross validation: 80% of data for training and the remaining
20% for testing.
• The deep learning methods in this study were the label-embedding attentive model and long short-term memory. Procedure texts for these models were
encoded into vectors using word2vec embedding as input.
• The label-embedding attentive model encoded the descriptions for each anesthesia CPT from the CPT Professional Edition medical code set maintained by
the American Medical Association. Most deep learning models for text classification only embed input (feature) text.
• A “compatibility matrix” was computed between embedded words and labels via cosine similarity.
• From this matrix, an attention score was calculated for each word and the entire procedural text sequence was then derived as the average of embedded
words, weighted by the attention scores. This score was used for CPT classification.
Results on Train/Test Dataset
• The highest overall accuracy was found with the support vector machine model (87.9%, CI 87.6–88.2%) (table 2). Extreme gradient boosting (87.9%, CI
87.5–88.3%), and long short-term memory (86.4%, 83.5–89.3%), and the label-embedding attentive model (84.2%, CI 84.1–84.3%) were all more
accurate than random forest model (82.0%, CI 68.1–95.9%).
• Using CPT categories to identify cases for which the random forest model demonstrated differential performance, there was a low of 70.7% for
radiology procedures and a high of 92.0% for shoulder procedures. There was an observed positive relationship between the number of cases
comprising a specific CPT code and the accuracy of the models for the CPT code, with a Pearson correlation of 0.72.
• Overall accuracy within the top three was 96.8% for support vector machine model and 94.0% for the label-embedding attentive model.
• Within validation, an overall accuracy of 82.1% in the Holdout data set of the best performing model (label-embedding attentive model) was observed.
Comparison of machine-learning algorithms for the prediction of Current Procedural Terminology (CPT) codes from pathology reports

Introduction
• The primary objective of this study is to compare the capacity to delineate primary CPT procedural codes (CPT 88302, 88304, 88305, 88307, 88309)
corresponding to case complexity across state-of-the-art machine learning models over a large corpus of more than 93,039 pathology reports from the
Dartmouth-Hitchcock Department of Pathology and Laboratory Medicine (DPLM).
• They compared XGBoost, SVM, and BERT methodologies for the prediction of primary CPT codes as well as 38 ancillary CPT codes, using both the
diagnostic text alone and text from all subfields.

Data Acquisition
• The researchers obtained Institutional Review Board approval and accessed more than 96,418 pathologist reports from DPLM, collected between June
2015 and June 2020.
• They removed a total of 3,379 reports that did not contain any diagnostic text associated with CPT codes, retaining 93,039 reports.
• Each report was appended with metadata, including corresponding EPIC (EPIC systems, Verona, WI), Charge Description Master (CDM), and CPT
procedural codes, the sign-out pathologist, the amount of time to sign out the document, and other details.
• The documents were deidentified by stripping all PHI-containing fields and numerals from the text and replacing with holder characters.

Machine Learning Models


The study implemented the following three machine-learning algorithms as a basis for the text classification pipeline.

SVM.
• An SVM model was trained to make predictions by using the UMAP embeddings formed from the tf-idf matrix. The SVM operates by learning a hyperplane
that obtains maximal distance (margin) to datapoints of a particular class.
Bag of words with XGBoost
• XGBoost algorithms operate on the entire word by report count matrix and ensemble or average predictions across individual Classification and Regression
Tree (CART) models.
• Individual CART models devise splitting rules that partition instances of the pathology notes based on whether the count of a particular word or phrase in a
pathology note exceeds an algorithmically derived threshold.

BERT
• Bert model was trained by using the HuggingFace Transformers package.
• The researchers used a collection of models that have already been pretrained on a large medical corpus in order to both improve the predictive accuracy of
their model and significantly reduce the computational load.
• Most BERT models limit the document characterization length to 512 words. To address this, they split pathology reports into document subsections when
training BERT models.
Results
• The study indicates that the XGBoost and BERT methodologies produce highly accurate predictions of both primary and ancillary CPT codes, which has the
potential to save operating costs by first suggesting codes prior to manual inspection and flagging potential manual coding errors for review.
• Further, both the BERT and XGBoost models preserved the ordering of the code/case complexity, where most of the misclassifications were made
between codes of a similar complexity.

You might also like