CPT Coding
CPT Coding
Introduction
• This paper has developed an automated anesthesiology current procedural terminology (CPT) prediction system that translates manually entered surgical
procedure text into standard forms using neural machine translation (NMT) techniques. The standard forms are calculated using similarity scores to
predict the most appropriate CPT codes
• The model's performance is compared with that of previously developed machine learning algorithms for CPT prediction.
Dataset
• The researchers collected and analyzed all operative procedures performed at Michigan Medicine between January 2017 and June 2019 (2.5 years).
• The first 2 years of data were used to train and validate the existing models and compare the results from the NMT-based model. Data from 2019 (6-
month follow-up period) were then used to measure the accuracy of the CPT code prediction.
• Three experimental settings were designed with different data types to evaluate the models.
• The researchers also selected the SVM and LSTM models as the baseline models. For SVM model development, they applied grid search cross-validation
for training and tuning hyperparameters. The input features of the SVM model were bigrams extracted from the training data and weighted using the
term frequency-inverse document frequency.
• For the LSTM model development, a sequence of words from the procedure text and preoperative diagnosis text in the training data was fed into the
embedding layer. The embedding layer then converted each word in the sequence to a vector representation using a Word2Vec model pretrained on
PubMed, PubMed Central, and Wikipedia. The LSTM model was trained on this sequence of vector representations and returned a hidden vector from
each state that was passed through a fully connected layer. A final softmax layer was then used to predict the final label.
Results
• The results in the figure indicate that the top-1 and top-3 accuracies of the NMT-based model were equivalent to those of the SVM and LSTM
models using procedure texts.
• The study also demonstrated that the use of additional information, such as preoperative diagnosis, improves SVM, LSTM, and NMT model
performance.
Classification of Current Procedural Terminology Codes from Electronic Health Record Data Using Machine Learning
Introduction
• This paper uses data science techniques applied to perioperative electronic health record data across multiple centers. Anesthesia CPT code classification
models were developed via multiple machine learning methods and evaluated.
• The study hypothesized that machine learning and NLP could be used to develop an automated system capable of classifying anesthesia CPT codes with
accuracy exceeding current benchmarks.
• This classification modeling could prove beneficial in efforts to optimize performance and reduce costs for research, quality improvement, and
reimbursement tasks reliant on such codes.
Dataset Used
• This study included all patients, adults and pediatrics, undergoing
elective or emergent procedures with an institution-assigned
valid anesthesia CPT code and an operative date between January
1st, 2014 and December 31st, 2016 from 16 contributing centers
in the Multicenter Perioperative Outcomes Group database.
• This data set includes both academic hospitals and community
based practices across the United States.
• A second and distinct data set was created using cases on
patients undergoing elective or urgent procedures with a valid
institution-assigned CPT code between October 1st, 2015 and
November 1st, 2016 from a single Multicenter Perioperative
Outcomes Group institution not included in the Train/Test data
set.
• This “Holdout” data set was used for external validation of the
models created in this study. The figure shows a flow diagram of
the data sets used and the experimental design of this study.
Features
• To maximize the number of cases included in the study, the features used in each model were limited to perioperative electronic health record data
commonly found in anesthesia records:
• age
• gender
• American Society of Anesthesiologists (ASA) physical status
• emergent status
• procedure text
• procedure duration
• derived procedure text length (number of words in procedure text)
• Institution-assigned anesthesia CPT codes were used as labels for each case and each case represents an instance for machine learning modelling.
Introduction
• The primary objective of this study is to compare the capacity to delineate primary CPT procedural codes (CPT 88302, 88304, 88305, 88307, 88309)
corresponding to case complexity across state-of-the-art machine learning models over a large corpus of more than 93,039 pathology reports from the
Dartmouth-Hitchcock Department of Pathology and Laboratory Medicine (DPLM).
• They compared XGBoost, SVM, and BERT methodologies for the prediction of primary CPT codes as well as 38 ancillary CPT codes, using both the
diagnostic text alone and text from all subfields.
Data Acquisition
• The researchers obtained Institutional Review Board approval and accessed more than 96,418 pathologist reports from DPLM, collected between June
2015 and June 2020.
• They removed a total of 3,379 reports that did not contain any diagnostic text associated with CPT codes, retaining 93,039 reports.
• Each report was appended with metadata, including corresponding EPIC (EPIC systems, Verona, WI), Charge Description Master (CDM), and CPT
procedural codes, the sign-out pathologist, the amount of time to sign out the document, and other details.
• The documents were deidentified by stripping all PHI-containing fields and numerals from the text and replacing with holder characters.
SVM.
• An SVM model was trained to make predictions by using the UMAP embeddings formed from the tf-idf matrix. The SVM operates by learning a hyperplane
that obtains maximal distance (margin) to datapoints of a particular class.
Bag of words with XGBoost
• XGBoost algorithms operate on the entire word by report count matrix and ensemble or average predictions across individual Classification and Regression
Tree (CART) models.
• Individual CART models devise splitting rules that partition instances of the pathology notes based on whether the count of a particular word or phrase in a
pathology note exceeds an algorithmically derived threshold.
BERT
• Bert model was trained by using the HuggingFace Transformers package.
• The researchers used a collection of models that have already been pretrained on a large medical corpus in order to both improve the predictive accuracy of
their model and significantly reduce the computational load.
• Most BERT models limit the document characterization length to 512 words. To address this, they split pathology reports into document subsections when
training BERT models.
Results
• The study indicates that the XGBoost and BERT methodologies produce highly accurate predictions of both primary and ancillary CPT codes, which has the
potential to save operating costs by first suggesting codes prior to manual inspection and flagging potential manual coding errors for review.
• Further, both the BERT and XGBoost models preserved the ordering of the code/case complexity, where most of the misclassifications were made
between codes of a similar complexity.