Hw2 Primer
Hw2 Primer
........................................................................................................................................ 1
Data description .................................................................................................................. 1
Variable names ................................................................................................................... 1
Load the data ..................................................................................................................... 1
Variation in drug use and age ............................................................................................... 2
Now fix up the plot ............................................................................................................. 2
Now a caption .................................................................................................................... 3
How much does drug use affect Hopkins score? ....................................................................... 3
What about long-term effects of drug use? .............................................................................. 6
%function hw2_template()
Data description
HIV Associated Neurocognitive Disorder (HAND) is a recent development in the aging HIV population.
Due to both toxicities in the therapies used to treat the disease or from HIV itself, patients are beginning to
have deficits in their psychomotor functions as well as short-term memory loss. This is further exacerbated
by the use of illicit drugs; a common co-variate with HIV infection. The Total Modified Hopkins Dementia
Score (TMHDS) is a tool that is used to assess this in large patient populations. The score ranges from 0
to 12 with a score of 10 or above being considered "normal".
The hopkinsdata.xlsx file contains the data from a study of a cocaine abusing HIV-1 cohort. Patients were
asked to return approximately 6 months apart for 6 visits and during their study visit they were subjected
to the TMHD test and blood was drawn for subsequent drug testing.
The Excel file has three worksheets, containing the age, drug, and score data. Each row represents a patient
and each column represents a visit.
Variable names
The age variable contains the age at time of visit for each patient (rows are patients and columns are
visits).
The drug variable contains the information on whether the patient tested positive for cocaine (True) or
did not (False) at each visit.
The score variable contains the TMHDS for each patient at each visit.
age=bmes375_xlsread('hopkinsdata.xlsx','age');
drug=bmes375_xlsread('hopkinsdata.xlsx','drug');
score=bmes375_xlsread('hopkinsdata.xlsx','score');
1
Variation in drug use and age
The data has two main independent variables (age & drug use) and one possibly dependent one (TMHDS).
First we want to see how much general variation exists in each one. Histograms are perfect for this.
subplot(2,1,2)
xlabel('Age at intake', 'fontsize', 15, 'fontweight', 'bold')
ylabel('# Patients', 'fontsize', 15, 'fontweight', 'bold')
2
set(subplot(2,1,2), 'FontSize', 12)
Now a caption
All captions should have the following: - Title Sentence: This should deliver the main point of the figure
- Explanation of the source of the data behind each plot. - Explanation of all axes.
Figure Caption: The cohort represents a middle-aged population of moderate drug-users. Top: A his-
togram showing the number of patients grouped by the fraction of times the patient tested positive for
cocaine. Bottom: A histogram showing the number of patients grouped by thier age at intake.
fig useatvisit
boxplot(score(:, 1), useatvisit, 'labels', {'No Use', 'Use'})
ylabel('TMHDS', 'fontsize', 15, 'fontweight', 'bold')
set(gca, 'FontSize', 12)
3
Is it still true at other visits?
4
Fix the y axis limits.
Clean up
clf
atvisits=[1 3 6];
for i = 1:numel(atvisits)
col = atvisits(i);
useatvisit = drug(:, col);
subplot(1, numel(atvisits), i)
boxplot(score(:, col), useatvisit, 'labels', {'No Use', 'Use'})
ylim([0, 12])
title(['Visit ', num2str(col)], 'fontsize', 15, 'fontweight', 'bold')
if i==1
ylabel('TMHDS', 'fontsize', 15, 'fontweight', 'bold')
else
set(gca, 'YTickLabel', {})
end
end
5
Figure Caption: Drug use at the time of visit does not significantly alter the TMHDS of a patient. For
visits 1, 3, and 6 boxplots of the TMHDS are shown grouped by a positive drug test at the time of visit.
The red-line indicates the mean TMHDS, the box shows the 25% & 75% percentiles, and the whiskers
span 1 stardard of the data.
for i = 1:size(score,1)
trend = polyfit(age(i, :), score(i, :), 1);
dscore(i) = trend(1);
end
fig longterm
scatter(avguse, dscore);
6
ugh, that's ugly. Let's fill in the markers:
but I know things are "piling up". Let's use transparency to see what's in the back.
7
maybe I should just make boxplots at each unique value
boxplot(dscore, avguse)
8
ohhhhh, boxplot doesn't put things at the "value" but at integer positions and then uses text-labels. Lets
see if we can change that.
clf
boxplot(dscore, avguse, 'positions', (0:6)./6)
set(gca, 'XTick', 0:0.2:1, 'XTickLabel', 0:0.2:1)
% Put a trend-line in
x=xlim;
trend = polyfit(avguse, dscore, 1);
hold on
plot(x, trend(1)*(x) + trend(2), 'm--', 'LineWidth', 2)
plot(x, [0 0], 'k--')
hold off
9
Figure Caption: Increased drug use is correlated with a more rapid decline in Hopkins score. The rate of
change of Hopkins score was estimated using a linear regression for each patient (dTMHDS/year). These
scores were grouped by the fraction of times the patient tested positive for Cocaine. The red-line indicates
the mean TMHDS, the box shows the 25% & 75% percentiles, and the whiskers span 1 stardard deviation
of the data. The trendline is shown in magenta.
10