Methodology Series Module 1 Cohort Studies

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

[Downloaded free from http://www.e-ijd.org on Tuesday, November 28, 2017, IP: 150.107.215.

107]

IJD® MODULE ON BIOSTATISTICS AND RESEARCH METHODOLOGY FOR THE DERMATOLOGIST

Methodology Series Module 1: Cohort Studies


Maninder Singh Setia

Abstract From the Department of


Cohort design is a type of nonexperimental or observational study design. In a cohort study, Epidemiology, MGM Institute of
the participants do not have the outcome of interest to begin with. They are selected based Health Sciences, Navi Mumbai,
Maharashtra, India
on the exposure status of the individual. They are then followed over time to evaluate for the
occurrence of the outcome of interest. Some examples of cohort studies are (1) Framingham
Cohort study, (2) Swiss HIV Cohort study, and (3) The Danish Cohort study of psoriasis and Address for correspondence:
depression. These studies may be prospective, retrospective, or a combination of both of these Dr. Maninder Singh Setia,
types. Since at the time of entry into the cohort study, the individuals do not have outcome, MGM Institute of Health Sciences,
the temporality between exposure and outcome is well defined in a cohort design. If the Navi Mumbai, Maharashtra, India.
exposure is rare, then a cohort design is an efficient method to study the relation between E‑mail: maninder.setia@
karanamconsultancy.in
exposure and outcomes. A retrospective cohort study can be completed fast and is relatively
inexpensive compared with a prospective cohort study. Follow‑up of the study participants is
very important in a cohort study, and losses are an important source of bias in these types of
studies. These studies are used to estimate the cumulative incidence and incidence rate. One
of the main strengths of a cohort study is the longitudinal nature of the data. Some of the
variables in the data will be time‑varying and some may be time independent. Thus, advanced
modeling techniques (such as fixed and random effects models) are useful in analysis of these
studies.

Key Words: Cohort studies, limitations, strengths

Introduction of the study. They are then followed over time to


Cohort studies are important in research design. The term evaluate for the occurrence of the outcome of interest.
“cohort” is derived from the Latin word “Cohors” – “a As seen in Figure 1, at baseline, some of the study
group of soldiers.” It is a type of nonexperimental or participants have exposure (defined as exposed) and
observational study design. The term “cohort” refers to others do not have the exposure (defined as unexposed).
a group of people who have been included in a study Over the period of follow‑up, some of the exposed
by an event that is based on the definition decided by individuals will develop the outcome and some unexposed
the researcher. For example, a cohort of people born in individuals will develop the outcome of interest. We will
Mumbai in the year 1980. This will be called a “birth compare the outcomes in these two groups.
cohort.” Another example of the cohort will be people
who smoke. Some other terms which may be used for Examples of Cohort Studies
these studies are “prospective studies” or “longitudinal
Framingham cohort study (https://www.
studies.”
framinghamheartstudy.org/index.php)
Design This cohort study was initiated in 1948 in Framingham.
Framingham, at the time of initiation of the cohort, was an
In a cohort study, the participants do not have the
industrial town 21 miles west of Boston with a population
outcome of interest to begin with. They are selected
of 28,000. This Framingham Heart Study recruited 5209
based on the exposure status of the individual. Thus,
men and women (30–62‑year‑old) in the study to assess
some of the participants may have the exposure and
others do not have the exposure at the time of initiation
This is an open access article distributed under the terms of the Creative
Commons Attribution‑NonCommercial‑ShareAlike 3.0 License, which allows
Access this article online others to remix, tweak, and build upon the work non‑commercially, as long as the
Quick Response Code: author is credited and the new creations are licensed under the identical terms.

Website: www.e‑ijd.org For reprints contact: [email protected]

How to cite this article: Setia MS. Methodology series module 1: Cohort
DOI: 10.4103/0019-5154.174011 studies. Indian J Dermatol 2016;61:21-5.
Received: December, 2015. Accepted: December, 2015.

© 2016 Indian Journal of Dermatology | Published by Wolters Kluwer - Medknow 21


[Downloaded free from http://www.e-ijd.org on Tuesday, November 28, 2017, IP: 150.107.215.107]

Setia: Cohort studies

Types of Cohort Studies


Prospective cohort study
In this type of cohort study, all the data are collected
prospectively. The investigator defines the population
that will be included in the cohort. They then measure
the potential exposure of interest. The participants
are then classified as exposed or unexposed by the
investigator. The investigator then follows these
participants. At baseline and during follow‑up, the
investigator also collects information on other variables
Figure 1: Example of a cohort study that are important for the study (such as confounding
variables). The investigator then assesses the outcome
the factors associated with cardiovascular disease (CVD). of interest in these individuals. Some of these outcomes
The researchers also recruited second generation may only occur once (for example, death), and some
participants (children of original participants) in 1971 may occur multiple times (for example, conditions
and the third general participants in 2002. This has been which may recur in the same individual – diarrhea,
one of the landmark cohort studies and has contributed wheezing episodes, etc.).
immensely to our knowledge of some of the important
risk factors for CVD. The investigators have published 3064 Retrospective cohort study
publications using the Framingham Heart Study data. In this type of cohort study, the data are collected from
records. Thus, the outcomes have occurred in the past.
Swiss HIV cohort study (http://www.shcs.ch/) Even though the outcomes have occurred in the past,
This cohort study was initiated in 1988. It was a the basic study design is essentially the same. Thus, the
longitudinal study of HIV‑infected individuals to conduct investigator starts with the exposure and other variables
research on HIV pathogenesis, treatment, immunology, at baseline and at follow‑up and then measures the
and coinfections. They also work on the social aspects outcome during the follow‑up period.
of the disease and management of HIV‑infected pregnant
women. The study started with a recruitment of Sometimes, the direction may not be as well defined
individuals ≥16 years. The cohort was gradually expanded as prospective and retrospective. One may analyze
to include the Swiss Mother and Child HIV Cohort Study. retrospective data on a group of people well as collect
The cohort has provided useful information on various prospective data from the same individuals.
aspects of HIV and published 542 manuscripts on these Examples of prospective and retrospective
aspects. cohort studies
The Danish cohort study of psoriasis and Example 1
depression (Jensen, 2015) Our objective is to estimate the incidence of
This is another large cohort study that evaluated the cardiovascular events in patients with psoriasis. We have
association between psoriasis and onset of depression. decided to conduct a 10‑year study. All the individuals
The participants in the cohort were enrolled from who are diagnosed with psoriasis are eligible for being
national registries in Denmark. None of the included included in this cohort study. However, one has to
participants had psoriasis or depression at baseline. The ensure that none of them have cardiovascular events at
outcome of interest was the initiation of antidepressants baseline. Thus, they should be thoroughly investigated
or hospitalization for depression. The authors compared for the presence of these events at baseline before
the incidence rates of hospitalization for depression in including them in the study. For this, we have to define
psoriasis and reference population. The psoriasis group all the events we are interested in the study (such
was further classified as mild and moderate psoriasis. as angina or myocardial infarction). The criteria for
The authors found that psoriasis was an independent identifying psoriasis and cardiovascular outcomes should
risk factor for new‑onset depression in young people. be decided before initiating the study. All those who
However, in the elderly, it was mediated through do not have cardiovascular outcomes should be followed
comorbid conditions. at regular intervals (predecided by the researcher and
as required for clinical management). This will be a
We have presented examples of some large cohort prospective cohort study.
studies. It will be worthwhile to read the design and
conduct of these studies, and it will help the readers Example 2
understand the practical aspects of conducting and Our objective is to assess the survival in HIV‑infected
analyzing cohort studies. individuals and the factors associated with survival.

Indian Journal of Dermatology 2016; 61(1) 22


[Downloaded free from http://www.e-ijd.org on Tuesday, November 28, 2017, IP: 150.107.215.107]

Setia: Cohort studies

We have clinical data from about 430 HIV‑infected subjective assessment or recall by the patient.
individuals in the center. The follow‑up period ranges For example, dietary history, smoking history, or
from 3 months to 4 years, and we know that 33 alcoholic history, etc. This may help in reducing the
individuals have died in this group. We decide to perform bias in measurement of exposure
the survival analysis in this group of individuals. We • A retrospective cohort study can be completed
prepare a clinical record form and abstract data from fast and is relatively inexpensive compared with a
these clinical forms. This design will be a retrospective prospective cohort study. However, it also has other
cohort study. strengths of the prospective cohort study.

Outcomes in a Cohort Study Limitations of a Cohort Study


A cohort study may have different types of outcomes. • One major limitation of a prospective cohort design
Some of the outcomes may occur only once. In the is that is time consuming and costly. For example,
above mentioned retrospective study, if we assess the if we have to study the incidence of cardiovascular
mortality in these individuals, then the outcome will patients in patients of psoriasis, we may have to
occur only once. Other outcomes in the cohort study follow them up for many years before the outcome
may be measured more than once. For instance, if we occurs
assess CD4 counts in the same retrospective study, • In a retrospective cohort study, the exposure and
then the values of CD4 counts may change at every the outcome variables are collected before the study
visit. Thus, the outcome will be measured at every has been initiated. Thus, the measurements may not
visit. be very accurate or according to our requirements.
In addition, the some of the exposures may have
Strengths of a Cohort Study been assessed differently for various members of the
• Temporality: Since at the time of entry into the cohort
cohort study, the individuals do not have outcome, • As discussed earlier, cohort studies may not be very
the temporality between exposure and outcome is efficient for rare outcomes except in some conditions.
well defined
• A cohort study helps us to study multiple outcomes
Additional Points in Cohort Studies
in the same exposure. For example, if we follow Multiple cohort study
patients of hypercholesterolemia, we can study the Sometimes, we may be interested to compare the
incidence of melasma or psoriasis in them. Thus, outcomes in two or more groups of individuals. Thus,
there is one exposure (hypercholesterolemia) and we may have a multiple cohort study. It is important
multiple outcomes (melasma and psoriasis). However, the exposure, outcome, and other variables should be
we have to ensure that none of the individuals have measured similarly in both the study and the comparison
any of the outcomes at the baseline group.
• If the exposure is rare, then a cohort design is an
Measurement of exposure and outcome
efficient method to study the relation between
Since the individuals are included in the study based
exposure and outcomes
on the exposure status, this has to be well defined and
• It is generally said that a cohort design may not be
accurate. The outcomes also have to be well defined
efficient for rare outcomes (a case‑control design is
and measured similarly in all the participants. If you
preferred). However, if the rare outcome is common
have more than one group in the cohort (as in multiple
in some exposures, then it may be useful to follow
cohorts or reference population), you should ensure that
a cohort design. For example, melanoma is not a
the follow‑up protocols are similar in all the groups.
common condition in India. Hence, if we follow
individuals to study the incidence of melanoma, then Question: What if there is an error in measuring the
it may not be efficient. However, if we know that, exposure or the outcome?
theoretically, a particular chemical may be associated It is quite possible that individuals participating in a
with melanoma, then we should follow a cohort of cohort study may not be correctly classified – some
individuals exposed to this chemical (in occupational exposed individuals may be classified as unexposed and
settings or otherwise) and study the incidence of the other way round. If the misclassification of the
melanoma in this group exposure or the outcome is random or nondifferential,
• In a prospective cohort study, the exposure variable, then the two groups will be similar and the estimates
other variables, and outcomes may be measured more from the study will be biased towards the null. Thus, we
accurately. This is important to maintain uniformity will underestimate the association between the exposure
in the measurement of exposures and outcomes. and the outcome. If, however, the misclassification is
This is also useful for exposures that may require differential or nonrandom, then the estimates may be

23 Indian Journal of Dermatology 2016; 61(1)


[Downloaded free from http://www.e-ijd.org on Tuesday, November 28, 2017, IP: 150.107.215.107]

Setia: Cohort studies

biased toward the null, away from the null, or may be the first cardiovascular event. Thus, at the end of the
an appropriate estimate. 2nd year, 50 individuals have the outcome.
Follow‑up The total time contributed by these 50 individuals is
Follow‑up of the study participants is very important 50 × 2 years = 100 person years (PY) ‑ (A).
in a cohort study and losses are an important source of The total time contributed by the rest of the cohort
bias in these types of studies. Some patients are lost to is (10,000 − 50) × 10 = 99,500 PY ‑ (B).
follow‑up in large cohorts; however, if the proportion is
very high (>30%), then the validity of the results from Thus, the total person time is A + B = 99,600.
this study are doubtful. This loss to follow‑up becomes The incidence rate is 50/99,600 or 0.000502. As it is
all the more important if it is related to the exposure obvious from the term, this measure is a rate (compared
or outcome of interest. For example, in our prospective with cumulative incidence which was a proportion).
study, majority of the patients who were lost to Thus, the incidence rate of first cardiovascular event in
follow‑up had severe psoriasis at the baseline, then we psoriatic patients is 0.502/1000 PY or 5.02/10,000 PY.
will get biased estimates from the study. Thus, managing
follow‑ups and minimizing losses are an important Other analysis
component of the design of a cohort study. Other methods such as logistic regression, Kalpan–Meier
curves, cox‑regression, Poisson regression, lognormal
Nested case–control study regression may be useful in cohort studies. These are
This is a specific type of study design nested within relatively advanced analyses and should be discussed
a cohort study. In this, the investigator will match with a statistician.
the controls to the cases within a specific cohort. The
exposure of interest will be assessed in these selected Fixed and random effects models
cases and controls. For example, our hypothesis is One of the main strengths of a cohort study is the
that there is a biological marker that in present/ longitudinal nature of the data. Some of the variables
elevated (to begin with) in individuals who develop are time varying (such as blood pressure), and some
cardiovascular events in psoriatic patients. It is may be time independent (such as sex). The fixed and
expensive to assess this marker in all patients. Thus, random effects models are useful to handle longitudinal
we select all those who develop the outcomes (cases) data. The random effects model provides both
in our cohort and a sample of individuals who do not between‑ and within‑individual variance and is useful for
develop the outcomes (controls). An important aspect, time‑dependent and time‑independent variables. These
however, is that we should have stored the biological models are used in linear outcomes (such as body mass
material that we have collected at baseline, and the index) or categorical outcomes (such as presence/absence
biological marker should be assessed in this sample. of psoriasis). These are advanced modeling techniques
This procedure maintains the temporal strength of the and should be discussed with a statistician.
cohort study.
Some Practical Points
Analysis Project management
Cohort studies will help us to estimate the cumulative The investigator should remember that conducting a
incidence and incidence rate. large‑scale prospective cohort study requires proper
project management.
Cumulative incidence
Example Follow‑up of participants
We follow 10,000 psoriatic patients for 10 years. Of these, The investigator should devise strategies to ensure proper
50 have a cardiovascular event. Thus, the cumulative follow‑up of individuals at the designated time intervals.
incidence will be 50/10,000 or 0.005. This measure is a A computer program should be put in place at the start
proportion. Thus, the cumulative incidence will be 0.5% of the prospective study. The program should indicate the
or 5/1000. number of participants due for a visit every day. If the
individual does not visit for the next week, a reminder
Incidence rate should be sent to the individual. This can be performed
Example through texting or a phone call to the individual. Some
We follow‑up 10,000 psoriatic patients for 10 years. Of investigators hire field workers or outreach workers to
these, 50 have a cardiovascular event. ensure follow‑up of study participants.
How do we calculate the incidence rate? It is important that we include only patients with
Let us assume that all the cardiovascular events occurred permanent addresses in the area for long‑term cohort
at the end of the 2nd year. Our outcome of interest was studies. Details about the stay (permanent address,

Indian Journal of Dermatology 2016; 61(1) 24


[Downloaded free from http://www.e-ijd.org on Tuesday, November 28, 2017, IP: 150.107.215.107]

Setia: Cohort studies

temporary address, and duration of residence in the Bibliography


current address) should be a part of the inclusion 1. Hennekens CH, Buring JE. Epidemiology in Medicine. 1st ed.
criteria. Philadelphia, USA: Lippincott Williams & Wilkins; 1987.
2. Egeberg A, Khalid U, Gislason GH, Mallbris L, Skov L, Hansen
Data management PR. Impact of depression on risk of myocardial infarction,
The investigator should prioritize data management in stroke and cardiovascular death in patients with psoriasis:
these studies. The data entry program should be installed A Danish Nationwide Study. Acta Derm Venereol 2015.
DOI: 10.2340/00015555-2218.
at the start of the project. In addition, data entry and
3. Framingham Heart Study. Available from: https://www.
cleaning should be done as soon as data are collected.
framinghamheartstudy.org/index.php. [Last accessed on 2015
This will help us to identify the lacunae in the existing Nov 14].
data, loss of follow‑ups, and missing data points. 4. Jensen P, Ahlehoff O, Egeberg A, Gislason G, Hansen PR,
Skov L. Psoriasis and new‑onset depression: A  Danish
Missing data Nationwide Cohort Study. Acta Derm Venereol 2015. DOI:
It is very important to address missing data in cohort 10.2340/00015555-2183
studies. There are statistical methods to handle missing 5. Jewell N. Statistics for Epidemiology. Boca Raton, US:
data in studies – such as complete case analysis, Chapman and Hall/CRC; 2004.
available case analysis, single imputation, or multiple 6. Twisk JW. Applied Longitudinal Data Analysis for Epidemiology.
2nd ed. Cambridge, UK: Cambridge University Press; 2013.
imputations. The investigator should work with a
7. Keiser O, Taffé P, Zwahlen M, Battegay M, Bernasconi E,
statistician to address missing data in the dataset. Weber R, et al. All cause mortality in the Swiss HIV Cohort
These methods should also be described in the statistical Study from 1990 to 2001 in comparison with the Swiss
analysis section of the manuscript. population. AIDS 2004;18:1835‑43.
8. Rothman KJ, Greenland S, Lash TL. Modern Epidemiology.
Summary 3rd  ed. Philadelphia, USA: Lippincott Williams and Wilkins;
2008.
In a cohort study, participants who do not have the
10. Kleinbaum D, Kupper L, Morgenstern H. Epidemiologic
outcome at baseline are followed over time to estimate Research. New York, US: John Wiley and Sons, Inc.; 1982.
the incidence of the outcome. In this type of design, the 11. Pigott TD. A review of methods for missing data. Educ Res
temporality between the exposure and outcome is well Eval 2001;7:353‑83.
defined. The studies may be prospective, retrospective, 12. Samet JM, Munoz A. Cohort Studies. Epidemiol Rev
or a mixture of both. Prospective cohort studies may be 1998;20:1‑136.
time consuming and expensive. Losses during follow‑up 13. Swiss HIV Cohort Study, Schoeni‑Affolter F, Ledergerber B,
are an important source of bias in cohort studies; thus, Rickenbach M, Rudin C, Günthard HF, et al. Cohort profile:
The Swiss HIV Cohort study. Int J Epidemiol 2010;39:1179‑89.
measures to ensure follow‑up of participants should be
14. Hulley SB, Cummings SR, Browner WS, Grady D, Hearst N,
included in the design of a prospective cohort study. Newman TB. Designing Clinical Research. 2nd ed. Philadelphia,
Advanced modeling techniques are useful to analyze USA: Lippincot Williams and Wilkins; 2001.
longitudinal data and are preferred in cohort studies. 15. Swiss HIV Cohort Study. Available from: http://www.shcs.ch/.
[Last accessed on 2015 Nov 14].
Financial support and sponsorship 16. Szklo M, Nieto FJ. Epidemiology: Beyond the Basics. Sudbury,
Nil. MA: Jones and Bartlett Publishers, Inc.; 2004.
17. Snidjers TA, Bosker RJ. Multilevel Analysis: An Introduction to
Conflicts of interest Basic and Advanced Multilevel Modeling. 2nd  ed. London, UK:
There are no conflicts of interest. Sage Publications; 2012.

25 Indian Journal of Dermatology 2016; 61(1)

You might also like