Outbreak Investigation
Community Medicine · Epidemiology · lean revision notes
Outbreak Investigation
Outbreak investigation is the systematic, time-bound field response to an excess of cases over the expected baseline. It blends descriptive epidemiology, the epidemic curve, and analytical study designs to find the source and stop transmission. This is a perennial Community Medicine favourite because it integrates definitions, the 10 sequential steps, incubation-period arithmetic, and R0/herd-immunity formulae into one applied package.
Core definitions and classification
A precise vocabulary is examined repeatedly, so anchor these first.
| Term | Definition |
|---|---|
| Endemic | Constant/usual presence of a disease in a population/area |
| Hyperendemic | Persistently high incidence (e.g. malaria in some tribal belts) |
| Sporadic | Scattered cases with no apparent relation in time/place |
| Epidemic (outbreak) | Occurrence of cases clearly in excess of expected in a given area/period |
| Pandemic | Epidemic spreading over several countries/continents, affecting large numbers |
| Exotic | Disease imported into a country where it does not otherwise occur (e.g. rabies-free nations) |
High-yield: "Outbreak" and "epidemic" are used interchangeably; "outbreak" merely connotes a more localised event. Even a single case of a disease eradicated/never-before-seen (e.g. one case of smallpox, polio in a polio-free state, or one case of a never-reported disease) constitutes an epidemic.
Types of epidemics
- Common-source — exposure to a common agent.
- Point-source (single exposure): all cases within one incubation period; the classic explosive single-peak curve (e.g. food poisoning at a wedding).
- Continuous/intermittent common source: repeated/ongoing exposure (e.g. contaminated municipal water supply); curve has a plateau or irregular peaks.
- Propagated (progressive source): person-to-person transmission; series of progressively taller peaks one incubation period apart (e.g. hepatitis A, measles).
- Mixed: common source followed by secondary person-to-person spread (e.g. shigellosis, hepatitis A from food then contacts).
- Slow (modern) epidemics: chronic non-communicable disease "epidemics" — CHD, obesity, diabetes.
The epidemic curve — how to read it
The epidemic curve is a histogram of number of cases (y-axis) plotted against date/time of onset (x-axis). The x-unit should be roughly one-quarter to one-third of the incubation period of the suspected disease so the shape is not distorted.
Reading the curve answers four questions: pattern of spread, probable time of exposure, the likely incubation period, and presence of outliers.
| Feature | Point-source | Propagated |
|---|---|---|
| Number of peaks | Single sharp peak | Multiple progressively taller peaks |
| Spacing of peaks | — | Successive peaks ~1 incubation period apart |
| Spread of cases | All within one incubation period | Over several incubation periods |
| Up/down slope | Steep rise, gradual fall | Gradual, undulating |
| Classic example | Food poisoning | Measles, hepatitis A |
High-yield: In a point-source outbreak, counting back one incubation period from the peak of the curve estimates the time of exposure. Using the first and last cases and the known minimum/maximum incubation periods lets you bracket the exposure window from both directions.
Outliers at the leading edge may be the index/source case or background; outliers at the tail may represent secondary cases, a longer incubation, or a separate exposure — always investigate them individually.
Sequential steps of outbreak investigation
The classic field sequence (CDC / Park) — examiners love the order. A useful flow:
Establish existence → Confirm diagnosis → Define & count cases → Describe by time-place-person → Hypothesise → Test hypothesis → Refine → Additional studies → Control & prevention → Communicate.
- Verify the diagnosis & confirm the existence of an epidemic. Compare observed with expected (past records, neighbouring areas, seasonal baseline). Rule out artefacts: change in reporting, new diagnostic test, lab error, population influx.
- Confirm the diagnosis clinically and by laboratory (the investigation of choice for the agent — culture, serology, PCR). Examine a sample of cases yourself.
- Define a case and count cases. Build a case definition with clinical criteria + restrictions on time, place, person. Use a graded definition: confirmed → probable → possible (suspect). A sensitive (loose) definition early; tighten later for analysis.
- Describe the outbreak by time, place, person (descriptive epidemiology).
- Time → plot the epidemic curve.
- Place → spot map (e.g. John Snow's Broad Street pump map).
- Person → attack rates by age, sex, occupation, food eaten.
- Formulate a hypothesis about source, agent, mode of transmission, and exposure.
- Test the hypothesis with an analytical study — usually a retrospective cohort (when the population is well-defined, e.g. wedding attendees) or a case-control study (when the population is open/undefined).
- Search for additional / missing cases and refine the hypothesis.
- Conduct further studies — environmental sampling, food samples, water testing, entomological/serological surveys.
- Institute control & preventive measures (do not wait for full analysis — act on the most likely source as evidence accumulates).
- Report / communicate findings to authorities and community; write up for future prevention.
High-yield: The first step is to confirm that an epidemic actually exists (rule out pseudo-epidemics). The case definition comes before counting cases. Control measures need not wait for the investigation to conclude.
Attack rate and the analytical core
In food-borne outbreaks the food-specific attack rate table is the workhorse.
Attack rate (AR) = (number who develop the disease / number at risk) × 100, over the epidemic period. It is a form of cumulative incidence used for a short, sharply defined period.
| Food item | Ate it: ill / total (AR%) | Did NOT eat: ill / total (AR%) | RR |
|---|---|---|---|
| Suspect dish | high AR (e.g. 80%) | low AR (e.g. 10%) | high |
The implicated food is the one with (a) high attack rate among those who ate it, (b) low attack rate among those who did not, and (c) the largest difference (relative risk) between the two.
- Cohort study → compute Relative Risk (RR) = AR(exposed)/AR(unexposed).
- Case-control study → compute Odds Ratio (OR) (denominator unknown, so RR cannot be computed).
High-yield: When the entire exposed population is enumerable (a defined "cohort" of attendees), a retrospective cohort is the design of choice and RR is calculated. When cases arise from an undefined population, use case-control and OR.
Secondary attack rate (SAR) = number of new cases among contacts within the incubation period ÷ total susceptible contacts (× 100). The SAR measures infectivity/communicability and the effect of person-to-person spread — high SAR favours a propagated component.
Incubation period arithmetic
The incubation period (time from exposure to onset of symptoms) is central to interpreting curves and identifying the agent.
- Median incubation period points towards the likely organism (e.g. Staph aureus enterotoxin 1–6 h; Bacillus cereus emetic 1–6 h; Clostridium perfringens 8–16 h; Salmonella 12–36 h; Vibrio cholerae hours–5 days; hepatitis A 15–50 days).
- The range (min–max) of the incubation period plus the dates of first and last cases brackets the probable exposure window.
High-yield: Short incubation + vomiting predominant → preformed toxin (Staph, B. cereus emetic). Longer incubation + diarrhoea predominant → in-vivo organism multiplication/invasion (Salmonella, C. perfringens, Shigella). This toxin-vs-invasion split is a classic MCQ.
R0, effective R, and herd immunity
Basic reproduction number (R0): the average number of secondary cases produced by one infectious case introduced into a wholly susceptible population.
- R0 < 1 → infection dies out.
- R0 = 1 → endemic, stable.
- R0 > 1 → epidemic potential; larger R0 = faster/larger spread.
Effective reproduction number (R / Rt): secondary cases in a population that is not wholly susceptible (immunity, interventions present). Control aims to push Rt below 1.
| Disease | Approximate R0 |
|---|---|
| Measles | 12–18 |
| Pertussis | 12–17 |
| Diphtheria | 6–7 |
| Smallpox | 5–7 |
| Polio | 5–7 |
| Mumps / Rubella | 4–7 |
| COVID-19 (ancestral) | 2–3 |
| Influenza (pandemic) | 2–3 |
Herd immunity is the resistance of a group to an infectious agent based on the immunity of a high proportion of members, which reduces the chance a susceptible meets an infectious case.
Herd immunity threshold (HIT) — the proportion that must be immune to interrupt transmission:
HIT = 1 − (1/R0) = (R0 − 1)/R0
The critical vaccination proportion (Vc), accounting for vaccine efficacy (E):
Vc = HIT / E = [1 − (1/R0)] / E
High-yield: For measles (R0 ≈ 15), HIT = 1 − 1/15 ≈ 0.93 (93%) — the highest among common vaccine-preventable diseases, which is why measles needs ~95% coverage to eliminate. For R0 = 4, HIT = 75%; for R0 = 2, HIT = 50%. Memorise the relationship: higher R0 → higher HIT.
Worked example: R0 = 5, vaccine efficacy 90%. HIT = 1 − 1/5 = 0.80 (80%). Vc = 0.80/0.90 = 0.89 → 89% coverage needed.
Herd immunity does not apply to diseases without person-to-person spread (e.g. tetanus — every individual must be individually protected) and is undermined by clustering of susceptibles (anti-vaccine pockets) even if overall coverage looks adequate.
Control measures and their prioritisation
Control measures are directed at the three links of the chain: source/reservoir, mode of transmission, and the susceptible host.
- Controlling the source/reservoir — early diagnosis, notification, isolation, treatment, and (for zoonoses) dealing with the animal reservoir; quarantine of contacts.
- Interrupting transmission — environmental measures: safe water, chlorination, food withdrawal, vector control, hand hygiene, disinfection.
- Protecting the susceptible host — immunisation (active/passive), chemoprophylaxis, personal protection.
High-yield: In an outbreak the priority is to act on the most readily interruptible link first. For a point-source food outbreak → remove/withdraw the implicated food and treat cases. For a water-borne outbreak (cholera) → chlorinate/secure water and ORS-based case management even before lab confirmation. Immunisation/chemoprophylaxis follows where applicable (e.g. measles outbreak → ring vaccination; meningococcal → chemoprophylaxis of contacts).
Special tools and eponyms
- Spot map (dot map): plotting cases by residence/location — central to place analysis. John Snow's Broad Street pump map (London cholera, 1854) is the founding example of field epidemiology; removing the pump handle is the archetypal control action. Snow is the "father of epidemiology."
- Line listing: a table of cases with their identifying, clinical, and exposure variables — the raw data structure for the investigation.
- Epidemic threshold / endemic channel (median + upper limit on a control chart) used to declare an epidemic, classically for meningococcal disease and malaria surveillance.
- Berkson bias, recall bias — pitfalls particularly in the case-control phase.
Complications and pitfalls in interpretation
- Pseudo-epidemics from changes in reporting, new tests, or population migration — must be excluded at step 1.
- Mislabelling propagated vs intermittent common-source curves — both can show multiple peaks; the spacing (≈ one incubation period for propagated) is the discriminator.
- Incomplete case ascertainment biasing attack rates.
- Confounding in food-specific tables (people who ate dish A also ate dish B) — resolved by stratified attack-rate analysis (cross-tabulation) to find the truly implicated item.
Key differentials — distinguishing the scenarios
| Scenario | Curve | Best clue | Design / action |
|---|---|---|---|
| Wedding food poisoning | Single sharp peak | All onset within hours; same meal | Cohort + food AR table; withdraw food |
| Municipal water cholera | Plateau / sustained | Cases follow water supply zone | Spot map; chlorinate water |
| Measles in school | Successive peaks ~14 days apart | Person-to-person, susceptibles | Ring vaccination; isolate |
| Imported single polio case | n/a | One case = epidemic | Mop-up immunisation, surveillance |
Recently asked / exam angle
- Identify outbreak type from a given epidemic curve — single peak = point-source; progressively taller peaks one incubation period apart = propagated. This image-based MCQ is extremely common.
- Estimate exposure time by counting back one incubation period from the peak; or bracket it using min/max incubation and first/last case onset dates.
- Calculate HIT/Vc given R0 (and sometimes vaccine efficacy): HIT = 1 − 1/R0. Measles ≈ 93–95% is a repeat favourite.
- Sequence the steps — "first step in outbreak investigation" → confirm the existence of an epidemic / verify diagnosis. "Case definition before or after counting?" → before.
- Choice of study design — defined population → retrospective cohort (RR); undefined → case-control (OR).
- Implicated food identification from a food-specific attack-rate table — pick the item with high AR among eaters, low among non-eaters, highest RR.
- R0 ranking — measles/pertussis highest among the listed diseases.
- John Snow / Broad Street pump as the historical landmark of descriptive epidemiology.
- Toxin vs invasion food poisoning by incubation period (short + vomiting vs longer + diarrhoea).
Rapid revision
- First step: confirm an epidemic truly exists (compare observed vs expected; exclude pseudo-epidemic).
- Case definition is made BEFORE counting cases; grade as confirmed → probable → possible.
- Point-source curve = single sharp peak, all within one incubation period; propagated = multiple peaks one incubation period apart.
- Exposure time = peak of curve minus one incubation period.
- Choose epidemic-curve x-axis unit ≈ ¼–⅓ of the incubation period.
- Defined population → cohort → RR; undefined → case-control → OR.
- Implicated food: highest attack rate among eaters, lowest among non-eaters, largest RR.
- Secondary attack rate measures communicability/person-to-person spread.
- R0 < 1 dies out, = 1 endemic, > 1 epidemic. Measles R0 ≈ 12–18 (highest of the common ones).
- HIT = 1 − 1/R0; measles ≈ 93%; Vc = HIT/vaccine efficacy.
- Herd immunity does not apply to non-communicable links like tetanus.
- Don't wait for full analysis to start control; John Snow's Broad Street pump is the classic descriptive-epidemiology eponym.