Data Dictionary for examples presented in ActivEpi Textbook (2nd edition, 2013) and Interactive ActivEpi

By David Kleinbaum, Kevin M. Sullivan & Minn M. Soe

Data Sets for examples that are described in activities/expositions throughout Interactive ActivEpi and Companion Text have also been put into freely downloadable data files in SAS, Excel, and STATA formats, for use by students and instructors. Most statistical programs have the ability to import Excel files.

Lesson 2: Sydney Beach prospective cohort study

The data are from a prospective cohort study designed to investigate if swimming in beaches around Sydney, Australia, was associated with acute illnesses.  Swimming was defined as placing your head underwater. (Refer to ActivEpi Companion Textbook, Lesson 2, page 25.)

File Name: L02swim Number of records: 2839

Variable Information

Label

Values

Description

Freq

Whether a person became ill

ILL

1. Yes

2. No

Disease

No Disease

683

2156

Whether a person went swimming

SWAM

1. Yes

2. No

Exposed

Not exposed

1924

915

 

Lesson 3: Diabetes randomized clinical trial

The data are based on a randomized clinical trial where patients were randomized to receive standard insulin therapy (“standard therapy”) or an intensive insulin therapy (“intensive therapy”).  The outcome was retinopathy. (Refer to ActivEpi Companion Textbook, Lesson 3, page 42.)

File Name: L03diabetes Number of records: 726

Variable Information

Label

Values

Description

Freq

Whether a person developed retinopathy

RETINOP

1. Yes

2. No

Disease

No Disease

114

612

Treatment group

THERAPY

1. Std

2. Int

Standard Therapy

Intensive Therapy

378

348

 

 

 Lesson 3: Exposure to VDTs and spontaneous abortion in a retrospective cohort study

The data are based on a retrospective cohort study to assess if there was a relationship between exposure to video display terminals (VDTs) and spontaneous abortion.  (Refer to ActivEpi Companion Textbook, Lesson 3, starting on page 45.)

File Name: L03vdt     Number of records: 882

Variable Information

Label

Values

Description

Freq

Spontaneous abortion?

ABORTION

1. Yes

2. No

Disease

No Disease

136

746

VDT exposure?

VDT

1. Yes

2. No

Exposed

Not Exposed

366

516

 

Lessons 5, 6, and 12: Relationship of continued cigarette smoking with 5 year mortality among patients who had been diagnosed with a heart attack, prospective cohort study

The data are based on a group of men who had been diagnosed as having a heart attack restricted to those who were cigarette smokers.  After the heart attack, some men continued to smoke and others quit smoking.  The cohort was followed for five years and the risk of mortality assessed.  (Refer to ActivEpi Companion Textbook, Lesson 5, starting on page 106; Lesson 6, starting on page 139; Lesson 12 starting on page 346.)

File Name: L05cohort     Number of records: 156

Variable Information

Label

Values

Description

Freq

Outcome after five years (died or survived)

OUTCOME

Death

Survived

Disease

No Disease

41

115

Smoked cigarettes?

SMOKE

1. Smoke

2. Quit

Exposed

Not Exposed

75

81

 

Lessons 5 and 12: Contaminated food exposure and diarrheal disease case-control study

The data are based on an outbreak of a diarrheal disease in a resort in Haiti.  An unmatched case-control study was performed and the primary exposure was found to be consumption of raw hamburger.  (Refer to ActivEpi Companion Textbook, Lesson 5, starting on page 108; Lesson 12 starting on page 361.)

File Name: L05casecont     Number of records: 70

Variable Information

Label

Values

Description

Freq

Case or a control?

CASECONT

case

control

Disease

No Disease

37

3

Ate raw hamburger?

ATE

1. yes

2. no

Exposed

Not Exposed

24

46

 

Lessons 10 and 14: Association between toxic chemical “X” (TCX) and lung cancer controlling for cigarette smoking in a retrospective cohort study

The data are based on a hypothetical retrospective cohort study to assess the association between TCX and lung cancer, accounting for cigarette smoking.  (Refer to ActivEpi Companion Textbook, Lesson 10, page 285 and Lesson 14, pages 420-435.)

File Name: L10lungca     Number of records: 156

Variable Information

Label

Values

Description

Freq

Lung cancer?

LUNGCA

1. Yes

2. No

Disease

No Disease

41

115

Exposure to TCX?

TCX

1. Yes

2. No

Exposed

Not Exposed

75

81

Cigarette smoker?

SMOKE

1. Yes

2. No

 

81

75

 

Lesson 15: Association between estrogen use and endometrial cancer, a matched case-control study

The data are based on a pair-matched case-control study where cases were women diagnosed with endometrial cancer and controls were matched based on age, marital status, and date of entry into a retirement center.  These data allow for the analysis of pair-matched data using a stratified analysis approach.  (Refer to ActivEpi Companion Textbook, Lesson 15, starting on page 490.)

File Name: L15match     Number of records: 126

Variable Information

Label

Values

Description

Freq

Code to match each control to their case

stratum

 1-63

 

 

Case or control?

disease

1

2

Case (endometrial cancer)

Control (no disease)

63

63

Used estrogen?

estrogen

1

2

Yes

No

86

40

 

Lesson 15: Association between estrogen use and endometrial cancer, a matched case-control study with a covariate

These are the same data as above with the exception of an additional variable for presence/absence of gall bladder disease and 1/0 coding instead of 1/2 coding to be consistent with use in logistic regression. (Refer to ActivEpi Companion Textbook, Lesson 15, starting on page 504.)

File Name: L15matchLR     Number of records: 126

Variable Information

Label

Values

Description

Freq

Code to match each control to their case

stratum

 1-63

 

 

Case or control?

disease

1

0

Case (endometrial cancer)

Control (no disease)

63

63

Used estrogen?

estrogen

1

0

Yes

No

86

40

History of gall bladder disease?

gall

1

0

Yes

No