A flexible modeling approach to estimating the component effects of smoking behavior on lung cancer

https://doi.org/10.1016/j.jclinepi.2004.02.014Get rights and content

Abstract

Objective

Despite the established causal association between cigarette smoking and lung cancer, the relative contributions of age started, duration, years since quitting, and daily amount smoked have not been well characterized. We estimated the contribution of each of these aspects of smoking behavior.

Study design and setting

A case-control study was conducted in Montreal on the etiology of lung cancer. There were 640 cases and 938 control subjects for whom lifetime smoking histories were collected. We used generalized additive models, incorporating cubic smoothing splines to model nonlinear effects of various smoking variables. We adopted a multistep approach to deal with the multicollinearity among time-related variables.

Results

The main findings are that (1) risk increases independently by daily amount and by duration; (2) among current smokers, lung cancer risk doubles for every 10 cigarettes per day up to 30 to 40 cigarettes per day and tails off thereafter; (3) among ex-smokers, the odds ratio decreases with increasing time since quitting, the rate of decrease being sharper among heavy smokers than among light smokers; and (4) absolute risks demonstrate the dramatic public health benefits of long-term smoking cessation.

Conclusion

Our results reinforce some previous findings on this issue.

Introduction

Lung cancer is the most frequent malignant cancer in the world and is a major cause of death. The vast majority of lung cancer cases are attributable to cigarette smoking [1], a widespread habit still increasing in many countries [2]. Although the broad outlines of the smoking/cancer relationship are well established, it is not clear how the risk of lung cancer is affected by different aspects of smoking behavior, including age started, duration, years since quitting, and daily amounts. Understanding how these components contribute to lung cancer risk can help to predict the impact of different public health interventions or personal behavioral strategies and takes advantage of one of the clearest human models of the process of carcinogenesis after exposure to an environmental carcinogen. Thus, an in-depth investigation of the way that age and duration interact with amount smoked to produce different levels of risk may help us to understand some basic principles of environmental carcinogenesis.

Many studies on smoking and lung cancer have addressed exposure-response issues. However, the quality of the exposure data has often been questionable, with smoking assessed at one point in time rather than over the lifetime. Many studies have examined only one measure of exposure, usually pack-years, or considered each smoking variable separately without adjusting for other aspects of smoking history [3], [4], [5], [6]. Further, most exposure-response relationships were estimated using conventional parametric general linear models that impose strong a priori assumptions about the functional form of the relationship or categorize continuous variables.

The purpose of this article is to explore the quantitative relationships between several aspects of smoking behavior and lung cancer risk. To better understand the exposure-response relationships, we incorporated multiple smoking variables at a time, although this is limited by the multicollinearity among some of the time-related variables. We use generalized additive models (GAMs) that allow for flexible assumptions-free modeling of exposure-response functions [7]. The methods are applied to data collected in a large cancer case-control study conducted in Montreal in the 1980s.

Section snippets

Methods

The study design and data collection methods have been presented in detail elsewhere [8], [9]. Nineteen cancer sites were selected for study among men 35 to 70 years of age who were living in the Montreal area. Participation of all large hospitals in the area assured virtually complete population-based ascertainment of incident cases. Between 1979 and 1985, 3,730 cancer patients (82% response rate) and 533 control subjects selected from electoral lists and by random digit dialing (71% response

Description

This was a heavily smoking population. Nearly 80% of the control subjects had ever been regular smokers, and fewer than 40% had quit. Among control subjects, the average smoker had smoked 28 cigarettes per day (cig/day) for 36 years. Table 2 reports distributions of smoking variables among case subjects and two control groups. About 20% of control subjects were never-smokers, compared with 1.25% of lung cancer case subjects. Table 2 also provides ORs for each categorized smoking variable, for

Discussion

We estimated the contributions to risk of various aspects of smoking behavior by dealing with the two major methodologic obstacles: multicollinearity of the different aspects and possible nonlinearity of their effects. Our model-building involved two main steps: (i) restricting the analyses to current- and never-smokers to eliminate any effect of time since quitting and to estimate effects of age started and duration; and (ii) carrying out analyses in the entire sample, fixing the duration

Acknowledgments

The collection of original data was supported by grants from the Institut de Recherche en SantƩ et SƩcuritƩ du Travail du QuƩbec, the National Research and Development Program, and the National Cancer Institute of Canada (Principal Investigator: Dr. Jack Siemiatycki). Dr. Bernard Rachet was supported by an international research fellowship from the Ligue Nationale Contre le Cancer, France; a postdoctoral fellowship from the Institut National de la Recherche Scientifique-Institut

References (14)

  • E. Matos et al.

    Lung cancer and smoking: a case-control study in Buenos Aires, Argentina

    Lung Cancer

    (1998)
  • D.M. Parkin et al.

    At least one in seven cases of cancer is caused by smoking. Global estimates for 1985

    Int J Cancer

    (1994)
  • R. Peto et al.

    Smoking, smoking cessation, and lung cancer in the UK since 1950: combination of national statistics with two case-control studies

    BMJ

    (2000)
  • M. Dosemeci et al.

    Mortality among laboratory workers employed at the U.S. Department of Agriculture

    Epidemiology

    (1992)
  • M. Kreuzer et al.

    Risk factors for lung cancer in young adults

    Am J Epidemiol

    (1998)
  • J.E. Muscat et al.

    Cigarette smoking and large cell carcinoma of the lung

    Cancer Epidemiol Biomarkers Prev

    (1997)
  • T.J. Hastie et al.

    Generalized additive models

    (1990)
There are more references available in the full text version of this article.

Cited by (43)

  • Epidemiology of Lung Cancer

    2021, Encyclopedia of Respiratory Medicine, Second Edition
  • Sex-specific effects of leisure-time physical activity on cause-specific mortality in NHANES III

    2017, Preventive Medicine
    Citation Excerpt :

    In addition, all models were stratified by sex due to different sex-specific physical activity and disease patterns. We decided to carefully adjust for smoking by separately including duration of smoking and smoking intensity which is preferable over the use of pack-years (Leffondre et al., 2002; Rachet et al., 2004). All analyses were weighted according to the NHANES III guidelines to minimize sampling variability, effects of non-response, and differential selection probability (National Center for Health Statistics, 1994).

  • Flexible modeling of disease activity measures improved prognosis of disability progression in relapsing-remitting multiple sclerosis

    2015, Journal of Clinical Epidemiology
    Citation Excerpt :

    Accounting for such violations of the conventional assumptions may be essential to both avoid biased estimation and detect a statistically significant association [20,25,26]. Flexible modeling of the effects of prognostic factors has been advocated in several methodological articles in major epidemiology journal [16,20,27,28]. However, in spite of high relevance of, on one hand, (1) modeling of time-varying covariates and, on the other hand, accounting for possible violations of (2) the PH and/or (3) the log-linearity assumptions; to date, only few prognostic studies have simultaneously addressed all these issues.

View all citing articles on Scopus
View full text