Reliability of hamilton-norwood classification.

Guarrera M; Cardo P; Arrigo P; Rebora A

doi:10.4103/0974-7753.58554

Reliability of hamilton-norwood classification.

Affiliations

1. Department of Endocrinological and Medical Sciences, Section of Dermatology, Genoa, Italy.
Authors
Guarrera M¹
(1 author)

ORCIDs linked to this article

Rebora A | 0000-0001-6233-9980

International Journal of Trichology, 01 Jul 2009, 1(2):120-122
https://doi.org/10.4103/0974-7753.58554 PMID: 20927233 PMCID: PMC2938573

Articles in the Open Access Subset are available under a Creative Commons license. This means they are free to read, and that reuse is permitted under certain circumstances. There are six different Creative Commons licenses available, see the copyright license for this article to understand what type of reuse is permitted.

Free full text in Europe PMC

Abstract

Background

Hamilton-Norwood scale (HNS) has been largely used to assess clinically the severity of androgenetic alopecia (AGA), especially for therapeutical trials and even to establish its association with important diseases such as ischemic heart disease and prostate cancer.

Objective

To study HNS reproducibility in the hands of dermatologists and dermatology residents.

Materials and methods

Seven dermatologists and 16 residents in dermatology classified 43 photographs of male heads with different degrees of AGA. In a second study, 8 appraisers (3 dermatologists and 5 residents in dermatology) examined 56 pictures with the same procedure and repeated the observation 3 months later. In the first study, the inter-rater agreement was estimated by calculating an intra-class correlation coefficient (ICC). In the second study, for intra-rater repeatability, each rater's scores from session 1 were paired with his/her scores for the same subjects in session 2, and the ordinary least products linear regression was calculated.

Results

In the first study, the concordance of appraisers was unsatisfactory (ICC = 0.63-0.68)]. In the second study, repeatability was poor, without any significant difference between dermatologists and dermatology residents.

Comment

Reliability of HNS is unsatisfactory even in the hands of expert appraisers. To obtain better reliability, the number of classes should be reduced, but with such reduction HNS would be usable to classify patients only in a broad way.

Free full text

Int J Trichology. 2009 Jul-Dec; 1(2): 120–122.

https://doi.org/10.4103/0974-7753.58554

PMCID: PMC2938573

PMID: 20927233

Reliability of Hamilton-Norwood Classification

M Guarrera, P Cardo,¹ P Arrigo,² and A Rebora

Author information Copyright and License information Disclaimer

This article has been cited by other articles in PMC.

Go to:

Abstract

Background:

Objective

To study HNS reproducibility in the hands of dermatologists and dermatology residents.

Materials and Methods:

Results:

Comment:

Keywords: Alopecia, baldness, hair, Hamilton

Go to:

INTRODUCTION

Hamilton-Norwood scale (HNS) has been largely used to assess clinically the severity of androgenetic alopecia (AGA), especially for therapeutical trials, and to establish its association with important diseases such as ischemic heart disease and prostate cancer.[1‐8] Its reproducibility, however, has been poorly studied.[9,10] We investigated HNS reproducibility in the hands of dermatologists and dermatology residents and found it unsatisfactory.

Go to:

MATERIALS AND METHODS

Seven dermatologists and 16 dermatology residents were recruited as raters. They classified 43 photographs of male heads showing different degrees of AGA by constantly checking a cartoon depicting HNS, except the anterior model. Each examiner independently scored the pattern of each photograph as I, II, III, III vertex, IV, V, VI or VII degree using ordinals between 1 and 7 plus a value of 3.5 corresponding to III vertex degree.

In a second study, 3 dermatologists and 5 dermatology residents scored 56 randomly selected photographs on two separate occasions, 3 months apart, with the same procedure. The pictures were shown in the second session in the same order as in the former one.

Statistical analyses were done using SPSS version 17.0 (SPSS Inc., Chicago, IL).

In the first study, the inter-rater reproducibility was estimated by calculating the intra-class correlation coefficient (ICC) based on an ANOVA mixed model.[11] The value of ICC tends to be smaller than 1. The closer the ICC is to 1, the more similar the samples are. In the second study for intra-rater repeatability, each rater's scores from session 1 were paired with his/her scores for the same subjects in session 2, and the ordinary least products (OLP) linear regression and R² were calculated.[12] R² is a measure of linear association between two variables and corresponds to 1 when correlation is perfect.

Go to:

RESULTS

In the first study, data were greatly dispersed (coefficients of variation varied from 66% to 357%). ICC was 0.65 (P < 0.001). Only a slight difference between dermatologists and dermatology residents (0.631 vs. 0.683, respectively) was observed [Table 1].

Table 1

Intraclass correlation coefficients

Raters	N	Intraclass correlation	95% confidence interval
Residents	16	0.631	0.527 ÷ 0.739
Dermatologists	7	0.683	0.571 ÷ 0.788
Total	23	0.655	0.556 ÷ 0.757

Two-way random effects model where both raters' effects and measure effects are random, using an absolute agreement definition

In the second study, the correlation between the data from the first session and those from the second one (repeatability) was unsatisfactory. In fact, only one dermatologist achieved 0.75 of adjusted R². No significant difference between dermatologists and dermatology residents was observed [Table 2].

Table 2

Ordinary least products linear regression analysis for repeatability

Raters	Name	Slope	Bias	Adjusted R²
	GV	0.893	-0.145	0.6363
	AV	0.845	0.822	0.7443
Residents	MR	0.919	0.405	0.7111
	AC	0.760	0.887	0.4578
	MB	0.788	0.970	0.7226
	LY	0.880	0.687	0.6887
Dermatologists	FV	0.915	0.308	0.5991
	MG	0.856	0.480	0.7535

In a perfect repeatibility, slope must be 1 and bias 0. R2 varies from 0 to 1.

Go to:

DISCUSSION

AGA severity and drug efficacy have been repeatedly assessed by HNS. HNS has been employed even to evaluate issues as important as the relationship of baldness with coronary artery disease and prostate cancer. In such cases, the assessment has often been done by non-dermatologists or even by the patients themselves. HNS reliability has been verified recently. Taylor and - colleagues[9] examined 105 males who were invited to select a picture that best represented their balding pattern. Two trained appraisers independently assessed the participants' balding patterns, and the men's self-assessment was compared with the assessment of the two trained appraisers. Appraisers were very reliable in their assessment of balding pattern (Cohen's κ = 0.83); but when compared to the two trained appraisers, their concordance fell to 0.39-0.46. It should be noted, however, that the classes of HNS were reduced to 4. In a second analysis, Taylor et al. studied the concordance between the assessment by each observer and that by the patients of the balding patterns of the patients at age 35. The latter proved to be only moderately accurate. Littmann and -colleagues[10] studied 100 men who were invited to describe their hair patterning at age 30, at age 45 and their current age (50 to 76). κ values were 0.74, 0.71 and 0.81, respectively, but HNS classes were reduced to only 3. In addition, κ for the comparison of the subjects' report with the interviewer's assessment was only 0.47.

Two considerations should be done, however, when appraising those studies. The first is that to achieve a good agreement, HNS classes had to be reduced to 4 and 3, respectively.[9,10] The second consideration concerns the applicability of Cohen's κ and Fleiss' κ statistics. Fleiss' κ is a statistical measure of inter-rater reliability, which, differently from Cohen's κ, works for multiple raters giving categorical ratings to a fixed number of items. It expresses the extent to which the observed agreement among raters exceeds that expected if all raters did their ratings completely randomly. This test ranks the agreement on the base of estimated k value. As currently accepted, the agreement is regarded as poor if κ ≤ 0, slight if 0 < κ ≤ 0.20, fair if 0.20 < κ ≤ 0.40, moderate if 0.40 < κ ≤ 0.60, substantial if 0.60 < κ ≤ 0.80, perfect if 0.80 < κ ≤ 1.00.[13] One disadvantage of Fleiss' k is that it ignores any ordering. For example, the disagreement between class III and class VII has the same weight as that between class III and class III vertex.[14] We adopted ICC just to overcome such problem.

On the other hand, if to obtain a good reliability the number of classes is reduced, the classification becomes usable, as Chamberlain and Dawber put it, only in the broader classification of patients who are likely to respond to therapies.[15]

ICC is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other and is currently used to assess the reproducibility of quantitative measurements made by different observers measuring the same quantity.

Assessing AGA severity is not easy, especially if reliable, inexpensive and minimally invasive means are required. In our hands, HNS proved unsatisfactorily reproducible. A new classification, universal for men and women, has been recently introduced[16] but still awaits validation. In its stead, computer-assisted measurements of hair density[17] or a cheaper and easier method as the modified wash test[18] should be adopted. The latter provides the percentage of vellus telogen hairs, which is probably a more accurate measure of AGA severity.

Go to:

Footnotes

The paper was presented at the 14^th Annual Meeting of EHRS, July 2-4, 2009, Graz, Austria.

Source of Support: Nil

Conflict of Interest: None declared

Go to:

REFERENCES

1. Lesko SM, Rosenberg L, Shapiro S. A case-control study of baldness in relation to myocardial infarction in men. JAMA. 1993;269:998–1003. [Abstract] [Google Scholar]

2. Rebora A. Baldness and coronary artery disease: The dermatologic point of view of a controversial issue. Arch Dermatol. 2001;137:943–7. [Abstract] [Google Scholar]

3. Oishi K, Okada K, Yoshida O, Yamabe H, Ohno Y, Hayes RB, et al. Case-control study of prostatic cancer in Kyoto, Japan: Demographic and some lifestyle risk factors. Prostate. 1989;14:117–22. [Abstract] [Google Scholar]

4. Demark-Wahnefried W, Lesko SM, Conaway MR, Robertson CN, Clark RV, Lobaugh B, et al. Serum androgens: Associations with prostate cancer risk and hair patterning. J Androl. 1997;18:495–500. [Abstract] [Google Scholar]

5. Denmark-Wahnefried W, Schildkraut JM, Thompson D, Lesko SM, McIntyre L, Schwingl P, et al. Early onset baldness and prostate cancer risk. Cancer Epidemiol Biomarkers Prev. 2000;9:325–8. [Abstract] [Google Scholar]

6. Hawk E, Breslow RA, Graubard BI. Male pattern baldness and clinical prostate cancer in the epidemiologic follow-up of the first National Health and Nutrition Examination Survey. Cancer Epidemiol Biomarkers Prev. 2000;9:523–7. [Abstract] [Google Scholar]

7. Wynder EL, Mabuchi K, Whitmore WF., Jr Epidemiology of cancer of the prostate. Cancer. 1971;28:344–60. [Abstract] [Google Scholar]

8. Giles GG, Severi G, Sinclair R, English DR, McCredie MR, Johnson W, et al. Androgenetic alopecia and prostate cancer: Findings from an Australian case-control study. Cancer Epidemiol Biomarkers Prev. 2002;11:549–53. [Abstract] [Google Scholar]

9. Taylor R, Matassa J, Leavy JE, Fritschi L. Validity of self reported male balding patterns in epidemiological studies. BMC Public Health. 2004;4:60. [Europe PMC free article] [Abstract] [Google Scholar]

10. Littman AJ, White E. Reliability and validity of self-reported male balding patterns for use in epidemiologic studies. Ann Epidem. 2005;15:771–2. [Abstract] [Google Scholar]

11. Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bulletin. 1979;86:420–7. [Abstract] [Google Scholar]

12. Ludbrook J. Comparing methods of measurement. Clin Exp Pharmacol Physiol. 1997;24:193–203. [Abstract] [Google Scholar]

13. Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York: Wiley: 1981. [Google Scholar]

14. Bland JM, Altman DG. Statistics notes: Measurement error and correlation coefficients. BMJ. 1996;313:41–2. [Europe PMC free article] [Abstract] [Google Scholar]

15. Chamberlain AJ, Dawber RP. Methods of evaluating hair growth. Australas J Dermatol. 2003;44:10–8. [Abstract] [Google Scholar]

16. Lee WS, Ro BI, Hong SP, Bak H, Sim WY, Kim do W, et al. A new classification of pattern hair loss that is universal for men and women: Basic and specific (BASP) classification. J Am Acad Dermatol. 2007;57:37–46. [Abstract] [Google Scholar]

17. Blume-Peytavi U, Hillmann K, Guarrera M. Hair growth assessment techniques. Hair growth and disorders. In: Blume-Peytavi U, Tosti A, Whiting DA, Trüeb R, editors. Berlin: Springer: 2008. pp. 125–54. [Google Scholar]

18. Rebora A, Guarrera M, Baldari M, Vecchio F. Distinguishing androgenetic alopecia from chronic telogen effluvium when associated in the same patient: A simple non invasive method. Arch Dermatol. 2005;141:1243–5. [Abstract] [Google Scholar]

Articles from International Journal of Trichology are provided here courtesy of Wolters Kluwer -- Medknow Publications

Full text links

Read article at publisher's site: https://doi.org/10.4103/0974-7753.58554

Read article for free, from open access legal sources, via Unpaywall: https://europepmc.org/articles/pmc2938573

Citations & impact

Impact metrics

Citations

Jump to Citations

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/15095343

Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/15095343

Smart citations by scite.ai
Explore citation contexts and check if this article has been supported or disputed.
https://scite.ai/reports/10.4103/0974-7753.58554

Supporting

Mentioning

Contrasting

Article citations

Prioritizing susceptibility genes for the prognosis of male-pattern baldness with transcriptome-wide association study.
Choi E, Song J, Lee Y, Jeong Y, Jang W
Hum Genomics, 18(1):34, 02 Apr 2024
Cited by: 0 articles | PMID: 38566255 | PMCID: PMC10985920
Articles in the Open Access Subset are available under a Creative Commons license. This means they are free to read, and that reuse is permitted under certain circumstances. There are six different Creative Commons licenses available, see the copyright license for this article to understand what type of reuse is permitted.
Free full text in Europe PMC
Factors associated with early-onset androgenetic alopecia: A scoping review.
Liu LP, Wariboko MA, Hu X, Wang ZH, Wu Q, Li YM
PLoS One, 19(3):e0299212, 07 Mar 2024
Cited by: 0 articles | PMID: 38451966 | PMCID: PMC10919688
Review
Articles in the Open Access Subset are available under a Creative Commons license. This means they are free to read, and that reuse is permitted under certain circumstances. There are six different Creative Commons licenses available, see the copyright license for this article to understand what type of reuse is permitted.
Free full text in Europe PMC
Association between Androgenetic Alopecia and Psychosocial Disease Burden: A Cross-Sectional Survey among Polish Men.
Adamowicz R, Załęcki P, Dukiel A, Nowicka D
Dermatol Res Pract, 2022:1845044, 17 Mar 2022
Cited by: 6 articles | PMID: 35340914 | PMCID: PMC8947924
Articles in the Open Access Subset are available under a Creative Commons license. This means they are free to read, and that reuse is permitted under certain circumstances. There are six different Creative Commons licenses available, see the copyright license for this article to understand what type of reuse is permitted.
Free full text in Europe PMC
Can we predict prostate size by scoring baldness? The relationship of androgenic alopecia and lower urinary tract symptoms.
Aourag N, Langenhuijsen JF, d'Ancona F, Heesakkers J
Cent European J Urol, 72(1):39-43, 12 Mar 2019
Cited by: 1 article | PMID: 31011438 | PMCID: PMC6469014
Articles in the Open Access Subset are available under a Creative Commons license. This means they are free to read, and that reuse is permitted under certain circumstances. There are six different Creative Commons licenses available, see the copyright license for this article to understand what type of reuse is permitted.
Free full text in Europe PMC
Efficacy and safety of a new 5% minoxidil formulation in male androgenetic alopecia: A randomized, placebo-controlled, double-blind, noninferiority study.
Blume-Peytavi U, Issiakhem Z, Gautier S, Kottner J, Wigger-Alberti W, Fischer T, Hoffmann R, Tonner F, Bouroubi A, Voisard JJ
J Cosmet Dermatol, 18(1):215-220, 16 Apr 2018
Cited by: 2 articles | PMID: 29659116

Go to all (16) article citations

Other citations

Wikipedia

https://en.wikipedia.org/wiki/Hamilton–Norwood_scale

Search life-sciences literature (43,973,697 articles, preprints and more)

Reliability of hamilton-norwood classification.

Author information

Affiliations

Authors

ORCIDs linked to this article

Abstract

Background

Objective

Materials and methods

Results

Comment

Free full text

Reliability of Hamilton-Norwood Classification

M Guarrera

P Cardo

P Arrigo

A Rebora

Abstract

Background:

Objective

Materials and Methods:

Results:

Comment:

INTRODUCTION

MATERIALS AND METHODS

RESULTS

Table 1

Table 2

DISCUSSION

Footnotes

REFERENCES

Full text links

Citations & impact

Impact metrics

Citations of article over time

Alternative metrics

Article citations

Other citations

Wikipedia

Similar Articles