x _ . "gfiflm‘gfl .. s 3-1: 112%... an: Hflémufiuun .émfina .. «.3 . agia 5: 035.27%}! £- 1.33»— ...w .. j... > T! ‘. 5 .~ .«lhifin. kits; 9 . 3...... xv... id: x. u i . .1 3 ha. .1 an i z . t 3. 1‘1...r.!:) livvsr:lau5, 1y. 1‘9. 2!", 1&3... (Git-17$!) 12K :2. “1:14;...1...‘ 531.1. 24.111311... f;£alatii.lutiltm€ iv??:r§§lhb.rr\i it!!! k E .S ‘ V a 31;... El 2. Eras-:‘S i. ‘ {3.35:7 mu..v\~.r¢..|f .1! pile. . a. [Itxu .7; ufighés BRARISE lllllllllllllllllllllllllllllllllllllllllll Hill 3 1293 02079 934- This is to certify that the thesis entitled Breast Aid: Results of a Multivariate Model to Predict for Breast Cancer in Women with a Breast Mass presented by Janet Rose Osuch, M.D. has been accepted towards fulfillment of the requirements for Master's degfiwin Epidemiology Major professor Date May 1, 2000 0-7639 MS U is an Affirmative Action/Equal Opportunity Institution IJgRARY Michigan State UHSUQI’QH'I: PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE Aug 1 3 2006 BREAST AID: RESULTS OF A MULTIVARIATE MODEL TO PREDICT FOR BREAST CANCER IN WOMEN WITH A BREAST MASS By Janet Rose Osuch M.D. A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE, Department of Epidemiology 2000 ABSTRACT BREAST AID: RESULTS OF A MULTIVARIATE MODEL TO PREDICT FOR BREAST CANCER IN WOMEN WITH A BREAST MASS By Janet Rose Osuch M.D. Most surgeons feel obligated to perform open biopsy of a palpable breast mass, citing the potential consequences of diagnostic delay of breast cancer. However, most breast masses are benign. BREAST AID (Breast Risk Evaluation and Scoring System to Aid in the Diagnosis of Mammary Masses) is a mathematical model designed to address this problem. It used multivariate logistic regression analysis of 32 epidemiologic variables and 8 multi-Ievel clinical variables to arrive at a final probability of breast cancer. Of the three hundred twenty lesions evaluated, BREAST AID triaged 76 into open biopsy and 244 into follow-up. Thirty-five of the 36 cancers were present in the masses triaged to open biopsy, for a cancer rate of 46%. One cancer was present in the 244 cases triaged to follow-up for a cancer rate of 0.41%. This patient was diagnosed on 6 week follow-up. Before widespread application, a multi-institutional study will be necessary to test the validity and generalizability of the model. If generalizable, BREAST AID has the potential to conserve health care dollars while simultaneously assuring both physicians and patients that follow-up is a safe alternative to open biopsy. Copyright by JANET ROSE OSUCH 2000 DEDICATION To all of the women who sought counsel in the Comprehensive Breast Health Clinic at Michigan State University during this project, and who gave so freely of themselves. Earning their trust was an unparalleled privilege. ACKNOWLEDGEMENTS This project was an idea first explored with Matt Romans MD, a resident in the Department of Surgery at Michigan State University, whose enthusiasm for it inspired me to continue after his graduation. It was subsequently funded by the MSU Institute for Managed Care and Blue Care Network - Health Central, whose support allowed for two graduate assistants to be employed, Jennifer Fusco, an Epidemiology graduate student, and Tosca Kinchelow, who is soon to graduate with an M.D. degree from the College of Human Medicine at MSU. In particular, the long and dedicated hours of Tosca, who has become a dear friend of mine, contributed greatly to the successful outcome of this project. To acquire the skills necessary to pursue the project further, Dorothy Pathak Ph.D., MS. mentored me into the Masters Degree program in Epidemiology at MSU, an endeavor made possible through the generous funding of the US. Department of Defense. Her unending enthusiasm fueled me in the early stages of the development of BREAST AID and her friendship, inspiration and support served a pivotal role for me. Last but not least, I would like to acknowledge the countless hours of work and mentoring from my major advisor, Mat Reeves BVSc. Ph.D. His previous work with a similar model in veterinary practice and his experience in Epidemiology provided the needed tools to develop BREAST AID to its current level. More importantly, however, his patience, persistence, and energetic and creative input into this project makes it just as much his as mine. To him I am deeply grateful. TABLE OF CONTENTS LIST OF TABLES .................................................................. LIST OF FIGURES ............................................................... LIST OF SYMBOLS AND ABBREVIATIONS .............................. INTRODUCTION ................................................................... LITERATURE REVIEW .......................................................... Decision-Making in the Diagnosis of the Palpable Breast Mass: Who Needs Open Biopsy? ............................................. The Malignant: Benign Ratio on Open Breast Biopsy ............... Risk Factors for Breast Carcinoma ....................................... Age ........................................................................... Reproductive Factors/ Endogenous Hormones .................. Exogenous Hormone Use .............................................. Personal History of Breast Cancer .................................. Family History of Breast Cancer ...................................... Environmental Risk Factors ............................................ Variables Collected during the History of the Individual Patient... Asymptomatic Women ................................................... Symptomatic Women - Risk Factors ................................. Symptomatic Women - The Presenting Complaint ............... Physical Examination of the Breasts ...................................... Asymptomatic Women ................................................... Symptomatic Women who Present with a Breast Mass ......... Components Used in Physical Examination Evaluation ...... Size of the Palpable Mass and Detection on CBE ............ Classification Schemes for Results of CBE ..................... Test Characteristics of CBE in Women with Palpable Masses ........................................... Mammography in Symptomatic Women who Present with a Breast Mass ................................................ Classification of Mammography Findings ............................ vi xii xiii 28 33 Test Characteristics of Mammography in Women with a Palpable Mass ......................................................... 34 History and Clinical Breast Exam in Combination with Mammography... 35 Fine Needle Aspiration Consistency .............................................. 36 Fine Needle Aspiration Cytology .................................................. 38 Classification of Fine Needle Aspiration Biopsy (FNAB) Findings... 38 Test Characteristics of Fine Needle Aspiration ............................ 39 Pitfalls of F NAB ................................................................... 40 Triple Diagnosis ....................................................................... 49 Classification of Triple Test Findings ........................................ 49 Test Characteristics of the Triple Test ...................................... 53 Follow-up after Triple Diagnosis ............................................. 59 Previous Use of Likelihood Ratios in Women with a Breast Mass ........ 61 Summary of Literature Review ..................................................... 61 MATERIALS AND METHODS ......................................................... 65 Quantitative Scoring System: Preliminary Work .............................. 65 Study Design .......................................................................... 68 Data Collection ................................................................... 68 Components of Clinical Impression .................................. 69 Clinical Breast Examination (CBE) ................................... 70 Mammography ............................................................ 70 Fine Needle Aspiration — Internal Consistency and Cytology. 71 Follow-Up ................................................................... 72 Study Setting ...................................................................... 73 Period for Data Collection ...................................................... 73 Selection of Study Population ................................................ 73 Abstraction of Data from the Medical Record and Database Creation ....................................................... 74 Definitions ......................................................................... 75 Epidemiologic Predictor Variables ................................... 75 Clinical Predictor Variables ............................................ 75 Outcome Variables ...................................................... 76 Data Analysis and Statistics .................................................. 76 Staged Application of Bayes” Theorem ................................ 76 Multiple Logistic Regression Modeling ................................. 77 Staged Approach to Model Building .................................... 77 Final Adjustment of the Model based on Disease Prevalence... 79 vii Model Goodness — of — Fit ........................................................ 80 BREAST AID Evaluation .......................................................... 81 RESULTS ..................................................................................... 81 Population Studied ..................................................................... 81 Population Characteristics and Results of Bivariate Logistic Regression 82 Biopsy and Follow-up Rates .................................................. 82 Descriptive and Bivariate Logistic Regression Results .................. 84 Epidemiologic Variables ................................................ 84 Age .................................................................. 84 Body Mass Index (BMI) ........................................ 85 Personal History of Breast Cancer .......................... 85 Age at Menarche ................................................ 85 Menopausal Status .............................................. 86 Age at Menopause ............................................... 86 Reproductive Factors - Pregnancy Status, Number of Pregnancies, Number of Deliveries, and Age of First Delivery ................................................. 87 Oral Contraceptive Use ........................................ 88 Estrogen Replacement Use .................................. 88 Progestin Replacement Use .................................. 89 Family History of Breast Cancer, Age at Diagnosis, and Laterality of Disease .................................. 89 History of Radiation Exposure ............................... 90 Alcohol and Tobacco Exposure .............................. 91 Clinical Variables ......................................................... 91 Mass Size .......................................................... 91 Mass Shape ....................................................... 92 Mass Mobility ...................................................... 92 Mass Consistency ................................................ 92 Mass External Texture .......................................... 93 Mammogram Results ............................................ 93 Internal Consistency on FNAB .............................. 93 FNAB Cytology Results ........................................ 94 Results of Multivariate Logistic Regression Analysis ..................... 95 Stage 1 ...................................................................... 95 Premenopausal Women ....................................... 95 Postmenopausal Women ...................................... 96 Results of Hosmer-Lemeshow Goodness-of-F it Chi Square: Stage 1 ................................... 97 Final Step in Stage 1: Evaluation of Pre-test Probabilities .............................................. 98 viii Stage 2 ..................................................................... 98 Results of Hosmer-Lemeshow Goodness-of-F it Chi Square: Stage 2 ................................... 100 Merging of Data and Modification of Final Post-Test Probability and Likelihood Ratio for Stage 2... 101 Test Characteristics for Merged Dataset Based on Stages 1 and 2 ......................................... 101 Stage 3 ..................................................................... 102 Results of Hosmer-Lemeshow Goodness-of-F it Chi Square: Stage 3 ................................... 103 Merging of Data and Generation of Final Post-Test Probabilities ............................................. 1 04 Test Characteristics for BREAST AID: Final Merged Dataset ......................................... 105 Results of Individual Diagnostic Tests Compared with Results of BREAST AID ............................................................... 107 DISCUSSION ................................................................................ 109 Potential Study Biases: Patient Selection and Population Characteristics .................................................. 1 11 Selection Criteria ......................................................... 112 Population Characteristics .............................................. 1 15 SUMMARY AND CONCLUSIONS .................................................... 119 APPENDIX ................................................................................... 121 REFERENCES ............................................................................. 150 10. 11. 12. 13. LIST OF TABLES Probability of developing invasive breast cancer in SEER areas, women, all races, 1987-88 ................................................ 122 Physical Exam in Women with Palpable Breast Masses - Test Characteristics from the Literature ......................................... 123 Mammography in Women with Palpable Breast Masses - Test Characteristics from the Literature ......................................... 125 Fine Needle Aspiration Cytology in Women with Palpable Breast Masses — Test Characteristics from the Literature ..................... 126 Triple Diagnosis in Women with Palpable Breast Masses - Test Characteristics from the Literature .......................................... 128 Malignant vs. Benign Results by Score from BREAST AID Preliminary Study ................................................................ 129 Definition of Test Characteristics Used in BREAST AlD ............... 130 Biopsy Results of 135 Lesions from 336 Breast Masses Evaluated between November 1994 and February 1998 ............................ 131 Comparison of Current Study Histology Results with Prior Publications ........................................................................ 1 32 Prevalence of Cancer and Bivariate Logistic Regression Results for Epidemiologic Categorical Predictor Variables in 336 Breast Lesions Evaluated between November 1994 and February 1998 ................ 133 Prevalence of Cancer and Bivariate Logistic Regression Results for Clinical Categorical Predictor Variables in 336 Breast Lesions Evaluated between November 1994 and February 1998 ................ 141 Prevalence of Cancer and Bivariate Logistic Regression Results for Continuous Epidemiologic and Clinical Predictor Variables in 336 Breast Lesions Evaluated between November 1994 and February 1998 ................................................................. 144 Logistic Regression Model of the Risk for Breast Cancer for 112 Postmenopausal Women based on Epidemiologic Variables .......... 97 14. 15. 16. 17. 18. 19. 20. 21. Observed and Expected FrequencieswithinQuartiles of Risk: Stage 1 .............................................................................. Logistic Regression Model of the Risk for Breast Cancer in 330 Breast Masses based on CBE and Mammography Results ............ Observed and Expected Frequencies within Groups of Risk: Stage 2 .............................................................................. Logistic Regression Model of the Risk for Breast Cancer in 336 Breast Masses Evaluated with FNAB ........................................ Observed and Expected Frequencies within Groups of Risk: Stage 3 .............................................................................. Test Characteristics for BREAST AID by Final Probability of Cancer Sensitivity, Specificity and Predictive Values for Mammography Alone with Comparison to Test Characteristics of BREAST AID ...... Sensitivity, Specificity and Predictive Values for Fine Needle Aspiration Cytology Alone with Comparison to Test Characteristics of BREAST AID ..................................................................... Results of Indeterminate or Less Findings on Mammography and FNAB in Ten Cases of Breast Cancer: Comparison with Final Probability of Cancer Predicted by BREAST AID .......................... xi 97 99 100 103 104 105 108 109 146 LIST OF FIGURES 1. Example of 2 X 2 Table for Calculating Test Characteristics ............ 147 2. Best Test / Worst Test ROC Curve ............................................. 148 3. BREAST AID ROC Curve .................................................... 149 xii Bl-RADSTM BREAST AID CBE cm Cl cont’d DCIS ERT FN FP FNAB HRT LCIS malig md mm mn n/a NPV LIST OF SYMBOLS OR ABBREVIATIONS Breast Imaging Reporting and Diagnostic System Breast Bisk Evaluation and Scoring System to Aid in the Diagnosis of Mammary Masses Clinical breast examination Centimeters Confidence Intervals Continued Ductal carcinoma in-situ Degrees of freedom Estrogen replacement therapy False Negative False Positive Fine needle aspiration biopsy Hormone replacement therapy Lobular carcinoma in-situ Malignancy Median Millimeters Mean Number Not applicable Negative Predictive Value xiii 0C OR PAR PPV PRT ROC SEER SENS, Se SPEC, Sp TN TP U.S. Oral contraceptive Odds Ratio Population attributable risk Positive Predictive Value Progesterone Replacement Therapy Receiver Operating Characteristic Surveillance Epidemiology and End Results Registry of the National Cancer Institute Sensitivity Specificity True Negative True Positive United States xiv INTRODUCTION Breast cancer is the most common cancer diagnosed in women, accounting for one-third of all female cancer. The American Cancer Society estimates that 182,800 cases of invasive and 22,000 cases of in-situ breast cancer will be diagnosed in 2000 in the United States, and that an estimated 42,000 women will lose their lives to the disease during that same year 1. In 1970, 90% of patients who had breast cancer presented with a palpable breast mass 2. Since then, screening mammography use has increased steadily, and breast cancer is being diagnosed in asymptomatic women with increasing frequency. Despite this, 55% - 68% of women with breast cancer presented with a clinically palpable mass in several series published in the 19908 3'6. The optimal clinical approach to the management of palpable breast masses has not been established. Evaluations have included the analysis of risk factor data, historical facts associated with symptoms, physical examination results, assessment with mammography and ultrasound, and application of fine needle aspiration cytology results. Each of these has limitations as a single test and historically, open biopsy has been considered necessary to establish a definitive diagnosis of a palpable mass by many authors 2: 3: 7. Some health- care analysts have criticized the estimated 800,000 to 1,000,000 open diagnostic breast biopsies performed in the US. each year as “unnecessary", and some of this criticism is justified 8'10. There is inherent emotional, physical, and financial morbidity associated with open biopsy, a societal cost of over $4 billion dollars annually in the US. alone, and the average malignant: benign yield is low — usually no better than 1:4 or 1:5 5’ 6. 11. Complicating these issues is the reality that failure to diagnose breast cancer based on “failure to be impressed with clinical findings” is the most common reason for malpractice litigation in the US. 12. Most surgeons feel obligated to perform open biopsy of palpable breast masses, fearing the profound emotional and potential biological and legal consequences of diagnostic delay of breast cancen There has been a trend over the past 25 years to apply the principles of “triple diagnosis” to the evaluation of a breast mass, especially in diagnostic breast centers or other settings that ensure large volumes of patients. Triple diagnosis refers to the application of a combination of 3 diagnostic tests to evaluate a breast mass — clinical breast examination (CBE), mammography, and fine needle aspiration biopsy (FNAB). A strict principle of this approach is that if any one of the 3 diagnostic test results is other than benign, then open biopsy is mandatory. One goal of triple diagnosis is to decrease the number of open biopsies done for benign disease. Traditional evaluation methods for diagnostic tests, referred to as test characteristics, measure sensitivity, specificity, and positive and negative predictive values using 2 X 2 tables and a classification of results that divides them into benign vs. not benign categories. This dichotomous classification scheme, although necessary in order to analyze the accuracy of any diagnostic test, ignores the considerable degree of diagnostic uncertainty that exists for each component of the triple test and the resulting indeterminate categories in diagnostic ratings. As a result, although the application of the principles of triple diagnosis allows some women to be triaged into clinical follow-up, the open biopsy rate may still be excessive. The goal of this study is to better define diagnostic classification schemes contributing to triple diagnosis, and to develop a mathematical prediction model to help distinguish between malignant and benign disease in a symptomatic group of women who present with a breast mass. A multivariate model to predict for malignancy in women who present with a breast mass has not been previously reported. Application of the results of the diagnostic prediction, if accurate, could reduce the need for open biopsy and its associated morbidity without compromising the timely diagnosis of breast cancer, and simultaneously offer both physician and patient assurance that clinical follow-up is an acceptable alternative to open breast biopsy. LITERATURE REVIEW Decision-Making in the Diagnosis of the Palmble Breast Mass: Whg Needs Omn Biogy? A clinician considers a host of variables in making a decision about the etiology of a breast mass, including data collected during the history, results of the physical exam, and results of diagnostic testing. Many publications exist examining the value of risk factor analysis, patient history and physical examination findings, mammography, ultrasound, fine needle aspiration cytology and a combination of these in the evaluation of breast masses. The pertinent literature for each of these will be discussed in the following sections. The Malignant: Benign Ratio on Open Breast Biogy The literature addressing the malignant: benign ratio on open breast biopsy is published almost exclusively by surgeons practicing in breast clinics, academic centers, or whom have a major interest in breast disease. By its very nature, then, the reports reflect management of a select referral population by a specialized group of physicians whose practices may not reflect those of the general surgery community whom are responsible for the bulk of breast mass management in the United States. The malignant: benign ratio quoted in the literature may therefore be an overestimate of what occurs in actual practice. In addition, the medical malpractice climate in the United States almost insists on 100% diagnostic accuracy in the work-up of a breast problem, despite the well- known false-negative characteristics of the diagnostic tests commonly used in the work-up. This makes interpretation of the literature addressing the malignant: benign breast biopsy ratio in other countries less applicable than it would othenrvise be. Keeping these issues in mind will be helpful during this review. As an illustration of worldwide differences, Ciatto and colleagues published a multi—center study of 6 institutions in Italy, their malignant: benign breast biopsy ratios, and reasons for benign biopsies. There was a wide range of age- adjusted ratios reported, from 2.9:1 to 121.67 13. Two-thirds of the cases had an open biopsy for benign disease because one diagnostic component of the triple test was indeterminate or suspicious for malignancy. Interestingly, they explored other reasons for biopsy in this study and found that 11.3% were done for an increasing size of the palpable mass on follow-up, 3% were done for cosmetic reasons, 8.2% were done for psychological reasons (patient request), and 0.7% were done for high-risk situations. Despite the range of malignant to benign results, they are much better than any series published in the United States, where the malignant: benign biopsy rate is usually quoted between 1:4 and 1:5 5: 6: 11. A large study by Leis documented all of the biopsies done in the decade 1960-1969 at New York Medical College and Affiliated Hospitals prior to the use of screening mammography. Of 3,287 breast biopsies performed, 923 (28%) were for malignant disease 2. Egan’s data on breast biopsies done between 1963 and 1973 of 2793 biopsies done at Emory Clinic in Atlanta documented a 25% malignancy rate that was highly age-dependent, with women less than 38, 39-62, and 63-100 years of age having malignancy rates of 6.2%, 26.3%, and 62% respectively. The biopsy rate, however, did not vary by age group; between 14 and 16% of breast exams in each age group resulted in surgical biopsy 11. Clinicians are aware of the differential malignancy rate by age group, and the lack of variance in the biopsy rate by age reflects the fact that breast cancer can affect any age group in the adult women, coupled with the prevalence of benign disease that exists in premenopausal women, which peaks in the 5th decade of life. Because of the high degree of diagnostic uncertainty in the presence of a breast mass, clinicians will tolerate a high number of false positive results, especially young women, in order to achieve the lowest number of false negatives. In a more modern study, in 1992 Seltzer reported a breast biopsy cancer rate of 20.4%. This was also influenced by age, with 34.7% of the population of biopsied patients 50 years of age or older being diagnosed with cancer, compared to less than 12% in those less than age 50. In 1997, the same author published a new series in a different population - 2247 patients who underwent open biopsy for a variety of reasons. In this series, 55% of the biopsies were done for a palpable breast lump (21% cancer rate), and 35% for an abnormal mammogram (25% cancer rate) 6. The rate of cancer in the overall population was 22%; 12% for those less than 50 years of age and 41% for those 50 years of age and older. Lay documented a lower malignancy rate of 15% in women biopsied for palpable abnormalities between 1983 and 1987 at University Hospital in Jacksonville Florida 14. Selzer reported on the calculation of a “fear factor” indicator for breast biopsies, that is, the difference between the surgeon’s positive predictive value and the actual incidence of malignancy on biopsy. He calculated a “fear factor” of 24% for women less than 50 years of age and 15% for those 50 and over 6. Entering into this fear factor are considerations that affect both the physician and the patient. The patient expects that a diagnosis be made with 100% diagnostic certainty. The surgeon understands that no test can ensure 100% diagnostic certainty short of open biopsy (even open biopsy has its pitfalls), and that medicine is a science of probabilities. The surgeon, however, is vulnerable to the medical-legal climate and in the United States, and there seems to be little tolerance for anything less than 100% diagnostic certainty. The result is that many more open biopsies are performed in the United States than in other parts of the world and many more than may be necessary. Risk Factors for Breast gapinoma Epidemiologic factors identified to confer risk in large populations have been widely studied from a public health perspective, but are of varying degrees of usefulness in the clinical setting when evaluating an individual. Misinterpretation of risk information is common, and many studies report over- estimation of risk by patients and their physicians 15'17. Conversely, some individuals are lulled into complacency by assuming that if they possess no risk factors that they are at no risk for the disease. By far the highest risk for breast cancer is the female gender, with a ratio of women: men affected of approximately 180:1 18. All women are at risk for breast cancer, but the probability of disease varies by a variety of additional factors. Systematic study of these factors has provided important clues to the etiology of breast cancer. The most important of these variables will be briefly described. Age Unlike most cancers, breast cancer affects the entire age spectrum of adult women, with increasing frequency as a woman ages. The cumulative risk of developing breast cancer by a given age, in five-year increments beginning with age group 20-24 through age 95+, is given in Table 1. If a cohort of women is followed for 85 years, one in 9, or 11.1% will develop invasive breast cancer at sometime in her lifetime, and if followed for 95 years, one in 8, or 12.5% will be diagnosed 19. When the ageincidence curve for breast cancer is plotted on a log-log scale, the curve produced is not a straight line throughout a woman’s lifetime as it is with other cancers. Instead, the curve assumes a straight line from age 25 to 50, after which a slowing effect in the incidence slope is noted. The curve assumes a straight line at this slowed rate for the remainder of women’s lives 20. This important observation provides clues to the etiology of breast cancer and its dependence on female hormones for the induction and promotion of carcinogenesis. It implies that premenopausal women possess the etiologic components for breast cancer development and that cessation of ovulation decreases, but does not eliminate them 20. Many of the known risk factors for breast cancer relate to the specifics of ovulation, and in particular, to the hormonal milieu to which the breast is exposed. This will be discussed in more detail in the following sections. Regroductive Factors/Endogenous Hormones There is a strong correlation between the length of exposure to endogenous hormones and the risk of breast cancer. The earlier the age at menarche, the higher the risk. Girls who begin menses at age 11 or before have a relative risk for breast cancer of 1.3 as compared with those with onset at or after age 16 21. Age at menopause also influences risk. Studies demonstrate an average risk elevation of 1.5 between those women who underwent menopause at or after age 55 as compared with the less than 45-year age group. Even more provocative, it was found that those women who had a surgically induced menopause through bilateral oophorectomy and before the use of estrogen replacement therapy had a reduction in breast cancer risk to 0.4 22. It has been estimated that for every 12-month increase in age at menopause, there is an approximate increased risk of breast cancer of 3% 21. Carrying a pregnancy to the third trimester confers a protective effect on risk in most women. This is inversely correlated with age at first delivery. Russo and Russo explain the biological mechanism for this as differentiation of the terminal duct lobular unit of the breast during pregnancy 23. However, if a woman delivers her first child at or after the age of 30, her risk is actually about the same as if she was nulliparous. The risk ratio for this group is 1.9 compared to a reference group of women whose first delivery was before age 20 24. The biological mechanism for this observed phenomenon has not been explained. Parity as a risk factor disappeared after adjusting for age at first delivery in early studies, but there is some evidence currently that parity exerts a paradoxical influence, or crossover effect, on risk. Parity increases risk in premenopausal women, and decreases it in women over the age of 40 25. Obesity (weight above ideal of 20 kg or more) increases the risk of breast cancer in postmenopausal women to 1.8, but in premenopausal women, the risk is actually inversely correlated with weight 26. The mechanism for the latter has not been explained, but for the former, the hypothesis is that obesity causes a rise in estrone, the major postmenopausal estrogen, through aromatization of androstenedione in the excess adipose tissue of obese women. Estrone is then converted to estradiol, increasing the hormonal milieu of breast tissue to this potent estrogen 27. §x_ggenou_s Hormone Use Oral contraceptive (00) use and its possible relationship to breast cancer risk has generated debate over several decades. A recent meta-analysis of the relationship pooled the data from over 150,000 women in more than 50 studies. A 24% increased risk was observed in current OC users that returned to baseline five years after discontinuing use 28. A meta-analysis of hormone replacement therapy (HRT) use and breast cancer risk has also recently been done. Over 50 studies in more than 160,000 women were examined; a 35% increased risk for breast cancer was observed in current users of HRT, that returned to baseline five years after discontinuing use 29. Most of the studies examined the use of estrogen replacement alone rather than combined therapy with progestins. This is important because progestins are currently given to most postmenopausal women on HRT who have not had a hysterectomy to decrease the proliferative effect of estrogen on the uterus and subsequent risk of endometrial cancer 30. It has been shown that progestins do not have a similar effect on the breast and may actually increase cellular proliferation at that site 31. Personal History of Breast Cancer It has been known for some time that a history of unilateral breast cancer predisposes a women to a greater risk of developing contralateral breast cancer over the same time period compared to a woman without that history. This risk is predictable at a steady rate of 0.7% per year of follow-up 32. 10 Family Histom of Breast Cancer A family history of breast cancer can increase a woman’s risk anywhere from 20% to 85-90% depending on the relative(s) affected and their menopausal status at diagnosis 33. A recent meta-analysis of 74 published studies has provided pooled estimates of the relative risks associated with various family histories 34. First degree relatives are most important, increasing a woman’s relative risk to 2.1. A history of breast cancer in a second-degree relative increases the relative risk to 1.5. The woman at the greatest risk is one who has both a mother and a sister with breast cancer. The relative risk in this group of women in the pooled analysis was 3.6, but could be as high as 13.6. This high odds ratio reflects the stronger likelihood that a woman with two first- degree relatives with breast cancer has inherited a genetic mutation increasing her susceptibility to the disease. Recent advances in genetic research have identified 2 genes, BRCA-1 and BRCA-2, that when abnormal are associated with up to an 80 - 90% lifetime risk of breast cancer 35. Women who have a mutation of one of these genes also have a higher likelihood of a family history of premenopausal breast and/or ovarian cancer in multiple relatives, and either disease is more likely to be bilateral. It has been estimated that approximately 5—10% of women diagnosed with breast cancer have inherited a mutation in one of these genes 33. Overall, the proportion of women in the general population with a family history of breast cancer in a first degree relative is low, about 8% 36. Of those women diagnosed with breast cancer, family history estimates range between 11 10 and 22%, depending on the characteristics of those assessed and the number and familial proximity of those included 37’ 38. Mmentgl Risk Factors Two environmental risk factors have been unequivocally identified - radiation exposure and alcohol intake. The association between radiation exposure and breast cancer risk has been known for many years and has been studied in the context of repeated fluoroscopic treatments for tuberculosis 39. 40, post-partum mastitis treatments with ionizing radiation 41, and in the atomic bomb survivors of Hiroshima and Nagasaki 42’ 43. The studies demonstrate a two to four-fold increase in the incidence of breast cancer for women exposed before age 25, which occurs after an average latency period of 15 years. Associations between alcohol intake and breast cancer risk have been studied in earnest over the past several years. A pooled analysis of cohort studies has recently been published which examined over 50 studies. Most of these demonstrate a 30-40% increased risk in current users of 1-2 alcoholic drinks/day compared to non-users. This is thought to be associated with the increased metabolism of estrogen observed in alcohol users 44. Based on this data, an analysis has recently been done estimating the age-adjusted population attributable risk (PAR) for alcohol and breast cancer 45. The finding of a PAR of 2.1% caused the authors to conclude that widespread efforts to reduce alcohol consumption would not have a substantial impact on breast cancer rates. 12 Three major studies have addressed the issue of PAR in breast cancer examining multiple variables of risk. Seidman examined 10 common risk factors and found that only 29% of cases of breast cancer occurred in women aged 55-84 with any risk factors, and only 21% of cases in women aged 30-54 had any risk factors 46. Madigan found that 47% of cases of breast cancer occurred in women with at least one risk factor, but one of the most important, family history, was found in only 9.1% of the population studied 47. Rockhill and colleagues recently examined the risk factors of menarche, parity status, age at first delivery, family history, and previous history of breast biopsies 48. The authors estimate a PAR of 15%-25% based on these factors, depending on the definition of risk. They estimate that 62%-98% of the female population has been exposed to at least one of the factors studied. They conclude that the established risk factors for breast cancer have little public health value because of the high proportions of the population exposed, and the unmodifiability of most factors 48. The magnitude of the association of risk for most risk factors is also low, between 1.2 and 2.0, which contributes to the view that they are not of much practical use. These views do not take into account the important insights that result from the study of risk factors from an etiologic standpoint, and the effect of risk assessment in terms of chemoprevention studies. It does, however, reflect the perspective that clinicians have about epidemiology as it relates to the assessment of an individual woman 49» 5°. 13 Variables Collected during the History of t_he Individual Patient mmmatic Women It is important to appreciate that risk factor analysis is complicated and dependent on considerations of confounding and interactions between variables. As a result, the useful application of breast cancer risk data, which is multifactorial, has proved to be difficult in the clinical setting. Several authors have attempted to quantitatively predict the absolute risk for breast cancer in an individual patient based on one or more risk factors. The earliest of these models apply only to women with a family history of the disease. Anderson studied a cohort of hospitalized women in Houston with a family history of breast cancer and reported results as absolute risks categorized according to frequently observed family histories 51» 52. Ottman used the same approach in a population-based cohort in Los Angeles 53. These approaches were further refined by Claus, who modeled the absolute risk of breast cancer in women with a family history in relatives from the same lineage based on the probabilities of autosomal dominant inheritance patterns 54. Although family history is an important consideration of risk, lack of this risk factor does not protect a woman from development of the disease. The great majority of women who are diagnosed with breast cancer have no known family history. Gail and colleagues developed a quantitative model to estimate absolute risk for breast cancer that uses multiple risk factors in the analysis, rather than family history alone 55. These included age, age at menarche, age 14 at first delivery, number of breast biopsies, and presence of atypical hyperplasia on breast biopsy as well as family history in first degree relatives. The data analyzed to construct the Gail model was taken from the Breast Cancer Detection and Demonstration Project, a volunteer-based observational study of women who were willing to undergo annual screening with physical exam and mammography. The model has been validated in the Texas Breast Screening Project and in the Nurses Health Study and has been used in modified form to predict risk in the Tamoxifen Breast Cancer Prevention Trial of the National Surgical Adjuvant Breast Project 56’ 57. It has been determined to be a relatively accurate predictor of risk, although there have been concerns expressed about underestimation of risk in some subgroups, especially those with an extended family history 57. The Gail model is also documented to overpredict risk by 33% among women 60 years of age and younger who do not participate in annual screening 50. In addition, the use of breast biopsies as an independent predictor of risk has been criticized for its inclusion in the model. Many consider this to be a confounding variable, especially since women at increased risk for breast cancer are likely to have breast biopsies recommended more often than those who do not. Symptomatic Women — Risk Factors Accurate knowledge and appropriate application of risk factors in the clinical setting alter the probability of disease in an individual patient. It is more likely, for example, that a woman with a family history of breast cancer in 2 sisters who presents with a breast mass has cancer as compared with a woman 15 without a family history. The magnitude of the risk is dependent on the patient’s age; other risk factors present or absent in the patient’s history, and the clinical characteristics of the mass. No attempt has been made to predict risk for breast cancer in symptomatic women based on history of risk factors alone in any quantitative or even non- quantitative way. One short descriptive study published in 1977 examined 1059 symptomatic patients from one surgeon’s practice collected over a 6-year period. The rate of breast cancer in the studied population was 13%. The women with breast cancer were compared with those who did not have breast cancer. No statistical analysis was done and no frequency tables were provided. The author examined age, family history, menstrual status, a history of benign breast disease, parity, use of hormones, and duration of symptoms, concluding that only age and family history in first degree relatives were useful in the history for predicting the presence of malignancy 58. Practice guidelines addressing breast cancer and its diagnosis have recently been written by a variety of organizations. Although recommendations are made to collect risk factor information, physicians are cautioned to remember that“ the majority of women who develop breast cancer have no risk factors beyond being female and aging” 59. One guideline additionally states that “risk factors for breast cancer should be noted, but their presence or absence should not influence the decision to investigate a lump further” and that “although known risk factors, including aging, all increase the risk of breast cancer, they 16 do not substantially influence the probability that any particular lump will be malignant” 60. Symptomatic Women — The Presenting Complaint Breast complaints are extremely common. A study in the United Kingdom documented a frequency of primary care visits for breast complaints of 18 visits per 1000 person-years 61. A similar study published by Barton and colleagues followed 2400 women between July 1983 and June 1995 in a large health maintenance organization in New England. They found that 16% of the population of women between the ages of 40 and 69 presented with a breast symptom during the follow-up period, for a rate of 22.8 presentations per 1000 person-years 62. Women younger than 50 presented twice as often as older women. Most studies of the frequency of breast complaints in general practice have been done in the United Kingdom. It has been estimated that the average general practitioner will see 13 - 34 patients for new breast complaints per year 63'65. Breast complaints fall into 5 categories: mass, abnormal mammogram, pain, nipple discharge, and skin changes, and the complaint offers some information about the likelihood of cancer. In Barton’s study in primary care physician practices, the most common symptom was pain (47%), followed closely by a breast mass (42%) 62. Breast cancer was diagnosed in 23 of the 372 women who presented with symptoms (6.2%). A mass, which was the most common sign of malignancy, had a 10.7% chance of representing breast 17 cancer and had a likelihood ratio of 65. Age was not significantly associated with the likelihood of cancer in this study 62. The studies of referral clinics reflect a different population than has been discussed to this point. Primary care physicians, for example, are not likely to refer the majority of patients with breast pain, but manage this symptom themselves. In a study of referrals to the breast clinic at the City Hospital in Nottingham published in 1988, 1445 patients were seen over one year, 65% of whom presented with symptoms of a mass. This series did not include abnormal mammogram referrals. Twenty-five percent of the patients with biopsy confirmed masses had cancer, and a women less than 50 years of age had a 1 in 20 chance of having malignancy, whereas the ratio was 1 in 2 in women 50 years of age and older 66. In a study of 1,000 consecutive patients referred to a general surgeon in London, similar findings were reported by Ellis et al in 1984. There was an overall cancer rate of 12% in their population, and 26% for biopsied patients 67. A more current study by Seltzer, also in a p0pulation of surgical referrals, reported on 2,000 consecutive new patients evaluated in a surgical clinic between January 1987 and February 1989 5. The most common presentation to the clinic was a breast mass (50% frequency), followed by an abnormal mammogram (31.9%). Less frequent were complaints of mastalgia (6.5%) and nipple discharge (3.6%). Miscellaneous symptoms accounted for the reminder. Of the 4 symptoms specified, the cancer rates were 8.2%, 9.1 %, 2.3%, and 5.9%, respectively. These malignancy rates are only slightly lower than those 18 reported from the breast clinic at Ninewalls Hospital in Dundee, UK, where the malignancy rate was 11% 68. These rates are consistent with the fact that cancer rates for overall clinic referral populations are much lower than quoted cancer rates for patients with open biopsy, as would be expected. Age is a strong predictor of the likelihood of malignancy. Kleinfeld reviewed 1,019 breast biopsies done at the North Miami General Hospital between 1964 and 1968. In that study, the malignancy rate was 29% overall. In the decades of life from 10-19, 20-29, 30-39. 40-49. 50-59, 60-69, 70-79, 80-89, and 90+, the malignancy rates were 3%, 2%, 9%, 19%, 59%, 70%, 77%, 100% and 100% respectively 59. Specifically, the pcak age at which breast biopsy occurred was 40-49, but the malignancy rate was only 19%. This rose incrementally in postmenopausal women so that the rates ranged between 59% and 100% past the age of 50. The combination of the risk factor history, the patient’s age, and the presenting complaint evaluation begin to form a pretest likelihood of disease in the clinician’s mind. However, results of the clinical breast exam and diagnostic tests add to the assessment. How much each component contributes and to what degree each diagnostic test is necessary is a matter of debate. What are the necessary components of breast mass evaluation beyond the history? If a palpable mass is confirmed on clinical examination, some clinicians demand open biopsy for definitive diagnosis at this point, while others are willing to consider follow-up under some circumstances. Mat are these circumstances? What is the role of mammography in women with palpable masses? Is it 19 acceptable to stop the work-up at the point of CBE and mammogram evaluation? How many cancers, if any, will be missed is biopsy is not done at his point? Is cytological or histological sampling necessary in every patient with a breast mass? What is the false negative rate of the diagnostic work-up short of open biopsy? The next sections will address these issues. Physical Examination of the Breasts Asymptomatic Women Despite numerous publications addressing the techniques of screening physical examination of the breasts 2. 49. 70'72, the test performance characteristics of CBE have not been studied until recently. The Canadian National Breast Screening Study (NBSS) published data detailing the sensitivity, specificity, and positive predictive values of CBE in 19,965 women aged 50-59 years and 25,620 women aged 40-49. On visit one, the respective values were 83%, 88%, and 3% in the older age group and 71%, 84%, and 1.5% in the younger age group. A separate analysis of test characteristics between the study surgeons (who also evaluated the symptomatic patients) and the nurse examiners (who did much more of the screening exams) demonstrated interesting differences. The sensitivity, specificity, and positive predictive value at visit one for the nurses for screened patients aged 40-49 was 68.5%, 95.5%, and 5.0% respectively. For the screened patients examined by nurses aged 50-59 the values were 80%, 96.8%, and 10.9% respectively. In comparison, the same values for the surgeons in the examined 40-49 age group were 96.8%, 70.4%, and 5.0% and in women 50-59, the respective values were 20 92.7%, 75.6%, and 10.3% at visit one 73. This difference between the nurses and surgeons reflects the different roles of the examiners. The nurses were more likely to refer patients with any breast abnormality for evaluation by a surgeon, but surgeons were likely to remove only those lesions for which biopsy was clinically indicated. For these reasons, one would expect surgeons to have a higher test sensitivity than nurses and for nurses to have a higher test specificity than surgeons. In 1999, a meta-analysis was done of the sensitivity and specificity of CBE in asymptomatic women 62. A pooled analysis of the clinical trial data demonstrates lower sensitivity and higher specificity for CBE than in the overall results of the N388; values of 54% and 94% respectively. These numbers again reflect the screening role of CBE in these studies. Symptomatic Women Who Present with a Breast Mass Commnents Used in Physical Examination Evaluation Surprisingly few publications detail the components that enter into a clinician’s assessment of whether a mass on CBE represents benign disease or cancer. Most mention the physical examination signs of advanced malignancy and early studies discuss these characteristics as criteria for designation of a mass as malignant on CBE. If a mass is hard, fixed to the skin or chest wall, or irregular in shape or texture on palpation, it is likely to be classified as malignant. The same is true if skin puckering, nipple retraction or nipple erosion is present, or if regional lymph nodes are palpable. In 1967, Zajicek et al, in one of the first publications of FNAB findings in women with breast masses makes the 21 following statement: “It seems pertinent that of the 485 cases in which cancer was cytologically diagnosed from aspiration biopsy, 30 percent lacked such obvious clinical signs of advanced mammary carcinoma as fixation of skin and muscle and retraction of the nipple 74.” Kruezer and Boquoi, in their publication in 1976, specifically state that they used these as the criteria for a malignant or suspicious finding on physical examination. They found the sensitivity, specificity, and positive and negative predictive values of clinical exam to be 90%, 78%, 74% and 92% respectively 75. Grobler et al applied similar criteria in a publication in 1990 in a series reported from the University of the Orange Free State, Bloemfontein 76. Sixty-six percent of their patients had malignancy and the average tumor size was 3 cm. The sensitivity, specificity, positive and negative predictive values of physical exam were 96%, 76%, 88% and 90%. The test characteristics published in these two studies suggests, like the population studied by Zajicek et al, that many the patients examined had advanced disease at the time of evaluation. If these were the main criteria for ruling in breast cancer on CBE today, many more false negative results would occur. Current awareness of women in the importance of breast self examination, and the increased emphasis on the importance of CBE and mammography in asymptomatic women, has resulted in the decreasing size of clinically detectable tumors. In 1985, it was reported that the average breast mass size discovered on breast self-examination was 2.5 cm 77, and in the initial screens with physical examination in the Ontario Breast Screening Program between 1990-1995, the average invasive cancer size 22 found at palpation was 1.8 cm 78. Cady et aI studied 1,001 cancers diagnosed between 1989 and 1993 at a tertiary center and a community hospital and found a median maximal diameter for invasive cancer of 1.5 and 1.7 cm respectively 79. In addition to the decreasing size of tumors at presentation, the characteristics of the mass on CBE are also changing. In 1971, Venet and colleagues estimated the sensitivity of 3 characteristics of physical examination conducted in asymptomatic women who were found to have palpable masses during an early screening study, the Breast Cancer Detection and Demonstration Project. They evaluated mass consistency, texture, and mobility. They estimated sensitivities of 62%, 60%, and 40% respectively for each of the three characteristics 80. This means that 4 of 10 malignant breast masses would be either soft or cystic, that 4 of 10 would not have irregular borders, and that 6 of 10 would be freely movable. The sensitivity for individual physical examination characteristics is likely to be even lower than it was almost 30 years ago, since along with decreasing size of palpable masses and their more subtle CBE characteristics, the associated advanced signs of breast cancer that clinicians relied upon in the past have disappeared. That “failure to be impressed with clinical findings” is the most common reason for diagnostic delay of breast cancer underscores this point 12. To illustrate it further, Mushlin used the data from Venet and calculated the probability of cancer based on the clinical finding of fixation, or lack of mobility. Based on an estimated pretest likelihood of cancer in the presence of a breast mass of 20% and a sensitivity 23 and specificity of 90%, he estimated a post-test likelihood of malignancy of 50% for a positive finding and 14% for a negative one 81. Clearly, greater diagnostic accuracy is required than that offered by this finding (or lack of it) on CBE. Lesions that are round, smooth, and circumscribed on palpation have a higher likelihood of being benign than malignant 60. However, the reliability of this assessment is dependent on the experience of the examiner, and even when evaluated as clinically benign, the false-negative rate of physical exam is too high to depend on it alone to determine the etiology of a breast mass under any circumstances. Rimsten found a true positive rate for physical exam of 70% — 90% and a true negative rate of 70% in one study of experienced examiners 82. In the largest study of examiner-dependent sensitivity of CBE, Ciatto et al found a range between 69% and 90% among 41 examiners, with a statistically significant association between examiner experience as measured by number of cancers palpated (<1oo, 100-299, >300 83. In another study, Boyd et al in 1981 reported on 100 patients with breast masses examined by 4 experienced surgeons 84. Forty-one of these patients had been admitted to the hospital for an open breast biopsy of a palpable lump, while the remaining 59 i had no known breast disease. Masses were palpated in 32 - 42 % of the patients. Agreement between the 4 surgeons of the finding of a mass on CBE was 100% among only 16 of the 41 cases in which a mass was present, and full agreement that the mass represented malignancy occurred in only 5 of the 15 patients with cancer. In this study, 7% - 20% of the patients who had cancer had no recommendation for biopsy on the part of at least one surgeon 84. 24 Size of the Palpable Mass and Detecjm on CBE Few studies have reported mass size on CBE, and those that do often fail to distinguish whether the size of the mass reported refers to the findings at CBE or on histopathology. This is important because CBE tends to overestimate the size of a mass, and if the size reported refers only to the histopathology findings, the PE test characteristics may appear better than they actually are. Boerner and Sneige found that lesions measured as just under 3 cm by CBE were an average of 7 mm smaller on histopathologic exam 85. In addition, some studies report only the size of malignant lesions on histopathology, which does not allow for a full evaluation of the population characteristics studied unless all of the lesions were malignant. One study from Germany retrospectively examined 160 breast carcinomas between 1988 and 1990. The average tumor size histologically was 25.7 mm. The smallest tumor detected by palpation was 3 mm. CBE detected 87% of the tumors overall; 77% that were 2 cm or less, and 47% that were 1 cm or less in size 86. The others were detected by mammography. Hermansen et al studied a series of patients between 1980 and 1982, and they found that 33% of the tumors were less than 1 cm in size, 50% were between 1 cm and 3 cm, and 17% were greater than 3 cm. The malignancy rates were 7%, 31%, and 72% respectively 87. DiPietro reported on 631 solid breast masses that did not have advanced signs of malignancy. In 105 patients, the mass spontaneously regressed and these were followed to ensure persistence of this finding. They do not discuss the average mass size but comment that 66.3% of the benign cases were not larger 25 than 20 mm, and that 53.7% of the malignant cases were 20 mm or less in size. In addition, 39.2% of masses 20 mm or less in size were malignant and that 54.5% of those larger were malignant. The question of the sensitivity of CBE to detect a mass in relation to its size is more directly answered in a study of screened women by F leg at al reported in 1979. In it, 183 cancers were diagnosed in 17,000 women between the ages of 45 and 64. The mammogram techniques were state-of-the-art for the time, but not as technically sophisticated as they are today. Nonetheless, comparison between CBE and mammographic detection of cancer is possible. In this study, tumors were divided into size groups of 0—0.5 cm, 0.51-1.0 cm, 1.1-2.0 cm, 2.1-3.0 cm, and over 3 cm. Mammography detected 92%, 70%, 80, 65%, and 80% of lesions in the respective size categories. CBE performed much worse when the lesion was 2 cm or less in size, detecting 20%, 40%, 50%, 80%, and 90% of the cases in each size category 88. A large study of 2, 648 palpable malignant tumors diagnosed between 1979 and 1986 in Italy also examined sensitivity of physical examination in relation to histologically measured tumor size. Physical examination identified the palpable mass as cancer with the following frequency: ln-situ cancer: 48%, masses 2 cm or less: 70%; and masses > 2.1: 89 — 93% 83. Classification Schemes for Results of CBE Classification schemes for findings on CBE typically fall into 3 categories: benign, indeterminate, and suspicious/malignant. Some investigators use a classification that includes suspicious classifications in the indeterminate 26 category. None of these has been validated and few publications even describe their criteria for inclusion into one classification category vs. another. In an impressive attempt to delineate physical exam characteristics, DiPietro et al used a “Clinical Diagnostic Index” that used the algebraic sum of 9 different characteristics on CBE. They reported the test characteristics of the physical exam according to the overall diagnostic index score. This score included the following characteristics: (1) changing size of mass; (2) consistency (soft, equal to surrounding tissue, or hard); (3) shape and borders (roundish or irregular); (4) surface characteristics (smooth, intermediate, or irregular); (5) intramammary mobility (marked, intermediate, or scarce); (6) prominence (not protruding or protruding at palpation); (7) pain with palpation (present, scarce, absent); (8) nipple secretion (absent or serous or hematic present); and (9) presence of axillary lymph nodes (not palpable or suspicious). Even with this elaborate diagnostic index, 10-23% of the malignant masses were undervalued on clinical examination, and the test sensitivity was only 63% 89. Test Characteristics of CBEli Women with Palgble Masses The subjectivity of CBE interpretation and the lack of a standardized reporting system have inhibited studies of CBE accuracy. Table 2 demonstrates the test characteristics of physical exam from the literature. Many of these values were derived from the data quoted in the papers, as not every report calculated the pertinent test characteristics. The categorization into positive vs. negative tests was based on a benign vs. not benign categorization as discussed above. Most of the studies in table 2 were done in the context of triple diagnosis evaluation, 27 and CBE test characteristics analyzed as a separate component of this diagnostic evaluation 7. 75, 76, 87: 89'98. One was done as a direct evaluation of CBE 99. and three others as an evaluation of CBE in combination with other tests 82, 100, 101. The sensitivity of CBE has been demonstrated to be age-dependent. In women 60 and over, values of 90% or greater are the rule 83, 96, 99: 100. In contrast, CBE sensitivity is low in women 40 or less, ranging between 25% and 77% in those same studies. In women between the ages of 40 and 60, CBE sensitivity rises incrementally above that in the younger group, but does not achieve levels as high as in the older age category, ranging between 68% and 90% 83, 96: 99: 100. The reason for this observation is likely to be twofold. One is explained by the increasing incidence of cancer as a woman ages, knowledge incorporated into the assessment made by clinicians that increases the index of suspicion of cancer in the older woman. In fact, there is a clinical caveat that a postmenopausal woman who has a breast mass has cancer until proven otherwise 102. The other is related to the ease of CBE interpretation in postmenopausal relative to premenopausal women, the latter of whose CBE interpretation may be challenged by the influence of ovarian hormone levels and fluctuations 103. How hormone replacement use in postmenopausal women will affect the sensitivity of CBE has not been studied to date. Mammography in Symptoma_tlc Women who Present with a Breast Mass Mammography is capable of detecting carcinoma before it can be palpated, and numerous clinical screening trials throughout the world have demonstrated its 28 ability to reduce mortality from breast cancer. The sensitivity of screening mammography is expected to be 85% or above 104. When referring to screening mammography, a false-negative case is defined as any case of symptomatic cancer diagnosed within 12 months of a normal screening examination 105. Since the false-negative rate is defined as 1- sensitivity, the usual false negative rate of mammography in the screening setting is expected to be about 10 - 15%. This number is often erroneously applied to the diagnostic setting. In a diagnostic setting, such as this one, the false-negative rate refers to the number of missed cancers at the time of biopsy or if biopsy was not done, at the time when a palpable mass was clinically evident. Unfortunately, the false-negative rate can be far higher in the diagnostic than in the screening setting, especially in young women with dense breasts 106. In early publications, the false-negative rates in symptomatic populations reached as high as 44% 107. Egan published a series of 2,793 breast biopsies done between 1963 and 1973, 886 of which were malignant. He reported a false-negative mammography rate of 21.9% overall in women with solitary nodules over age 38 years, and this varied by age group. In the age categories 39-62 years the false-negative rate was 19.3% and in those 63-100 years it was 41.2% 11. This is somewhat unexpected, because most literature quotes a higher false-negative rate in women with dense breasts, who are more likely to be in young age groups. However, this study was limited by the technically inferior mammogram characteristics compared with modern times. 29 Edeiken reviewed 499 malignant breast biopsies performed at Jersey Shore Medical Center for palpable masses between 1976 and 1985. He found an overall false negative rate for mammography of 22% that did not vary by study year. When broken down by age, the false-negative rate was 44% for women less than 51 years of age who presented with palpable masses and 13% for those aged greater than 51 108. A study from Copenhagen during that same time period published more favorable performance characteristics for mammography but the results also differed by age, with false-negative mammograms in women with palpable masses in 17%, 9%, 3% and 3% respectively for age groups 40 or less, 41-50, 51-60, and greater than 60, respectively 96. Chew and colleagues, in 1996, published a series of 349 patients with breast masses who were evaluated at the Strathfield Breast Centre in Sydney, Australia. They found an overall false-negative mammography rate of 9% in patients with an average age of 60 years 109. In women under the age of 50, the false-negative rate was 17%, which was statistically significant. The average size of the cancer was 20.3 mm in mammographically-negative breast cancer and 23.8 mm in mammographicaIIy-positive breast cancer, which was not significantly different. This study is especially relevant because it was published during a time when state-of-the-art dedicated film-screen mammography units were used. Its results were corroborated by a study published in 1998 by Dawson and colleagues that documented a 35% false negative rate of mammography for women 35 years of age and lower who had 30 palpable masses, 110 and by Coveney et al, who found a 30% false-negative mammography rate for women with palpable masses 40 years of age or less 106. That study documented sensitivity rates for mammography by age in women with palpable masses of 70%, 75%, 85%, 92% and 90% for age groups less than 40, 41-50, 51-60, 61-70, and over 70 years of age, respectively 106. There are reasons for false-negative mammograms beyond the signs of malignancy being hidden by dense breasts, thus resulting in a normal mammogram reading. Coveney et al examined 291 cases of malignant palpable breast masses diagnosed between 1986 and 1991 at St. Vincent’s Hospital in Dublin, and documented a false-negative rate of 16.5%. Reasons cited for the 48 false-negative cases included misinterpretation of overt signs of malignancy on the films (20%), lack of recognition of subtle signs of malignancy (34%), and mammographic findings mimicking benign disease (16%). The mammograms were actually normal in only 30% of cases 106. In one large study of the litigation literature, 68% or patients that successfully filed suits against their physicians had either negative (49%) or equivocal (19%) mammogram reports. Several independent authors have reported on the effect of false negative mammograms on the diagnostic delay of breast cancer 107- 111'113, and cautions regarding pursuit of the diagnostic work-up of a breast mass no matter what the mammogram result are standard. In an interesting study by Lister et al published in 1998, they compared the test characteristics of mammography and high-frequency ultrasound in a population of 205 women with clinically benign masses who underwent triple 31 test evaluation and in whom the cancer prevalence was 7%. The mean age in the study was 40 years but mass size was not discussed, and no biopsy rates or follow-up durations were reported. However, one can compare test characteristics of the two radiological tests on initial evaluation of the breast mass. Sensitivity, specificity, and positive and negative predictive values for mammography were 57%, 93%, 61%, and 92% and for ultrasound, 93%, 97%, 68% and 99% 114. These authors now use ultrasound as the radiological study in the triple test. They limit mammography use in women with clinically benign lumps to those with confirmed malignancy (to screen for occult mammographic abnormalities prior to cancer treatment), to those with suspicious ultrasound findings, or to those in whom a screening mammogram is appropriate based on age and timing of the previous film. They comment, however, that ultrasound is extremely operator and machine dependent, and caution that this change in policy is only appropriate if local test characteristics confirm the improved accuracy of the procedure. The use of ultrasound as a substitute to mammography, or in addition to it, in the work-up of a palpable breast mass, is beginning to be reported with increasing frequency. Reinikainen et al reported on the use of high-frequency ultrasound in the evaluation of 84 palpable lesions, 63% that were malignant, evaluated between 1996 and 1997. The average patient age was 48 years and the average mass size on ultrasound was 20 mm for malignant lesions and 18 mm for benign masses. In a blinded analysis, ultrasound and mammography performance were compared. Ultrasound and mammography performed equally well, with 97% sensitivity for 32 both modalities. In both cases, 1 of 37 malignancies had normal imaging studies, but they were in different patients, both of whom were diagnosed with lobular carcinoma. One of the greatest benefits of ultrasound in this study was the increase in radiological suspicion of malignancy that occurred; 12 cancers that were indeterminate on mammography were clearly malignant on ultrasound. In addition, 57 cases were evaluated with FNAB, and 3 were false— negative. Radiological imaging with either one or both modalities correctly identified these malignancies 115. Classification of Mammography Findings The classification of findings in mammography has been reported in a variety of ways, and a standard reporting system was adapted in the United States in 1992. This system, referred to as the Breast Imaging Reporting and Data System (Bl-RADST”) was developed by the American College of Radiology 116. lt assigns a numerical score to a range of mammographic categories between 1 and 5. BI-RADS is an accepted standard among radiologists dedicated to mammography, predicting for malignancy as the categorical designation increases numerically. Exceptionally reproducible results on mammogram interpretation of 300 test cases has been documented, with almost identical ROC curves among 3 different expert radiologists 117. Some of the literature published after BI- RADSTM was released uses the terminology in its classification of mammogram readings, while other systems have been used as well. Most that do not use the Bl-RADSTM system classify mammogram findings into categories of benign, suspicious, and malignant. 33 Test Characteristics of Mammography in Women with a Palpable Mass The test characteristics of several studies that examined mammography performance in the presence of palpable masses are listed in Table 3. As with CBE test characteristics, many of the quoted values were derived from the data given in the papers, as not every report calculated the pertinent test characteristics. The categorization into positive vs. negative tests was based on a benign (normal or benign mammogram) vs. not benign categorization (indeterminate, suspicious or malignant reading) as discussed above. Some papers categorized the test values differently from the guidelines outlined above in the calculation of the test characteristics. For purposes of comparison , all values given in Table 3 were recalculated and checked using raw data and consistent categorization to ensure comparability. One study in Table 3 specifically examined mammogram results in the context of palpable masses 108, while some did so in the context of other diagnostic combinations, but not as a component of triple test 100, 101. However, like CBE evaluation, most of the studies in table 3 were done in the context of triple diagnosis evaluation, and mammogram test characteristics analyzed as a separate component of this diagnostic evaluation 7, 75’ 76, 87, 89'98. The papers by DiPietro from 1985 and 1987 and Martelli from 1990 deserve special comment. These 3 papers are from the same institution. The authors considered several mammograms unsatisfactory, because the x—ray did not visualize the lesion that was palpable because of breast density. However, they appropriately included the readings as “negative” in the calculation of test 34 l./_ characteristics (false negative in the calculation of sensitivity and true negative in the calculation of specificity). Most papers do not make the distinction of unsatisfactory as a criterion for mammography findings in women with a breast mass, but instead simply include the results in the calculations of test performance as “negative”. Although for purposes of the table the results were recalculated to include the unsatisfactory mammogram readings into the negative results, the reports mentioned above provide helpful information regarding reasons for false-negative mammograms. Histo_ry and Clinical Breast Exam in Combination with Mammography In 1997, Seltzer published a prospective study of 6,787 consecutive patients whom he had evaluated as part of his surgical breast practice between September 1987 and December 1993. He used the combination of history, CBE, and mammography and committed himself to a clinical impression of benign or malignant disease prior to intervention. Biopsy was performed in 2,247 patients, 1240 of whom presented with a breast mass. Two hundred and sixty-six of the patients with a breast mass had cancer for a malignancy rate of 21%. The positive predictive value for the combination of tests in this setting for this experienced surgeon was 0.68; 0.45 for women less than 50 years and 0.89 for those 50 years of age or more 6. The negative predictive value in women with a mass for the combination of tests was 0.95 for those less than 50 and 0.89 for those 50 and older. In a similar study reported by Stems, the positive predictive value for malignancy was different depending on the combination of test results and the age of the patient. For women less than 50 35 years of age, the PPV was 88.5% for those with positive CBE and mammography, 49.6% for positive mammography but benign CBE, and 42.1% for positive CBE and benign mammography. When both tests indicated benign disease, the malignancy rate was still 12.6%. For women 50 years or greater, the respective values were higher at 92.2%, 66.7%, and 36.5%. The malignancy rate when both tests were negative in the older women was only slightly lower than in the younger women, at 12.2% 118. The false-negative rate of the diagnostic work-up of a breast mass is too high to stop the evaluation at the point of a benign CBE and a benign mammogram. Although this point has been stressed repeatedly in the litigation literature, the clinical literature also reflects unacceptably high missed cancer rates if cytological or histological steps are not taken. Fine Needle Aspiration Consistency Experienced cytologists have written about the physical sensation experienced by the aspirator when the tip of a needle enters a solid breast mass. This internal consistency has been described as soft, rubbery, hard, and/or gritty. Gritty lesions, especially those that are hard and gritty, are more likely to be malignant 119» 120. A few authors have used this feature as part of the evaluation of whether or not a palpable breast mass represents malignancy. Somers and colleagues examined 105 lesions that they classified as “subsuspicious” at the Albert Einstein Medical Center between 1984 and 1987. They specifically excluded any solid, dominant masses and examined instead lesions which they described as areas of prominence, thickening or tenderness 36 on CBE. Unfortunately, they do not break down the frequency of each presentation and it is difficult to discern from their publication how many of these lesions actually represented cysts that disappeared following aspiration. All of the cases had “subsuspicious” CBE, mammogram and internal consistency results. They described the needle sensation upon aspiration as “rubbery firm, cystic or soft” as opposed to “gritty or indeterminate”. Twenty-five percent of the lesions had a scant or insufficient FNAB; that was accepted by the authors because they used the F NAB results as a confirmatory rather than a diagnostic test in this study. However, the study design dictated that any case of cancer that was identified utilized FNAB cytology or open biopsy in doing so. One cancer was identified by aspiration of suspicious atypical cells and infiltrating ductal carcinoma was confirmed on open biopsy. Six other biopsies were performed that were benign. Twelve percent of the patients were lost to follow-up; the mean length of follow-up for those followed was 61 months. No further cases of cancer were identified 121. Given that we have very little knowledge of the characteristics of the masses in the study population, the outcome of hypocellular FNAB results for breast masses under any circumstances, and the 25% insufficiency rate and 12% lost to follow-up rate of this study, conclusions regarding subsuspicious breast mass management from this study are preliminary, at best. Nonetheless, it would be safe to conclude that the malignancy rate of subsuspicious lesions is lower than for otherwise dominant breast masses. 37 Fine Needle Aspiration Cytology Classification of Fine Needle Aspiration Biopsy (FNAB) Findings Different classifications of cytology findings have been reported in the literature, but most use a 5-level scheme as follows: benign, atypical, suspicious, malignant, and insufficient. There is debate, however, regarding the atypical classification, how it should be reported, and how it should be managed. One large multi-institutional study classified a finding of “few atypical cells” as benign 122, while others consider any finding of atypia as an indication for open biopsy. In the latter instance, however, the atypical findings are usually classified as having characteristics compatible with indeterminate findings. In the practice guidelines recently published as part of the 1996 National Cancer Institute sponsored workshop on fine needle aspiration biopsy, recommendations for classification were made as follows: benign (no evidence of malignancy); atypical/indeterminate (correlate with imaging characteristics and clinical impression); suspicious/probably malignant (biopsy); malignant; and unsatisfactory. This classification scheme, by a subcommittee of national experts, appears to allow for clinical correlation in the management of atypical findings. Two pages prior to this classification in the same report, however, is this statement: “The presence of atypia on a cytologic or histologic preparation, the degree to which is to be determined by the pathologist, warrants an excisional biopsY' 123. In the calculation of test characteristics for F NAB, most studies include the finding of atypia in the “positive” interpretation of the test, 38 unless the atypical cells are consistent with benign disease, as in the study by Zarbo et al 122. Test Characteristics of Fine Needle Aspiration Many studies of the diagnostic efficacy of FNAB have been reported evaluating its role in the vvork-up of palpable breast abnormalities, and the largest have been selected for reporting and summarizing in Table 4. Some have been done in the context of triple diagnosis, but most have not. Most studies document greater test sensitivity than with either CBE or mammography, and unlike either of these two tests, the sensitivity of the procedure does not appear to vary by age 100. In an important multi-institutional study conducted in 1988 in which 294 institutions and 988 pathologists participated throughout the United States, 13,066 breast FNAB specimens were examined, 33% that had histopathologic correlation to form the basis for determining test characteristics, and 41% that demonstrated malignancy. The sensitivity, specificity, and positive and negative predictive values were 82%, 97%, 95%, and 86% respectively 122. A second multicenter study from Italy reported on ten sites and 23,063 FNAB cases observed between 1987 and 1993. Histologic confirmation was available for 7,763 cases and a minimum of 6 month follow-up for another 13,144 cases, with a malignancy rate of 26% for the reported population. The sensitivity, specificity, and positive and negative predictive values for adequate aspirations was 92%, 95%, 90%, and 97% respectively 124. In another large study from 6 sites in the United Kingdom, 1010 patients 39 with adequate F NAB readings and a 61% malignancy rate demonstrated similar test characteristics of 92%, 86%, 91%, and 88% respectively 125. Of the numerous studies of the test characteristics of FNAB that have been reported, few report the lesion size that was aspirated or the age of the population, and most compare the results with open biopsy rather than all patients who underwent F NAB during the study period. Many studies combine results of cyst aspirations with solid mass evaluation, and some studies include readings with atypical cells in the “positive” category while others do not. In addition, specimen inadequacy is handled in different ways by different authors. Some include it in the calculations as a negative reading while others do not. Table 4 presents a representative sample of published studies that presented raw values for recalculation of test characteristics. For purposes of comparison, atypical findings were calculated in the positive category, cyst aspirations were eliminated, and insufficient specimens were dropped from the calculations. Pitfalls of F NAB When calculating test characteristics, most authors include both atypical readings and suspicious readings in the positive category, and approach was taken when calculating test characteristics for Table 4. The clinical application of a positive F NA result, however, differs depending on the reading. For atypical or suspicious readings, open biopsy is always done, whereas many surgeons will perform a definitive cancer operation for a cytology reading which is definitely malignant, considering confirmatory open biopsy unnecessary. In fact. this was one of the first applications of the procedure in the category of 40 avoiding open biopsy. False positive results of “definitely malignant” readings on FNAB are uncommon, but do occur, usually in predictable settings. These include the hyperplastic and atypical findings of pregnancy and lactation, the presence of apocrine cells simulating carcinoma, and the cellular features of some fibroadenomas. The usual rate for false positive results of definitely malignant readings is less than 1% 126. False-negative results on FNAB is a well-recognized problem and occur much more frequently than false-positive results. There are two major categories of false-negative readings: those with adequate cellular representation, and those without it. The National Institutes of Health guideline on FNAB recommends that aspirators have an insufficiency rate of less than 20% to maintain credentialing privileges 123. It is not clear, however, what literature was reviewed to arrive at this figure, as there is a wide range of insufficient rates quoted in the literature. Few studies quote insufficiency rates for consecutive aspirations, and the possibility of selection bias is high. For instance, many studies site an insufficiency rate that applies only to those specimens that were subjected to open biopsy. The insufficiency rate in this group would be expected to be lower than in a consecutive series of patients. The explanation for this prediction is based on the finding that masses that are malignant have lower insufficiency rates than masses that are benign 124, and benign masses are less likely to undergo biopsy. As an example, it is instructive to examine a series reported by Park et al 127 that examined 669 consecutive cases of F NAB specimens. They reported a 25.3% inadequacy 41 rate in the entire series of patients. However, for calculation of test characteristics, the authors had no follow-up mechanisms and relied on histo- cytologic correlation. The inadequacy rate in the 198 patients who were selected for biopsy was 15.6% 127. Lazda et al report on three different audit periods between 1988-89, 1991- 92, and 1993 in one cytopathology laboratory in the United Kingdom. In audit period I, 408 aspirations were done, 46.8% of which were read as inadequate. An FNAB clinic was instituted between audit periods I and II and the inadequacy rate dropped to 20%. However, it rose to 30.6% in 1993 128. The malignancy rates following an inadequate result were 37.5%, 13.2%, and 7.1%; the latter number reflects the usual rate, which ranges between 4% and 9% 128. It is higher, however, in some reports. In a series of 1708 cases by Martelli at the Tumor Institute of Milan, the insufficiency rate was 407/1708 (24%), and of these, 34% were malignant 7. In an earlier study from the same institution, there was a 31% insufficiency rate overall, and 16.1% of these cases were malignant 89. This was similar to an inadequacy rate of 26% in a publication by Watson et al from Galavvay, Ireland 129. The insufficiency rate has been quoted as lower by other authors. Vifinchester et al quoted a 20% insufficiency rate in their series 98. Horgan et al documented a 12.9% insufficiency rate, and 4 of the 258 specimens were malignant 130. Frable had a 6.0% insufficiency rate, and 7 of 42 of these lesions were malignant on open biopsy 131. Kaufman et al had only a 3.0% insufficiency rate, and all of these 42 lesions were benign on open biopsy 93. In Zarbo and colleagues’ more definitive multi-institutional study in the United States, the insufficiency rate of 294 institutions and 13,066 specimens was 18% 122. It was 19% in the multi- institutional study in Italy, with 6.5% rate of inadequate readings for the malignant specimens and 23.2% for the benign lesions 124. Other authors have documented higher inadequacy rates with benign lesions than with malignant ones as well 129. 130. Watson et al have also demonstrated that age predicts for inadequacy rates, with women less than 45 years of age having insufficiency rates of 14 - 26%, 48 - 55 years of age 13%, and over 55 years of age 1 — 5% 129. This is probably a reflection of the fact that young women are more likely to have benign disease, and therefore have a greater likelihood of inadequate aspirations. Reasons for insufficient specimens are multifactorial. Some aspirated lesions simply do not contain epithelial cells. This is entirely appropriate for nonepithelial lesions of the breast. In these cases, the aspiration may accurately diagnose the lesion as benign. At times, insufficient aspirations can be correlated with clinical and mammographic findings to arrive at a diagnosis of adipose tissue, scar tissue, radiation changes, or other reasons for epithelial paucity 132. To illustrate this point, Abele and colleagues demonstrated in a unique study in 1983 that direct aspiration of surgical specimens that were nonproliferative yielded a paucity of epithelial cells, despite repeated aspirations 133'1 35. In another interesting study, Wolberg et al found 66 of 464 FNAB 43 results (14.2%) to demonstrate a paucity of or no epithelial cells and to represent nonepithelial benign conditions after correlation with clinical and radiographic findings. All of these were followed for 12 months with no malignancies identified. However, he also documented 20 cases of inadequate specimens (4.4% of the sample) that were not considered definitely benign. Fourteen of these were benign and 6 were malignant 136. Concluding that insufficient results represent benign findings can therefore be a dangerous assumption, especially in inexperienced settings 137. Recognizing this, some authors consider an unsatisfactory or insufficient specimen to be the equivalent of no FNAB and believe that a reading of insufficient cells should not be used at all in medical decision-making 137: 138. This attitude was adopted in the construction of the test characteristics of FNAB presented in Table 4, and insufficient specimen results were not used in the calculations presented. Insufficient specimens that do not obtain true cellular representation of the components of a lesion represent true false negative results. The most common reason for this is technical, most often due to inexperience of the aspirator. lnexperience may be related to inadequate technique of aspiration, geographic misses of the target lesion due to small size, lack of circumscription, or lesion location 139. Barrows and colleagues demonstrated an insufficiency rate of between 6 and 21% among aspirators attempting to obtain representative cells from malignant lesions on FNAB if the physician had aspirated fewer than 70 cancers 125. Zarbo and colleagues, in the multi- institutional study conducted in the United States, documented a 17% 44 inadequacy rate overall, but only 7.2% in specimens obtained by a pathologist in contrast to a clinician, which they correlated with experience. However, they also pointed out that only 5% of the over 200 institutions in the study exclusively used pathologists for FNAB performance 122. Layfield also documented greater FNAB accuracy rates when a cytopathologist performed the aspiration, as discussed above 139. One must consider that the spectrum of disease in a population, and the age of that population, are Important considerations in all test interpretation characteristics, including inadequacy rates of F NAB. A series with a high malignant: benign ratio should predict for a lower insufficient specimen rate than a series with a low benign: malignant ratio. Does this explain some of the discrepancy that is found between series that report insufficiency rates of aspirations by surgeons as opposed to cytopathologists? lnvariably, surgeons have an inadequacy rate of around 20%, as opposed to 5- 7% for cytopathologists. Surgeons, however, are more likely to perform aspiration on consecutive patients, and therefore to include a greater number of “subsuspicious” lesions than might be referred to the cytopathologist for aspiration. This raises the possibility of a possible selection bias accounting for the differential rates of inadequate specimens reported in the literature. Investigation of this possibility awaits a future study. Rather than the chosen specialty of the aspirator, the amount of clinical experience and confidence of the aspirator have been shown to relate to aspirated material representing the lesion in question 100. 125. Dixon reported on FNAB inadequacy rated during two time periods, one during which multiple 45 aspirators were used to perform F NAB and the second during which only one aspirator was used. The inadequacy rate dropped from 22% to 10%, and the sensitivity improved from 81% to 99% 100. The actual criteria for specimen adequacy, or inadequacy, is open to debate. Layfield has suggested that a minimum of 6 clusters of epithelial cells per case, composed of at least 5 cells each, be present 137, while others have suggested 4-6 clusters of 5-10 cells each 140’ 141. The recently published guidelines from the National Institutes of Health regarding fine needle aspiration specifically state that there is no national standard requiring that a given number of cells be present for specimen adequacy 123. Salami et al reported on the outcomes of 61 patients who had inadequate FNAB readings done for palpable findings at the University of California at Los Angeles in 1995. These inadequate results represented a specimen inadequacy rate of 23%, 21% attributed to misdirected biopsies and 1.6% to misinterpretation. The mean size of the mass on CBE for the inadequate results was 1.3 cm and the average age of the patient was 46 years. Twenty cases had a histologic diagnosis and the rest were followed for a minimum of 24 months. A retrospective review confirmed the lack of epithelial cells in 34 cases, 2 of which were malignant. Seven cases actually fulfilled the criteria for adequacy. and one of these was malignant. Twenty cases had some epithelial components, but less than six clusters required for the study. These were all benign. The study is unique because it is the first to examine the outcome of inadequate FNAB results in patients with palpable abnormalities 142. It 46 confirms that in the 3 cases of a false-negative F NAB, that open biopsy was performed on the basis of an abnormal CBE or mammogram. There are other reasons for false-negative F NAB results. Lesion size is one. A small target lesion can be missed by the aspirator, resulting in a sampling error. Barrows et al reported on 689 breast cancers divided by 1 cm size increments. They found that the percent of positive or suspicious aspirates rose incrementally from 70% for 1 cm lesions, to 80%, 85%, 90% for 2 cm, 3 cm, and 4 cm and over sized lesions 125. Less dramatic but consistent findings were reported by Ciatto, with sensitivities of 92% for lesions up to 2 cm, 95% up to 5 cm, and 96% for over 5 cm 124. Paradoxically, some investigators have found an increased false-negative rate when the lesion size is over 4 cm 139’ 143. This is attributed to the presence of large areas of necrosis, hemorrhage or fibrosis within a lesion. There is also well-documented difficulty interpreting some aspirates, even if the material is adequate and representative. Reasons for false negative interpretations include specific pathology such as low-grade in-situ malignant lesions, well-differentiated invasive ductal carcinoma, tubular carcinoma, papillary carcinoma, comedocarcinoma, colloid carcinoma, and invasive lobular carcinoma 123. 130- 139. It is interesting that young women are more likely to present with smaller masses than older ones, and there is a correlation between mass size and tumor differentiation. The probability of a false- negative reading is correlated with tumor differentiation such that the more differentiated the tumor, the higher the likelihood of a false-negative F NAB 47 reading 126. Therefore, a higher proportion of young women will have a false- negative F NAB result because of this reason than older women. Because the overall proportion of women with this profile will be low, however, test characteristics are not likely to be affected to any great extent. Zarbo et all was able to quantitate the false-negative rate of FNAB due to sampling errors (7%) vs. interpretation errors (0.8%). Although this may appear to be a relatively low rate, absolute numbers bring it into perspective. In their nationwide study done over a 6 month period, there were 261 occurrences of patients with breast cancer who had false negative FNAB results 122. That false-negative FNAB can lead to a delayed diagnosis of breast cancer is underscored in a study by Troxel et al that identified false-negative FNAB cytology of breast lesions to account for 10% of pathology-related malpractice claims 144. False-positive results can also occur. False-positive interpretive errors are usually due to poor cell fixation or inadequate staining, or misinterpretation of findings, usually of proliferative benign lesions, some with atypia 139. This is especially true of fibroadenomas, and this can influence clinical management. For instance, few would insist on the removal of a fibroadenoma with a negative FNAB unless the patient requested it. Dixon et al have shown that over 33% of fibroadenomas become smaller or resolve when followed and that conservative management of fibroadenomas is appropriate in 90% of cases 145. However, the cytologic features of fibroadenomas on FNAB can be confusing 139: 146. The lesion is moderately cellular and can have atypical features, and in some 48 cases this will require biopsy to exclude malignancy 139’ 147. The clinical and mammographic features of fibroadenomas are generally easily differentiated from those of carcinoma by virtue of their well-circumscribed, hypermobile clinical characteristics, but sufficient overlap exists between these features and medullary carcinoma and small invasive ductal carcinomas to require biopsy in the presence of atypia 147. 148. Maygarden et al reported on 48 FNAB results in women 30 years of age and under, 8 of which had atypical features. Two of these represented fibrocystic changes, 2 were fibroadenomas, 3 were infiltrating ductal carcinomas, as 1 was a medullary carcinoma 149. Clinically, 2 of the 4 malignant lesions were clinically benign, underlining the need for open biopsy in cases of atypia. In conclusion, even if specimen insufficiency is eliminated, FNAB has a false-negative rate that precludes its use as the only method of ruling out malignancy and a false-positive rate that results in excessive biopsy rates. This has lead investigators to use a combination of methods to achieve greater accuracy in the diagnostic evaluation of a breast mass. Triple Diagnosis Classification of Triple Test F Mpga Triple diagnosis has traditionally referred to the application of the combination of three diagnostic tests in the work-up of a breast mass: clinical impression, mammography, and fine needle aspiration cytology results. The results of each individual test have been classified in the same ways as has previously been discussed. Its modern application involves clinical decision-making for test of 49 concordance both for malignant and for benign results. That is, definitive cancer operations are often done for malignant results of the triple test, and follow-up has been recommended for benign results. Discordant findings between tests when the triple test is applied indicate the need for open biopsy. Some authors have attempted to devise a scoring system for triple diagnosis in order to increase diagnostic accuracy. In the first such attempt, Smallwood et al conducted a study of triple diagnosis in the breast clinic at Southampton General Hospital in 1981. Their system assigned a numerical score of -2 for a definitely benign mass, -1 for possibly benign, 0 for equivocal or inadequate assessment, +1 for probably malignant and +2 for definitely malignant for a scoring range of -6 to +6. The scoring system was applied to the results of clinical examination, mammography, and FNAB, but clinical examination was not reported. The inadequacy rate of F NAB in the 244 lesions assessed was 34% (these lesions were assigned a 0 in the scoring system) and the malignancy rate in the population studied was 40%. The authors report the results of the scoring system but do not indicate a cut-off point for h'iage purposes as all patients underwent open biopsy at that time. However, if the score was +4 or above, definitive surgery was conducted without frozen section and was accurate for malignancy in all of the 72 cases. No patients with scores of -4 or below had malignancy; 65 patients fell into this group. Of the remaining 60 patients with scores between -3 and +3, 27 (45%) had malignancy 150. No further publications regarding scoring systems appeared in the literature until a follow-up series on Vetto’s original paper was presented at the Pacific 50 Coast Surgical Association in February 1998. In it, Morris et al departed from the attitude that open biopsy was necessary whenever there was a discordant result of the components of triple diagnosis. In a series of 261 breast masses studied between 1991 and 1997, they assigned a scoring system between 1 and 3 to each of three components of triple diagnosis, corresponding to benign, suspicious, or malignant results. This allowed for a possible score between 3 and 9. The mean age of the patient was 49 years. The authors did not comment on mass size. One-hundred twenty-five masses scored a 3, indicating concordance for benign results. Twenty-three of these underwent open biopsy and 112 were followed for a mean duration of 18 months. All were benign. Seventeen cases had a score of 4, indicating discordance with one of the components of triple diagnosis. These all underwent open biopsy and all were benign. A score between 6 and 9 was assigned to 88 masses, and all of these were malignant. A score of 5 was assigned to 21 masses (2 of the 3 components of triple diagnosis suspicious and 1 benign, or 2 benign and one overtly malignant). Of these, 8 were malignant on open biopsy and 13 were benign 94. The authors concluded that scores of 4 or less could be appropriately managed by follow-up. They acknowledged that FNAB cytology has greater diagnostic accuracy than the other components of triple diagnosis, and commented that they did not want to develop a “weighted” system of scoring for the sake of simplicity. They justified their recommendations of avoiding open biopsy, even in the case of a suspicious FNAB reading, by citing that the 17 cases that fell into this category in their series were all benign, and 51 by discussing the reasons for false~positive FNAB results. In a provocative discussion following presentation of this paper, Butler questioned the recommended approach for scores of 4, especially if the abnormal result was a FNAB. He stated that in 18 patients studied with these findings at his institution, 1 case was malignant. He also questioned the advisability of proceeding with definitive therapy in cases of scores of 6 (patients with suspicious results for all 3 components of triple diagnosis, or in those with malignant CBE or mammography results but benign FNAB findings). Other discussants questioned the criteria for interpretation of CBE and whether the results of the study were generalizable, certainly valid concerns that had been raised by other studies as well. In the reply. Dr. Vetto addressed the question of CBE by commenting that the evaluation is attempted blindly, “sometimes without even looking at the mammogram”. Although this reply may not have fully answered the question, it does reflect the reality of most surgical practices. That is, that when a woman is evaluated for a breast mass, it is difficult to separate the mammogram findings form those of physical exam when arriving at a clinical impression. In fact, this is openly acknowledged in a paper by Cardona and colleagues, who state that “PE (physical examination) sensitivity will be overestimated when PE is performed after other diagnostic tests, such as mammography, because the clinician may be influenced and alerted by radiologic evidence of a suspect lesion.” This study has attempted to address this problem, as well as to specifically define the characteristics of malignancy used during clinical examination. 52 Test Characteristics oftIhe Triple Test Numerous authors have reported the results of the application of triple diagnosis, and Table 5 documents the test characteristics of a representative group of studies. The same criteria for calculation of results from raw data was applied as in the previous tables. Some studies only included results for malignant diagnoses, and in these cases, only the sensitivity for the triple test is indicated. Two studies had purposely low malignancy rates of 0.4% 151 and 1% 121. These two studies specifically examined lesions that they had assessed as not requiring open biopsy. The positive predictive value for these studies would be expected to be much lower than in those that had higher disease prevalence, and this expected result was verified in the calculations. However, the other test results are comparable with the rest of the literature. It is important to recognize that the usefulness of triple diagnosis has evolved over time, and its results were applied in Europe well before the United States. Initially, its major strength was recognized as its ability to increase the sensitivity (that is, decrease the false-negative rate) of the diagnosis of malignancy and to reduce the need for frozen section confirmation of malignancy prior to definitive surgery 150. In one of the first studies done in 1976 in Berlin, Germany, Kruezer and colleagues viewed the triple diagnosis test as a three-way test of concordance. They recommended definitive surgery in cases when all 3 test components were definitely malignant. In addition, they advised that if all three components were definitely benign, that close follow-up was acceptable to open biopsy. They did not define the parameters for follow- 53 up. Lastly, they advised that if any of the components diverged or results were suspicious, that open biopsy was warranted. Using these guidelines, they found a diagnostic error of less than 1% 75. It should be pointed out that the criteria for benign findings on CBE were any mass 3 cm in size or less that lacked advanced signs of breast cancer. The population in which their study was done was probably representative of many in the early 1970s, but the presentation of a breast mass in 1999 is far different than 25 years ago, as discussed above. The recognition of this fact, as well as the high malignancy rate in their population (41 %) obligates one to question the generalizability of the findings of that report. In a study done in Denmark in the early 1980s, Hermansen reported on 22 solid masses, 61% that were less than 2 cm. Their malignancy rate was 26%. They had a 100% sensitivity with triple diagnosis, but cautiously concluded that before open biopsy could be avoided, that a larger study would be necessary 37. A follow-up study was published 3 years later by the same authors. It included 458 palpable solid breast masses with a malignancy rate of 31%. In this study, the diagnostic accuracy for malignancy was enhanced over that of any one of the test components alone, as has been found by many authors. For example, 13 lesions had a false-negative mammogram report, and 10 of these had either suspicious or malignant findings on FNAB. The remaining 3 cases were accounted for by suspicious findings on CBE 92. This again resulted in a test with 100% sensitivity. They concluded at this point that if all 3 test components were benign, that open biopsy “was superfluous”. They also 54 acknowledged that it was inevitable that a false-negative result of the triple diagnosis would occur, and that clinical follow-up was warranted. Again, however, no guidelines regarding follow—up were provided. Despite the initial enthusiasm for triple diagnosis, some authors have not achieved 100% sensitivity with application of the technique. This leads to debate regarding its use. DiPietro reported a series of 631 solid breast masses without obvious signs of malignancy on CBE from Milan, Italy in the early 1980s and found a 7% false negative rate with triple diagnosis 89. In another series from the same institution reported in 1990, Martelli reported a 4.8% false negative rate using the triple test. This lead that institution to adopt the policy that open biopsy of any solid mass that persists on 1-2 month follow-up be done in women over 30 years of age 7. Butler et al also recommended that open biopsy be done on persistent breast masses, even though their study documented a 100% sensitivity for triple diagnosis 90. Steinberg et al use similar three-way concordance criteria for biopsy or follow-up in a more modern study, otherwise recommending open biopsy. However, they fail to define the components of CBE evaluation and do not comment on mass size or population age. They found a 1% false-negative rate with a benign triple test and emphasize the importance of follow-up, again providing no guidelines for what is meant by this recommendation 95. Vetto et al, in a modern series in the United States, found 100% diagnostic accuracy when all three components of the triple diagnosis were concordant. They recommended definitive therapy for those that were malignant, avoiding the need for diagnostic biopsy in these 55 cases. They also recommended that if all three results were benign, that follow- up should be done rather than biopsy. For discordant results, biopsy was recommended 97. Most authors agree with this recommendation. Layfield has estimated that using this approach, that 50% of open biopsies could be eliminated, and follow-up in compliant patients be done instead 152. He recommended 3 — 6 month follow-up but does not specify a duration of time over which this should occur. As would be expected considering that both CBE and mammography are components of triple diagnosis, results are influenced by age. Dawson and colleagues studied 68 patients 35 years of age or less diagnosed with breast carcinoma between 1984-1996. Sixty-five patients had both mammograms and FNAB performed in addition to CBE. Twenty-three of these (35%) had benign CBE results and mammogram findings. The diagnosis of cancer was made by FNAB in each of these 23 cases, with readings of malignancy, suspicious for malignancy, or atypical. There was one false-negative F NAB in this study, but on retrospective review, the authors comment that the side was misinterpreted as benign and should have been read as atypical 110. In another series reported by Yelland et al, 81 patients under the age of 36 were identified with breast cancer between 1971 and 1989. The average tumor size was 3 cm. The masses were evaluated as malignant in only 58% of the clinical exams by the surgeons. In a unique analysis, the authors were able to identify that only 25% of the general practitioners who initially examined the patients considered the mass malignant. Mammography was positive in only 24% of the clinically 56 benign masses and 80% of the clinically malignant ones. FNAB results were positive in 56% of the clinically benign masses and 80% of the clinically malignant ones. These dismal statistics were improved upon when triple diagnosis was applied, but the sensitivity was still only 84%, and 10 of the 63 patients evaluated had benign results on all three tests. Open biopsy was done in these 10 patients based on inability to follow the patient. One might question the experience of the clinicians who published this report until the institution in which it was conducted is examined. The series of cases were derived from the Combined Breast Clinic at St. George’s Hospital in London, which had been in existence for 19 years prior to patient identification. The results might be explained by the technological failures of mammography at the time that the study was conducted. Mammography techniques have certainly improved over the years, but are unlikely to ever achieve near 100% sensitivity in young women. It is easy to see why some surgeons believe that open biopsy should be performed on all women who present with a breast mass. The authors of this paper discussed the fact that 9,768 patients less than 36 years of age consulted the breast clinic during the same time period as this study. Assuming no false negative rates in this population, 99.2% of the women seen in this clinic under the age of 36 will be diagnosed with benign disease. If they all had masses and open biopsy was required on them all, the operative burden would have been 10 cases per week. The clinic currently is using ultrasound to assess masses in women who have negative mammograms, as are other centers 86- 109: 153. 57 The effect of small tumor size on the individual components of triple diagnosis was investigated in a series of 21 invasive carcinomas from Osaka, Japan that were less than 1 cm in size. All of the tumors were palpable, although 57% were thought to be benign on CBE. The mammogram was benign in 17% of the cases and FNAB benign in 4 cases (19%). Of the 4 that were benign, CBE was suspicious in 2, equivocal in 1, and negative in the other. Mammography was suspicious in this latter case. No false negatives were found 154. The authors comment that breast size in Japanese women is on average smaller than in women in the United States and CBE and FNAB may therefore be more accurate in Japan than in other settings. The performance of triple test on women with small breast tumors needs further study. The only other publication that was found addressing this issue at all was that of Di Pietro et al from 1985 in a large study of Italian women. In this study, 24 malignant tumors measured 1 cm or less on physical examination. The authors comment in the discussion of the paper that the positive predictive value for fliese small tumors was found to be between 30% and 40% for CBE and mammography, and 80% for FNAB. No data was given, however, and the outcome of the triple test for this subgroup was not discussed. Breast cancer is more challenging to diagnose as the size of the tumor decreases, especially on CBE and on mammography in women with dense breasts. In fact, FNAB has been cited by some authors to be especially helpful in these settings. Very few studies have done comparisons of test performance according to tumor size. Langmuir provided some insight in her study by 58 analyzing the patients who had cancer and providing data on test performance. In that study, CBE was abnormal in 75%, 90%, and 100% of patients with cancer Follow-up after Triple Diagnosis Although most authors who have published about the use of triple diagnosis to triage patients from open biopsy to follow-up have emphasized the importance of follow-up, few have provided guidelines for what constitutes follow-up. The large experience of DiPietro et al, documenting false negative rates of triple diagnosis between 4.8% and 7% make it logical that follow-up will be required for identification of these cases 89. Follow-up recommendations and/or practices have ranged from routine screening through the primary care provider 94, to open biopsy of all persistent masses at 1-2 month follow-up 7- 89. Some authors recommend open biopsy if the mass persists, but allow for longer, but often unstated, periods of observation 90’ 98. Abele and colleagues published a series of benign FNAB results that they confirmed as benign using the standards of benign biopsy or no change on 6 month follow-up 133. Ciatto reported a modern series using the same criteria for confirmation of benignity of FNAB 124. No additional malignancies were reported in either study. Barrows et al used similar criteria in their study of triple diagnosis with the exception of a 12 month follow-up time period 125. They also reported no follow-up malignant lesions. Brown et al reported on 526 patients with breast masses assessed by triple diagnosis between 1986 and 1990, one eventual cancer was initially benign on all three components of triple diagnosis. This patient was diagnosed 59 on follow-up 3 months after the original presentation by an increase in the size of the mass. No other malignant cases were identified on follow-up, but the number of cases of follow—up were not reported 155. Boerner and Sneige reported on 2 false-negative cases of triple diagnosis. One woman requested an open biopsy and a 7 mm invasive ductal carcinoma was found. The second was followed and at 5 months, a dramatic increase in the size of the original mass was found. It proved to represent a case of a 4 cm ductal carcinoma in- situ 85. The authors comment that unbiopsied masses were followed for two years, but do not comment on the results beyond the two cases mentioned above. Rimsten reported on one false negative case by palpation and FNAB that was diagnosed 5 months later because of an increase in the size of the mass 82. In summary, lesions that are followed after initially judged benign by CBE, mammography and FNAB are usually benign, with about a 1% false negative rate in the literature that is available from modern series. The follow— up that is deemed necessary to capture the 1% of cases is less well defined, but it appears that the cases identified on follow-up have all been found within 6 months. It is still possible that a slovv-grovving cancer may take longer to increase in size than 6 months, but if there is no change in 12 months, it would appear that follow-up would be complete. This definition is used in calculating false-negative rates for screening mammography, and it is standard to consider symptomatic cancer diagnosed within a 12-month period after a normal screening mammogram as evidence of a false-negative result 105 60 The most specific guideline for follow-up that is published corroborates this conclusion. It represents a 1996 expert panel consensus from a National Cancer Institute sponsored workshop on fine needle aspiration biopsy. A 3 to 6 month follow-up after a benign triple test is recommended, “followed by repeat examination within an additional 12-month period, depending on subsequent clinical findings”. Previous Use of Likelihood Ratios in Women wig a _Byeast Mesa Barton studied the likelihood ratio of CBE in the presence of a breast mass. When a mass is palpated on CBE, the likelihood ratio for malignancy was demonstrated to be 10.6 in this study, compared with a negative likelihood ratio of 0.4762. This relatively high negative likelihood ratio in the face of a normal CBE reflects the fact that breast cancer is often diagnosed on the basis of an abnormal mammogram. Giard and Hermans studied the likelihood ratios for 4 different values of FNAB in 996 patients. They defined the values as follows: Infinity for those patients diagnosed as definitely malignant; 4.81 for those diagnosed as suspicious; 0.11 for those diagnosed as benign, and 0.41 for those diagnosed as insufficient 155. Sum ma_ry of Literature Review The influence of the risk factor profile on the probability of disease in a symptomatic population has not been studied to date. Numerous studies testing the sensitivity, specificity, and positive and negative predictive values of CBE, mammography, and fine needle aspiration cytology in the work-up of a breast mass have been published, and none of these alone will result in 61 sufficient rates of breast cancer diagnosis to be used exclusively. Improved accuracy of breast mass diagnosis using the principles of triple diagnosis have been demonstrated by a number of large academic centers throughout the world. However, these centers often have dedicated surgeons, radiologists, and cytologists to contribute a multidisciplinary approach to the diagnosis of breast cancer. This raises questions regarding the generalizability of the published results. In addition, Giard and Hermans have commented, in a study of the results of 29 studies of F NAB test performance, that any test that is operator-dependent is likely to suffer from publication bias 157. That is, that study results are unlikely to be submitted for publication if they do not reflect good, or at least perceived adequate, test performance. This insight raises questions about the applicability of FNAB in the typical general setting to which most patients with breast masses present, and these and other authors recommend the development of individual performance characteristics for persons involved in FNAB 136: 157. This recommendation could easily be extended to all of the components of triple diagnosis. The published studies also suffer from lack of population characterization; most fail to report average mass size or age of the population studied. When mass size is quoted, the results from histopathology examination rather than CBE results are often used, and sometimes only the size of the malignant results are reported. The spectrum of disease, whether presenting in early or advanced forms and in what proportions, can influence the test characteristics of any diagnostic test, and age has an influence on the prevalence and test 62 characteristics as well. Studies of CBE, mammography, FNAB, or the triple test, with few exceptions, include extremely high prevalence for cancer of between 26 and 66%. Most of these rates do not reflect the prevalence for cancer in most practices, even in most breast centers, and indicate a probable selection bias among the patients to whom the tests were applied. If a population is selected that includes a greater proportion of malignant cases than is reflective of the patients with breast masses in general, this selection bias can effect the measured test characteristics. Positive predictive value, for example, will be highly dependent on the prevalence of disease; as the prevalence decreases, so will the positive predictive value (and conversely, the negative predictive value will increase). To illustrate this point, it is instructive to examine a report by Logan-Young et al that selected a large population of women with breast masses to study whom had characteristics of benign disease, rather than malignancy, on CBE and mammography. In this study, the prevalence for cancer was only 0.4%. These patients all had FNAB as part of their vvork-up, and had biopsies or follow-up for 12 months. Nine cases of cancer were found in a population of 2,247 women, eight after suspicious or malignant FNAB results and 1 after 6 month follow-up revealed an enlarging mass 151. The positive predictive value for triple diagnosis in this study was calculated to be 32.0%. Given this, one might conclude that open biopsy is necessary in all patients with a breast mass because only 32% of the cases with a positive test actually had malignant disease. This conclusion would have 63 resulted in the open biopsy of 2, 238 women who did not have malignancy. The implications to the health care community and society at large are obvious. A comment also needs to be applied to the subject of the “gold standard” in measurement of test accuracy. In breast mass vvork-up, there is no question that the gold standard is open biopsy. This presents a problem because unless every mass that presents is biopsied, test specificity will be slightly underestimated, and sensitivity overestimated. In other words, those patients that do not have the “gold standard” diagnostic test will be dropped from the analysis. Therefore, the studies that report test characteristic data only on patients who had biopsies do not reflect the true test characteristics for the population that presents with a breast mass. A large series by Rimsten et al will illustrate this point, and give another example of the effect of prevalence on positive predictive value. These authors reported their data for FNAB two different ways. One examined test characteristics in women who underwent open biopsy (N = 381, prevalence of malignancy 28%), and the other examined the test characteristics in the entire population that underwent FNAB (N = 986, prevalence of malignancy 11%). The women who did not undergo open biopsy were followed for 24 months. One cancer was found at follow-up 7 months after the negative FNAB. The sensitivity, specificity, and positive and negative predictive values for FNAB in the population that underwent open biopsy was 96%, 85%, 71 %, and 98% respectively. The test characteristics for the population of women who underwent open biopsy with the addition of the followed patients were 95%, 91%, 58% and 99%, respectively. There are no 64 well-defined criteria for a substitute gold standard short of biopsy. Most authors agree that a diagnosis of cancer can be ruled out (therefore eliminating false- negative test results) after a reasonable period of follow-up. The follow-up period varies by study, however, and is often not defined at all. This study was designed to help sort out these problems. It includes all patients who met eligibility requirements for the triple test, whether they had a biopsy or follow-up. It does not select for patients in whom malignancy was suspected, and includes a minimum follow-up period of 12 months as part of the eligibility requirements. It attempts to characterize the population in detail and to clearly define the criteria used to classify CBE findings. Although the patient population is derived form a university-based Breast Center, the large number of self-referred patients and the volume of referrals for anxious women decreases the pro-test likelihood of disease, or prevalence, to a level that is more likely to be reflective of average surgical practices throughout the country rather than a tertiary care center. MATERIALS AND METHODS uantitative Scorin S stem: Prelimina Work: In academic year 1993-94, a pilot study in the Comprehensive Breast Health Clinic (CBHC) at Michigan State University was conducted to investigate a method to quantify the principles of triple diagnosis. The goal of the study was to design a quantitative model for each of the methods used for breast mass evaluation in the CBHC with the purpose of arriving at a summation score. The intent was for the score to be used to predict for benign vs. malignant disease 65 and to be used prospectively to triage patients between open biopsy and follow- up. A numerical score was assigned to each of 4 separate assessments of breast masses, which included (1) clinical impression of the breast mass on CBE (2) mammography results (3) F NAB results and (4) assessment of mass texture when aspirated. For clinical impression on CBE categorization, three numerical scores between 1 and 5, 1 for benign, 3 for indeterminate, and 5 for suspicious for malignancy were assigned. Components of the clinical impression used in the categorization included the risk factor profile, the history, the CBE results, the mammogram assessment by the surgeon, and the internal texture of the mass when fine needle aspiration was performed. For the CBE results, categories similar to those of Di Pietro et al were collected 91. The BI-RADS lexicon system of the American College of Radiology was used to categorize the mammography findings. This system includes the 5 categories found below. (1) Category 1 — normal; (2) Category 2 —benign finding; (3) Category 3 — indeterminate; (4) Category 4 - suspicious; (5) Category 5 — malignant. 66 For categorization of fine needle aspiration biopsy cytology, the literature was reviewed and the most commonly observed categories were assigned numerical scores between 0 and 5 consistent with their positive predictive value. Scores assigned were as follows: (1) 0 — no epithelial cells; adipose tissue; (2) 1 — benign cells; (3) 2 — slight atypia, consistent with benign disease; (4) 3 — many atypical cells; (5) 4 — suspicious; (6) 5 — malignant. For textures of aspirates, three numerical scores between 1 and 5 were assigned, based on descriptions by Roberts 119 and Kline 120. (1) 1 - soft or rubbery; (2) 3 - soft or rubbery but gritty; (3) 5 - hard and gritty. To conduct the preliminary study, the cases seen in 1992 by a single surgeon whose patients had been evaluated only by her for breast masses and for whom a complete data set was available were analyzed. Numerical scores were assigned for each of the 4 evaluation components as discussed above and a summation score obtained. One hundred thirty seven masses in 103 patients were evaluated; all had either open biopsy or a minimum of one-year follow-up. A one year time was chosen based on the natural history of the growth of palpable breast masses that represent carcinoma and on a literature 67 review of screening mammography standards 136, 151. Sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) were individually calculated for each of the 4 categories, for different score assignments in each of the categories, and for a summation score once optimal numerical assignments had been determined. Possible scores ranged between 2 and 20. A cutoff score of 7 or less essentially ruled out breast cancer (N = 124), and a score of greater than 7 had a sensitivity of 100% and a specificity of 98%. The results of the preliminary study can be found in Table 6. All malignancies were diagnosed within 4 weeks of presentation. Based on these results, a prospective study was begun, which is the topic of the remainder of the discussion. Study Design: Data Collection To develop a mathematical model to predict for breast cancer using the principles of triple diagnosis, a prospective database was constructed. Two forms were developed to collect demographic, epidemiologic, historical, and clinical data. The epidemiologic portion of the database was completed by the patient at the time of the initial evaluation in the CBHC and pertinent information updated at each visit. Epidemiologic variables collected for analysis included age, height, weight, body mass index, personal history of breast cancer, reproductive factors (age at menarche, parity, age at first delivery, number of deliveries, menopausal status, age at menopause), exogenous hormone use (oral contraceptives, estrogen replacement therapy, progestin replacement 68 therapy, including duration of use), family history of breast cancer (first and second degree relatives, including age at diagnosis and unilateral or bilateral involvement), history of childhood or early adulthood radiation therapy, and tobacco and alcohol use, including duration and frequency. The follow-up database contained clinical data appropriate to each patient visit and allowed for ongoing prospective data to be collected from each of the clinicians practicing in the CBHC. Clinical variables collected included mass characteristics (size, shape, texture, mobility, consistency, mammogram results, results of fine needle aspiration biopsy (internal consistency and cytology results), and clinical disposition of the patient. Additional variables were collected for interest but were not a major focus of the study. These included ultrasound results and mammography density in the area of the mass. When the forms were not complete, data was extracted from the medical record using clinic notes and letters to referring physicians. This occurred for at least one variable in approximately 15% of the cases. Variables related to the diagnostic evaluation which are routinely included in assessment by triple diagnosis were defined according to criteria detailed in the following sections. Components of Clinical Impression: The first evaluation component of triple diagnosis involves the clinical impression of the physician examining the patient and represents the overall probability of cancer based on a variety of factors assessed during the clinical encounter. These include assessment of the risk factor profile, CBE results, mammogram results, and assessment of the internal 69 consistency of the mass when the needle is placed during fine needle aspiration. Clinical impression therefore represents an outcome rather than a true explanatory variable. Because we were interested in analyzing each of the components of the clinical impression separately, a decision was made to substitute the results of CBE for the overall clinical impression commonly used in triple diagnosis. Clinical Breast Examination (CBE): Mass size was assigned to 1 of 4 categories: (1) less than or equal to 1 cm; (2) 1.1 — 2.0 cm; (3) 2.1 — 5.0 cm; (4) greater than 5 cm. Mass shape was assigned 3 categories: (1) vague thickening, (2) round, oblong or lobulated mass, or (3) irregular shape. Mobility was assigned 2 categories: (1) mobile or (2) fixed to skin or chest wall. External texture was assigned 2 categories: (1) smooth or (2) irregular. Consistency was assigned 3 categories: (1) soft, (2) hard or (3) neither. Mammography: The surgeon’s evaluation of the mammogram in the area of the palpable abnormality was recorded according to the BI-RADSTM classification system as detailed above. Although much literature has been written about mammogram results in the evaluation of a breast mass, very little has been written about the timing of the mammogram with respect to other aspects of the mass evaluation. There is some literature that points out that mammography must be done before FNAB to avoid false-positive results from possible hematomas precipitated by the aspiration. If the 70 aspiration is done before mammography, a waiting period of 2 weeks to allow for hematoma resolution is suggested 153» 159. All of the mammograms in this study were done using these guidelines. Few authors have addressed the timing of mammography evaluation in relation to the appearance of a breast mass. Langmuir et al used a period of 6 months on either side of the time of the FNAB in their study, as did Steinberg et al 95, 101. Although admittedly arbitrary, a decision was made to include patients in this study whose mammograms had been done within 4 months before the first presentation of the breast mass or within 6 months afterward. Fine Needle Aspiration — Internal Consistency and Cflology; Internal mass consistency refers to the perception of the clinician regarding the way that the mass felt when performing an FNAB. It was categorized as follows using descriptions by Roberts 119 and Kline 120: (1) very soft, consistent with adipose tissue; (2) rubbery or soft, not consistent with adipose tissue; (3) either hard and non-gritty, or soft or rubbery but gritty; (4) both hard and gritty. Fine needle aspiration biopsy was performed using a modification of the technique originally introduced by Martin in 1930 160. A 22 gauge needle attached to a 10 cc syringe held by an lnradTM syringe holder was used. The skin was entered once, 3-5 cos of vacuum applied, and aspiration performed with several small strokes in several directions. Vacuum was released and 4 smears were immediately prepared, 2 for 71 Papanicolaou and 2 for Giemsa staining. Two of the slides were smeared and immediately fixed in 95% alcohol and 2 were smeared and air — dried. If there was not enough specimen to prepare 4 slides, the procedure was repeated. The needle and syringe were rinsed in cytopathology fixative and the specimens sent to the pathology department. If needed, the pathologist could retrieve cells by performing a cytospin retrieval on this specimen. Fine needle aspiration cytology was classified using 7 categories: (1) adipose tissue; (2) benign cells; (3) few atypical cells consistent with benign disease; (4) many atypical cells, indeterminate; (5) suspicious for malignancy; (6) malignant; and (7) insufficient. These categories are consistent with those most recently published following a Consensus Conference at the National Institutes of Health in 1996 123. Follow-up: For patients who did not undergo open biopsy after the initial evaluation, a follow-up appointment was routinely scheduled before the patient left the clinic. Routinely, this occurred at 3 months in women who had been examined at the optimum phase of their menstrual cycle. For those who were not at the optimal phase in whom the phase of the cycle was not clear, 6 week follow-up was scheduled. Rarely, 6 month follow—up was scheduled; typically this would be in cases of a clinically diagnosed fibroadenoma. The third and fourth follow-up visits were usually scheduled at 3 month intervals from the previous visit. 72 Study Setting: The study was conducted at the Comprehensive Breast Health Clinic (CBHC) at Michigan State University. The CBHC represents a single-site referral center for both physician and self-referred patients attended by 5 surgeons and 1 nurse practitioner. Of the new patients, approximately 85% of the patients are physician-referred and 15% are self-referred. Approximately 35% of the clinic population consists of follow-up high-risk patients, patients with a previous history of breast cancer, or women with active problems. The vast majority of the clinic population is drawn from a 30—mile radius around Lansing, Michigan. Period for Data Collection: Data collection forms were completed prospectively by the 5 surgeons and 1 nurse practitioner with clinical practices in the CBHC beginning in November 1994 and extending through the end of February of 1998. No data was abstracted from the charts until approval was granted by the University Committee on Research Involving Human Subjects. This approval was initially granted on 3/4/96 and renewed for one year’s time on 2/20/97. Selection of Study Population: Patients who were potentially eligible for the study included all patients presenting to the CBHC who had billing codes for fine needle aspiration biopsy or open biopsy for a mass during the times of the study. The billing office used a computer-generated system of identification to provide the list of potentially eligible women. Codes for 823 FNAB or open biopsy procedures were reviewed to determine patient selection for the study. The medical records were then 73 examined for each patient. Patients were included in the study if they were female, 30 years of age or older, and met all of the following eligibility requirements: 0 Physical examination results recorded by the examining clinician o Mammogram within 4 months before or 6 months after presentation of the mass 0 Fine needle aspiration cytology of adequate cellularity on first or repeat aspiration; or, if hypocellular, consistent with the clinical assessment that adipose tissue was aspirated and represented the mass in question 0 Hypocellular findings that were repeated with adequate cellularity for evaluation c Mass disposition that met one of the following criteria: 0 Open biopsy 0 Mass disappearance within 12 months of follow-up 0 A minimum of 12 months of follow-up by the evaluating clinician with no change in the size or the consistency of the mass Abstraction of Data from the Medical Record and Database Creation: Data was collected from the medical records and transferred manually onto a data abstraction form. To assure accuracy of data abstraction, the results of the first 20 charts abstracted by the data abstractor were compared with those abstracted by the PI of the project . A goal of achieving a kappa coefficient of agreement of .9 or better between the abstractor and the Pl was set prior to the study according to the suggested study design criteria of Fleiss 161. The Kappa 74 scores for inter-observer agreement between the data abstractor and the Pl were .98 using the abstraction forms created. Data was entered into a database created in Lotus Smart Suite Approach software and imported into the SAS" System for analysis. Definitions: The mathematical model constructed from the data is termed BREAST AID (Breast Cancer Bisk Evaluation and §coring System to Aid in the Biagnosis of Mammary Masses). To construct it, multivariate logistic regression analysis was performed using SAS® System software version 6.12. The data was divided into predictor (epidemiologic and clinical) variables and outcome variables. Epidemiolggic predictor variables included age, height, and weight; body mass index (bmi expressed as weight/ height2 ); personal history of breast cancer; reproductive factors (including age at menarche, parity, age at first delivery, number of deliveries, menopausal status, age at menopause); exogenous hormone use (oral contraceptives, estrogen replacement therapy, progesterone replacement therapy), including duration of use; family history of breast cancer (first and second degree relatives) including age at diagnosis and unilateral or bilateral involvement; history of radiation treatment in childhood or young adulthood; and tobacco and alcohol use, including duration and frequency of exposure. Clinical predictor variables included characteristics of the breast mass (size, shape, mobility, external texture, consistency); results of mammography 75 (negative, benign, indeterminate, suspicious, malignant); and fine needle aspiration results (internal consistency of mass — soft , rubbery, hard, gritty; and cytological evaluation — adipose tissue, mild atypia, many atypical cells, suspicious, malignant). Outcome variables included either histologic results at open biopsy or clinical disposition after a minimum of 12 month follow-up. The gold standard for BREAST AID evaluation was the presence or absence of cancer defined as ductal carcinoma in-situ, invasive ductal carcinoma, or invasive lobular carcinoma. Data Analysis and Statistics: Staged Application of Bayes’ Theorem: Application of Bayes’ theorem provides a method to use the pre-test probability of disease and modify it by applying the results of one or more diagnostic tests (in the form of a likelihood ratio) to arrive at a post-test probability of disease. The rationale for this approach is described in detail by Weinstein and Fineberg 162 and by Sackett et al 163 and is briefly discussed in the subsequent section entitled “Staged Approach to Model Building”. We were interested in modeling the diagnostic prediction for breast cancer in a way as similar as possible to the thinking process that a clinician uses in evaluating a breast mass. To accomplish this, a 3-step modeling process was used. Stage 1 represented the “pre-test” predicted probability for breast cancer based on epidemiologic and historical data gathered during the initial clinical encounter with the patient. Stage 2 represented the probability of breast cancer based on results of CBE 76 and mammography in combination with the epidemiology and historical assessment from Stage 1. Stage 3 represented the predicted probability for breast cancer based on the FNAB findings in combination with the Stage 2 model results. Multiple Lpgistic Regression MogaLipg To begin construction of the mathematical model, initial screening of each predictor variable was done using bivariate logistic regression analysis with breast cancer as the target disorder. Odds Ratios (OR) and 95% confidence intervals (Cl) were calculated to assess the magnitude of the association between each predictor variable and the outcome variable. Statistical significance was evaluated using the likelihood ratio chi-square (LRCS) statistic. Variables determined to be important m1 (defined as age and menopausal status for this analysis) or those with a p value 0.30 or less were selected for inclusion in the multivariate logistic regression model building phase. These covariates were entered into the multiple regression model and tested for significance using backwards elimination according to the method discussed by Kleinbaum 164. Possible tvvo-way interactions were assessed using similar methodology. The final model was the most parsimonious model that was as accurate in prediction for the probability of breast cancer as was possible. Sta ed roach ta Mogal Bagging; Stage 1 analyzed the epidemiologic variables in order to define the pre-test odds or likelihood of disease. The results of the multivariate analysis of these variables, referred to as the pre-test probability of disease, represented the 77 probability of breast cancer that a clinician might estimate after evaluation of only the clinical history and risk factor information. Application of Bayes’ theorem requires the conversion of the pre-test probability of disease to the pre- test odds of disease using the formula Pre-test odds of disease = P(D) I 1 — P(D) where P(D) = probabilityof disease. In Stage 2 the clinical variables from the CBE and mammography results were analyzed in a multivariate logistic regression model. The output of this model represented the probability of breast cancer based on the clinical variables of CBE and mammography alone. In order to apply Bayes’ theorem the logistic model was used to generate a likelihood ratio which was used in combination with the pre-test odds from Stage 1 to generate a post-test odds using the following equation: Pre-test odds X Likelihood Ratio (LR) = Post-test odds (from Stage 1) (from Stage 2) The LR expresses the odds that a given level of a diagnostic test result would be expected in a patient with, as opposed to without, the target disorder 153, and is a common measure of the discriminating ability of a diagnostic test. In this case, the LR from the logistic model represents the likelihood of cancer based on the combination of CBE and mammography results. The post- test odds represents the odds of disease for the combination of epidemiologic (Stage 1) and CBE and mammography (Stage 2) data combined. Stage 3 repeated the steps of Stage 2 by generating a logistic regression model for the FNAB data. A second likelihood ratio (LR2) representing the 78 odds of breast cancer based on the F NAB variables alone was generated, and then multiplied by the post-test odds from stage 2 to obtain the final odds for cancer. This final post-test odds was then converted to the final post-test or predicted probability of cancer. A summary of these calculations for the 3 stages is presented below. S_ta9e_1 Epidemiologic variables —> MODEL 1 (Pre-test probability) —+ pre-test probability I 1 — pre-test probability = pre-test odds S_tage_2 0 Clinical variables —> MODEL2 —> likelihood ratio1 (LR1) . Pre-test odds X LR1 = Post-test odds S_tage_§ o FNAB variables —-) MODEL 3 a likelihood ratio 2 (LR2) o Post-test odds X LR2 = Final odds 0 Final odds I 1 + final odds = Final Probability of Cancer Final Adjustment of the Model based on Disease Prevalence: The relationship between the probability of disease generated in the multiple . . . . . . . . 165 logistic regresswn model and the likelihood ratio (LR) has a logistic form and the regression coefficients of the logistic model can be used to generate an . . . 165 accurate LR according to the followmg equation LR = exp (a0 + B1X1 4' 32X2 +---+kak) 79 When the prevalence of disease is not exactly 50%, an adjustment must be made based on the prevalence of disease in the population under study in order to generate an accurate LR. This is achieved by adding log P(D—) I P(D) to the intercept term of the logistic model according to the methods described by Albert 165 and Reeves 166. Adjusted intercept = ao + log P (D-)I P (D) \Miere P(D-) = the prevalence of non-diseased individuals and P(D) = the prevalence of diseased patients in the population under study. The LR equation derived from the logistic model therefore has the following form under these circumstances: LR = exp (a0 + log P (D-) I P (D) + Z kaxk) or LR = P (D-) l P (D) exp (a0 + 2 kaxk) This represented the final step in our analysis to determine the probability of cancer for each patient based on her unique covariate pattern. Modfel Goodpess - of - Fit_: Each logistic model was evaluated using the Hosmer-Lemeshow goodness-of- fit chi square statistic. The latter statistic is a means to assess how well the model fit the data from which it was generated. It divides the data into groups and then calculates expected numbers of cases of disease based on the predicted probabilities from the model, and compares these numbers with observed values 167. This validation procedure was applied to all of the logistic models generated. A non-statistically significant result indicates that there was 80 insufficient evidence that the null hypothesis of adequate goodness-of-fit be rejected. BREAST AID Evaluation: Once constructed, the test performance of BREAST AID was evaluated using two techniques: (1) measurement of test characteristics and (2) construction of an ROC curve. Test characteristics are defined as the sensitivity, specificity, and negative and positive predictive value of a test based on a 2 X 2 table — see Figure 1 for an example of a 2 X 2 table and Table 7 for definitions of test characteristics. An ROC curve is a graphic representation of the test results, with sensitivity depicted on the “y” axis and 1 - specificity depicted on the “x” axis of the graph. This plot is used to assess various cutoff points for the final probability of cancer that best predicts outcome (see Table 7 and Figure 2 for definitions). RESULTS Population Studied: There were 279 female patients with a breast mass who were at least 30 years of age seen in the CBHC during the period of the study included in the study. They had 336 occurring synchronously (simultaneously) or metachronously (at different points in time) masses that met all of the eligibility requirements. There were 616 potentially eligible patients. Reasons for non-inclusion in the study and estimates of the number of lesions by reason include: (1) mammogram not done within the defined period for mammography (N = 36); (2) open biopsy, but no fine needle aspiration done (N = 39); (3) charts not 81 found on at least 3 attempts (N = 27); (4) FNAB hypocellular and not consistent with adipose tissue, not repeated or repeated with persistent inadequacy (N = 157); and (5) FNAB hypocellular on visit 1; patient sent for open biopsy (N = 52). Some of the patients had more than one reason for non-inclusion. In these cases, hypocellular results were tracked in preference to other reasons for non-inclusion. Estimations of reasons for non-inclusion were necessary because at the beginning of the study, specific reasons for exclusion were not tracked on 300 charts, although numbers of ineligible patients were. To account for this, 50% (N = 150) of the ineligible charts were reexamined and reasons for ineligibility listed. Data for the entire portion of the population of 300 patients was then estimated based on these results. These results were compared with the charts for whom detailed reasons for ineligibility were tracked in the latter half of the study (N = 316). The percentages of reasons for ineligibility between the 300 original charts and the 316 charts from later in the study demonstrated agreement of reasons for non-inclusion of within 10% of one another. Population Characteristics and Results of Bivariate Laglstic Ragression Biopsy and Follovv-up Rates The usual practice in the CBHC involves open biopsy for discordant triple diagnosis results (at least one of the three tests with an other-than—benign result) or for patient preference. Using these criteria, 135 of 336 lesions underwent open biopsy. These included 86 (63.7%) biopsies within 1 month of visit one, 35 (25.9%) biopsies within 4 months of visit 1, 12 (8.9%) biopsies 82 within 7 months of visit 1, and 2 (1 .5%) biopsies within 10 months of visit 1. Thirty-two malignancies were diagnosed after visit 1 and 4 after visit 2 using this approach. One of these four lesions was in a postmenopausal patient who initially declined biopsy at visit 1, even though the surgeon felt it necessary; she consented after the FNAB cytology indicated malignancy. The other 3 malignancies were diagnosed in premenopausal women. Two had a concordant triple diagnosis for benign disease after visit 1 and one had many atypical cells diagnosed on FNAB. All of these were diagnosed at visit 2 with persistent masses that had become more prominent at visit 2. All cancers were diagnosed within 3 months of presentation. Visit 2 follow-up periods ranged from a period of 6 weeks to 6 months. Six-week follow-ups were usually scheduled for pre-menopausal patients for purposes of examination at a more optimum phase of the menstrual cycle and six month follow-up for patients diagnosed with clinical fibroadenomas. Othenrvise, 3 month follow-up was scheduled. Table 8 lists the histopathological results for the 135 biopsied lesions and Table 9 compares them with histopathological frequencies from other published series. Thirty-six lesions were diagnosed with cancer; 27 with invasive ductal carcinoma, 5 with invasive lobular carcinoma, and 4 with ductal carcinoma-in-situ. This represents a malignant: benign ratio of open biopsy of 1:275 (i.e., 36:99). Ten high-risk lesions were identified, 2 with lobular- carcinoma-in-situ and 8 with atypical ductal hyperplasia. Of the benign lesions, 60 represented fibrocystic changes, 21 were fibroadenomas, and the remaining 8 represented miscellaneous benign histopathological findings. 83 Descriptive and Bivariate L_pgistic Regression Resalg The prevalence of cancer and bivariate logistic regression results for each of the categorical epidemiologic predictor variables are listed in Table 10. The same data for the categorical clinical predictor variables are listed in Table 11. Table 12 shows the same results for predictor variables specified on a continuous scale. Epidemiologic Variables All of the data concerning the categorical epidemiologic variables can be found in Table 10. Table 12 includes data on the continuous epidemiologic and clinical variables. Aga Age ranged between 25 and 84 years, with a mean of 47.6 years. The mean age of the benign lesions was 46.8 years and for the malignant lesions, 53.9 years. Seventeen of 221 lesions were malignant in women 49 years of age or less (7.7% malignancy rate) and 19 of 115 were malignant in women 50 years of age and older (16.5% malignancy rate). The comparison between the two was significant with a p value of .02 (data not shown). When patients were grouped into those less than 40 (reference group), 40-49, 50-59, and greater than 60, the risk of cancer increased sequentially with values of 1.0, 2.1, 3.7, and 4.9, respectively. Associated 95% confidence intervals were 0.67-9.5, 1.1- 17.3, and 1.4-23.2, respectively. This result was significant at the p = .05 level. 84 My Mass Index (BMI) Degree of obesity was measured by the Quetlet, or Body Mass Index (BMI). This was defined by converting height in inches to height in meters and weight in pounds to weight in kilograms. The formula weight! height2 was applied to calculate the BMI. The mean BMI for those women with cancer was 26.1 kglmz, compared to 24.6 kglm2 in the group without cancer. For purposes of data analysis, the BMI was grouped into those above and below the mean of 24.73 kg/m2 (range 16.7-58.5 kg/mz), with an associated odds ratio of 2.09 (95% CI = 0.75-5.34) for those above the mean. This was significant at the p = .04 level. Personal History of Breast Cancer Only nine patients indicated a history of previous breast cancer, and one of these was diagnosed with a new primary cancer in this study. There was no significant difference In the frequency of a personal history of breast cancer between the two groups (OR 1.04, 95% CI = 0.06-5.93, p = .97). Age at Menarche The average age of onset of menarche for women with breast cancer was 12.5 years compared with those without malignancy whose average age of onset was 12.9 years. Age at menarche was categorized into those less than 12 years and 12 years or more. Bivariate analysis showed that women whose age at menarche was less than 12 years of age were at higher risk of breast cancer (OR = 1.64, 95% CI = 0.69-3.60. The p value of .25 was not significant, but met the criteria for inclusion in the construction of the multivariate model. 85 Menogusal Status Women were classified as premenopausal if they had had any menstrual activity within the previous 12 months and were not on hormone replacement therapy, or if they had a history of hysterectomy alone or a hysterectomy and unilateral oophorectomy and were less than or equal to 50 years of age. Women were classified as postmenopausal if they had had no menstrual activity for 12 months or if they were on hormone replacement therapy, if they had a history of bilateral oophorectomy, or if they had a history of hysterectomy alone or hysterectomy and unilateral oophorectomy and were greater than 50 years of age. Fifty years of age was chosen as the cut-off age for menopause because 90% of the population that knew their age at the onset of menopause had reached menopausal status by that time. This approach is the same as that used in the Nurses Health Study 163. Breast cancer was diagnosed in 6.7% of the premenopausal patients and 18.8% of the postmenopausal patients who presented with a breast mass. Menopausal status was strongly associated with the risk of breast cancer (OR = 3.2, 95% CI = 1.60-6.60), with a p value of less than .01. Age at Menopause Although 112 women were classified as postmenopausal, only 97 knew the exact age at which they reached menopause. The average age at menopause for the 97 women in whom it could either be determined or estimated was 46.4 years. The average age at menopause was 46.4 years for the group without cancer and 46.5 years for the malignant group. Sixty-nine of the women 86 underwent natural menopause and 91% of these women had menopause at or before the age of 50. Sixty-four women had a surgical procedure that resulted in cessation of menses, 30 had a bilateral oophorectomy while 34 had at least one ovary left intact Thirty-one of these procedures were done before the age of 49 and 34 were done at age 50 or over. Women were categorized into those who had reached menopause at or after age 50 and compared to those who had reached menopause before age 50. The OR of 1.37 (CI = 0.44-4.17) was not significant and did not reach the criteria for inclusion in the multivariate model (p = .58). Reproductive Factors - Pragnang Status, Number of Pregnancies, Number of Beliveries, and Age of First Delivegy The frequency of nulliparity in our population was 12.5%. Only 11% of the women with cancer were nulliparous. Comparison of this group to a group who had delivered at least one child yielded an OR of 0.86 (95% CI = 0.25-2.32), which was not significant and did not reach the criteria for inclusion in the multivariate model (p = .78). The average number of pregnancies did not differ between the two groups (n = 2.8). When the number of pregnancies were categorized into one, two, and three or more pregnancies, there was again no significant relationship with cancer risk. Similar results were obtained for number of deliveries. The average number of deliveries for the benign group was 2.3 and for the malignant group, 2.5. When comparing groups who had had 2 or 3 or more deliveries with those who had only one, there was no significant relationship with cancer risk. The average age of first delivery for the 87 benign group was 24.7 years and for the malignant group, 24.2 years. Age of first delivery comparisons into groups of ages <19, 20-24, 25—29, and >29 also yielded OR that were not significant when tested by bivariate analysis; the p value of 0.78 did not meet the criteria for testing in the bivariate model. Oral Contraceptive Use In this study, 78.3% of the patients were ever users of oral contraceptives, but only 8 (0.4%) of these patients were current users. Ever as compared with never use of oral contraceptives was highly significantly associated with cancer risk with a p value of <01. However, the magnitude of association was the reverse that would have been predicted based on past literature (OR for ever vs. never use 0.29, 95% CI = 0.14-0.61). This finding was hypothesized to represent confounding by age due to the cohort effect, and was strengthened by the finding that duration of use had no effect on outcome. The mean duration of oral contraceptive use in the cancer patients was 5.3 years, compared with 5.7 years in the benign group. Bivariate OR estimates for 3 months - <5 years of use vs. 5 — 10 years vs. >10 years were 1.0, 0.89, and 0.69, respectively, with non-significant confidence intervals (p = .82). anpgan Replacementfia In this study, 82.1% of the postmenopausal patients were pastor current users of estrogen replacement therapy (94 current and 20 past users). Comparison of pastor current use with never use demonstrated respective OR of 0.46 and 0.67 and respective 95% CI of 0.06-2.55 and 0.22-2.32. These associations were not statistically significant (p = .69). The average duration of estrogen 88 replacement use was 6.7 years for the benign group and 10.6 years for the malignant group. Duration of estrogen replacement use demonstrated an OR of 2.1 (95% CI = 0.71-7.12) when comparing <5 years of use vs. 5 or more years (p = .14). This variable met the criteria for testing in the multivariate model. Because of this, ERT use (never, past, current) was included in the model as well. Progestin Raplacement [Jae In this study, 58% of the postmenopausal patients were past or current users of progestin replacement therapy (47 current and 18 past users). Comparison of past vs. current use with never use demonstrated respective OR of 1.62 and 0.74 and 95% CI of 0.45-5.65 and 0.24-2.18. These values were not statistically significant (p = .51), and progestin use was not included in the multivariate model. The average duration of progestin replacement use was low, 3.3 years for the benign group and 5.5 years for the malignant group. Duration of progestin replacement use demonstrated an OR of 1.5 (95% CI = 0.40-5.48) when comparing <5 years of use vs. 5 or more years (p = .53). This was not significant and this variable was not tested further. Family Histogy of Breast Cancer, Age at Diagnosis, and Laterality of Disease In the overall population, the frequency of a family history of breast cancer in a first and/or second-degree relative was forty-five percent and was 21.1% for at least one affected first-degree relative. The magnitude of the association was 1.42 (95% CI = 0.40-5.48) for any family history, 1.76 (95% CI = 0.79-3.71) for a family history in a first-degree relative, 1.45 (95% CI = 0.41-4.07) for family 89 history in a sister, and 2.28 (95% CI = 0.95-5.08) for family history in the mother. The associated p values were less than .3 only for family history in a first-degree relative or history in the mother. Because these variables could not be simultaneously used for testing in the multivariate model because one is a subset of the other, the strongest magnitude of association was chosen and history in the mother was tested (p = .06). Forty-five women knew the age of the mother’s diagnosis. The mean age of diagnosis in the benign group whose mother had breast cancer was 55, compared to 53 for the group with cancer. Comparison of those diagnosed at greater than age 50 years with those diagnosed at 50 years of age or less demonstrated an OR of 1.96 with a p value of .39. Forty-six women knew if their mothers were diagnosed with unilateral or bilateral disease. Seven had bilateral breast carcinoma. Comparison of this group with those of unilateral disease demonstrated an OR of 4.13 (95% CI = 0.69-24.00). Because of small numbers of patients, this was not statistically significant but did reach the criteria for inclusion in the multivariate model (p = .12). History of Radiation msure There was a history of radiation exposure in childhood or early adulthood in 4.4% of the population. Cancer was diagnosed in 9.8% of the population without the exposure and 28.6% of the pOpuIation with the exposure (OR 3.67, 95% CI 0.96-11.72), p = .06). 90 Alcohol and Tobacco Bmosure Current alcohol use was compared with no use, demonstrating an OR of .64 (95% CI = 0.32-1.32) and a non-significant p value of .22. This did meet the criteria for testing in the multivariate model, although the magnitude of the association was the reverse of that expected based on the majority of the epidemiologic literature. Alcohol dosage and frequency of use (number of drinks/week) were not significant in the bivariate model, nor was tobacco exposure when tested for ever vs. never use or numbers of pack years of cigarette smoking. Neither were tested further. Clinical Variables All of the data concerning the categorical clinical variables can be found in Table 11. Table 12 contains data concerning the continuous clinical and epidemiologic variables. Mass Size Mass size was documented for 330 of 336 lesions (98.2%). The mean mass size was 1.6 cm; 1.5 cm for the benign group and 2.4 cm for the malignant group. Mass size was categorized into <= 1 cm (N = 129); 1.1-2.0 cm (N = 150), 2.1-5.0 cm (N = 46), and > 5 cm (N = 5). The frequency of cancer for the respective categories were 4.7%, 10.7%, 23.9%, and 60%. The latter three categories were compared with those 1 cm in size or less using bivariate logistic regression. Corresponding OR were 2.45, 6.44, and 30.75, respectively, and respective 95% Cl were 0.97-7.0, 229-199, 436-2712. These values yielded a highly significant p value of <01. 91 Mass Shape Mass shape was available for 317 lesions, determined by either written description or a picture representation from the chart. Shape was categorized as a vague thickening (N = 147), a round or lobulated mass (N = 156), or as an irregular mass (N = 14). Cancer frequencies for the respective categories were 6.1%, 13.5%, and 42.9%. Using round or lobulated masses as the referent category, odd ratios of .42 (95% CI = 0.18-.92) for vague thickenings and 4.82 (95% Cl 1.46-15.30) for irregular masses were obtained. These were associated with a significant p value of <0.01. Mass Mobility Masses were classified as either fixed (if they were fixed to the chest wall or to skin), or mobile. Three-hundred thirty-five of 336 lesions were mobile. The one lesion that was fixed represented cancer, as expected. Because of the highly skewed distribution, this variable was not analyzed further. mes Consistency Data for this variable was available for only 29% of the lesions. Its distribution was also highly skewed. None of the lesions categorized as soft represented cancer, whereas all of the lesions that were categorized as hard were malignant. However, of the 97 lesions with data available, 84.5% were neither hard nor soft on palpation. For these reasons, the variable was not analyzed further. 92 Mass External Texture Data was available for the impression of external texture for 318 lesions. Of the 41 lesions classified as irregular (12.9%), 17 (41.5%) were in the cancer group, whereas only 19 of the 277 smooth lesions had cancer (6.9%). A highly significant association between breast cancer and external texture was found (OR = 9.62, 95% CI = 4.42-21.08, p = <.01). Mammggram Results Mammogram results were available for all 336 lesions. The distribution of the five categories of mammogram readings were as follows: Category 1: Negative - 56.0%; Category 2: Benign - 32.1%; Category 3: Indeterminate - 7.4%; Category 4: Suspicious - 2.4% and Category 5: Malignant - 2.1%. The frequency of cancer in each of these categories was: Category 1: Negative — 1.6%; Category 2: Benign - 7.4%; Category 3: Indeterminate — 44%; Category 4: Suspicious — 87.5% and Category 5: Malignant - 100%. For purposes of further analysis, Category 4 :Suspicious and Category 5: Malignant readings were combined. The three remaining categories were compared with those classified as Category 1 using bivariate logistic regression. Corresponding OR were 4.93 (95% CI = 139-2290), 48.45 (95% CI = 13.42 — 233.64), and 863.33 (95% CI = 12249-99900) respectively, yielding a highly significant p value of <01. Internal Consistency an F NAB lntemal consistency refers to the perception of the consistency of the mass to the examiner during F NAB. Internal consistency results were available for 336 93 lesions at visit 1. The distribution of the four original categories of lntemal consistency were as follows: 1 - soft, consistent with adipose tissue — 14.3% (N = 48); 2 - soft (not consistent with adipose tissue) or rubbery — 70.0% (N = 235); 3 - hard (not gritty), or soft or rubbery with some grittiness - 14% (N = 47); 4 - hard and gritty - 1.8% (N = 6). All of the 48 lesions in category 1 were benign, whereas all 6 in category 4 were malignant. Soft (not consistent with adipose tissue) or rubbery lesions were found to represent cancer 6% of the time, and for the category of hard (not gritty) or soft or rubbery with some grittiness, 34% of the time. For purposes of further analysis, categories 1 and 2 were combined, used as the referent category, and compared with the combination of categories 3 and 4. The corresponding OR was 13.64 (95% CI = 624-2998). This result was highly significant, with a p value of <01. fll.AB_CvtologitRes_ufl§ FNAB cytology results were available for 317 lesions at visit 1. Nineteen cases had FNAB readings that were hypocellular and in which the internal consistency was not consistent with adipose tissue. Repeat aspirations were done for these lesions at visit 2 but no further malignant cases were diagnosed. For purposes of analysis, the visit 1 cases were examined. The distribution of the six categories of FNAB cytology readings were as follows: 1 — adipose tissue - 11.4% (N = 36); 2 - benign epithelial cells - 63.1% (N = 200); 3 - few atypical cells — 11.4% (N = 36); 4 - many atypical cells — 5.0% (N = 16); 5 - suspicious for malignancy - 4.1% (N = 13); and 6 - malignant - 5.0% (N = 16). Cancer was associated with each of these categories with the following frequencies: 94 adipose tissue - 0%; benign epithelial cells - 3.5%; few atypical cells - 8.3%; many atypical cells - 18.8%; suspicious for malignancy — 53.9%; malignant — 100%. For purposes of further analysis, the first and second categories were combined, as were the fifth and sixth categories. Using the adipose tissue/benign cases as the referent category, the remaining categories were tested for association with cancer using bivariate logistic regression. Corresponding OR were 2.97 (95% CI = 062-1129) for the category read as few atypical cells, 7.55 (95% CI = 1.50-30.85) for many atypical cells, and 125.41 (95% CI = 41 79-44475) for suspicious/ malignant readings. This was highly significant, with a p value of <01. Results of Multivariate Logistic Ragression Analysis The results of multivariate testing are presented according to the 3 stages used in model building. There were no significant two-way interactions encountered in the modeling process. $999.1 Because menopausal status was so strongly predictive of malignancy in the bivariate model, we considered premenopausal and postmenopausal women separately in the analysis of the epidemiologic variables. Premenopausal Women The epidemiologic variables that passed the initial criteria of a p value of 0.3 or less using bivariate logistic regression for the premenopausal women included age, weight, body mass index, age at menarche, oral contraceptive use, family history of breast cancer in the mother, history of radiation exposure, and current 95 alcohol use. After multivariate analysis of these epidemiologic variables related to the 223 lesions and 15 cancers in premenopausal women, none were found to be significant in predicting for breast cancer. The goal of Stage 1 was to establish the pre-test probability of disease, which, in this case, represents the closest estimate of the likelihood of breast cancer prior to the application of a diagnostic test. Because the pre-test probability of breast cancer for the premenopausal women in the study could not be predicted from epidemiologic variables pafi, the best estimate of it was through the use of disease prevalence. Because disease prevalence is age- dependent, we used age groups <40 years (N = 65), 40—44 years (N = 74), and >44 years (N = 84) to calculate prevalence in our study population and establish the pre-test probability of breast cancer. The associated probabilities (prevalence) by age group were 4.6%, 6.8%, and 8.3%, respectively. Postmenopausal Women For the postmenopausal women, the same variables were tested as in the premenopausal women, with the addition of estrogen replacement therapy use and duration for the postmenopausal group. Multivariate analysis of the epidemiologic variables in the 112 lesions and 21 cancers in this group demonstrated two to be significant: family history in the mother and a history of radiation exposure. The final model obtained for the postmenopausal women is summarized below 96 Table 13: ngistic Ragression Model of the Risk of Breast Cancer for 112 Postmenopausal Women based on Epidemiologic Variables Wald Variable B s_e_l§ QB 95% CI p Value Intercept -2.44 2.00 Age (years) 0.01 0.03 1.01 .95, 1.08 0.77 History in Mother No (referent) - - 1.00 - - Yes 1.51 0.69 4.52 1.15, 17.85 0.03 History of Radiation No (referent) - - 1.00 - - Yes 2.23 0.99 9.27 1.34, 79.19 0.02 The associated pre-test probability of malignancy for each covariate pattern, based on the presence or absence of these factors, ranged between 11% and 66%. Results of Hosmer-Lemeshow Goodness-of-Fit Chi Sguare: Stage 1 The GOFCS was generated for the 103 postmenopausal women. The post-test probabilities were ranked in ascending order and then grouped into quartiles. The groups were not equal because many patients shared the same pre—test probability result, especially in the lower half of the probability rankings. The number of predicted and observed cases of breast cancer for each quartile of predicted probability or risk is summarized below. Table 14 Observed and Expected Freguencies within Quartiles of Risk: Stage 1* Cancer = 1 Cancer = 0 Group Total Observed Expected Observed Expected 1 41 4 4.6 37 36.4 2 30 5 4.9 25 25.1 3 13 3 2.4 10 10.6 4 19 9 9.0 10 10.0 0 Legend on following page 97 * Postmenopausal women only # Quartiles were created by ranking the pre—test probabilities and then dividing them into sequential groups + The expected frequencies were calculated by summing the pre—test probabilities for all women within each grouping There is good agreement between the number of observed and predicted cases of cancer for each quartile. The GOFCS was 0.26 (df = 2, p = 0.88). The model therefore fit the data well and appears to provide an accurate estimate of the probability of breast cancer for postmenopausal women. From a diagnostic perspective, the results from stage 1 make it clear that epidemiologic data alone is not able to differentiate those women with a breast mass that might be safely triaged to follcvv-up, as evidenced by the distribution of cancer cases across all quartiles of predicted probability. Final Step in Stage 1: Evaluation of Pre-test Probabilities: Data from Stage 1 modeling was merged for the premenopausal and postmenopausal women and used to set the pre—test (or prior) probability of disease for application to the Stage 2 model. The final pre-test probabilities for breast cancer ranged between 4.6% and 66% based on Stage 1 results, with an average of 11.0%. The mean pro-test probability was 20.2% for the cancer patients and 9.9% for the patients without cancer. However, the range of pre- test probabilities was similar between the two groups; 4.6% — 52.2% for the non-malignant group and 4.6% - 66.4% for the cancer patients. fleas-2 Stage 2 involved analysis of the clinical variables obtained from the CBE and mammogram that passed the initial screening criteria. Age was included in the 98 analysis regardless of the statistical significance as an ‘a priori’ confounder. Specific candidate variables tested included mass size, mass shape, mass mobility, mass external texture, and mammogram results. Because a Category 5 (malignant) mammogram was associated with malignancy 100% of the time, it was necessary to group Category 4 (suspicious) and Category 5 mammograms into one category in the multivariate analysis. Multivariate analysis of the clinical variables was performed on 330 lesions that had all data available. The 6 missing cases lacked a specific mass size on CBE. After stepwise backwards deletion of non-significant variables (p>0.05) only age, mass size and mammogram results remained in the final model. The final model obtained for the clinical variables is summarized below Table 15: Lpgistic Ragression Model of the Risk of Breast Cancer in 330 Breast Masses Based on CBE and Mammography Results Wald Variable B M O_R 95% CI p Value Intercept -7.91 1 .61 Age 0.06 0.03 1.07 1.02, 1.12 0.01 Size (centimeters) 5 1.0 (referent) - - 1.00 - - 1.1-2.0 0.70 0.65 2.01 0.59, 7.92 0.28 2.1-5.0 1.30 0.79 3.68 0.78, 18.10 0.10 > 5 2.75 1.34 15.68 0.94, 227.30 0.04 Mam Findings Normal (referent) - - 1.00 - - Benign 1.49 0.71 4.43 1.20, 21.11 0.03 lndeterm 4.08 0.75 59.39 15.20, 310.85 <0.01 Susp/Malig 6.00 1.22 402.40 52.96, 999.00 <0.01 The associated post-test probability of malignancy for each covariate pattern, based on the presence or absence of these factors, ranged between 0.28% and 98.07%. 99 Results of Hosmer-Lemeshow Goodness-of—Fit Chi Sguare: Stage 2 The GOFCS was generated for Stage 2 in a similar manner to that described for the Stage 1 results. In this case, however, the post-test probabilities were ranked in ascending order and then grouped into nine groups of approximately equal size according to their post-test probability rankings. These groups were chosen by the SAS program as the optimal number for GOFCS testing. The number of predicted and observed cases of breast cancer for each of the nine groups of risk is summarized below. Table 16 Observed and Expected Freguencies within Groups of Risk: Stage 2 Cancer = 1 Cancer = 0 Group # Total Observed Expected Observed Expected 1 42 0 0.17 42 41.8 2 33 0 0.28 33 32.7 3 33 0 0.44 33 32.6 4 36 1 0.57 35 35.4 5 32 1 0.83 31 31.2 6 43 2 1.56 41 41.4 7 40 4 2.49 36 37.5 8 33 3 4.67 30 28.3 9 38 25 24.99 13 13.0 # Groups were created by ranking the post-test probabilities and then dividing them into sequential groups of approximately equal size + The expected frequencies were calculated by summing the post-test probabilities for all women within each grouping There is good agreement between the number of observed and predicted cases of cancer for each group. The GOFCS was 3.07 (df = 7, p = 0.88). The model for Stage 2 fit the data well and appears to provide an accurate estimate of the probability of breast cancer for this data set. 100 Merging of Data and Modification of F ina_l Post-Test Probability and Likelihood Ratio for Stage 2 The output generated by the models for during Stage 1 and Stage 2 was merged so that both epidemiologic data and data related to mass size and mammogram results could be evaluated. This merge, as outlined in the methods section, was done because we were interested in the ability of the model to predict for breast cancer in women with a breast mass evaluated only by history, physical exam, and mammogram, as is done in many primary care and surgical offices. After the data was merged, the likelihood ratio generated from merged dataset was adjusted based on the proportion of cases and controls in the dataset as described in the Methods section. The mean adjusted likelihood ratio for the merged dataset was 2.93, with a range from .001 to 374.88. The mean adjusted post-test probability was 10.66%, with a range between 0.10% and 99.71%. Test Characteristics for Merged Dataset Based on Stages 1 and 2: Various cut-points for triage to biopsy were explored by calculating sensitivity, specificity, and positive and negative predictive values using the combination of Stage 1 and Stage 2 data. The best test sensitivity (97.2%) was obtained using a post-test probability cut-point of 1%. However, the corresponding specificity was only 37% and 214 of 320 lesions would have been triaged to biopsy. One cancer would have been missed. The most efficient cut-point in test performance using the criterion of the maximum average of sensitivity and specificity combined was a post-test probability of 4%. Using this value, the 101 sensitivity was 86.1%, the specificity was 74.7%, the positive predictive value was 30.4%, and the negative predictive value was 97.7%. Of the 320 lesions, 102 would have been triaged to biopsy for a malignant: benign ratio of 121.83. However, this level of efficiency came at the price of 5 of 36 missed cancers (1 3.9%). SL892; Stage 3 involved analysis of the data obtained from FNAB; internal consistency and cytology. Age was again included in the analysis as an ‘a priori’ confounder. Both FNAB variables passed the initial screening criteria and were entered into the multivariate model. The F NAB variable of lntemal consistency, originally consisted of 4 levels (1 - adipose tissue, 2 - soft or rubbery, 3 - gritty with associated soft or rubbery findings, 4 - hard and gritty). Two of these categories needed combining for entry into the multivariate model. The adipose tissue category represented a 100% benign distribution and the hard and gritty category was 100% malignant. Because of this, any hard finding, or any gritty finding, (whether associated with soft, rubbery, or hard findings), was combined into a single category and was compared with another category that included adipose tissue, soft, or rubbery findings. In addition, the six categories of FNAB cytology were combined into four for the same reasons. The new combined categories were: 1 - adipose tissue or benign cytology; 2 - few atypical cells; 3 - many atypical cells and 4 - suspicious or malignant cytology. Data was available for all 336 lesions for Stage 3 multivariate analysis. Both internal consistency and cytology remained as predictors of malignancy in the final 102 multivariate model. The final model obtained for the F NAB data is summarized below Table 17: Lpgistic Ragression Model of the Risk for Breast Cancer in 336 Breast Masses Evaluated with FNAB Wald Variable B _s_e_l3 .O_R 95% CI p Value Intercept -5.53 1 .22 Age 0.03 0.02 1.03 0.99, 1.09 0.15 lntemal consistency Adipose/soft/rubbery (referent) - - 1 .00 - - Hard and/or gritty 1.64 0.54 5.15 1.75, 14.95 <0.01 FNAB Cfiolpgy Benign (referent) - - 1.00 - - Few atypical 1.01 0.77 2.73 0.51, 11.39 0.19 Many atypical 2.40 0.79 11.08 2.05, 49.90 <0.01 Suslealig 4.22 , 0.63 68.26 20.89, 255.97 <0.01 The associated post-test probability of malignancy for each covariate pattern, based on the presence or absence of these, ranged between 0.57% and 95.97%, with a mean of 10.71%. Resplts of Hosmer-Lemeshow Goaness—of—Fit Chi Sguare: Stage 3 The GOFCS was generated for Stage 3 in a similar manner to that described for the Stage 1 and 2 results. In this case, however, the post-test probabilities were ranked in ascending order and then grouped into eight groups according to their post-test probability rankings. The groups were not equal because many patients shared the same covariate pattern and thus post-test probability result. These groups were chosen by the SAS program as the optimal number for GOFCS testing. The number of predicted and observed cases of breast cancer for each of the eight groups of risk is summarized below. 103 Table 18 Observed and Exgted Freguencies within Groups of Risk: Stage 3 Cancer = 1 Cancer = 0 Group # Total Observed Exgcted Observed Exfled 1 48 0 0.27 48 47.73 2 30 0 0.43 30 29.57 3 107 2 1.73 105 105.27 4 12 0 0.29 12 1 1.71 5 51 2 2.1 1 49 48.89 6 37 2 2.74 35 34.26 7 34 15 13.01 19 20.99 8 17 15 15.41 2 1.59 # Groups were created by ranking the post-test probabilities and then dividing them into sequential groups + The expected frequencies were calculated by summing the post-test probabilities for all women within each grouping There is good agreement between the number of observed and predicted cases of cancer for each group. The GOFCS was 1.88 (df = 6, p = 0.93). The model for Stage 3 fit this data well, and appears to provide an accurate estimate of the probability of breast cancer for this data set. Merging of Data and Generation offipal Post-Test Probabilities The model generated during Stage 3 was used in sequence with the combined data generated at the end of Stage 2 so that the contribution of FNAB results to the combined model could be evaluated. Three hundred and twenty lesions had all data necessary for calculations for all 3 models. The likelihood ratio generated from the final dataset was adjusted based on the proportion of cases and controls in the dataset. The mean adjusted likelihood ratio for the final dataset was 169.61, with a range from .00004 to 2,113.24. The mean final post-test probability was 10.12%, ranging from 0.004% to 99.99%. 104 Test Characteristics for BREAST AL): Fina_l Merged Dataset: Various cut-points of the final post-test probabilities were explored by calculating sensitivity (Se), specificity (Sp), positive predictive value (PPV) and negative predictive value (NPV) for each cut-point. A classification table demonstrating the values follows: Table 19 ‘ Test Characteristics for BREAST AID by Final Probability of Cancer 2 % Final # Biopsies % Missed Probability Se (%) Sp (%) PPV (%) NPV (%) Needed Cancers 95 55.6 100 100 94.7 20 44.4 90 55.6 100 100 94.7 20 44.4 80 58.3 99.6 95.5 95.0 21 41.7 70 58.3 99.2 91.3 94.9 23 41.7 60 61.1 99.2 91.7 95.2 24 38.9 50 66.7 99.2 92.3 95.9 26 33.3 40 77.8 98.6 87.5 97.2 32 22.2 30 80.6 98.6 87.9 97.6 33 19.4 20 86.1 97.2 79.5 98.2 39 13.9 10 86.1 95.1 68.9 98.2 45 13.9 5 94.4 89.8 54.0 99.2 63 5.6 4 94.4 88.4 50.7 99.2 67 5.6 3 97.2 85.6 46.1 99.6 76 2.8 2 97.2 80.3 38.5 99.6 91 2.8 1 97.2 74.3 32.4 99.5 108 2.8 .5 97.2 62.0 26.3 99.4 133 2.8 .4 97.2 56.3 22.0 99.4 159 2.8 .3 97.2 50.3 19.9 99.3 176 2.8 .2 97.2 37.0 16.4 99.1 214 2.8 .1 100 26.4 14.7 100 245 0 These cut-off points were plotted and an ROC curve constructed. An ROC curve is simply a graphic means of plotting sensitivity against 1 - specificity. It is used to assess the ability of a test to distinguish between, in this case, cancer and benign disease. Figure 2 illustrates an ROC curve for a perfect test (solid line) and a test with no discriminating ability (dotted line). Figure 3 illustrates 105 the ROC curve constructed for BREAST AID. In general, the point on an ROC curve that is closest to the upper left hand comer of the graph represents the most desirable cut point for a test (assuming that the “costs” of false negative and false positive results are equal) . The most efficient cut-point in terms of maximizing sensitivity and specificity in test performance for BREAST AID was 3%. Using this value, the sensitivity was 97.2%, specificity was 85.6%, positive predictive value was 46.1%, and negative predictive value was 99.6%. Of the 320 lesions, 76 would have been triaged to biopsy and 35 cancers would have been detected for a malignant: benign biopsy ratio of 1:1.17. One malignancy was missed with this approach. This patient was also missed on the initial visit using the triple diagnosis approach. She presented as a 43 year-old premenopausal patient with a 1.2 cm vague thickening, a normal category 1 mammogram, a rubbery lntemal consistency on FNAB, and a benign fine needle aspiration cytology result. She was diagnosed on 6-week follow-up at the best phase of her menstrual cycle, when her mass became more prominent. The cancer rate for the 244 lesions triaged into follow-up was 0.41% and for the 76 lesions triaged into biopsy, 46.1%. Of the cancer cases, sixteen lesions had malignant F NAB cytology results and all were malignant on histopathology. Seven had Category 5 mammograms and all of these were malignant as well. Suspicious FNAB readings were found in 13 cases and seven of these were malignant at biopsy. Suspicious mammograms (Category 4) were found in 8 cases, 7 which were malignant. 106 Reatllts of Individual Diagnostic Tests Compared with ResJults of BREAST AID: CBE findings were not evaluated independent of mammography findings in this study, and calculation of test characteristics for CBE is therefore not possible. However, this study is one of the first in the past two decades to characterize the clinical characteristics of a series of breast masses. This is relevant because, as discussed above, the clinical findings suggestive of malignancy on CBE are decreasing in prevalence, especially since the widespread use of mammography. Twenty-two of 36 cancers were 2 cm or less in size, and 6 were 1 cm or less in size. Only 6 cancers presented with an irregular shape on palpation; the remaining 30 represented smooth or lobulated lesions or vague thickenings. Only 1 of 36 cancers was fixed; the rest were mobile. Seventeen malignancies were assessed to be irregular on palpation of external texture, the remaining 19 were smooth. These findings highlight the low sensitivity of CBE for diagnosing malignancy. Although abnormal physical examination signs may increase the likelihood of cancer, lack of physical signs of malignancy by no means assures a benign diagnosis. Unlike CBE, it is possible from this study to report the test characteristics for mammography. Eleven of 36 cancers (30.6%) were classified as Categories 1 or 2 on mammography by the same surgeons who had palpated the mass within minutes of the mammogram reading. The possibility of false negative results due to interpretive errors seems remotely low under these conditions. It was unusual for the surgeon to disagree with the mammographic findings of the radiologist, and when it did occur, the reading was upgraded by the surgeon 107 based on the CBE findings. Fifteen mammograms were classified as either Category 4 or Category 5 and 14 (93.3%) of these were malignant. These fourteen results represented 38.8% of the 36 malignant lesions. The other 11 cancers were classified as Category 3 mammographically (30.6% of malignant cases). The following table demonstrates these points and compares them with results from BREAST AID. Table 20: Sensitivi S cifici and Predictive Values far Mammoggghy Alone with Comparison to Test Characteristics of BREAST AID* Se Sp PPV NPV # Biopsies % Missed Test (%) (%) (%) (%) Needed Cancers Mammogram Category 3, 4 or 5 69.4 95.0 62.5 96.3 40 30.6 Mammogram Category 4 or 5 38.9 99.7 93.3 93.2 15 61.1 BREAST AID Probabiliy a 3% 97.2 85.6 46.1 99.6 76 2.8 * Se = sensitivity, Sp = specificity, PPV = positive predictive value and NPV = negative predictive value. The findings of FNAB were not much better. Twenty of 36 malignancies were soft or rubbery on internal consistency, for a sensitivity of 44.4%. Ten of 36 malignancies had FNAB readings of benign epithelial cells (7) or a few scattered atypical cells (3) for a sensitivity of 72.2%. The following table demonstrates the FNAB cytology results and compares them with results from BREAST AID. 108 Table 21 SensitivityI Specificity, and Predictive Values for Fine Needle Asplration Cytology Alone with Comparison to Test Characteristics of BREAST AID“ Se Sp PPV NPV # Biopsies % Missed Test (%) (%) (%) (%) Needed Cancers Many atypical, suspicious or malignant cells on FNAB 72.2 93.7 57.8 96.6 45 27.8 Suspicious or malignant cells on FNAB 63.9 98.0 79.3 95.8 29 36.1 BREAST AID Probability a 3% 97.2 85.6 46.1 99.6 76 2.8 * Se = sensitivity, Sp = specificity, PPV = positive predictive value, NPV = negative predictive value, and FNAB = fine needle aspiration biopsy. One could argue that all cases of either suspicious or malignant mammogram or FNAB cytology results would result in biopsy. One or both of these results occurred in 26 of 36 cancers in this study. The ten remaining malignancies had indeterminate or lower findings on both mammography and FNAB. The challenge for the surgeon is to decide how many lesions assessed to be equivocal or less in nature should be biopsied. BREAST AID was capable of sorting out this dilemma in 9 of 10 cases, requiring only an additional 40 biopsies to make the distinction. Table 22 in the appendix summarizes the findings of these ten cases. DISCUSSION When evaluating a breast mass, the most important question facing the physician and the patient differs from the usual one to be answered in the evaluation of a physical sign. Rather than the question “Mat is causing this problem?”, the question is, instead, “Does this mass represent cancer?” When the challenge of the clinical evaluation is to rule out cancer, the clinician is likely 109 to tolerate many false positive results in a diagnostic test, favoring high sensitivity (and high negative predictive value) at the expense of lower specificity (and lower positive predictive value). The consequence of this approach in the case of a breast mass, in which the diagnostic procedure is mass removal, is that a large number of biopsies for benign disease are performed in the process of ruling out breast cancer. Although many consider open biopsy to be a minor procedure with few consequences as compared with a missed diagnosis of breast cancer, women pay a significant emotional toll associated with the anxiety of the procedure itself. In addition, there is emotional distress related to the necessary delay between the clinical evaluation and the biopsy, and the biopsy and the final pathology result, a process that can take several days or even weeks. In addition, although skin scars usually resolve over time, some women experience breast disfigurement as a consequence of open biopsy, and subsequent screening by CBE and mammography can be more difficult after open biopsy because of breast tissue scarring. Finally, the economic costs of liberal use of tissue biopsy to rule out breast cancer are enormous — approximately $4 billion per year in the United States alone. This has resulted in tensions between health economists and surgeons, with the former group quoting unnecessary biopsy rates and the latter group citing the expectation of diagnostic perfection on the part of patients and the legal community. The ability of BREAST AID to assist in the management of breast masses by triaging malignant lesions to biopsy and benign lesions to follow-up is based on 110 its ability to incorporate several complementary pieces of diagnostic information to arrive at one final probability. Although scoring systems have been attempted applying the principles of triple diagnosis in the past, arriving at a single, and thus definitive, quantitative score has been difficult. This is due in part to the lack of definition of the clinical impression component of the triple test, which depends on clinical expertise with undefined results of CBE, and on results of several diagnostic tests which result in an overall gestalt Impression of benign vs. malignant disease that is difficult to characterize. BREAST AID removes this undefined characteristic of triple diagnosis while incorporating all of its fundamental principles. It adds to triple diagnosis as well, by incorporating important epidemiologic information into the interpretation of the test results. By arriving at a quantitative result that predicts accurately for malignancy, it helps remove the diagnostic uncertainty that a clinician faces when evaluating equivocal findings. Potential Stugy Biases: Patient Selection and Pppulation Characteristics The question of the representativeness of the population in a study to the world of patients who present to their doctors with a symptom is an important one, especially when the study is aimed at developing a mathematical model to predict for a target disorder. How representative the population under study is in this report, then, becomes critical when judging the applicability of the results. To answer this question, it becomes necessary to consider the criteria for selection, and the causes of non-inclusion once selected. 111 Selection Criteria The selection criteria for entry into this study required the performance of certain diagnostic tests within a defined time period. By definition, then, the study design resulted in the elimination of patients who undenNent immediate biopsy without mammography and/or adequate FNAB results. It is unlikely, except in a very young women, that an open biopsy would be done without having obtained a diagnostic mammogram. It is much more likely that an open biopsy might be done without performing an FNAB. This is most likely to occur when the surgeon is certain of the need for biopsy or the patient Is adverse to F NAB. It is possible under these circumstances that more malignancies would be present in the group that did not undergo FNAB than in the group that did. if this is the case, then the malignancy rate would be underrepresented in our study. The effect that this might have on the mathematical model would not be substantial however, because the model would certainly triage patients with clear indications of malignancy on either mammography or FNAB into biopsy. Of greater concern is the group of patients who underwent FNAB in their diagnostic work-up but whose results were hypocellular and not repeated. This group was not eligible for this study according to our selection criteria, but data was collected on these patients. The fact that patients with hypocellular FNAB results have a higher rate of benign disease than an overall group of patients who undergoes evaluation with FNAB has already been discussed. To substantiate this, and to document the outcome of our group with hypocellular FNAB results (31.8% of all subjects who underwent F NAB), a separate analysis 112 will be conducted later. The other 21.6% of patients who were not included in the study will not be analyzed further. These patients include 27 in whom the charts could not be located, and these are likely to be missing at random. The 36 patients who did not have a mammogram evaluation within the time period specified by the study probably do differ from the overall population studied. They are more likely to present with a clinically benign finding than the overall group. In contrast, the 39 patients who had an open biopsy without a fine needle aspiration biopsy are likely to have a higher rate of malignancy than the overall group. This is especially true in the practice at the CBHC, where it is routine to evaluate patients with F NAB unless he patient prefers to defer the procedure of unless the clinician judges that the FNAB result would be unlikely to change management options. There is also the possibility of referral bias to consider. One should question whether patients with a breast mass referred to a breast clinic are representative of patients with breast masses overall, and if patients referred to the CBHC specifically are representative. If any bias exists, one would expect an overrepresentation of the target disorder. This study documents a prevalence for breast cancer of 10.7%. Whether this prevalence is reflective of the prevalence of cancer in women with breast masses will of course be dependent on the age of the population being studied and the stages of disease in the sampled population. Most of the surgical literature quotes much higher malignancy rates (Tables 2-5), but in many of these studies, the rates are given only for patients who underwent open biopsy. This suggests either referral or 113 selection bias possibilities in these studies, and a resulting overrepresentation of malignancy in some of the surgical literature. A study of 2000 FNAB aspirations by Horgan et al from the University Hospital Breast Unit in Galaway, Ireland specifically states that they used consecutive cases of breast masses seen between 1982 and 1989 in their study and that they did not attempt to select patients for biopsy based on clinical findings. Their malignancy rate of 13.2% is close to our own 130. Selzer’s report of 1,000 consecutive patients referred to his surgical clinic with a breast mass documented an 8.2% malignancy rate in his population 5. In Barton’s retrospective study of 2400 patients in a primary care setting who were followed over time to measure frequency of breast complaints, 157 patients presented with or were discovered to have a breast mass on CBE; of these, 10.7% were malignant 169. It seems likely that the malignancy rate in our study is representative of the women with breast masses overall and is not an underrepresentation of the frequency of malignancy in women with a breast mass. Another reason that may account for the prevalence of malignancy in our surgical clinic is the recent increase in mass screening rates with CBE and mammography. The result of this trend is the identification of more breast masses than would otherwise have been identified. Although the goal of screening is to identify cancer at an earlier stage than would be the case without screening, the inevitable consequence of this effort is the identification of higher rates of what eventually prove to be benign findings. 114 Population Characteristics The mean age of 47.6 years in our study population is similar to others that report age in women with breast masses, the range of which varies between 45 years and 54 years 7. 89’ 9°: 93: 94: 96. 97’ 101. The exception is the study by Grobler et al, whose average age was 62 years 75. The mean age of 46.8 years for benign lesions in this study is essentially the same as a large multi- institutional study of Ciatto et al, who reported a mean age of biopsy for benign lesions of 46.7 years for 754 collected cases 13. The mean BMI in this study was 24.73 kg/mz, which compares with a mean value of 24.3 kg/m2 in the Adventist Health Cohort Study that studied exposure variables related to cancer and heart disease in 20, 341 women. Some question may arise with the comparison of BMI in our population with the Adventist Health Cohort Study because the latter population is largely vegetarian. However, in a population-based case control study by Stanford et al of women 50-64 years of age, 45.6% of the cases and 50.2% of the controls had a BMI below 23.7 170. Thus, the BMI in our population seems representative of women in general. The average age of onset of menarche of 12.8 in this study compares with the same value in the Adventist Health Cohort Study 171. In Rockhill’s study of population attributable fraction estimations for breast cancer risk factors using the data from the Carolina Breast Cancer Study conducted between 1993 and 1996, 19.7% of the 958 Caucasian women had an age at menarche of less than 12 years. This compares with 18.3% of our population. These findings support 115 the representativeness of our population. Further support is found in the 1987 National Health Interview Cancer Epidemiology Supplement, which reported that 18.3% of women less than 50 years of age from the Midwest have an age at menarche of less than 12 years and that 14% of women 50 and over from the Midwest had such a history 172. The frequency of nulliparity in our population was 12.5%. This compares with a frequency of 14.6% in Rockhill’s study 48 and 10.8% in the study by Stanford 170. Age distribution could account for the variations in frequency of nulliparity in these two studies, since the former included women 20-74 while the latter was in women 50-64. The frequency of nulliparity in our population appears representative, confirmed also by the frequency of nulliparity in women between the ages of 50-79 living in the Midwest to be 13.5% in the 1987 National Health Interview Cancer Epidemiology Supplement 172 This study, which is a supplement study to the larger National Health Interview Study (NHIS), found a prevalence of nulliparity of 29.6% in the Midwest for women 20- 49 years of age. The average age of first delivery of 24.7 years in our study is higher than in the Cancer and Steroid Hormone Study of women 20-55 years of age, where the mean age of delivery was found to be 21.8 years for control subjects in seven of eight SEER regions 173. The distribution of age at first birth is also not representative of predicted frequencies. Our study had frequencies of 16.0%, 33.2%, 33.9%, and 16.8% for age of first delivery of <20, 20-24, 25-29, and >29 years, respectively. This represents a lower frequency of early age of 116 delivery and a higher frequency of late age at delivery compared to other studies 170. 172. It is possible that these distribution of factors in our population reflects the fact that our population is at higher risk for breast cancer, by virtue of the fact that they are presenting with a breast symptom. Alternatively, these findings could be explained by the relatively young age group that we are studying, and the trend of delay in childbirth for this group of women. The frequency of number of deliveries in this study was 2.4, 2.3 for the benign group and 2.5 for the group with cancer. This compares to the mean number of deliveries in the seven SEER regions other than the San Francisco Bay area of 2.6, and 2.1 for the San Francisco Bay area 173, and indicates population representativeness. The average age of menopause of 46.4 years compares with 44.4 years in the seven SEER regions other than the San Francisco Bay area, and 44.9 years for the San Francisco Bay area 173, and also indicates population representativeness. The rate of ever use of oral contraceptives in our population was 78.3%. This compares with a frequency of 70.7% in the Adventist Health Cohort Study 171 but only 41.3% in the Nurses’ Health Study 174. (The latter finding is probably explained by the age of the cohort studied and the time of initiation of the study.) The unexpected protective association of OCP use with breast cancer found in this study has already been discussed. 117 The rate of ever use of postmenopausal estrogens in our population was 82.1%. Of these, 68.8% were current users. This is higher than ever-use of post-menopausal estrogens of 52.4% in the Adventist Health Cohort Study 171, 41.4% in the Nurses’ Health Study 174, and 34.1% in the Midwest as reported in the 1987 National Health Interview Cancer Epidemiology Supplement 172. This may reflect a practice pattern specific to the Lansing area, or the current prevalence of HRT use in the population in the US as a whole, but the possibility that hormone replacement therapy may be associated with an increased risk of breast masses must also be considered. This study found a history of breast cancer in first-degree relatives to be 21.1% and in mothers only, 14.1%. This compares with a family history in first- degree relatives in Rockhill’s study of 22.0% 48. However, these are higher numbers than those usually quoted. For instance, the Nurses’ Health Study found an age-standardized distribution of family history in first-degree relatives in the Midwest to be only 8.2% 175, and the Adventist Health Study found a breast cancer frequency in first-degree relatives of only 4.2%. It is possible that women with a family history are more vigilant about breast self-examination and/or have greater medical-seeking behavior than women without a family history. This would lead to an increased frequency of breast lump detection in the former group. The population characteristics in this study are similar, but not equivalent, to those of others. Just how the differences might affect the results of the multivariate model is open to conjecture at this point. It is very likely that 118 because the number of women with the target disorder is low in this study (36 cases of breast cancer), that additional risk factors may contribute to the model once the numbers of patients is increased in a larger study. SUMMARY AND CONCLUSIONS Assuming that BREAST AID is a mathematical representation of the decision-making process facing a clinician, it is not surprising that the surgical approach to the evaluation of breast masses has traditionally required open biopsy for a definitive diagnosis. In this study, the final predicted probabilities for a breast mass based on the final steps in Stage 3 and representing malignancy ranged between .004% and 99.99% for the population of breast masses overall, and between 3% and 99.99% for the lesions triaged into biopsy. Despite the low predictive probability for cancer of 3% from a raw value standpoint, 46.1% (35/76) of the lesions whose final probability was at or above this level had malignancy. It is unrealistic to think that clinical decision-making is capable of making differentiations this small. Herein lies the potential application of this model; it combines multiple pieces of routine diagnostic information, much of it inter-related, into one final prediction that is capable of efficiently triaging patients into open biopsy or follow-up. In this regard, it satisfies both the economic and medical-legal concerns so commonly associated with the symptom of a breast mass. BREAST AID is a unique prediction model for breast cancer that has the potential to improve the current accuracy of diagnosis while minimizing the rate of open biopsy. I believe it does this without compromising the timely diagnosis 119 of breast cancer. This latter caveat is dependent on close follow-up of non- biopsied lesions for a minimum of 3 months and for at least one year if the mass persists. If the size of the mass increases or the consistency of the mass becomes more pronounced on follow-up, open biopsy is indicated. Prior to the widespread application of BREAST AID, a demonstration study to validate and test the generalizability of the model will be necessary. This will require a large- scale mum-institutional study incorporating large numbers of patients with both malignant and benign disease. If the predictive ability of the model is confirmed, BREAST AID has the potential to conserve health care dollars while simultaneously offering both patients and physicians the assurance that clinical follow-up is a safe alternative to open biopsy. 120 APPENDIX 121 TABLE 1 Probability of developing invasive breast cancer in SEER" areas, women, all races, 1987-88 , AGE: w j * f i1 ’ .1 w ‘ Cumulative probability (%)“ 20-24 .005 25-29 .04 30-34 .16 35-39 .46 40-44 1.1 45-49 2.0 50-54 3.0 55-59 4.2 60-64 5.8 65—69 7-4 70-74 8.9 75-79 10.4 80-84 1 1.5 85-90 12.1 90-94 12.5 95+ 12.6 * SEER = Surveillance, Epidemiology and End Result program of the National Cancer Institute ** Cumulative probability of developing invasive breast cancer from birth 122 come 88: co :20 .2 9.33 9 8 8 8 8 8 :98 so: a 8 8.8 6: e8 82 8.85:? 8 R 8 8 8 828 so: we 02 ~55 8 8 82 95> 8 2 E 8 t. 8v .88 ca 62 New 8 82 .58 8> 8 R 8 2 8 :98 6e we 62 8% so: 8m #82 .9888 8 vs 8 8 2 scam so: em 3 88 6a 32 22 .8882 8 R no 8 8 528 so: 2 8 Es 8 8m 82 8:65. 8 .8 8 8 8 8238 aé em 8 6.5 8.2. 8: :82 .___cecs_ B 2 8 8 2 826. so: 88 8 8v .88 at 82 23228.. 8 E 8 8 S 88 so: we 2: 52m 6: N8 22 .583. 8 8 8 8 as 155 a be 02 as; 8 48 a 82 82.58. 8 8 2 8 8 88 so: he 62 :96. so: 88 82 68885: 8 8 8 8 8 8v .28 we 62 828 so: «8 82 .8255... 8 8 8 8 K aev 8 he 2: 88 8 Ba 82 839.0 8 2 8 8 8 2.5 8 we 2: 8.8 .8 com .282 .= :98 8 R 8 8 8 ace 8 we 62 828 .o: 82 .282 ._ :98 8 8 8 8 8 8v :8 .N .8 aé 9 8 B2 .2855 8 E E 8 t. 8v :8 .N .8 are 2. v8 82 dead 5 8 on k 8 8 i we 2: t 88 82 .8880 8 E 8 8 8 8.5 8 ca 82 68 9. a: 82 8:8 at 35:25 a... 2mm :55 a: 9am 8.885 2 88> co£=< >mz >9... comm meow 2.32 on.» onus. .>>o=on_ >835 om< 885.92... 85 So: 3.858820 88... - cocoa—2 888.6 83898.". 53> .3803 c. Eaxm .88.»??— N 2nd... 123 «mm 758.. 88638 = :98 6.3 78an 88638 _ :98 958:: 82:88 8 88 8283288 8:388: .8955 8:288. E222: 8:8 :mEon 508 38852982 38 8E3... 8:88:88 9 28 8.8.8: 2.8862282 38 .025... 8880: E85888 8898_:8 :0 Eu v :85 88888.: 60:88.8: :8 6 88:88.2 >828 So: 828588 8:88:88 8322.0 2825.0 E88898_:8 b8___x8 Lo :ozogcc :28 o: 85:9: N 555, 8888858: .8835 So: 8:o_88_ =4. 88“ 888 58288 ._. >8 358:8 88888 88:88 :_ 38 860.92: .6 88:83.2. You 888 ”88888 .3 858:8 8% v6 88:83:... << ++ 124 New 7 Pom? 02038 = :95 ”mum Tan? 08638 _ :95 .2 8:80:80 0. 2:0 8:28: 2.850.985 38 0:5... + 85:08 N 55:5 8888858.. 088803 “0: 8:288. __< s 8 8 8 2 8 :88 .o: 8 8 :88 5: 88 82 8.85:? 8 K 8 8 e8 :88 so: 8:: :2 a8 «8 8: 82 98> 8 8 8 8 t. .1818: s: :2 8: 8 82 .80 :o> 8 8 E 8 8 :88 .o: 8: o2 8% so: 8 82 8:88:88 8 R 8 8 8 :98 so: 455 2 8 28 9. F8 82 8:65. k 8: 8 2 8: :98 so: as; 8 8 88 8 82 82 .___o:8.2 8 8 8 8 2 :98 so: 8.“ 8 8v s8 «8 82 8:88.. 8 8 s. 8 s. :88 so: 8:: o2 :88 6: N8 82 .8388. 8 E 2 8 a: 1:8 8 8:: 2: Es 8 e8 82 8:89. 8 8 8 8 8 :98 .o: 8:: :2 :88 so: 8: 82 88:85.8... 8 8 8 8 8 :86. so: 8: o2 8% so: «8 82 88:85.8... .8 8 8e 8 :8 8.5 8 c: 82 88 8 82 82 .5390 - - - 8 2: :88 so: a: 82 :88 so: 8: 82 80.868 8 8 8 8 8 3.5 8 8: o2 82816: as. .282 .= :98 8 8 8 8: 8 Es 8 8: o2 :88 so: 88 .282 ._ :98 8 E 2 2 8: 8v s8 .8 .8 38 t. 88 82 .2855 8 8 k 8 t. 8v s8 .8 .8 8.5 8 88 82 685 _o 8 E 8 8 3. 88 8 8: we" 6.5 8 8 82 .588 e. >az >2. 8:8 8:88 at so: .83 85:95 98: 88>. z 88> .553. 9.8.2 on? 888.2 cargo—.0". >305 8m< 8.38:3... 85 50...— 8o=8_:888._8:o «88... 1 888885. .8885 83828:. 5.3 :8803 c. 33.608585. 8 8.98... 125 88.85:...8 8.8: 88.6 5:0 88888.: 2.0m 2 888 T .83 88.838 .. 580 .83 7832 88.838 . :95 5. 2:0 >0:8:m_.8:. 5.2. 2:838: .0“. < 8:80:80 0: >.:0 8.885 2.8052088... 88.8 5:5... + 8:“:2: N 55.2. 8888868.. 88.8805 .0: 8:0.88. ._< .. 8 :2 :2 8 8 82878: 2:5 2 28 8:5 8 88 82 8:5... 8 8 8 8 8 828 .o: 8 8 :85 8 82 82 .___8:8..._ 8 88 8 8 2 :8... .o: 8.8 8 8v .8: 88 82 22:88.. 8 8 8 8 8 :82: .o: 8.: :2 828 .o: 88 82 .888. 8 8 8 8 .8 +25 8 8.: :2 3:5 8 88 82 8:832. 8 8 8 8 2 828 .o: 8.: :2 :82: .o: 8.2 82 :8qu 8 8 8 8 8 8:8 .o: 8.: :2 8:8 .o: 88 82 .8858: 8 8 8 8 8 :38 .o: 8.: :2 828 .o: 8 8 82 888:8... 8: ..8 8 8 2.. :25 8 8.: :2 :25 8: 82 82 8.8.0 8 8 8 88 8.. :28 .o: 8.: :2 :88 .o: 282 82 .28.“. 8 8 8: 8 88 828 .o: 8.: :2 828 .o: 82 82 8.88... 8 8 8: 8 8 es. 8 8.: :2 828 .o: 88 .282 .__ :93 8 8 8 8 8 €55 8 8.: :2 828 .o: 88 .282 ._ :98 8 8 8 8 8 8v 088 .8 .8 2:5 8.. 88 82 .9855 8 8 8 8 8 8 v2.8 .8 .8 :25 8 :8 82 .28.: _o 8 8 8 88 8 828 .o: 8 8 828 .o: 882 82 .28.: 8 8 8 8 2 828 .o: 2 8 :82: .o: 82 8:2 .880 88 8: 8 8 8 :25 8 8.: :2 :25 8 8: 82 8:8 8 8 8 88 .8. +8 2 8.8 .o: 8 22 .882 .8328: .8 >n.2 >8. 08am 8:88 8:8”. .80. 88.8 85:02:. 3.. 888m. .888... z 88> ...0£:< 9.85. 888.2 8.3.32.0“. >8ao.m 8m< 9:56..me 0r: SOC flOBmtquNhNLO «ah. I 88888.2 .885 8.0828... £..s :8EO>> :. 30.9.6 :0_:8...n.8< 8.0882 8:. n. V 838... 126 09:38 “O: 9.9:??? mums—00.09.: 556:: 0>Emoa a ma 690328 3:50. _moo>_36m in 2:0 mucozma Uflwaofi how 95 mozmtmuomhmco “mm“ 620:6 ”scamjaon .9399 52.8 m_ $855.3 5 89:5: .3533 uflwaoéflfi 30:8 2 .5383: “mg“. g. 8 mm 5 mm x. 82m 6: m2 2: 52m 6: ~va 52 .093 mm 8 8 E. 9. :95 6: m2 o2 5% 6c owfi .82 18:3 3 S 8 8 3 52m 6: NF :96 “o: 5% 6.. wow 82 .9862, 8 8 8 E 5 52m 6: m mm :95 6: F2 82 .5628? :w 3 8 E 2 :95 “o: «E 2: ES 2. New E2 5855 8 2 8 3. 5m 52m 6: «E 2: 82m 6: now 82 9&5 8 08 08 8 5 52m .8 «E 2: Es mm 9. 32 95> mm E mm mm mm :93 6: E: 2: :96 6: SN #82 .9352m mm 8 8 8 3 :95 6: E: 02 :95 8: $2 82 .0925 mm 8 8 8 E 5 mm :95 6c 88 6: mm :95 “o: o E 88 Sim mm mm 5 mm 3 52m 6: vm mm :25 “8 93 122 $7.55.: 8 8 mm mm NV 5% 8: Sc 2: 52m 8: we 52 £3 3 8 mm 5 gas 2 :96 “o: 8% 6: X” :96 6: “mo 82 .Essoaa mm mm 2 3 AS. :95 6: 3c 2: 52m 6: 8m 32 .__mz_o on >n_2 >mm ovum 95$ 23. 3.8 on? Ange—.95 ex; 8mm. 3.62: 2 am; ...o£:< 9:22 anus. Q:.>>o=o..._ >305 om< 6.233 e asap 127 .0E050EE0E 05000: 5.2, 0005032 .0 35:86.5 .00:0:._.:0:0 .6 0:0.00. .0:0_0.00:0n:w. in. 0032.2 50: 0.:0500 mo PR 0% :0.:0n 2.00....00508808 0:0 >._00_:..0 0:02, 0:000. ..< .0 >303 :000 :0 00... 0005 3 =0 00 b00200 5:0:m__0E 05 0E. 0>on0 .0 m 3 00.000 .8 5500000005 5:000:00: 0.0: 0.0083: 05 5:05:05 :00 0>000 0:0 0 0:0 00000.0 5.50 .6 0.90.0000 00 v ..0 0 .6 000000 :0 00000 :000 :00 $09 .8 0:_0> 050.0000 05000: 0:0 05.000 0:0 5.0.0.0000 5.5.050 0 .00: 0.000 «005 0.0.; ++ 3.00:00 05 2:0 0.30.5 2.00.0220... 00.0 006:... + 05:08 N 55.3 0000050: 00.0003 00: 0:0.00. ._< .. - - - 3 02 E5 8 0.: 02 525 00 8 52 .20.? 02 v0 E 02 v0 5% 0o: 0.: 8. E5 00 0.. 82 90> 2: 00 3 02 #3 820 .8 5.5 5 £0 e5 0.. #00 «02 .2058 2: E S 8. 00 520 8: 55 S S .55 2‘ 60 $88 0522 - - - 8 0v 5% 0o: .0 00 65 om 8: 82 .___m:0_2 8 E 00 8 :V 520.2 0.: 2: =90 .8 08 002 03:20. 02 E S 2: % +455 3 02 02 E5 v0 v00 32 55.500. 8 00 8 00 33 520 6: NF 3 520 .8 $00 008 0:39.503 2: .0 E o? 5 5%. .8 0.: 09 820 6: 000 $2 $20.50: 02 00 00 2: 00 820 0o: 00: 02 520 6: 30 v02 52052.. 02 8 00 2: 8 6.5 8 0.: 09 9,5 00 v2 82 00320 8 E 00 00 on ouv $8 .0 .00 25 0V :0 $2 .3905 8 00 8 00 0V omv $3 .0 .00 ~95 9. .00 002 .9005 - - - mm 2: 820.8 0.: 02 00v $02 00 .8058 02 00 00 02 S. 6.5 on 0.: 2: 6.5 0.. 00 89 0250 3.0 .0555 3... >02 >00 0000 0:00 300 E005 00.0 0: 050m .0003 00> 0053.0 9.0.2 000.2 {50:00 >000.m 0m< z 0:30:00... 05 805 0050.005000030 .00... I 00000.... «000.5 0300.00 5.3 0080.5 :. 0.00: 0.0 0.0... .. m 0.00... 128 .5 _:fi JmflGF.y ON my or t mp mp 3. 2‘ NF : or CDIDONOOOflNMOPI-O w¢OwoooFOwoocoR 1-5 5‘ E. _‘v ; on on NF NF o OOOOOOFOFOOONNNOPFO_2- o NMV’IOCDNQO Hmmoom Mafi >35 $55.35 Q< 5.wm>:. 8 9mm $025 2. H_ . .. , 33%... s2 2._ , . .33m .3058 32 E535»... ucm vamp EonEo>oz 525.03 ovumsfi>m «$me 550.5 can E0... «no.3... m2 .0 3.33m. >moo.m a 22m... 131 00.20.30... 3:0 0.990 505.; .0 5.? 009.050 02.0.0505 00.03.05 . .5 020.00 20.2mm, ,, ,,.. 1 ,2 2 >20... zmmo 2.0 0530.... 50.05.... .s. «325. . o; o._. _ 0.. 0.3 mém - - 0N. me adv ow. .Emn. c0> v... Om 0.0.. For .05.. fiov v.0 0... 00¢ om. ..__0t0.>. - - - ode 0.8 m... - IN 0.0 0.3. v0. .20.:me 0.0 - - 0N. mdm 0.0 - 0.0 0.0 0.5 mm. .050... .0 Wm - EN v.3 Bum - - - wd v.9. om. .._0_.3m 0.0 me m. 0.3 TS. 0.0 we o.m Em 0.8 00. £0.50 .050 0E2... 0.00.002 080:000 0.3.6 0.0.0.... 20.. man. .0050 00:00 .0“. 65¢ -05.". -..0n.>I 32390.. .8030 30.054. 0>.00>:_ 02002.. .3. v.21 I0.I 20:54. . 030500.53... horn. :23 0:300”. 30.90.... >035 E0050 .0 :00..00E00 m 0.00... 132 3.18. F med .0 F. t... F . FF 2...? 9F CF. o... «E «5&3 .3. F «Fe .8. 0.8 aw... .805 $0.... .600 26.8. N... F .F F. 3: E. 85:. 0% 80-3. .8. F a. v. F F o. 8:8. 3.0.3 8.0%. 8. a. a.» .2: 8.0:. 3-8 o. F a. «d S 8.0:. «muv we. n 5.0 an. m.oF «8 29...... 3.0.0... Ed a. 03 an 2258. SFA 3 .9... 3 8m 28:2. 2:5 2. F NF.~ an. 0.8 8.” 2.0.25 «.8... F a... a. «.2 3 8R n..F-F.F hm 8 F. 3: 5 9.8.. $5... 0.3.... F.~ .3. F6. 3F 28.. 212. 3 a. m... «2 e8. mnuv .8. n 3... an. ....oF 8n om< .0 $33 1.0.3.3.. 2.3. «.80 30> .. .0 20:00 .5 035:... 25.8. m._m<.m<> 0.020.:— 10 0:0”. 00:00 .0 09:52 8:00:80 085...... 003 30350". 0:0 <03 .0nE0>oz :00250: 00.03_0>m 30.00.. .0005 on» :. m0.n0...0> 8.0.005 00.30300 0.020.800.0m .0. 0.300: :0.00050m. 000.00.. 0.0...0>.m 0:0 00:00 .0 00:0_0>0..n. 0F 0.00h 133 00: 30.05. .:0E000_00. 0:0E.0£0m0 .0 0.00.. on A 0:0 >E0.00.0:000 _0.0.0__:: 4+ >Eo.00.0.0>_.=>:.0.00.0:000 _0.0.0__£0:.:OE NF 55? 000:0... 02 n .003000:0E.0on_ n3. 0.00.. on «V 000 0:0 .2.-5.00.2300 _0.0.0__:: 4+ >E0.00.0.0>£00E «_- cEgB 000:05. n _003000:0E0.n_ n .. 00.0-00. 00. .0. 0.0 0.. .052 3 .00. 0.3 000 .20 0... F 00. .00. 0.2 000 00.0.0 0000000... 5.0.3. .0.F E 0.2 00 0.00. 00% 0F .0. 0.2 00 0.00. 00 uv 00. F 00. .0: 0.0F .0 00:03:05. .0 00< - 00.0-00. F 00.0 :0. 0.0 F 0 F F 00 003000000000. 3 .2. .0 000 0 000008052. -F0.v F 00.2 .00. 0.2 000 00.0.0 00000000: 00.0-00. 00F .0. 0.: F0 28. «Fv 0F .00. 0.0 0.0 0.00. «F "A 00. F 00. F .00. 0.2 000 0:20:02 .0 00< 00.0-00. 00. F AF. F. FF 0 00> 3 .00. ..0F .00 oz .0. F F0. .00. ..0F 000 .8000 .0020 20.0.: _0:00.00_ _0 .000: 1.0.0004 2.00 00.00 «20> 0 .0 20:00 _00 00050:. 0:202 m-_m<_m<> 0_0>.0.:_ mo 2.01 .00:00 0.. .0952 000000000 08500.: 00.08. 2 0.00. 134 .3. I 0 F. 00. .00. 0.. 000 .00..:0..0>m 3 .0.. 0. F0 0. .052 00: ..0.v F 00.2 .00. ..0. 000 900800000 .05 0.....0... 00.F .0. 0.0. 00 0.00.. 00K 0F.0-F... ...F .0. 0.0 F0 000.. 00-00 0. F .0. 0.0 00 0.00.. 00-00 00.0- .0. 00.F .0. 0..: 0.. 0.00.. 0.nv 0.. 0 00.F .00. 0.2 000 0.03.00 .0... .o 00... 0 F0 - 00. 00.F .0F. 00F F0. 0R 0F.0 - 0... 00.. .0: 0.0. 0F. 0 0.. .0. 0.0 0.. F 00. 0 0.. .00. 0.: 000 00002.00... 00.002 00.0-F... 0F.. .0: 0. FF 00. 02 2.0.00. 00. .0. 0.0 F0 0 0.. .0. 0.0. .0 F F0. 0 0F. .00. 0.0. 000 00.000000... .0 39552 .0 0000.. 100.00.. 0000. 0000 00.0., a .0 20:00 .00 000.02.. 0.5.8. m..m<.m<> 0_0>.0.:_ «O 2.01 .00:00 0.. .3532 000000000 085.00.... .0008. 2 0.00. 135 3:0 000.03 .00:000:0E.000 0. 00.00.00”. ... 2.0.0. 0.0 E 0.0. .0 .00:00 00.0-00. 00.F .0. 0.0 0F .00. 0.. .0. 0.2 .0 .052 .0. 0 00.. ..0. 0.2 0.. 0.0058000”. $3000.00 00: 0F..-F.. ..0 ..F. 0.2 00 0.00.. 0 "A 3 .0. 0.0 00 0.00.. 0 v 0 .. F 0... .0 F. 0.0. 0 FF 205000.000. :000..0m €039.30 00.0-00. .00 .0.. 0.8 .. .00:00 00.0-00. 00.0 .0. 0.0. 0. .00. 3 .0. 0.00 00 .052 .0. 0 .0. ..0. 0.2 0F. 0000.80.00: :0mob0m .0 00... .00-00. 00. .0. 0.0 .0 0.00.. 0FA 00.0-00. 00. .0. 0.. 08 0.00.. 2 I 0 3 .0: 0.0 0 FF 0.00.. 0v 00. 0 00. .00. 0.. 000 002.080.0000 3.0 £059.30 .0 0.00: r 6.000.. 2.00. 0000 00.0., 0. .0 0.03.00 0.0 .0005... 000.00. m..m<.0.<> 0339:. mo 03...”. .00:00 0\0 .3532 8:00.000 085.00.... 00:8. 0. 0.00. 136 :0 F00. 00.F .0. ..00 00 0.00. 00uv 0F .0. 0.2 00 0.00. 000 00. F 0.. .0. 0..F 00 0.000005 F0 .0505. 00 090 00.0-00. 00.0 .0. 0.2 00 00> 3 ..0. 0.0 000 oz 00. F 00.0 .00. 0. FF 000 .0505. - 30.0.: 2.80“. F..0.0.. 0..F A F F. 0.2 00 00> 3 .00. ..0 .00 oz 0 F. F 0.0 30. 0. FF 000 00.000. .2 I 3000.: 2.5.0". 00.0-F.. 00.F .0: 0.0F .2 00> 3 ..F. 0.0 0.F oz 00. F 00. .00. 0.: 000 .00:00 0000.0. 0.20.... 2.0.0“. 00.0-00. 00.F .0. F.0F F0 0 "A 3 .0. 0. FF 00 0 v 0:0—0000.000. 00. F 00. .FF. F.0F 00 0000020 9.056030 .0 0.00... I 6.000.. 0000. 0.0.00 00.9 0 F0 0.00.00 0.0 0.00505 0.3.00. m-.m<.m<> m_m>.__0uc_ KC 0351 .00—.50 0\0 008532 02.0.0050 080.00.... .0008. 2 0.00. 137 00.0-00. 0F.F .0: 0.0F .2 00> 3 .00. 0.2 0 F0 oz 00. F 00. .00. 0.: 000 02.0.00 00500 N I 020.: $00“. 0.0-00. 00. .0. F. FF 0F 0.00. 00 uv 0F 0. 0.00 0 0000. 00A 00. F 0.. .0. 0.0F 00 0.80005 .0 00.0.0 00F .o 00< .0.:0. 00.F .0. 0.0F .0 00> 0. F .00. ..0F 000 oz 00. F 00. .00. 0. F F 000 00.0.0 - 020.... 2.0.0“. 00.00-00. 2.0 .0. 0.00 . 0.0.0.5 0F .0. 0.0F 00 0.0.0.0: 0 F. F 00.0 .0. 0.0F 00 00080.. - 000000 0o 3.0.0.0.. .0 A.000: : 60.00.. 2000 0000 00.0., 0 .0 000000 .00 000050. 00200. m..m<.m<> 0.9.8:. , mo , 0.001 .00:00 0.. 03:52 00000008 0 , . . 080000.... .0008. 2 0.00. 138 00. F-0F. 00. a. 0.0 00 0.200.000 0.-F 3 a 0.2 00F xEFv 00. F 00. :00 0.0 000 0000000.". .282 0.0-0 F. 00. A00 0.0F 0F 0A 0... F-00. 00. E 0.2 00 0v FA 00. F-00. 00. A F F0 0.0 00F an 0A 3 a: 0.2 02 0 00. 0 00. F 88 F. FF 000 000.5 uFo 0000.02 00. F-00. 00. :00 0.0 000 00> 3 a: 0.0F 02 oz 00. F 00. F an 0.2 000 .282 .00:00 00. F F00. 00.0 A 0 0 0.00 0F 00> 3 800 0.0 000 oz 00. F 00.0 A000 0.2 20 00.02.00 £03065”. x..— .o .000: 160.00.. 2.00 0000 00.0., 0 .0 0.0000 .00 0000.000 00200. 0.5.0.5; wiznuc— MO Enum— uoocao 3 39:32 80.000000 080.000.... .0008. S 0.000 139 0080.090 2.8000000 u . 00.0. F0. 00. F .0. 0.0F F0 000 000.00. 00.F .0 F. F. FF 02 00v 00 0F .0: 0.0 02 0 00. 0 00. .00. F.0F 000 0000.090 0000.. 0.0.00. 8.0.00. 00.F .0: F... 00F .0>m 0F .0: 0.0 00F 0.02 00. F 00. .00. 0.2 000 000 800000 «0003:0050 .0 0.00: 1.0.0.00... 0.000 0000 00.0.3. .0 0.0000 .00 ..00050. 00200. m-.m<.m<> 0.02.2:— . 10 0.001 .00:00 .x. .00—002 ,, ”00000008 _ . , . . . 080000.... .0008. 8 0.000 140 .0900; 00.000 .F. 02 F 00.0". .00. 0.2 000 00022 .00. F 00.0 .00. 0.3 000 0.0.02... 000.... -00.0F-00.F 00.0 .0. 0.00 3 00.0000. 00.0.0 3 00.0 .0. F0 00F 00.000.00.00 0:00> 0F :0. 0.0F 00F 020.003.0080. .F0.v 0 .0.: .00. 0. FF 0F0 00000 000s. 0. F00-00.0 00.00 .0. 0.00 0 .00 00 0.0F-00.0 00.0 .FF. 0.00 00 0.0 0.0-F0 0.0-00. 00.0 .0: 0.2 00F 0.0 0.0-F.F 3 .0. 0.0 00F 0.0 0. Fuv ..F0.v 0 00.2 .00. 0.2 000 00.0 000.2 .0 .000: 1.0.0.000 2.00. 0000 0:.0> 0 00 0.00.00 .00 0000.00. 00200. m-.m<.¢<> «3?.qu KC 0361 00050 0x. ban—:32 00000080 080000.... 08.. . 0:000". 0:0 002 000.0902 0002500 0000:_m>m 000.00.. 00005 can 0. 0050...; 0000.020. _00_..00900 .0256 .50 00.000”. 0.200050”. 030.00.. 000...0>_m 0:0 .00:00 00 00020500. FF 0.00... 141 00.00-00.0 00.2 .00. 0. F0 00 00.00.2000 0.0... 3 .00 0.0 000 0000000000 .00—003.0000 .F0.v F 00.00 .00. 0.3 000 .000000. .00.000-00.00F 00.000 .0 F. 0.00 0F 0000000.). 3:20.096 00.000-00.2 00.00 .FF. 0.00 00 0.00.0..00000. 00.00-00.F 00.0 .0. 0.0 02 00.000 0. F .0. 0. F 00 F 0.00002 ..F0.v 0 00.02 .00. 0.8 000 0.0.8050... .00.F0-00.0 00.0 .0 F. 0. F0 F0 00.0000. 3 .0: 0.0 000 0.80.0 .F0.v F 00.00 .00. 0. FF 20 0.00.00 00.000 8.00.00 00.000 .0. 02 0 0.0: 8.00.0 00.000 .0. 0.: 00 0000.02 3 .0. 0.0 0 com .F0.v 0 00.00 00F. 0.0F 00 8000.00 .3 00.020.00.00 .0 .0000 100.00.. 2.00 0000 00.0., 0 00 0.00.00 .00 0000.00. 00200. 0.005005 «.0330: m0 0.03. .00:00 .0. .00—=02 00000008 080000.... .0008. FF 0.000 142 acmoEcgw >=wowflumuw H 0 00.000050 F0.0~F .00. 0.00 00 00000005. .msofifimsw .00.00-00.F 00.0 .0. 0.0F 0F 0.00.2 0.00.2 0~.FF-~0.0 00.0 .0. 0.0 00 0.0.90 2,0”. 3 E 0.0 000 00030309000. .F0.v 0 00.00 .00. 0.: =0 000.98 00000.02 0.0002 00.". _o .000: 1.0.0000 2.00. 0000 00.0.. 0 .0 0000.00 .00 000.000. 000.00. m..m<.m<> , 0.05.3... «O 0.000. .00:00 .x. 03.032 00000008 080000.... .0008. FF 0.00.0 143 _ _ _ _ _ . . . 0..... 00.F-00. 00.0 00. F 00. .00. 0.00 .000. 0.00 0.00 0:02.00 0F00< 00.F-00. _ 0F.F _ 00. _ F _ 00. _.00. 0.0 _ .000. 0.0 _ 0.0 _ 00002.00 000000.02 _ _ H _ 4 _ ¥ _ 00.000—5000. 00.F-00. 00.0 00. F 3. .00. 0.0 :8. 0.0 0.0 .0 0000.02 _ . _ v A _ _ . 0..... 0F.F-00. 00.F 00. F F0. .0: 0.00 .00. 0.00 0.00 00000000.... 00< 00.3.0. _ 00. _ 00. _ F_ 00.F _.00. 0.0F _.000. 0.0F _ 0.0F _.0000...0000000s. 00< .0F.F-00.F _ 00.F _.F0.v_ F. 00.1 _.00. 0.00 _ .000. 0.00 _ 0.00 _ .0000...00< .0000000. 0. 000.0500. FF.F-00. 00.F 0F. F 00.0 .00. 2.0 .000. 0.00 0.00 0. 20.03.0000. 000.2 ..Uom 00.F-00.F _ F0.F _ 00. _ F _ 00.0 _ .00. 0.00: .000. F.00F_ 0.00F _ .00000&.0fl0§ 0F.F-00. 00.F 0... F 0F. .00. 0.00 .000. 0.00 0.00 000000290: .0 0.03.5 €00.02... €00E30. 0.003-003 00.00. 00.0> 00 .00 0.00000 0000000 000.000. 0.00..0> 0.030.... 0300 0 0.00”. .00:00 09:00 :09: 000000000 00 009.000.... .0000. .0002 _.0005 0000 E03000". 0:0 009 000.0052 c0050.. 0203.05 000.00.. «000.5 000 :. «0.0.0.0.; 020.00....— _00_:__o 0:0 0.00—0.0.02.5 0:030:00”. 08 8.300”. 00.000500. 0.00.00... 20.0035 0:0 0000.00 00 00:205....— uuw 030% 144 00.00. 00 .00—=32 u z ~.00.0.00.F 00.F .F0.v F 0.0.0: .00. 0.0 .000. 3 .2 0.0.0.0008. mmo :o 006 000.2 .0000... 0.00.505 0F.F-00. 00.F 00. F 00. .00. 0.00 .00. 0.00 0.00 .0 00.0.0 0F 0. 00.0. .o 00< $0.00... 0.090905 00. F-F0. 00. 00. F 00. .0. 0.00 .00. F.00 0.00 .0 00.00.... 5 000m 00 090 .0000... 000800200”. F0.F.00. 0F.F 2. F 00.0 .FF. 0.0 .0... 0.0 0.0 0000020 00... 00.00.50 .0000... 0:0—000.0.me FF.F-00. 00.F 00. F 00.0 .0: 0.2 .00. 0.0 0.0 0002.00. on: 02.0.05 .0000... 00.F-00. 00. 00. F 0F. .00. 0.0 .000. 0.0 0.0 09000000000 _000 00: 02.0.50 0.0306 tan—0:5 ..00E::. .0 00000000.. 0000. 00.0., .0 .00 0.00000 0.00000 000.000. 0.00.00> 0.020.... 0.000 0. 00.01 .00:00 09:00 0.00:. 000000000 00 080.00.... .000.). .0005. ..0.0>0 .0008. 0F 0.000 145 0000.0 .0. 00000.00. 00 0.0 A .0 0.30090 02 F0500 0005.200 mczo> 8 0802.00 0. 0.30098 00.00.00. 003000.00. n 00.0.: 00.00.00”. .0.... .00:00 .0005 Fo >800... .0520... u >.o.0.... 00.0.0.2 00 60300000000000 u 0000,5000. 000300000000... u 00000.20. 1, 0.00> 0. 0m< 0 0.000 n 002008 m<20 00 0.00. 01.00:.00 m>00 .50. 0.00 09 000 0. F 00 00> 000000.000 F 0 $300.00 030 .50. .800 09000 0. F 00 00> 000000000 F 0 01000 F .3 .0000: >.000:. 09000 o.o 00 00> 000000000 mm «1330.0 .0....00 0.0... 09000 0. F 00 00> 000000000 um £0000. Fv 0>00 >00. >.0003. 0.00.0..00000. 0F o: 00 00000.0... 3 10.0.00. FN .00:00 .0000... 0.00.0..00000. N. F O0 00 00000.0... 00 10300.0 .00:00 0000:. 09000 0. F 00 00> 000000.000 mm 000$ F F . 0 09000 >.000:. 0.00.8.00000. oN 00 00 00000.0... 9. 000.00de a>00 >00. >.000:. 09000 c. F 00 00 3000.0... mv £6 F .o 09000 E0003. .00:00 N. F 00 00 00000.0... NV >00.00090... mmo.3>o >00000.0000 0. 005 >300... 320.: 030.0 0m< ...m_:mon_ 025". w>=_mon_ 2:... mmoz<0 OZ mmoz<0 .. hwmh + hwmh. 147 m>~50 00¢ hmmh. hmm0>> \ hwmh. hwwm "N MMDOE b65026 - _ a ad ad 9.: N6 9 . _ _ _ _ c i I H6 \ I N... \ I Md :5 3.83 .. .I. I \\ I #6 .39 .momlT \\ I m.© \\\ l 9.: \I\ ”.6 \\ fi a... l\ O H \ \ I l‘. Q Mumsuas 148 m$.50 00m. o_<._.m