A method is described for the reproducible quantification of biomarker expression, including biomarker expression in a tissue sample. Methods and systems are described whereby reproducible scores for biomarker expression are obtained independent of instrument, its location, or operator.

Method of calibration for an apparatus for the measurement of cell object features

EP0549905

April, 1999

Method and apparatus for automated cell analysis

EP0977981

February, 2000

PHOTOBLEACHABLE LUMINESCENT LAYERS FOR CALIBRATION AND STANDARDIZATION IN OPTICAL MICROSCOPY

EP0720114

January, 2001

Method and apparatus for detecting and interpreting textual captions in digital video signals

EP1065496

March, 2001

Method and apparatus for deriving separate images from multiple chromogens in a biological specimen

EP1202563

May, 2002

Image-pickup apparatus

EP1251179

October, 2002

Method for protein expression starting from stabilized linear short DNA in cell-free in vitro transcription/translation systems with exonuclease-containing lysates or in a cellular system containing exonucleases

EP1300713

April, 2003

Method and apparatus for automated image analysis of biological specimens

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims priority benefit under 35 U.S.C. §119(e) of U.S. Pat. Appl. No. 61/097,415, filed Sep. 16, 2008, the disclosure of which is hereby incorporated by reference in its entirety.

Claims:

What is claimed is:

1. A method of reproducibly quantifying biomarker expression in a slide-mounted tissue sample, the method comprising: (a) obtaining a slide-mounted tissue sample, which has been stained to permit localization of at least one cellular compartment and at least one biomarker; (b) obtaining one or more pixel-comprised images of the stained tissue sample using a microscope, and analyzing the one or more pixel-comprised images to obtain one or more data sets; (c) automatically analyzing the one or more data sets derived from the image pixels to differentiate data signal from noise; (d) automatically analyzing the one or more data sets derived from the image pixels to differentiate data signal attributable to each of said at least one cellular compartment; (e) optionally automatically analyzing the one or more data sets derived from the image pixels to differentiate data signal attributable to said at least one biomarker for each of said at least one cellular compartment; (f) quantifying the amount of biomarker expressed in each of said at least one cellular compartment to arrive at a standardized score, which is a product of a raw score and one or more factors selected from the group consisting of a calibration cube factor, a light source factor and an optical path factor, whereby a run comprises steps (a)-(f), and the biomarker expression in the same slide-mounted tissue sample is quantified in a manner having a level of reproducibility above 80 percent for separate runs, in which each automatic analysis step is carried out in an unsupervised manner and comprises an unsupervised pixel-based clustering algorithm, and whereby the microscope allows for automatic adjustment of exposure time to provide an optimized dynamic range of data captured in the image pixels.

2. The method of claim 1 in which the light source factor reduces variability in intensity of the light source.

3. The method of claim 1 in which data signal attributable to two or more cellular compartments is differentiated.

4. The method of claim 1 in which the amount of biomarker expressed in each of two or more cellular compartments is quantified.

5. The method of claim 1 in which data signal attributable to two or more cellular compartments is differentiated with a confidence interval of about 95%.

6. The method of claim 1 which further comprises assessing the image quality of the one or more slide-mounted tissue samples, the one or more pixel-comprised images or one or more magnified portions thereof, by testing any one of signal integrity, sample integrity and image integrity of the slide mounted sample and removing the image from analysis if one or more of the signal integrity, sample integrity and image integrity fail.

7. The method of claim 1 which provides a greater than 85% concordance for sample classification from one run to another.

8. The method of claim 1 which provides a greater than 90% concordance for sample classification from one run to another.

9. The method of claim 1 which provides a quantified measure of biomarker expression having a level of reproducibility above 90 percent.

10. The method of claim 1 which provides a quantified measure of biomarker protein expression having a level of reproducibility above 95 percent.

11. The method of claim 1 which provides a quantified measure of biomarker expression having a level of reproducibility falling in the range of about 90 to about 97 percent.

12. The method of claim 1 which provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 20 percent.

13. The method of claim 1 which provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 10 percent.

14. The method of claim 1 which provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 5 percent.

15. The method of claim 1 which provides a quantified measure of biomarker expression having a coefficient of variation (% CV) falling in the range of about 4 to about 7 percent.

16. The method of claim 1 in which the slide-mounted tissue sample has been stained with an optimal dilution of one or more reagents.

17. The method of claim 16 in which said optimal dilution produces one or more pixel-comprised images having an optimal dynamic range metric.

18. A non-transitory computer readable medium having computer readable instructions stored thereon for execution by a processor to perform a method of reproducibly quantifying biomarker expression in a slide-mounted tissue sample comprising: (a) acquiring one or more pixel-comprised images of a slide-mounted tissue sample, which has been stained to permit localization of at least one cellular compartment and at least one biomarker and analyzing the one or more pixel-comprised images to obtain one or more data sets; (b) automatically analyzing the one or more data sets to differentiate data signal from noise; (c) automatically analyzing the one or more data sets to differentiate data signal attributable to each of said at least one cellular compartment; and (d) quantifying the amount of biomarker expressed in each of said at least one cellular compartment to arrive at a standardized score, which is a product of a raw score and one or more factors selected from the group consisting of a calibration cube factor, a light source factor and an optical path factor, whereby a run comprises steps (a)-(d), and the biomarker expression in the same slide-mounted tissue sample is quantified in a manner that is reproducible for separate runs, in which each automatic analysis step is carried out in an unsupervised manner and comprises an unsupervised pixel-based clustering algorithm, and whereby a microscope allows for automatic adjustment of exposure time to provide an optimized dynamic range of data captured in the image pixels.

19. The non-transitory computer readable medium of claim 18 in which the computer readable instructions stored thereon for execution by a processor further comprises instructions to perform a method that includes, prior to said quantifying step, an optional step of automatically analyzing the one or more data sets derived from the image pixels to differentiate data signal attributable to said at least one biomarker for each of said at least one cellular compartment.

20. A system for reproducibly quantifying biomarker expression in a slide-mounted tissue sample comprising: (a) one or more lenses configured to magnify at least a portion of a slide-mounted tissue sample, which has been stained to permit localization of at least one cellular compartment and at least one biomarker; (b) a microscope, including a light source and an image sensor in optical communication with said one or more lenses, the microscope obtaining one or more pixel-comprised images of the stained tissue sample; (c) a processor in communications with the microscope, configured to analyze the one or more pixel-comprised images to obtain one or more data sets; and then (i) automatically analyze the one or more data sets to differentiate data signal from noise, (ii) automatically analyze the one or more data sets to differentiate data signal attributable to each of said at least one cellular compartment, and (iii) quantify the amount of biomarker expressed in each of said at least one cellular compartment to arrive at a standardized score, which is a product of a raw score and one or more factors selected from the group consisting of a calibration cube factor, a light source factor and an optical path factor, whereby a run comprises steps (i)-(iii), and the biomarker expression in the same slide-mounted tissue sample is quantified in a manner that is reproducible for separate runs, in which each automatic analysis step is carried out in an unsupervised manner and comprises an unsupervised pixel-based clustering algorithm, and whereby the microscope allows for automatic adjustment of exposure time to provide an optimized dynamic range of data captured in the image pixels.

21. The system of claim 20 in which the processor is further configured to optionally automatically analyze the one or more data sets derived from the image pixels to differentiate data signal attributable to said at least one biomarker for each of said at least one cellular compartment.

Description:

BACKGROUND OF THE INVENTION

The present invention relates to the field of automated biomarker expression analysis in tissue samples using algorithms to enhance operator-independent analysis and assay result reproducibility for greater predictive value in diagnostic assays.

To date, biomarker assessment on tissue sections relies on traditional cytochemical and immunohistochemical (IHC) techniques which were largely developed before large scale and high throughput assays were available. A significant drawback to traditional methods is the subjective nature of the test, and lack of standardization. Although IHC tests have shown clinical utility (e.g., Her2/HercepTest), the value of these tests have recently been shown to be compromised by the site at which the test is performed. Two recent studies examining the reproducibility of Her2 testing has shown that there may be as much as 20% error between local and central lab testing (Perez et al. J. Clinic. One. (2006) 24:3032-8; Paik S et al. Benefit from adjuvant trastuzumab may not be confined to patients with IHC 3+ and/or FISH positive tumors: Central testing results from NSABP B-31 (2007) 25:511-22).

The present invention provides for the first time fully automated standardization of in situ biomarker quantification that minimizes lab-to-lab, machine-to-machine, operator-to-operator, and day-to-day staining variations.

SUMMARY OF THE INVENTION

The present invention relates to the reproducible quantification of biomarker expression from tissue samples, whole tissue sections (WTS) as well as tissue microarrays (TMAs), so as to reduce variability between runs or batches due to differences in operators, equipment, facilities, and other factors. The systems and processes described herein provide the automated localization and quantitation of biomarkers with normalization of scores for greater reproducibility between runs, regardless of location, operator or instrument variability.

One embodiment of the invention relates to a method of reproducibly quantifying biomarker expression in a slide-mounted tissue sample comprising (a) obtaining a slide-mounted tissue sample, which has been stained to permit localization of at least one cellular compartment and at least one biomarker, (b) obtaining one or more pixel-comprised images of the stained tissue sample using a standardized optical system that includes a light source, (c) automatically analyzing one or more data sets derived from the image pixels to differentiate data signal from noise, (d) automatically analyzing one or more data sets derived from the image pixels to differentiate data signal attributable to each of said at least one cellular compartment, (e) optionally automatically analyzing one or more data sets derived from the image pixels to differentiate data signal attributable to said at least one biomarker for each of said at least one cellular compartment, (f) quantifying the amount of biomarker expressed in each of said at least one cellular compartment; whereby the biomarker expression in the slide-mounted tissue sample is quantified reproducibly.

In one embodiment, the standardized optical system includes a light source whose intensity and optical path variability have been normalized. In a further embodiment, the standardized optical system allows for automatic adjustment of exposure time to provide an optimized dynamic range of data captured in the image pixels.

In one embodiment, each automatic analysis step is carried out in an unsupervised manner.

In one embodiment, the data signal attributable to two or more cellular compartments is differentiated.

In one embodiment, the amount of biomarker expressed in each of two or more cellular compartments is quantified.

In one embodiment, the data signal attributable to two or more cellular compartments is differentiated with a confidence interval of about 95%.

In one embodiment, the method further comprises assessing the quality of the one or more slide-mounted tissue samples or the one or more pixel-comprised images or one or more magnified portions thereof.

In one embodiment, the method provides a reproducible cutpoint determination.

In one embodiment, the method provides a greater than 85% concordance for sample classification from one run to another for each sample. In a further embodiment, the method provides a greater than 90% concordance for sample classification from one run to another for each sample.

In one embodiment, the method provides a quantified measure of biomarker expression having a level of reproducibility above 80 percent. In a further embodiment, the method provides a quantified measure of biomarker expression having a level of reproducibility above 90 percent. In a further embodiment, the method provides a quantified measure of biomarker protein expression having a level of reproducibility above 95 percent.

In one embodiment, the method provides a quantified measure of biomarker expression having a level of reproducibility falling in the range of about 90 to about 97 percent.

In one embodiment, the method provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 20 percent. In a further embodiment, the method provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 10 percent. In a further embodiment, the method provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 5 percent. In a further embodiment, the method provides a quantified measure of biomarker expression having a coefficient of variation (% CV) falling in the range of about 4 to about 7 percent.

In one embodiment, the slide-mounted tissue sample has been stained with an optimal dilution of one or more reagents. In a further embodiment, said optimal dilution produces one or more pixel-comprised images having an optimal dynamic range metric.

In one embodiment, the method of the present invention is implemented by a computer. In another embodiment, the present invention is directed to a computer readable medium comprising the computer readable instructions stored thereon for execution by a processor to perform the method described herein. In another embodiment, the present invention is directed to an electromagnetic signal carrying computer-readable instructions for implementing the method described herein. The invention is also directed to a computer readable medium having computer readable instructions stored thereon for execution by a processor to perform a method of reproducibly quantifying biomarker expression in a slide-mounted tissue sample comprising: (a) acquiring one or more pixel-comprised images of a slide-mounted tissue sample, which has been stained to permit localization of at least one cellular compartment and at least one biomarker using a standardized optical system that includes a light source; (b) automatically analyzing one or more data sets derived from the image pixels to differentiate data signal from noise; (c) automatically analyzing one or more data sets derived from the image pixels to differentiate data signal attributable to each of said at least one cellular compartment; and (d) quantifying the amount of biomarker expressed in each of said at least one cellular compartment, whereby the biomarker expression in the slide-mounted tissue sample is quantified reproducibly. In a preferred embodiment of the computer readable medium, the computer readable instructions stored thereon for execution by a processor further comprises instructions to perform a method that includes, prior to said quantifying step, an optional step of automatically analyzing one or more data sets derived from the image pixels to differentiate data signal attributable to said at least one biomarker for each of said at least one cellular compartment. In yet another embodiment of the invention, a system for reproducibly quantifying biomarker expression in a slide-mounted tissue sample comprising: (a) one or more lenses configured to magnify at least a portion of a slide-mounted tissue sample, which has been stained to permit localization of at least one cellular compartment and at least one biomarker; (b) a standardized optical system, including a light source and an image sensor in optical communication with said one or more lenses, the standardized optical system obtaining one or more pixel-comprised images of the stained tissue sample; (c) a processor module in communications with the standardized optical system, the processor module configured to: (i) automatically analyze one or more data sets derived from the image pixels to differentiate data signal from noise, (ii) automatically analyze one or more data sets derived from the image pixels to differentiate data signal attributable to each of said at least one cellular compartment, and (iii) quantify the amount of biomarker expressed in each of said at least one cellular compartment, whereby the biomarker expression in the slide-mounted tissue sample is quantified reproducibly. In a preferred embodiment of the system, the processor module is further configured to optionally automatically analyze one or more data sets derived from the image pixels to differentiate data signal attributable to said at least one biomarker for each of said at least one cellular compartment. Such analysis is preferably performed prior to the quantification step (iii).

Other features, objects and advantages of the invention will be apparent from the following figures, detailed description and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows an embodiment of a process for reproducibly quantifying an amount of a biomarker expressed in each of one or more cellular components of a slide-mounted biological sample containing cells, such as a tissue sample from a patient.

FIG. 2 shows an embodiment of a process for image processing steps to reproducibly quantifying an amount of a biomarker expressed in each of one or more cellular components of a slide-mounted biological sample containing cells.

FIGS. 6A-F show 2×2 contingency tables comparing positive (POS) v. negative (NEG) population segregation based on X-tile cut-points generated for the reference (e.g., Instrument 1) for each indicated instrument set (A, B), operator set (C, D), and run set (E, F). Also shown are overall concordance, positive agreement, and negative agreement rates with 95% confidence intervals.

FIGS. 7A-C are frequency distributions separated into negative agreement, positive agreement, and non-agreement cases for (A) instrument 2 (AQUA® scores) to instrument 1 (cut-point); (B) operator 2 (AQUA® scores) to operator 1 (cut-point); and (C) run 2 (AQUA® scores) to run 1 (cut-point) to demonstrate where disagreement occurs within the population of breast cancer cases. Cases which disagree reside in and around the indicated cut-points and do not span over the entire distribution.

FIGS. 9A-D show the results from Example 4, below. FIGS. 9A-C are box plots of the Allred score from three pathologists compared to the AQUA® score generated for the same samples. FIG. 9D is a table showing the correlation between the Allred and AQUA® scoring for each pathologist.

FIGS. 10A-B show the (A) clustering of patients based on AQUA® scores and (B) survival of the clustered groups, as described in Example 4.

FIGS. 11A-B compare the five year disease specific survival using the Allred scoring (A) and AQUA® scoring (B).

FIGS. 12A-B illustrates the comparison of ER expression scores determined by 3 pathologists reading the same TMA slide using the Allred score method (A) and the comparison of ER expression AQUA scores determined by 3 PM-2000 instruments reading the same TMA slide (B).

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Unless defined otherwise, all technical and scientific terms used herein generally have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. As used in this specification and the appended claims, the singular forms “a”, “an” and “the” include plural referents unless the content clearly dictates otherwise. For example, reference to “a cell” includes a combination of two or more cells, and the like. Generally, the nomenclature used herein and the laboratory procedures in cell biology, immunohistochemistry, and imaging (e.g., cells and tissue) described below are those well known and commonly employed in the art.

It should be appreciated that the particular implementations shown and described herein are examples of the present invention and are not intended to otherwise limit the scope of the present invention in any way. Further, the techniques are suitable for applications in teleconferencing, robotics vision, unmanned vehicles, or any other similar applications.

Techniques suitable for use in the present invention can also be found in co-pending U.S. application Ser. Nos. 12/153,171, filed May 14, 2008; U.S. application Ser. No. 12/139,370, filed Jun. 13, 2008; U.S. application Ser. No. 12/186,294, filed Aug. 5, 2008; U.S. application Ser. No. 12/188,133, filed Aug. 7, 2008; and U.S. application Ser. No. 12/201,753, filed Aug. 29, 2008, each of which is hereby incorporated by reference in its entirety.

FIG. 1 shows a process for reproducibly quantifying an amount of a biomarker expressed in each of one or more cellular components of a slide-mounted biological sample containing cells, such as a tissue sample from a patient. The process 100 is initiated by obtaining a stained, slide-mounted biological sample containing cells 105, in which the stain has been applied in a manner to permit localization of at least one cellular compartment and at least one biomarker. The at least one cellular compartment includes, but is not limited to, a cytoplasm, a nucleus, and a cell wall. The at least one biomarker may include a biomarker labeled or detected via a CY5 fluorescent signal, used to detect a tumor cell. Examples include but are not limited to HER2, ER, PR, EGFR, ERCC1, TS and the like.

The process 100 continues by obtaining a pixel-comprised image or set of images of the stained tissue sample 110. The pixel-comprised image or set of images may be taken using a standardized optical system, such as an optical microscope system, that includes a light source. The pixel-comprised image may be obtained through the standardized optical system using a digital image sensor, such as a digital camera. In some embodiments, the digital image may be taken by an analogue camera, whereby the resulting analogue image or film is digitized by a digital scanner or equivalent means. In at least some embodiments, the camera may be mounted directly or indirectly to a microscope to obtain a pixel-comprised image in the form of a micrograph. Alternately or in addition to, the camera may rely on a direct connection to a video out line on an existing imaging system. In some embodiments, the light source includes light with wavelengths in one or more of the visible spectrum, the infrared (IR) spectrum, and the ultraviolet (UV) spectrum. The camera may be a video camera or a time-lapse photographic sequence, providing real-time and elapsed time images that contain dynamic cellular activity.

The pixel-comprised image obtained by the process 100 is then analyzed to derive one or more data sets from the image pixels to differentiate a data signal from noise 115. This analysis can be performed automatically, for example, using a processor. The processor may include one or more computer processors, controllable according to a preprogrammed instruction set. In some embodiments differentiating a data signal from noise includes a statistical analysis of the signal, for example an unsupervised cluster analysis resulting in segregation of pixels with signal from pixels having signal attributable to noise. In some embodiments, differentiating a data signal from noise may include subtracting a focused image of the slide-mounted tissue sample from a defocused image of the slide-mounted tissue sample. In some embodiments, the defocused image may include an image focused just below the slide-mounted tissue sample. In general, defocusing an image acts like a spatial low pass filter, allowing for a background measurement to which the data signal may be compared.

Next, the pixel-comprised image obtained by the process 100 is analyzed to derive one or more data sets from the image pixels to differentiate data signals attributable to each of at least one cellular compartment 120. This analysis can also be performed automatically, for example, using a preprogrammed computer or a dedicated special purpose processor. In some embodiments, differentiation may depend on a fluorescence attributable to a marker or stain applied to the slide-mounted sample. In some embodiments, the fluorescence of stains directed to the at least one cellular compartment varies in wavelength. In some embodiments, a blue fluorescence may be associated with a nucleus receptor (DAPI), a green fluorescence may be associated with a cell cytoplasm receptor (cytokeratin), and a red fluorescence may be associated with a cell membrane receptor (alpha-catenin).

Preferably, the stain is applied in a manner to permit localization of at least one cellular compartment and at least one biomarker. In some embodiments, the at least one biomarker, previously described in 105, is used to detect a tumor cell or a target protein or a target antigen and may be an indication of a tumor cell in the slide-mounted tissue sample. The process 100 includes a step to associate or otherwise to correlate the previously differentiated data signals attributable to each of at least one cellular compartments 120 with the detected tumor cell or tumor mask. In doing so, the one or more data sets derived from the image pixels are automatically analyzed to differentiate a data signal attributable to at least one biomarker for each of the at least one cellular compartment 125. The process 100 reproducibly quantifies the amount of biomarker expressed in at least one of the at least one cellular compartment for the slide-mounted tissue sample 130.

FIG. 2 shows a detailed process 200 for image processing steps to reproducibly quantifying an amount of a biomarker expressed in each of one or more cellular components of a slide-mounted sample. The process 200 is initiated by obtaining a digital image of a stained tissue sample 205. The digital image is obtained in a manner similar to that described in steps 105, 110 in method 100. Next, the process 200 directs the image quality to be tested for signal integrity 210, sample integrity 215, and image integrity 220. If any one or more image quality tests of signal integrity 210, sample integrity 215, and image integrity 220 fail, the digital image may be removed from analysis or may be flagged for a manual review of the digital image by the operator 225 to either accept the digital image for analysis or to remove the digital image from analysis 226.

Passing image quality tests for signal integrity 210, sample integrity 215, and image integrity 220 or passing a manual review of the digital image 225 identifies the digital image as a candidate for further image processing in process 200. Failure to pass image quality tests for any one or more of signal integrity 210, sample integrity 215, and image integrity 220 and failing to pass a manual review of the digital image 225 identifies the digital image as low quality, and the digital image is rejected. Briefly, signal integrity includes a cellular compartmentalization, a fluorescence intensity, and a saturated pixel percentage assessment. Saturation may be assessed by determining that a predetermined number of pixels in the pixel-comprised image are represented by data and data structures that include a maximum or near maximum value. Sample integrity includes a determination that a sufficient sample of interest (e.g., tumor tissue) is present for analysis. Image integrity includes, for example, a detection of any out of focus images, and in the case of TMAs, analysis and detection of any split images.

While the invention has been described in connection with the specific embodiments thereof, it will be understood that it is capable of further modification. Furthermore, this application is intended to cover any variations, uses, or adaptations of the invention, including such departures from the present disclosure as come within known or customary practice in the art to which the invention pertains, and as fall within the scope of the appended claims.

All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

1. Samples

Any cell containing sample may be analyzed by the methods of the present invention. For example, the sample may be prepared from tissues collected from patients. Alternatively, the sample may be a cell containing biological sample such as a blood sample, bone marrow sample, or a cell line. The samples may be whole-tissue or TMA sections on microscope slides. Particularly when using tissue microrarrays (TMAs), samples may be arranged as “spots” or “histospots” on a slide, with each histopot corresponding to a particular sample. Such methods for preparing slide mounted tissue samples are well known in the art and suitable for use in the present invention.

2. Biomarkers

As used herein, a biomarker is a molecule that may be measured in a biological sample as an indicator of tissue type, normal or pathogenic processes or a response to a therapeutic intervention. In a particular embodiment, the biomarker is selected from the group consisting of: a protein, a peptide, a nucleic acid, a lipid and a carbohydrate. More particularly, the biomarker may be a protein. Certain markers are characteristic of particular cells, while other markers have been identified as being associated with a particular disease or condition. Examples of known prognostic markers include enzymatic markers such as, for example, galactosyl transferase II, neuron specific enolase, proton ATPase-2, and acid phosphatase. Hormone or hormone receptor markers include human chorionic gonadotropin (HCG), adrenocorticotropic hormone, carcinoembryonic antigen (CEA), prostate-specific antigen (PSA), estrogen receptor, progesterone receptor, androgen receptor, gC1q-R/p33 complement receptor, IL-2 receptor, p75 neurotrophin receptor, PTH receptor, thyroid hormone receptor, and insulin receptor.

Cell containing samples may be stained using any reagent or biomarker label, such as dyes or stains, histochemicals, or immunohistochemicals that directly react with the specific biomarkers or with various types of cells or cellular compartments. Not all stains/reagents are compatible. Therefore the type of stains employed and their sequence of application should be well considered, but can be readily determined by one of skill in the art. Such histochemicals may be chromophores detectable by transmittance microscopy or fluorophores detectable by fluorescence microscopy. In general, cell containing samples may be incubated with a solution comprising at least one histochemical, which will directly react with or bind to chemical groups of the target. Some histochemicals must be co-incubated with a mordant or metal to allow staining. A cell containing sample may be incubated with a mixture of at least one histochemical that stains a component of interest and another histochemical that acts as a counterstain and binds a region outside the component of interest. Alternatively, mixtures of multiple probes may be used in the staining, and provide a way to identify the positions of specific probes. Procedures for staining cell containing samples are well known in the art.

A wide variety of proprietary fluorescent organelle-specific probes are commercially available, and include mitochondria-specific probes (MitoFluor and MitoTracker dyes), endoplasmic reticulum (ER) and Golgi probes (ER-Tracker and various ceramide conjugates), and lysosomal probes (LysoTracker dyes). These probes, as well as many nonproprietary fluorescent histochemicals, are available from and extensively described in the Handbook of Fluorescent Probes and Research Products 8th Ed. (2001), available from Molecular Probes, Eugene, Oreg.

Each cell containing sample may be co-incubated with appropriate substrates for an enzyme that is a cellular component of interest and appropriate reagents that yield colored precipitates at the sites of enzyme activity. Such enzyme histochemical stains are specific for the particular target enzyme. Staining with enzyme histochemical stains may be used to define a cellular component or a particular type of cell. Alternatively, enzyme histochemical stains may be used diagnostically to quantitate the amount of enzyme activity in cells. A wide variety of enzymatic substrates and detection assays are known and described in the art.

Acid phosphatases may be detected through several methods. In the Gomori method for acid phosphatase, a cell preparation is incubated with glycerophosphate and lead nitrate. The enzyme liberates phosphate, which combines with lead to produce lead phosphate, a colorless precipitate. The tissue is then immersed in a solution of ammonium sulfide, which reacts with lead phosphate to form lead sulfide, a black precipitate. Alternatively, cells may be incubated with a solution comprising pararosanilin-HCl, sodium nitrite, napthol ASB1 phosphate (substrate), and veronal acetate buffer. This method produces a red precipitate in the areas of acid phosphatase activity. Owing to their characteristic content of acid phosphatase, lysosomes can be distinguished from other cytoplasmic granules and organelles through the use of this assay.

Dehydrogenases may be localized by incubating cells with an appropriate substrate for the species of dehydrogenase and tetrazole. The enzyme transfers hydrogen ions from the substrate to tetrazole, reducing tetrazole to formazan, a dark precipitate. For example, NADH dehydrogenase is a component of complex I of the respiratory chain and is localized predominantly to the mitochondria.

Immunohistochemistry is among the most sensitive and specific histochemical techniques. Each sample t may be combined with a labeled binding composition comprising a specifically binding probe. Various labels may be employed, such as fluorophores, or enzymes that produce a product that absorbs light or fluoresces. A wide variety of labels are known that provide for strong signals in relation to a single binding event. Multiple probes used in the staining may be labeled with more than one distinguishable fluorescent label. These color differences provide a way to identify the positions of specific probes. The method of preparing conjugates of fluorophores and proteins, such as antibodies, is extensively described in the literature and does not require exemplification here.

Further amplification of the signal can be achieved by using combinations of specific binding members, such as antibodies and anti-antibodies, where the anti-antibodies bind to a conserved region of the target antibody probe, particularly where the antibodies are from different species. Alternatively specific binding ligand-receptor pairs, such as biotin-streptavidin, may be used, where the primary antibody is conjugated to one member of the pair and the other member is labeled with a detectable probe. Thus, one effectively builds a sandwich of binding members, where the first binding member binds to the cellular component and serves to provide for secondary binding, where the secondary binding member may or may not include a label, which may further provide for tertiary binding where the tertiary binding member will provide a label.

The secondary antibody, avidin, strepavidin or biotin are each independently labeled with a detectable moiety, which can be an enzyme directing a colorimetric reaction of a substrate having a substantially non-soluble color reaction product, a fluorescent dye (stain), a luminescent dye or a non-fluorescent dye. Examples concerning each of these options are listed below.

In principle, any enzyme that (i) can be conjugated to or bind indirectly to (e.g., via conjugated avidin, strepavidin, biotin, secondary antibody) a primary antibody, and (ii) uses a soluble substrate to provide an insoluble product (precipitate) could be used.

Beta-galactosidase substrates, include, but are not limited to, 5-bromo-4-chloro-3-indoyl beta-D-galactopyranoside (X-gal, blue precipitate). The precipitates associated with each of the substrates listed have unique detectable spectral signatures (components).

The enzyme can also be directed at catalyzing a luminescence reaction of a substrate, such as, but not limited to, luciferase and aequorin, having a substantially non-soluble reaction product capable of luminescencing or of directing a second reaction of a second substrate, such as but not limited to, luciferine and ATP or coelenterazine and Ca.2+, having a luminescencing product.

Nucleic acid biomarkers may be detected using in-situ hybridization (ISH). In general, a nucleic acid sequence probe is synthesized and labeled with either a fluorescent probe or one member of a ligand:receptor pair, such as biotin/avidin, labeled with a detectable moiety. Exemplary probes and moieties are described in the preceding section. The sequence probe is complementary to a target nucleotide sequence in the cell. Each cell or cellular compartment containing the target nucleotide sequence may bind the labeled probe. Probes used in the analysis may be either DNA or RNA oligonucleotides or polynucleotides and may contain not only naturally occurring nucleotides but their analogs such as dioxygenin dCTP, biotin dcTP 7-azaguanosine, azidothymidine, inosine, or uridine. Other useful probes include peptide probes and analogues thereof, branched gene DNA, peptidomimetics, peptide nucleic acids, and/or antibodies. Probes should have sufficient complementarity to the target nucleic acid sequence of interest so that stable and specific binding occurs between the target nucleic acid sequence and the probe. The degree of homology required for stable hybridization varies with the stringency of the hybridization. Conventional methodologies for ISH, hybridization and probe selection are described in Leitch, et al. In Situ Hybridization: a practical guide, Oxford BIOS Scientific Publishers, Microscopy Handbooks v. 27 (1994); and Sambrook, J., Fritsch, E. F., Maniatis, T., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (1989).

In one embodiment, the optimal dilution of a reagent used in the present assays, such as a staining or IHC reagent, described herein may be quantitatively and automatically determined. In one embodiment, multiple dilution sets are imaged, where each of the dilution sets consist of a different respective dilution value and a respective arrangement of immunoassay staining intensity values. A respective dynamic range metric is determined for each of the multiple dilution sets relative to the respective arrangement of immunoassay staining intensity values. Having found the respective dynamic range metric, a dilution set having the numerically optimal dynamic range metric is selected and the dilution value of that dilution set is selected as being representative of an optimal dilution level of the reagent for use in the present invention.

For example, a slide-mounted tissue sample is stained with one of the dilution series of the primary antibody utilizing common immunohistochemistry techniques described above. The resulting stained specimens are each imaged using a system for viewing the detectable signal and acquiring an image, such as a digital image of the staining. Methods for image acquisition are described in more detail below. The images thus obtained are then used by the method of the invention for quantitatively determining the optimal concentration of the reagent for use in the present invention. Each tissue sample set includes a multiple different tissue samples prepared with respective titer dilution, such that different tissue sample sets have different respective titer dilutions. A quantitative analysis is performed of the pixelized images of the multiple tissue sample sets.

For each dilution set of the multiple dilution sets, a dynamic range metric and a specificity of staining are each calculated. In one embodiment of the present invention, the dynamic range metric is an average absolute deviation. In another embodiment of the present invention, the data is log transformed, and the dynamic range metric is a weighted combination of a standard deviation, a variance, and a swing ratio. The specificity of staining is calculated to maximize specific signal while minimizing noise. The specificity of staining may be computed by summing each of a set of immunoassay staining intensity values associated with a stain-specific compartment and then computing a stain specific average for the stain-specific compartment, and also summing each of a set of immunoassay staining intensity values associated with a non-stain specific compartment and then computing a non-stain-specific average. Following the calculation of these two averages, the stain specific average can be divided by the non-stain specific average to produce the specificity of staining, or a Signal to Noise Metric. In such an embodiment, a numerically large sensitivity of staining value is optimal. In another embodiment the non-stain specific average is divided by the stain specific average to produce the sensitivity of staining. In such an embodiment, a numerically small sensitivity of staining value is optimal. Following the calculation of the dynamic range metric and sensitivity of staining for each of the dilution sets, the dynamic range metric and sensitivity of staining can be combined with one another to generate a combination value for each dilution set. The resulting combination values are used to select the dilution set with the most numerically optimal combination value. Associated with the selected dilution set is a dilution value representative of an optimal dilution of a reagent. Optionally, the process performs multiple comparisons to attempt to identify multiple stain specific and non-stain specific compartments.

4. Instrument Standardization and Image Collection

Once the sample has been stained, any optical or non-optical imaging device can be used to detect the stain or biomarker label, such as, for example, upright or inverted optical microscopes, scanning confocal microscopes, cameras, scanning or tunneling electron microscopes, scanning probe microscopes, and imaging infrared detectors etc.

In one embodiment, the imaging device is a microscope system that includes an illumination source configured to illuminate a target sample, optics configured to produce a magnified image of the illuminated target sample, and a detector, such as a digital camera, configured to capture a digital image of the magnified image. Quantitative results can be obtained through manipulation of the captured digital images. Such image manipulation can include image processing techniques known to those skilled in the art. In at least some embodiments, one or more of such image capture and image manipulation is accomplished with the aid of a processor. The processor can include a computer implementing pre-programmed instructions.

For example, a tissue sample or tissue microarray can be imaged as follows: a user places the microarray on a sample stage. The user adjusts the sample stage so that the first region of interest or first histospot is at the center of the field of view and focused on by the CCD camera. The objective lens should be adjusted to the appropriate resolution, for example, a 0.6 millimeter sample can be viewed at 10× magnification. If paraffin mounted, the sample generally correspond to areas of higher light intensity than the surrounding paraffin, as assessed through various means including signals derived from the visible light scattering of stained tissues, tissue autofluorescence or from a fluorescent tag. A computer can acquire a low-resolution image (e.g. 64 pixel×64 pixel with 16 bit resolution) using computer software (Softworx 2.5, Applied Precision, Issaquah, Wash.) and an imaging platform (e.g., Deltavision). A computer automatically translates sample stage by an amount approximately equal to a field of view. The computer then acquires a second low-resolution image. This process is repeated until the computer has acquired images of the entire tissue sample or microarray. Using commercially available software, the computer then generates a composite image of the entire tissue sample or microarray.

To optionally standardize quantitative results obtained using a particular system, a system intrinsic factor can be determined to account for intensity variability of the excitation source and device variability, e.g., along the optical path. In order to achieve this, a measurement of the intensity of the excitation light source may also be obtained for example by using an inline lamp intensity measuring tool. Also a measurement of a standard or a calibration sample, e.g, a calibration microscope slide may be obtained using the particular system to define one or more optical path factors. Use of such a calibration slide is particularly useful for fluorescence-based IHC applications, in which sample fluorescent regions of the calibration slide emit radiation within respective bandwidths. The fluoresced emissions allow for characterization of an optical path at each of the one or more respective wavelengths. These measurement can be obtained simultaneously or separately.

The system also optionally includes a calibration device configured to redirect a standardized sample of the illumination source to the detector, although it has been found that the methods of the present invention do not require the use of such device. In at least some embodiments a system processor is configured to determine a correction factor for a given microscope. The correction factor can be determined from a measurement of the standardized sample of the illumination source obtained using the calibration device. The correction factor can be used (e.g., by the processor) to correct for any variations in intensity of a detected image of the target sample. For example, a calibration cube factor (CC) is determined by comparison to a universal standard cube. A light source factor (LS) is determined by summing the pixel intensities of a captured image of the calibration surface. The optical path factor (OP) is the quotient of the average total light intensity of 16 images taken for each cube/sample combination. The CC and OP factors are intrinsic to the specific hardware system being studied and need only be calculated once or at an interval where one would suspect some type of modification in the optics has occurred.

where the CC and OP factors are defined upon system set-up/construction and the LS factor is measured simultaneously.

In some embodiments, a system processor is configured with instructions (e.g., software) for obtaining the calibration factor. Alternatively or in addition, the system processor is configured with instructions for using the correction factor to correct detected images. Such calibration is useful to reduce variability in intensity of the illumination source within the same microscope system, as may occur over time, and between quantitative results obtained using different microscope systems and/or different illumination sources.

5. Image Optimization

a. Exposure Time

The dynamic range of pixel intensity data from the collected image may be optionally optimized to further reduce run-to-run variations, especially due to staining intensity and equipment differences that affect exposure times. The process includes capturing an image of a subject within the field of view of a camera at a first exposure time, resulting in a captured image comprising a predetermined number of pixels, wherein each pixel has an intensity value. A frequency distribution of pixel intensities of the captured image is queried to determine a region of the greatest frequency occurrence of the pixel intensities of the frequency distribution. Exposure time is then adjusted from the first exposure time to shift that region of highest frequency distribution toward the middle of the range of intensity values. In other words, the center of mass (COM) of a histogram, or frequency distribution is determined from which an adjusted exposure time is calculated to achieve an optimized dynamic range of pixel intensities. A second image of the subject can then be captured at the adjusted exposure time resulting in an image having an optimal dynamic range.

There are various ways to correct exposure for the methods described herein. One correction technique is to iteratively acquire a new image at a longer or shorter exposure time than that of the previous image until saturated pixels are minimized and the optimal dynamic range is achieved. This iterative process allows for a quick adjustment in exposure time to bring the pixel intensities down within the range of detection to optimize exposure and dynamic range. However this simplistic approach may also cause the system to overcorrect for saturated pixels and set the new exposure time too low. Therefore it is desirable to modify the aggressiveness of the correction to the exposure time to be proportional to how many pixels are saturated in the previous image.

To achieve this the new exposure time may be calculated as:

E=E’×(1-(0.5)(1+S)),⁢whereS=A⁢⁢CCDx⁢CCDy⁢SLP

where E is the new exposure time, E′ is the currently set exposure time, A is an aggression level, SL is the saturation limit, CCDx and CCDy represent the pixel dimensions of the captured image, and P is the count of pixels at maximum intensity. The aggression level, A, may vary but, generally, the values that one would want to choose would depend upon the amount by which images tend to be over saturated. A value of zero (0) for A represents a minimum value for which the exposure time would be halved. A practical maximum value for A is about 10, after which the exposure time will not change enough for the algorithm to be useful. In a preferred embodiment of the invention, the value for A can fall in the range of about 0≦A≦4.5. More preferably, A is set at about 3.5.

The procedure of reducing exposure time to ensure the image is not overexposed is typically a multi-step process. In an exemplary embodiment, a 256 bin histogram is generated first for an 8-bit per pixel image obtained from the camera at the current exposure time, E′. The number of saturated pixels are identified and compared to a predetermined saturation threshold value. Then, if the image is at or below the saturation limit, the over-exposure procedure is exited. However, if the image is over exposed, the exposure time is decreased. The new, decreased exposure time can vary based upon the number of currently over exposed pixels. In an exemplary embodiment, a value S can be determined as

S=A⁢0.0002×20482M,

in which A is an “aggression level” currently defined at 3.5 and M is the count of pixels at maximum intensity. Then, the next exposure time E is derived as follows: E=E′−E0.51+S

When the number of over exposed pixels is much greater than the saturation limit, E≈E′−0.5E′ (e.g., the exposure time would be halved). The minimum amount of change to the current exposure time occurs when the number of over saturated pixels is very nearly equal to the saturation limit, in which case E≈E′−0.088E′. Thus, because the algorithm is exited when the image is at or below the saturation limit, the number of saturated pixels will never equal the saturation limit. The procedure of reducing exposure time can be repeated in an iterative manner until the amount of overexposure is within a chosen threshold, or until a maximum number of iterations has been accomplished. In either instance the over-exposure correction routine is then exited.

An alternative and equally viable process for correcting for overexposure is to acquire a new image at a minimum exposure time, then proceed with optimizing the exposure time by calculating the COM and bringing it within range of the midpoint, as described above.

b. Image Validation

The quality of the image may be automatically assessed and optimized to reduce variations due to one or more factors selected from stain uniformity, stain quality, tissue sufficiency, tissue sample position, signal saturation, focus, and signal intensity. Thus, images can be corrected for false readings due to no sample, too little sample, debris, multiple sample or “split image” (in the case of TMA analysis) and poor focus such that invalid images may be excluded from subsequent data analysis. Each stain may be validated independently from other stains on the same or different images. These assessments and optimizations may be performed automatically by an image-processing program with particular threshold values fixed in the program or provided by the user.

To optionally determine stain uniformity across the slide or at least the imaged portion of the slide, the intensity values of vertical columns of the image pixels are combined along the respective column and plotted across the x-axis. The combination can be a straightforward addition of pixel intensity values along the column. Alternatively or in addition, the combination can be a statistically arrived at value, such as an average intensity value of all of the pixels in the column. For example, with an image using 8 bits to represent intensity, there are 256 possible pixel intensity values for each pixel. The pixel intensity values span a range from black (e.g., “0”) to white (e.g., “255”). Values in between black and white are associated with varying shades of gray. The relative maximum intensity values, or peak values, may be compared between different regions of the image to determine stain uniformity and positional bias. If a bias is found, the image may be excluded from further analysis.

To optionally determine stain quality, the staining intensity of the compartment specific stain inside the compartment is measured by analyzing pixels intensity of the digital image that are identified as part of the compartment. For example to measure the stain quality of a nuclear stain, total stain intensity within the nuclear cellular compartment may be formulated as a combination, such as a sum of the intensities of pixels identified as representing nuclei. Total stain intensity outside of the nuclear cellular compartment can be similarly formulated as a sum of the intensities of pixels identified as not nuclear. The two values for nuclei and non-nuclear are compared. For example, the two values can be combined in a ratio, the single value of the ratio indicative of the comparison. For example, the combined nuclei intensity can be divided by the combined non-nuclear intensity by the image processing program to provide a tissue stain quality ratio. A low ratio, such as a ratio approaching 1, is indicative of poor staining quality or poor tissue integrity. An acceptable minimum staining quality threshold can be fixed or settable by a user. Such samples identified as failing to meet the minimum staining threshold can be excluded from the data set and from further analysis by the validation program.

Tissue sufficiency may be analyzed by counting the pixels of an image with signal intensities above a threshold intensity then determining if the total number of positive pixels meets a minimum criterion for sufficient tissue. Likewise, the percent positive pixels to the total may be used as the criterion.

When analyzing tissue microarrays, tissue sample position may be assessed by calculating the average pixel intensity in each of multiple different sections identified within the field of view. Individual samples or histospots must be identified to determine position, which may be used by comparing the position of the edge of the sample with the center of the sample. The sample edge is determined based on pixel intensity of rectangular areas to assess if the centered properly, and the central pixel intensity is measured to determine if the edges are not due more than one sample. Sample edge detection may be performed using a discrete differentiation operator, such as a Sobel edge detector, or any other number of edge detectors well-known in the art. Incorrectly positioned samples or split spots containing more than one target may then be identified and excluded from further analysis.

Focus of the sample may be assessed, such as by determining a kurtosis value for the pixel intensities of the image. The staining intensity values of pixels in a digitized image can be plotted in a histogram. The distribution can be analyzed as an indication of focus. An in focus image will typically have a pixel intensity distribution with a relatively sharp, defined peak (higher kurtosis) compared to an out of focus image which will have a pixel intensity distribution with a flattened peak (lower kurtosis). The sharpness or flatness of such a distribution can be represented in a single value, such as a kurtosis value. A higher kurtosis value is indicative of a relatively sharp defined peak; whereas, a lower kurtosis value is indicative of a flattened peak.

Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. For univariate data Y1, Y2, . . . , YN, the formula for kurtosis is:

kurtosis=∑i=1N⁢(Yi-Y_)4(N-1)⁢s4 where Y is the mean, s is the standard deviation, and N is the number of data points. Excess kurtosis can be defined as

kurtosis=∑i=1N⁢(Yi-Y_)4(N-1)⁢s4-3 so that the standard normal distribution has a kurtosis of zero. Positive kurtosis indicates a “peaked” distribution and negative kurtosis indicates a “flat” distribution. Images with negative kurtosis may then be excluded from further analysis.

Signal intensity may be assessed by sorting the signal intensity data measured from images acquired in each relevant channel for each histospot and identifying the number of samples with low staining intensities. Such samples may be excluded from further analysis.

6. AQUA® Scoring

Once the image is optimized and validated, with any invalid histospots or images removed, the image is virtually masked, three dimensional approximations of cells in the sample may be generated, and biomarkers are associated with subcellular compartments of individual cells. One such algorithm for automatically performing these tasks is the Automated QUantitative Analysis platform (AQUA® platform). This technique is also described in U.S. Pat. No. 7,219,016 and Camp et al., 2002 Nature Medicine 8(11)1323-1327 which are both specifically incorporated herein by reference in their entirety. However, for the first time, automation of each step is described herein, increasing the ease and reproducibility of this analysis.

In one embodiment tissue samples are stained with markers that define, for example, the cellular compartments of interest and the specific target (or targets) being studied. Pixel-based local assignment for compartmentalization of expression (PLACE) is the key algorithm that functions to effectively segment image pixels for the purpose of expression compartmentalization. A critical step in this algorithm is the setting of intensity thresholds that are used to delineate background or non-specific pixels from signal-specific pixels. Images that have been “masked” in this way are subsequently combined in a mutually-exclusive fashion such that pixels above the thresholds are assigned to specific cellular compartments. Once pixels have been assigned to each compartment, the signal for the target biomarker can then be averaged over all of the pixels assigned to a given compartment, which is the AQUA® score for that sample.

For example, a tumor-specific mask may be generated by manually thresholding the image of a marker (cytokeratin) that differentiates tumor from surrounding stroma and/or leukocytes. This creates a binary mask (each pixel is either ‘on’ or ‘off’). Thresholding levels are verified, and adjusted if necessary, by checking a small sample of images and then remaining images are automatically masked using the single determined threshold value. All subsequent image manipulations involve only image information from the masked area. Off target specific images may be clustered to iteratively adjust pixel intensities of nonstandard masked targets. The dilate image processing technique allows for a spatial low pass filter to fill in nearest-neighbor pixels that are surrounded by pixels included in the mask. The erode image processing technique allows for a spatial high pass filter to remove pixels that are not contiguous with the mask or that form structures that are contrary to structures expected for a given slide-mounted tissue sample. Such adjustments allow inclusion of valid but nonconforming samples that may otherwise be excluded from further analysis.

Next, the signal to noise ratio may be enhanced by correcting for background noise. For example, two images (one in-focus, one out of focus, e.g., taken 6 μm deeper into the sample) are taken of the compartment-specific tags and the target marker. The out of focus image acts as a spatial low pass filter that provides a background value. For example, percentage of the out-of-focus image is subtracted from the in-focus image, based on a pixel-by-pixel analysis of the two images, such as by using an algorithm called RESA (Rapid Exponential Subtraction Algorithm). The RESA algorithm enhances the interface between areas of higher intensity staining and adjacent areas of lower intensity staining, allowing easier assignment of pixels to background and adjacent compartments. Finally, the PLACE algorithm assigns each pixel in the image to a specific cellular compartment. Pixels that cannot be accurately assigned to a compartment within a user-defined degree of confidence are discarded. For example, pixels where the nuclear and cytoplasmic pixel intensities are too similar to be accurately assigned are negated (for example, comprising <8% of the total pixels). Once each pixel is assigned to a cellular compartment (or excluded as described above), the signal in each location is summed to generate the AQUA® score for that sample, as shown in the following equation:

AQ=(1∑Ci)⁢(∑Ti⁢Ci)

where AQ is the raw AQUA® score, Ti is the ith target intensity, also known as power density, and C, is the ith cell compartment probability. These data are saved and can subsequently be expressed either as a percentage of total signal or as the average signal intensity per compartment area.

Preferably, the AQUA® score may be automatically normalized, for example, by clustering, to assign pixels to a particular cellular compartment based on intensity data. This clustering allows for further removal of background noise, assignment of specific pixels to a given compartment and probabilistic assignment of pixels to each compartment where there may be overlapping signals. Once pixels are assigned to each compartment (or discarded in the case of noise) the associated target signals can be measured, for example summed and a score calculated.

The assignment is preferentially determined on an image-to-image basis, rather than setting universal criteria. Furthermore, pixel assignment (e.g., Cy3/Cytokeratin pixels to cytoplasm) is also a function of other compartment images such that consideration is given to the status of pixels in other compartment images. In one embodiment one image is of a first stain that specifically labels a first compartment (e.g., a Cy3/cytokeratin image, representing the cytoplasmic compartment) and a second image is of a second stain that specifically labels a second compartment (e.g., DAPI image, representing the nuclear compartment) and pixel assignments are based on four criteria:

1.) Low intensity in both first and second image (e.g., DAPI and Cy3): BACKGROUND: REMOVE

4.) High second stain and first stain (e.g., DAPI and Cy3) intensity: INDETERMINANT: REMOVE

Clustering is a mathematical algorithmic function whereby centroids within data sets are defined by relative distances of each data point to one another, as determined, for example, by Euclidean or log-likelihood distance. While not wishing to be bound by theory, it is believed that clustering pixel intensities from at least two images (e.g., DAPI and Cy3), could result in centroids that define pixels as described, at least, by the above criteria. Because clustering is objective and can be performed individually on each image, clustering is a reliable method for assignment of pixels to compartments, independent of operator intervention.

In another embodiment, pixels containing signal indicative of both the first and second stain are assigned to compartments by the following method. Every pixel in acquired images has three attributes intensity contribution from compartment marker A, intensity contribution from compartment marker B and an intensity contribution from the target or biomarker of interest. These intensities are measured in their respective fluorescence channels per the experimental configuration. To avoid experimental bias, the target intensity is not manipulated in this current method. Thus, the data for the two compartment attributes can be illustrated in a two dimensional plot schematically.

Pixels with a strong bias towards either of the axes can be assigned to that compartment (e.g., pixels in regions A and B could be absolutely assigned to compartments A and B respectively). Pixels near the origin represent low intensities for both channels and can be discarded as background along with outlier pixels that have high intensity but similar values. Pixels that remain in region A/B can then be assigned to each compartment based on probability. This assignment allows target signal in those pixels to be distributed across both compartments based on the probability characterization.

To define the regions described above, for example, for every image, clustering is used to determine three centroids in the data. This method is fully automated and does not require any operator decisions to proceed. The analysis is accomplished by performing k-means clustering on three centroids using Euclidean distances.

The data are then analyzed as follows: (i) Background and outlier pixels are discarded from further calculation. A pixel is defined as background if its distance to the origin is less than twice that of the background centroid distance to the origin. A pixel is define as an outlier if its intensity exceeds the value defines by the line or plane defined by the outermost centroids; (ii) Pixels in regions A and B are assigned exclusively to those two compartments; (iii) Pixels in the triangular region A/B are then assigned a probability value that allows them to essentially be distributed in multiple compartments. This probability value can be calculated based on distance from the two regions A and B, or, using a shape function that will also assign a probability of each pixel having a contribution from the background region by examining each pixel's distance from the three vertices given by the centroids; (iv) With all pixels assigned, the associated target scores can be summed up for each compartment and a score calculated using standard methods:

∑i#⁢⁢pixels⁢Inti*⁢Pi∑i#⁢⁢pixels⁢Pi where Int is the intensity of the pixel, P is the probability of the pixel being assigned to a particular compartment (ranging from 0 to 1).

Cutpoints are established using algorithms to separate samples into groups with specific features, such as samples containing tissues with different biomarker expression levels for one or more biomarker, as described in more detail in McCabe et al., J. Natl. Canc. Inst. (2005) 97(24):1808-1815, which is hereby incorporated by reference in its entirety. By reducing sample biomarker quantification results using the methods of the present invention, intra-group variation is minimized, differences between groups are maximized and more easily identified. For instance, a high expression level of a biomarker, represented by a high AQUA® score, may be more tightly correlated with aggressive disease and reduced survival, whereas a lower AQUA® score is not. By reliably distinguishing the two groups, the correlation between a biomarker and disease becomes more clear. Thus, AQUA® scores generated using the methods of the invention provide a reliable assay for comparing sample groups such that the biomarker may be more specifically correlated with the particular characteristics, leading to more reliable diagnosis and prognosis estimation on an individual sample.

7. Reproducibility

Because the variability of AQUA® scores is reduced through automatic instrument standardization, such as exposure optimization, as well as through normalization of the raw AQUA® scores by clustering, as well as by optionally improved image validation, the sensitivity and reproducibility of the assay is enhanced. For example, in one embodiment, the data signal attributed to two or more cell compartments can be more reliably distinguished. In a further embodiment data signal can be distinguished with at least about 90% confidence interval. In a further embodiment, the data signal can be distinguished with about a 95% confidence interval. In a further embodiment, the data signal can be distinguished with about a 99% confidence interval.

The normalized AQUA® score provides a more reproducible cutpoint determination, leading to greater agreement of sample classification between runs. In one embodiment, the assay provides for a greater than 85% concordance for sample classification from one run to another for each sample. In a further embodiment, the assay provides for a greater than 90% concordance for sample classification from one run to another for each sample. In a further embodiment, the assay provides for a greater than 95% concordance for sample classification from one run to another for each sample. In a further embodiment, the assay provides for a greater than 99% concordance for sample classification from one run to another for each sample.

The normalized AQUA® score provides a more reproducible quantified measure of biomarker expression. In one embodiment, the quantified measure of biomarker expression level has a reproducibility above 80%. In a further embodiment, the quantified measure of biomarker expression level has a reproducibility above 90%. In a further embodiment, the quantified measure of biomarker expression level has a reproducibility above 95%. In a further embodiment, the quantified measure of biomarker expression level has a reproducibility above 99%. In a further embodiment, the quantified measure of biomarker expression level has a reproducibility from about 85% to about 99%. In a further embodiment, the quantified measure of biomarker expression level has a reproducibility from about 90% to about 99%. In a further embodiment, the quantified measure of biomarker expression level has a reproducibility from about 90% to about 97%.

In one embodiment, the normalized AQUA® score provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 20%. In a further embodiment, the normalized AQUA® score provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 10%. In a further embodiment, the normalized AQUA® score provides a quantified measure of biomarker expression having a coefficient of variation (% CV) below 5%. In a further embodiment, the normalized AQUA® score provides a quantified measure of biomarker expression having a coefficient of variation (% CV) from about 1% to about 20%. In a further embodiment, the normalized AQUA® score provides a quantified measure of biomarker expression having a coefficient of variation (% CV) from about 5% to about 15%. In a further embodiment, the normalized AQUA® score provides a quantified measure of biomarker expression having a coefficient of variation (% CV) from about 4% to about 7%.

Thus, normalized AQUA® scores provides a reliable assay for comparing sample groups such that the biomarker may be more specifically correlated with the particular characteristics, leading to more reliable diagnosis and prognosis estimation.

EXAMPLES

Example 1

Standardization of HER2 Analyses Using Automated Aqua® Technology

Materials and Methods

Cohort Description and TMA Construction

A large breast cancer cohort in tissue microarray (TMA) format was employed in these studies in order to test standardization techniques. This cohort from the Yale Tissue Microarray Facility (YTMA49) has been described in detail previously (Dolled-Filhart M, et al. Cancer Res. (2006) 66:5487-94). Briefly, the breast cohort (n=669) of invasive ductal carcinoma serially collected from the Yale University Department of Pathology from 1961 to 1983. Also on the array is a selection of normal tissue and cell line controls. The mean follow-up time is 12.8 years with a mean age of diagnosis of 58.1 years. This cohort contains approximately half node-positive and half node-negative specimens. Detailed treatment information was not available for this cohort.

Slides were washed 3×5 min with 1×TBS containing 0.05% Tween-20. Corresponding secondary antibodies were diluted in Da Vinci Green and incubated for 30 minutes at room temperature. These included either antibodies directly conjugated to a fluorophore for anti-cytokeratin (Alexa 555-conjugated goat anti-rabbit; 1:100, Molecular Probes, Eugene, Oreg.), and/or conjugated to a horseradish peroxidase (HRP) via, anti-mouse or rabbit Envision (Dako, Carpinteria, Calif.)). Slides were again washed 3×5 min with TBS containing 0.05% Tween-20. Slides were incubated with a fluorescent chromagen amplification system (Cy-5-tyramide, NEN Life Science Products, Boston, Mass.) which, like DAB, is activated by HRP and results in the deposition of numerous covalently associated Cy-5 dyes immediately adjacent to the HRP-conjugated secondary antibody. Cy-5 (red) was used because its emission peak is well outside the green-orange spectrum of tissue auto-fluorescence. Slides for automated analysis were cover slipped with an anti-fade DAPI-containing mounting medium (ProLong Gold, Molecular Probes, Eugene, Oreg.).

Microscopy System and Image Acquisition

The PM2000™ system, commercialized by HistoRx, Inc. (New Haven, Conn.), is based on a system described previously (Camp et al., 2002, supra). In brief, it is comprised of the Olympus BX51 epi-fluorescence microscope (Olympus America, Inc., Center Valley, Pa.) which is equipped with a motorized nosepiece to control selection of objectives (e.g., 4×, 10×, 20×, 40×, and 60×); a motorized filter turret to control selection of different filter cubes (e.g., DAPI, Cy2, Cy3, Cy5, and Cy7 or equivalent wavelengths); a motorized stage to control stage movements (Prior Scientific Inc., Rockland, Mass.); an X-Cite 120 mercury/metal halide light source (EXFO Life Sciences & Industrial Division, Ontario, Candada); and a QUANTFIRE monochromatic digital camera (Optronics, Inc., Goleta, Calif.).

Automated image capture was performed by the HistoRx PM-2000 using the AQUAsition™ software package. High resolution, 8 bit (resulting in 256 discrete intensity values per pixel of an acquired image) digital images of the cytokeratin staining visualized with Cy3, DAPI, and target (HER2) staining with Cy5 were captured and saved for every histospot on the array. Pixels were written to image files as a function of power (Power (P)=((Pixel Intensity/256)/exposure time) in order to help compensate for experimental variations in staining intensity and exposure times.

AQUA® Score Generation

Images were validated for percent area tumor (tumors showing <5% area/field were redacted), out-of-focus, and debris. Of the 669 tumor samples on YTMA49, 86 samples were redacted (12.8%) leaving a 583 samples for subsequent scoring and analysis. Compartment specific AQUA® scores for HER2 for each histospot were generated based on the PLACE (pixel-based locale assignment for compartmentalization of expression algorithm) algorithm as described previously (Camp et al., 2002, supra). To remove operator-to-operator bias for threshold setting, an unsupervised pixel-based clustering algorithm for optimal image segmentation was used in the PLACE algorithm as described elsewhere in this application.

Instrument Standardization

For AQUA® score standardization for instrument variability, three calibration factors were developed: calibration cube factor (CC factor), light source factor (LS factor), and Cy5 optical path factor (OP factor). Calculation of these factors is based on pixel intensity measurements given by images acquired under described conditions. All factors rely on a specialized filter cube (calibration cube) designed whereby light is reflected directly from the light source to the camera via white filter paper attached to the objective-end of the filter cube. To account for variations in the different cube constructions, calibration cubes for each machine were standardized by calculating the percentage of the average total light intensity compared to average total light intensity of a “gold standard” calibration cube (producing the CC factor). This is a constant factor which is calculated and maintained for each cube, and thus each microscope system with that cube installed. The light source factor is calculated for each histospot acquired and is the total light intensity as measured by the calibration cube divided into a constant (100,000). The optical path factor accounts for the amount of light passed through a specific microscope objective/filter combination relative to the measured incoming light intensity. For these measurements, a standard sample is required that can be transferred between different machines and maintain reproducibility in its construction. A commercially available blue fluorescent standard slide was selected for this purpose (Omega Optical Inc., Brattleboro, Vt.). Standardization was performed as described herein previously.

Fluorescent stains were multiplexed to compartmentalize and measure expression of specific biomarkers. For HER2, only pixels (Cy5) within the cytokeratin-derived (epithelium specific) compartment (Cy3) were considered for analysis, thus differentiating tumor from stromal HER2 signal as well as membrane/cytoplasm from nuclear HER2 signal. As described, only HER2 pixels that coincided with cytokeratin pixels were used to generate an AQUA® score.

Three serial sections of a cohort (n=669) of invasive breast cancers were fluorescently stained for HER2 as described in Materials and Methods. The first serial section was used for AQUA® score generation across three different instruments and three different operators. The second and third serial sections were stained on separate days to assess run-to-run variability. FIG. 4 gives box plots showing normalized AQUA® score distributions for each indicated acquisition parameter (instrument (FIG. 4A), operator (FIG. 4B), and independent staining runs (FIG. 4C)). For 583 patient samples, the average percent coefficient of variation (% CV) was 1.8% (min=0.04%; max=10.7%) across instruments, 2.0% (min=0.06%; max=15.6%) across operators, and 5.1% (min=0.12%; max=29.7%) across independent staining runs. These % CVs rival that of in vitro immunoassays such as ELISA (Butler et al. J Immunoassay. (2000) 21:165-209).

Positive/Negative Concordance

A critical parameter for HER2 testing in the clinical laboratory is the ability to reproducibly classify patients as positive or negative. Using survival as a surrogate marker for positive/negative classification 22, an optimal AQUA® score cut-point was established using X-tile 19 for normalized HER2 AQUA® scores produced for instrument 1, operator 1, and staining run 1. FIG. 5A shows Kaplan-Meier survival analysis of positive/negative HER2 classification for instrument 1. As described in Materials and Methods, the cut-point was validated with significance by Monte Carlo simulation (P<0.001) and training/validation subsets (P=0.002). This validated cut-point was applied to AQUA® scores generated on instruments 2 and 3 with significance, P<0.001 and P=0.004 respectively (FIGS. 5B and 5C). Similar reproducibility was observed across independent operator and staining run acquisitions with P-values all ≦0.01 (data not shown).

Current ASCO-CAP guidelines are suggesting laboratories achieve 95% positive/negative concordance for current HER2 assay methodologies (Wolff A C et al, Arch Pathol Lab Med. (2007) 131:18). A recent study shows that for HER2 IHC-based scoring, concordance between observers ranges from 54-85%, falling short of these guidelines (Hameed et al; in press; direct communication). Positive/negative concordance for normalized AQUA® scoring across instruments, operators, and staining days was examined using the cut-points established above. As shown in FIG. 6, overall concordance ranged from 94.5% (Instrument 1 to Instrument 3; FIG. 6B) to 99.3% (Operator 1 to Operator 2; FIG. 6C). These analyses include all cases including those that would be considered equivocal.

To assess where differential classification occurred in the distribution of normalized AQUA® scores, paneled frequency histograms were generated to examine where differentially classified cases were occurring. As shown in FIG. 7, for instrument-to-instrument, operator-to-operator and run-to-run, differentially classified cases occur at the cut-point and not over the entire distribution. These data suggest that the classification error concerns cut-point selection not generation and reproducibility of the normalized HER2 AQUA® score. Taken together, these data show classification of patients for inter-instrument, inter-operator, and inter-run assessment of HER2 expression using AQUA® scoring is highly reproducible with concordance rates approaching, if not exceeding, that suggested by ASCO/CAP.

Example 2

AQUA® Analysis of EGFR: Analytical Performance Data

The methods of the present invention were applied to the evaluation of the biomarker EGFR in breast cancer tissue sections. As shown in FIG. 8A, a TMA cohort of 748 specimens was analyzed for HER2 expression by normalized AQUA® analysis across three instruments, 3 operators, and 3 separate staining runs with an average % CV of 4.3%. Preliminary analytical performance assessment was performed with EGFR PharmDx (Dako). As FIGS. 8B-D demonstrate, AQUA® analysis of EGFR expression across 3 slides and 3 staining days on a TMA composed of breast tumor and cell lines (n=152) shows a high degree of precision slide-to-slide and day-to-day with an average slope of 1.00047, an average Pearson's R of 0.95, an average % CV for tumor tissue of 3.3%, and an average % CV for cell lines of 4.7%. Taken together, these data demonstrate that AQUA® analysis allows for an EGFR assay with a high degree of precision, and combined with instrument and software controls, development of a robust clinical biomarker assay is possible.

Example 3

AQUA® Analysis of ER Expression: Reproducibility

To demonstrate reproducibility of AQUAnalysis™ with another biomarker, four breast tissue blocks were obtained that represent a range of estrogen receptor (ER) expression (as judged by Allred scoring). Sections of these tissue blocks were then taken to generate H&E slides on which a board certified pathologist circled the area of tumor for all subsequent analyses. A serial set of sections was DAB stained with the monoclonal mouse anti-human estrogen receptor α, clone 1D5 antibody and evaluated for Allred scoring by the same pathologist. See, e.g., Harvey J M, et al., (1999) J. Clin. Oncol. 12:1474, which is hereby incorporated by reference in its entirety.

Serial sections were then stained using the immunofluorescence staining protocol described above. Images were collected on the HistoRx PM-2000™ microscopy platform and then passed to AQUAnalysis™ for scoring. All scores were transformed on a log 2 scale.

Image files for the four slides were acquired and then passed through the AQUAnalysis™ software package n=10 times by the same operator to demonstrate overall software reproducibility. For all files, the % CV was essentially 0 (less then 1E−7).

The same four image files were then provided to three different operators along with the software operating instructions. Here, the operators redacted images in the method outlined for technician review of image quality, which reflects typical use. The results, shown in Table 1, demonstrate that even with more than 30% variability in the number of images scored (in the case of slides #3 and #4) the % CV in the overall score is still on the order of 1% or less.

TABLE 1

Inter-operator reproducibility

Mean

AQUA ®

score

#fields

(n = 3)

Operator

per operator

scored

Mean

StDev

% CV

Slide #1

Operator 1

9.941

100

9.945

0.0047

0.05

Operator 2

9.945

104

Operator 3

9.950

109

Slide #2

Operator 1

10.382

60

10.397

0.0357

0.34

Operator 2

10.437

61

Operator 3

10.370

62

Slide #3

Operator 1

10.254

102

10.232

0.0194

0.19

Operator 2

10.216

110

Operator 3

10.227

150

Slide #4

Operator 1

13.723

4

13.818

0.1644

1.19

Operator 2

13.723

4

Operator 3

14.008

14

To demonstrate performance at independent sites and using alternative hardware systems for image acquisition, the AQUAnalysis™ software was evaluated on three different platforms which meet the required hardware specifications (described in the operator's manual). The systems are described below in Table 2.

TABLE 2

Hardware systems used for reproducibility testing

System 1

System 2

System 3

Component

HistoRx PM-

External System

External System

2000 ™

#1

#2

Light Source

Exfo Xcite 120

Exfo Xcite 120

Prior Lumen

Microscope

Olympus BX-52

Olympus BX-51

Nikon 50i

Objective Mag.

20×

10×

10×

Camera

Optronics

Cooke Sensicam

PixelLink

Quantifire,

QE

PL-B872-MF

2048 × 2040,

1376 × 1040,

1392 × 1040,

7.4 μM pixel

6.45 μM pixels

6.45 μM pixels

FOV size

758 μM × 758 μM

887 μM × 671 μM

898 μM × 671 μM

Filters

Cy3, Cy5, DAPI

Cy3, Cy5, DAPI

Cy3, Cy5, DAPI

To assure uniform testing, it was critical to assure that the same regions of tissue were sampled on each system since automated image acquisition is not required or necessarily available on all listed platforms. To accomplish this, a TMA was constructed from the same four samples described in the previous reproducibility tests, using five cores from each of the four blocks (for a total of 20 spots). This microarray was then immunofluorescently stained and the same single TMA was acquired on three separate hardware systems sequentially. Images were acquired in the laboratory of the installed hardware system by the operator after receiving training on the software operation. Results were maintained in a blinded manner for all external testing and are provided below in Table 3.

TABLE 3

Results of inter-site/inter-hardware testing

HistoRx

External Site 1

External Site 2

Overall

AVG

SD

CV

AVG

SD

CV

AVG

SD

CV

AVG

SD

CV

Sample 1

8.12

0.19

2.40

9.01

0.40

4.44

8.36

0.45

5.34

8.50

0.46

5.37

Sample 2

9.05

0.55

6.10

9.72

0.52

5.36

8.74

0.81

9.22

9.17

0.51

5.51

Sample 3

9.34

0.16

1.66

10.08

0.04

0.39

9.40

0.31

3.26

9.61

0.41

4.27

Sample 4

12.01

0.30

2.53

12.78

NA

NA

11.23

0.57

5.11

12.01

0.77

6.43

Each ‘sample’ refers to the average of the scores from the five-fold redundant TMA spots cored from the associated whole tissue section sample used in previous testing (with a range of Allred scores).

Overall, the % CV values for mean AQUAS score values are in the 4-6.5% range and do not vary significantly across the range of ER expression. The % CV variations observed within a single sample at a single site is the result of indicative of marker heterogeneity within the sample, not measurement error. ANOVA analysis indicated that while there is significant variation between the samples scores as expected (p<0.001), there is no significant difference between sites (p=0.58).

Example 4

AQUAS Analysis of ER Expression: Correlation with Allred Scoring

To demonstrate the utility of AQUA® scores, a comparison study was performed to examine the relationship between breast tissue immunofluorescently stained then scored using AQUAnalysis™ and breast tissue chromogencially stained then scored using the Allred method. As discussed in Example 1, a tissue microarray cohort of 669 patients was obtained from the Yale University Tissue Microarray Facility consisting of samples from the Yale University Department of Pathology tissue archives collected between 1961 and 1983.

Two serial sections were stained using the mouse monoclonal 1D5 antibody. One section was stained using DAB and the second using the immunofluorescence methods described above above. The DAB stained slide was then provided to three board certified pathologists for Allred scoring. Each pathologist was blinded to the results of the others (and to the fluorescent staining results) and no further information was given other than the nature of the tissue being scored and the biomarker (ER). In parallel the fluorescently stained serial section of the TMA was run through the AQUAnalysis™ software and AQUA® scores were generated. There was a definitive trend between Allred scoring and AQUA® scoring as demonstrated below (FIGS. 9A-C)) for all pathologists. Allred scoring and AQUA® scoring was highly correlative (p<0.001) for all pathologists by non-parametric correlation analysis (Spearman's Rho, FIG. 9D). Multinomial logistic regression analysis demonstrated a significant (p<0.001) and direct (pseudo R2) relationship between Allred scoring and AQUA® scoring for all pathologists (see FIG. 9).

Although there is variation in the individual scores provided by the pathologists, in typical practice, a cut point is applied to the data. For the Allred scoring method, a cut point of 3 was used such that scores <3 (0 or 2) were considered negative and scores ≧3 were considered positive. See, e.g., Harvey et al, supra. This cut point was applied to the individual results of each of the manually scored pathologist data sets and a consensus score was generated. Samples where at least one pathologist could not provide a score were eliminated from the consensus. If scores did not agree, the majority score was used. As a result of this, it was observed that pathologists demonstrated universal agreement for positive/negative classification 91% of the time for 523 cases.

To determine a cut point for the AQUA® score data, an unsupervised Bayesian clustering algorithm based on log-likelihood distances was applied using the commercially available software program, SPSS(SPSS, Inc. Chicago, Ill). For this algorithm, patients were grouped based on cluster membership of AQUA® scores. When the clustering was performed, four clusters are manifested in the data (see FIG. 10A). Analogous to Allred scoring, survival was used to determine positive and negative classification between expression groups. As shown in FIG. 10B, three (clusters 2-4) show improved survival whereas cluster 1 show decreased survival. Therefore, patients in cluster 1 were considered ER negative and all others ER positive.

With positive/negative classification determined for the AQUA® data, a concordance matrix was generated to compare the results of the AQUAnalysis™ software with the results of the scoring consensus derived from manual Allred scoring. The results indicate that the overall concordance between the methods is 94.9% with percent positive agreement of 96.0% and percent negative agreement of 92.5%, as shown in Table 4 below.

TABLE 4

Concordance of AQUA ® scoring

with manual IHC Allred scoring.

Manual Allred Scoring

Positive

Negative

Total

AQUA ®

Positive

218

8

226

scoring

Negative

9

99

108

Total

227

107

334

% Positive agreement (218/227) = 96.0% (95% CI = 92.6-98.2)

% Negative agreement (99/107) = 92.5% (95% CI = 85.8-96.7)

% Total agreement (317/334) = 94.9% (95% CI = 92.0-97.0)

As confirmation of positive/negative classification of ER AQUA® scores, overall survival was compared between Allred scoring and AQUA® scoring, as shown in FIG. 11. Both methods predict significant five-year disease specific survival and show similar cumulative survival rates for positive/negative classification.

Example 5

Reproducibility of AQUA® Scoring and Allred Scoring

A single TMA slide described in Examples 3 and 4 was stained for ER expression using conventional chromogenic immunohistochemistry techniques as previously described and independently evaluated by three pathologists using light microscopy and scored by the Allred method (FIG. 12A). A second, serial section, TMA slide was stained for AQUA® analysis (fluorescent immunohistochemistry) of ER expression and the same slide was analyzed on three independent instruments by AQUA® analysis (FIG. 12B). Examination of the results obtained by each pathologist vs. each other pathologist shown as scatterplots (FIG. 12A) shows that while overall concordance is high, there are samples considered positive by one pathologist that are considered negative by another. These patients would receive different treatment (hormonal therapy) depending on which pathologist read their ER results.

The kappa values (which range from 0 to 1, Table 5) over the spread of Allred scores indicate that there is a great deal of variance in the respective pathologists manual scoring determinations.

TABLE 5

Kappa of results scored by pathologists.

Path 1 v. Path 2: Kappa = 0.482 (p < 0.001)

Path 1 v. Path 3: Kappa = 0.444 (p < 0.001)

Path 2 v. Path 3: Kappa = 0.400 (p < 0.001)

Overall regression analysis of the results obtained for each combination of PM 2000 instruments is shown in FIG. 12B. There is extremely high correlation (R2>0.99 in all cases, Table 6) with a strong correspondence (regression coefficients, analogous to the slop eof the regression line, are all ˜1.0). Furthermore, ANOVA analysis of the datasets produces p>0.05, indicating that the data sets are statistically indistinguishable. In comparison to FIG. 12A, results obtained by AQUA analysis is highly reproducible with an average CV of 1.35%. Therefore the methods of the present invention provide for consistent results regardless of which instrument (and therefore location) the sample was analyzed on.

TABLE 6

Comparison of results obtained on

three instruments by AQUA analysis.

Regression Coefficient

Comparison

R2

(95% CI; p value)

Instrument 1 v2

0.996

1.003 (0.99-1.01; <0.001)

Instrument 1 v3

0.995

1.01 (1.00-1.02; <0.001)

Instrument 2 v3

0.996

1.003 (0.99-1.01; <0.001)

The Examples are provided for illustrative purposes only and should not be used to limit the scope of the invention. Many other embodiments of the invention are apparent to those of ordinary skill in the art in view of the contents and teachings of this disclosure.