Workshop on Urine Albumin (UA) Standardization (2015)

Bethesda, MD

February 5, 2015

Welcome and Goals for the Meeting

Greg Miller, Ph.D., Virginia Commonwealth University, Richmond, VA Dr. Greg Miller, the Chairman of the Laboratory Working Group of the National Kidney Disease Education Program (NKDEP) (LWG/NKDEP), welcomed the participants to the workshop on standardizing UA measurement and thanked them for attending the workshop. The LWG/NKDEP has been developing the tools to standardize UA measurement for several years, working with colleagues at the National Institute of Standards and Technology (NIST) to develop reference materials and reference procedures, as well as with Dr. John Lieske at the Mayo Clinic, who developed the first liquid chromatography-isotope dilution mass spectrometry (LC-IDMS) candidate reference procedure for comparison of all of the commercial procedures. The meeting included participants from the LWG/NKDEP and the International Federation of Clinical Chemistry Working Group on Standardisation of Albumin Assay in Urine (IFCC WG-SAU). The meeting attendees introduced themselves, stating their role in UA measurement. A list of participants is provided in Attachment A.

Dr. Miller indicated that the goals for the workshop were the following: (1) to understand the current status of UA measurements, the subject of a paper published in 2014 by members of the LWG/NKDEP and IFCC WG-SAU; (2) identify measurement procedure limitations to address before standardizing UA measurements, including how the manufacturers need to proceed to address these limitations; (3) develop a plan to standardize UA measurement procedures; and (4) review the use of National Health and Nutrition Examination Survey (NHANES) UA data to improve the diagnostic value of UA in kidney disease, including the relationship between using the NHANES data to develop decision levels and the UA standardization effort.

Summary of Clinical Use of Urine Albumin and the Importance of Standardizing Measurements

Andrew Narva, M.D., FACP, FASN, Director, NKDEP, NIDDK, NIH, Bethesda, MD
Dr. Andrew Narva provided background on the clinical use of UA and the importance of improving UA measurement. He began by expressing his appreciation to the participants for their attendance; Dr. Miller for his leadership; and Dr. Robert Star and NIDDK’s Division of Kidney, Urologic, and Hematologic Diseases for their support of UA research, particularly in a time of constrained resources. Relative to the importance of standardizing measurements in a clinical context, Dr. Narva cited his experience in the Indian Health Service, which serves a population with a large percentage of adults with diabetes, in which he was working on encouraging clinicians to use biomarkers to manage their patients’ care. The lack of standardization in testing and reporting has been a barrier to the use of albumin-to-creatinine ratio (ACR) by clinicians. UA is a critical biomarker for the diagnosis in chronic kidney disease (CKD), with approximately half of all patients with CKD identified on the basis of increased UA. In addition, UA is hypothesized to be a marker for cardiovascular disease and kidney disease progression. The potential for using UA as a patient education tool is under-recognized, including use as a marker to incentivize adherence to treatment. Standardizing UA measurement is important for surveillance, clinical care, and research in CKD. CKD is functionally defined as a reduction in glomerular filtration rate (GFR) (less than 60 mL/min/1.73 m2 for at least 3 months) and/or kidney damage as evidenced by pathological abnormalities or other markers of kidney damage (generally, urine ACR > 30 mg/g, the current clinical definition of albuminuria). The need to standardize UA measurement is evident in surveillance data from the NHANES, where only 43.5 percent of elevated UA in random samples were confirmed in first void samples. Approximately half of the CKD burden in the United States, which is projected from NHANES data, is diagnosed by albuminuria, often based on a single measurement of elevated UA, but pre-analytic factors, such as sample type, can alter the estimated burden. The estimated burden of CKD might be much lower if diagnoses were based on confirmed cases. In clinical care, albuminuria is associated with more rapid progression of CKD. Albuminuria also has been shown to be associated with higher risk of cardiovascular events and cardiovascular death in diabetic kidney disease (DKD). In addition, reduction of albuminuria in response to therapeutic interventions has been associated with a decreased risk in renal endpoints (e.g., death, dialysis, loss of half of GFR). Regarding research resources and efforts, there are many studies for which UA results cannot be compared because UA measurement is not standardized. To conserve resources, pragmatic trials and ways of looking at large populations are increasingly being used to improve clinical care. There are almost 1 million samples in storage that could be analyzed if there were a standardized measurement procedure. There also needs to be standardization in reporting. The definition of “normal”—less than 30 mg/g ACR—is based on very limited data, but Kidney Disease: Improving Global Outcomes (KDIGO) 2012 retained the established definitions of micro- and macroalbuminuria. However, such studies as the Prevention of REnal and Vascular ENd-stage Disease (PREVEND) Study found a relationship between ACR and increased risk of cardiac death that was continuous and went much below 30 mg/g. A lower ACR may prove beneficial as a decision value. Dr. Narva noted that total urine albumin excretion generally is reported as ratio to UC, the excretion of which varies with age, gender, and race; therefore, the denominator of ACR needs to be considered in setting a decision value. Because women have lower production of creatinine, a sex-specific ACR decision value has been shown to eliminate the disparity in albuminuria between men and women. Analysis of NHANES data has the potential to show the importance of UC variation in a representative population, and could inform reporting cut-offs. The public health objective of improving patient care requires standardization of laboratory measurements of urine albumin, as well as standardization of the nomenclature of laboratory reporting of UA results. For example, in trying to develop a performance measure (a measure of clinical care that may affect physician reimbursement and certification), there is a need to define which levels of UA are considered albuminuria and which are not. The issues involved in educating the clinical community about changes in decision levels based on standardization are likely to prove challenging.

Current Status of Agreement Among Routine Measurement Procedures for Urine Albumin

NKDEP LWG/IFCC Project Status Report

Dr. Lorin Bachman, Chair of the IFCC WG-SAU, provided a status report on the LWG/IFCC harmonization and commutability assessment to standardize UA measurement. The goals of the assessment include to use native patient urine samples to evaluate the current state of agreement among routine UA methods relative to the isotope dilution mass spectrometry (IDMS) candidate reference measurement procedure (cRMP), assess commutability characteristics of the Institute for Reference Materials and Measurement’s (IRMM) ERM-DA470k/IFCC, and explore the ability to use the certified reference material to standardize routine methods. Most of the data that Dr. Bachman reviewed were published in a recent article in Clinical Chemistry.1

Dr. Bachman and her colleagues found that in the range of 13 to 30 mg/L UA, some of the routine methods were biased high or low. The components of the variation—including within-run imprecision, between-run imprecision, position effects, and sample-specific effects—were determined using an error components model. Over the method’s analytical measurement range (AMR), the analytical precision (pooled within-run imprecision, between-run imprecision, and position effects) generally was fairly good at less than 10 percent. The AMRs of the methods varied widely.

Bias was the major source of disagreement when compared to IDMS, with many methods biased low at lesser concentrations, but some biased high. At 15 mg/L, positive and negative biases were significant (−35% to +34%); at 30 mg/L, the bias range was −15 to +18 percent; and most methods agreed well (within 10%) with IDMS at 100 mg/L. The NHANES fluorescent immunoassay (FIA) was biased high independent of concentration.

Sample-specific influences (i.e., residual error after excluding analytical error and bias) showed that there were sample matrix influences with each method, but for most methods, the sample matrix produced minor effects. In the upper sample concentration range, effects of dilution caused bias, indicating the need to evaluate biases introduced by the dilution matrix.

In summary, the measurement issues that need to be addressed include imprecision; sample-specific effects, which were relatively insignificant; bias, including nonlinear bias in 10 of 16 methods, which was the dominant contribution to lack of agreement with IDMS; and appropriate dilution protocols for samples with UA concentrations above the AMR. The next step is to determine the approaches and resources needed to address the measurement issues of imprecision, sample-specific influences, dilution effects, and nonlinear bias.

Discussion

Regarding the relative importance of different sources of error, sample-specific effects were considered not of major concern.

The effect of freezing on sample measurement was discussed by the participants, who made the following points:

There might be a contribution from the length of time of storage of frozen samples. There are frozen urine samples from the NHANES dating back to 1999. There is literature on stability, but to use samples stored for a long time, there might be a need to account for possible drift in measurement procedures and to have measurement values from before the samples were frozen.

The storage temperature might contribute to bias.

Storage over longer periods of time could require moving between freezers, which might cause partial thawing.

Some of the Pima Indian data that are very old and were stored at −20 °C showed deterioration by immunoassay that was not seen when analyzed by MS. These data imply a differential bias by method with time.

Details about the analytical protocols used were discussed:

Fresh patient samples were collected and shipped overnight in refrigerated transport to the manufacturers.

In this study, samples were centrifuged before shipping. Centrifugation also is part of some manufacturers’ methods, and standard practice regarding centrifugation may be useful.

As a reference material, 11 out of the 16 manufacturers were using ERM-DA470. Implementation of this reference material required making substantial dilutions, with possible effects on traceability.

The dilution of samples was discussed, given that sometimes, diluted samples did not have the same biases as samples within the AMR:

Saline and water are the most likely diluents, but not all of the manufacturers specify the appropriate diluent. Different diluents might have different matrix effects.

Error is introduced by dilution. A participant noted that most analyzers perform autodilutions. Most manufacturers recommend autodilution but only fixed dilution ratios are provided. UA concentrations may exceed available autodilution ratios, and manual dilution is then required. It is important that the actual concentration be reported (not a greater than value) to enable appropriate medical care.

Adsorption to sample containers during dilution is a possible source of error, but Dr. Miller indicated that this has been shown to be less than 1 percent for common containers.

The goals for standardizing AMRs among methods was discussed:

Dr. Miller stated that measuring very low concentrations of UA is important for patients at initial assessments.

Dr. Narva commented that it is also important to measure progress of treatments for patients with high ACRs. Patients then can receive feedback about their treatment. Many advanced cases require a very large dilution factor, however, which can be a source of error.

Dr. Narva suggested expanding AMRs as an important design goal for manufacturers, although it was recognized that there might be technical constraints on the feasibility of doing so. It was acknowledged that an AMR spanning four orders of magnitude is unlikely to be achievable.

Urinary Albumin and Creatinine Measurements: What Should Be the Analytical Performance Goals

John Eckfeldt, M.D., Ph.D., University of Minnesota

Dr. John Eckfeldt described the approaches that have been proposed for establishing analytical performance goals as a preface for a group discussion on setting performance goals for accuracy and precision. There is a general consensus that ACR is useful for routine screening for renal disease among patients who are diabetic or hypertensive. First morning void is the preferred collection method, but random/spot sampling is common. Twenty-four-hour urinary total protein or ACR may be preferred for tubulointerstitial disease. Measurement of ACR requires accurate measurement of both albumin and creatinine. The coefficient of variation (CV) for ACR is the square root of the sum of the squares of the CVs of UA and urinary creatinine (UC). Interlaboratory measurements indicate, however, that all-method variation for creatinine is approximately 6 percent, whereas for albumin, it can be as high as 38 percent at low levels, driving the uncertainty of the ACR.

Dr. Eckfeldt described general approaches to establishing clinical laboratory performance goals. These included regulation and external quality assessment (EQA) criteria; inherent biological variation within individuals, which can be too demanding to determine for some analytes; surveys of clinical opinion; effects on guideline-driven medical decisions; and effects on medical procedures ordered. Intra-individual variability of ACR is estimated as ranging from 4 to about 100 percent within day and across days, depending largely on fluid intake. An example of a study that estimated the inherent coefficient of variation (CVi) is the CKD Biomarker Consortium Study, in which approximately 50 patients were monitored with three to five samples, first void and random/spot collections, collected approximately 1 week apart. The standard deviation (SD) of results increased with mean ACR. There was a large degree of biological variation, compared to which analytical variation was relatively small. Variables affecting CVi included the disease state of patients; drug treatments that patients were using; the amount of UA being excreted; the interval between urine collections; and analytical variability, which was not separated from biological variability.

Dr. Eckfeldt stated that analytical performance goals for UA and UC need to be developed. These include the CV for UC (CVUC), which generally is considered not to need improvement; bias for UC, which also is not considered to need improvement; the analytical CV for UA (CVUA), which probably needs improvement, particularly at low concentrations; and bias in UA, which definitely needs improvement, particularly at “normal” to slightly elevated UA concentrations.

Discussion

The issue of the representativeness and demographics of the patient population from the CKD Biomarker Consortium Study was raised:

Given that the participants were being seen by a nephrologist, they likely had renal disease and likely were receiving some type of treatment, such as antihypertensive mediation or salt restrictions. Most of them likely had CKD, although Dr. Eckfeldt knew of one patient who had acute kidney injury (AKI). It is possible that this population would not be representative of patients with diabetes and hypertension being screened for albuminuria.

Biological variation is being explored systematically in a more representative population.

It was agreed that although the CKD Biomarker Consortium Study might overestimate biological variability, in a more representative population, it still is likely to be a large number in the 20 to 40 percent range.

Regarding biological variation, there is a syndrome of intermittent proteinuria, but its implications on disease prognosis for cardiovascular or renal disease is not well understood.

In discussing the criteria for setting performance goals, the following points were made:

An important question is determining what precision and accuracy in ACR a physician needs to treat patients effectively.

Biological variability is not likely to be useful for setting performance goals.

A Reference System for Urine Albumin

Reference Materials

Dr. Miller

Dr. Miller reviewed what constitutes a reference material and how traceability is determined for a reference material. Dr. Miller described an example of a traceability chain. A known amount of pure albumin (e.g., NIST Standard Reference Material [SRM] 2925) is weighed to produce aqueous albumin solutions that are used as calibrators for the liquid chromatography (LC)-IDMS RMP. The RMP will be used to value assign a secondary reference material of albumin in human urine (NIST SRM 3666) that is in development and will be widely available to manufacturers as a basis for calibration. NIST SRM 3666 will be used to calibrate the manufacturer’s selected procedure, which then will be used to calibrate the manufacturer’s working calibrator, retained by the manufacturer for long periods of time. The manufacturer’s working calibrator will be used to calibrate the manufacturer’s standing procedure (which might be identical to the manufacturer’s selected procedure, perhaps with different performance criteria). The manufacturer’s standing procedure will be used to produce the manufacturer’s product calibrator, which laboratories will use to calibrate routine procedures to produce patient results. This is a standard traceability chain as described in ISO document 17511, In Vitro Diagnostic Medical Devices—Measurement of Quantities in Biological Samples—Metrological Traceability of Values Assigned to Calibrators and Control Materials. At this time, the LWG/NKDEP is working to add the missing higher level components.

Currently, there are two approaches to calibrate UA measurement procedures. One is to use diluted ERM DA470 (or its replacement ERM DA470k/IFCC) serum reference material, and the other is to prepare pure albumin in buffer and use the molar absorptivity of the solution to calculate its concentration. The current serum protein reference material is DA470k, with a concentration of 37.2 g/L, which must be diluted 100- to 10,000-fold to achieve concentrations suitable for most routine methods AMRs, introducing dilution errors and matrix effects. The suitability of diluted DA470k is not known, but until now it has been the only albumin certified reference material available.

Diluted IRMM ERM DA470k/IFCC: Suitability for Use With UA Measurements

Dr. Miller

Diluted ERM DA470k/IFCC was measured along with the patient samples in the assessment of current status of harmonization of commercial UA measurement procedures. In determining the commutability of diluted DA470k, regression or difference plot methods were not used because biases versus IDMS were concentration-dependent for many routine procedures. Instead, the mathematical relationship was determined between the target values based on gravimetric dilution of DA470k and the measured UA for DA470k dilutions within the AMR for each routine measurement procedure. This relationship was used to mathematically adjust UA results for patient samples that were within the interval of the diluted DA470k concentrations. The number and concentration of the dilutions used varied depending on the AMR. The hypothesis was that the bias to IDMS would be reduced if the DA470k was commutable with patient samples. It was found when comparing the initial results to the results after recalculating based on making the results traceable to DA470k that the recalculation generally made the bias relative to IDMS worse, meaning DA470k was not commutable. A more appropriate reference material needs to be developed.

Dr. Ashley Beasley-Green described the UA reference measurement system in development at NIST. NIST SRM 2925 is an aqueous solution of recombinant human serum albumin (HSA) of characterized purity and molecular structure homogeneity. The primary intended use of NIST SRM 2925 is to prepare calibrators for the secondary reference measurement procedure based on ID-LC-MS/MS. The secondary reference material, NIST SRM 3666, contains albumin and creatinine in frozen human urine. Its intended use is as a secondary reference material for use by manufacturers of routine measurement procedures. The four target levels of UA in NIST SRM 3366 are: ≤ 10 mg/L, 10 – 50 mg/L, 50 – 200 mg/L, and 200 – 400 mg/L. Each level will be derived from human urine samples pooled from a minimum of 20 donors. The certification of the material for ACR is expected in mid-2015 and will be measured by a NIST-developed IDMS method.

Discussion

The use of the primary and secondary reference materials as calibrators was discussed:

A participant asked whether two concentrations of the primary reference material, aqueous NIST 2925, would be made available. Dr. Beasley-Green responded that only one concentration is available. Dr. Miller clarified that the primary purpose of the primary standard was to calibrate LC/MS procedures, not routine methods.

The secondary reference material is being developed specifically for calibrating routine immunoassays.

The four levels of the secondary reference material, which will be provided at a discrete concentration within each level’s target range, were determined by what was appropriate for the range of routine UA method AMRs. Ideally, the low value will be approximately 5 mg/L.

The timeline for the commutability assessment for NIST SRM 3666 was discussed:

NIST SRM 3666 will be available in March 2015 for the pilot commutability study.

The pilot commutability study is projected to be completed in mid-2015.

The participants asked about the preparation of the NIST SRM 3666 reference material:

NIST SRM 3666 will be filtered at 0.45 μm and stored at −80 °C.

The urine for the different levels will be four separate pools rather than a single pool of all urine collected.

The participants discussed the qualifications of the donors for producing NIST SRM 3666 because of possible interferences:

The issue of drugs, such as antibiotics, which can be present at high concentrations in urine, was raised.

Donors were screened for infectious diseases and blood in urine.

Ongoing treatment with antihypertensive medicines was acceptable for donors.

Plan for Commutability Assessment for NIST SRM 3666

Dr. Beasley-Green

Dr. Beasley-Green gave the International Organization for Standardization’s (ISO) definition of commutability: “closeness of agreement between the mathematical relationship of the measurement results obtained by two measurement procedures for a stated quantity in a given material, and the mathematical relationship obtained for the quantity in routine samples.” There is a three-phase plan for the commutability assessment for NIST SRM 3666: (1) produce material (to be finished in March 2015); (2) in parallel to the value assignment, conduct a pilot commutability study with five manufacturers (to be completed in mid-2015); and (3) conduct a full commutability study with 10 to 15 assays (to be completed in late 2015 or early 2016). Dr. Beasley-Green raised points of discussion, including the value assignment of UA and UC; uncertainty of the UA calibrator (SRM 2925); estimation of the volume required for manufacturers; and integrity of the patient urine samples, especially in consideration of the effects of freezing.

Discussion

The participants discussed the pilot commutability study:

Dr. Karen Phinney indicated that its value is to ensure the material is likely suitable for use by routine measurement procedures and to help design the full commutability study as well as get initial feedback from the manufacturing community.

Regarding the feasibility of conducting the pilot study, Dr. Miller asked about the quality assurance protocols. It would be desirable to include fresh and unpooled samples.

Selection of routine measurement procedures for the pilot should represent assays that are different implementations, particularly regarding antibody by different manufacturers.

The effects of freezing on samples was raised as an issue that should be decided before the pilot study:

It was noted that Dr. Bachman’s data for the IDMS cRMP showed a small bias from freeze-thaw for IDMS results that was corrected for in the data analysis.

Manufacturers might have data on whether frozen samples would be acceptable for their methods.

Logistically, it might be easier for a clinical laboratory with access to many samples to conduct a study on the effects of freeze-thaw. A possibility would be for manufacturers to nominate selected customers to collaborate with in the freeze-thaw study so that logistical problems with shipping samples would be eliminated.

The freeze-thaw study possibly could be combined with the pilot study. Dr. Miller proposed that the LWG/NKDEP and IFCC WG-SAU develop an experimental design to assess freeze-thaw effects and conduct the pilot study simultaneously, proposing a timeframe, and then circulate the plan to manufacturers for review. However, it may be that doing each step separately will be logistically simpler so that manufactures can better manage that the representative routine measurement procedures are operating to specifications.

The full commutability study will most likely be scheduled for early 2016:

Dr. Miller noted that there is an IFCC Working Group on Commutability that has developed guidelines for experimental design that will likely specify a minimum of approximately 40 patients with three replicates for each patient and two runs per sample to estimate uncertainty with appropriate statistical significance. However, the details of an experimental design will be influenced by performance characteristics of the routine measurement procedures involved.

Dr. Miller proposed that the LWG/NKDEP and NIST develop an experimental design for the full commutability study, circulating the plan to manufacturers for review.

Urine Albumin Reference Measurement Procedures

Mayo LC-IDMS

John Lieske, M.D., Mayo Clinic, Rochester, MN

Dr. Lieske provided an update on the UA RMP at the Mayo Clinic. Of the urine albumin trypsin peptides with demonstrated multiple reaction monitoring (MRM) transitions by LC-MS/MS, five are reproducibly observed in patient urine samples, and three were used for quantification in a clinical comparison study. The Mayo LC-IDMS assay showed good interassay, intra-assay, and within-run precision. Precision increased a small amount if three peptides were included in quantitation. Samples were linear up to a four-fold dilution, which is sufficient to measure most clinical samples. Spiking with peptide fragments of less than 10 kilo Daltons (kDa) in native urine did not affect the assay. There was some bias between peptide measurements based on peptide location in the protein, but it was on the order of 1 percent. As Dr. Bachman showed, routine methods showed bias relative to the Mayo IDMS procedure, particularly at low concentrations. There was a small positive bias of freeze-thaw of individual urine samples on results from the Mayo IDMS procedure that averaged 3.5 percent.

NIST LC-IDMS

Dr. Beasley-Green

Dr. Beasley-Green described the results of the validation study for NIST’s LC-MS/MS UA assay. The assay was performed on a tryptic digest of albumin, yielding 11 peptides by MRM analysis for quantitation. With the NIST cRMP, quantitative and qualitative analyses can be performed. Thirty patient urine samples, calibrants, and quality control samples were measured by both NIST and the Mayo Clinic. Results for several of the 30 unpooled patient samples had substantial discrepancies between the two IDMS measurement procedures. These discrepancies were thought to be caused by pre-analytical sample preparation, differences in the source of albumin used to prepare calibrators, or different internal standard 15N-labeled albumin.

Discussion

The participants discussed whether the peptides were selected as those that were potentially modified in disease.

Dr. Beasley-Green stated that the peptides were selected based on detection by MS.

With 11 peptides in the NIST cRMP, relative quantities can be determined to see if any of the peptides decrease. It was suggested that in the commutability assessment, the stability of relative ratios of peptides between patients could be explored.

The participants discussed the large discrepancies in some of the NIST versus Mayo results.

Dr. Beasley-Green responded that centrifugation during sample preparation might play a role; NIST does not centrifuge samples as part of its protocol, and some samples were cloudy.

Another difference is that each laboratory used its own internal standards, source of albumin to prepare calibrators, and trypsin digestion protocols. NIST’s standard is made in-house.

A second comparison study is planned in which each laboratory will analyze trypsin digests and calibrators from the other laboratory.

The resolution of the differences between the results in the NIST and Mayo Clinic IDMS results was recognized as critical to the development of the standardization program for UA.

Plan the Standardization Program for Urine Albumin

Dr. Miller

Dr. Miller introduced the discussion of the plan for the standardization program for UA. The plan will need to address the various scientific points raised in the earlier discussions.

For the full SRM 3666 commutability assessment, it would be ideal if the experimental design included the manufacturers’ working calibrators and a sufficient number of patients to gather data for the manufacturers’ resubmissions to the Food and Drug Administration (FDA) so that the recalibration changes could be made operational.

Discussion

The participants discussed the FDA resubmission process for routine methods:

The legal requirement for resubmission is the introduction of changes that could affect the performance of a device. If changes in performance are small, data requirements will not be extensive for resubmission, or resubmission might not be required at all. The qualification about changes in performance being small allows some flexibility in deciding whether resubmittal is necessary.

The FDA would be willing to review the commutability protocol and provide feedback, as well as speak directly with manufacturers as a group or independently. A key consideration is that separate sample sets need to be used to assess commutability and validate the standardization. Recently the FDA participated in a similar study for Vitamin D, and therefore has experience in helping manufacturers with resubmissions.

To create a commutability study that would also serve for resubmission to the FDA, Dr. Miller asked the manufacturers how many patient samples were needed. At least 40 or as many as 100 was considered an appropriate number. The concentrations in the patient samples should bracket the NIST SRM levels.

The participants agreed to the following status of issues and steps for the plan to standardize UA measurement procedures:

Bias in results measured by different routine measurement procedures was the dominant error component contributing to differences in patient sample results. The fluorescent immunoassay used for UA measurements in the NHANES program had a proportional bias versus Mayo’s IDMS procedure that was similar in magnitude to that of some of the routine measurement procedures.

Bias for some routine measurement procedures was different at different concentrations. Consequently, a simple correction for a proportional bias will not be suitable for some routine measurement procedures. There may be several causes for different bias at different concentrations, including concentrations and/or number of calibrators not adequate for the AMR or measuring interval; matrix of calibrators not appropriate or different at different concentrations; uncertainty in calibrator concentrations when large dilutions of, for example, ERM DA470k/IFCC or its predecessor ERM/CRM DA470 serum protein reference material is used. Concentration-dependent biases need to be addressed by the manufacturers, where applicable, before other standardization steps can proceed.

Some routine measurement procedures had different bias for results within the AMR versus those from samples that were diluted to obtain measurable concentrations within the AMR. Some routine measurement procedures do not specify a diluent or dilution protocol in the instructions for use (IFU). There may be a confounding effect of inaccurate dilution of the urine sample and a contribution from concentration-dependent calibration bias depending on the final concentration of a diluted sample. Dilution limitations need to be addressed by the manufacturers, where applicable, before other standardization steps can proceed.

Sample specific effects were considered modest (2.8–15.2%). It is likely that most routine measurement procedures are measuring suitable epitopes on the albumin molecule. A limitation in the data is that sample-specific effects were determined as the random error that was not explained by other measurable components such as within run, between-run and position effects. There may be individual patients with an unusual form of albumin that may be less reactive in some routine measurement procedures, or there may be specific influence quantities impacting the measurement results in some individual patients’ urine. It remains unknown if standardizing immunoassays to recognize specifically agreed upon epitopes would improve agreement among results.

The AMR (measuring interval) varied substantially among different routine measurement procedures. The lower limits varied between 1.2 and 12 mg/L and the upper limits varied between 60 and 500 mg/L. It is likely that research will establish that lower UA thresholds for risk classification should be used that will likely require quantitation to at least 1 to 2 mg/L. Higher concentrations of 5 to 10 g/L are clinically relevant, and patients with concentrations in this range should have their samples diluted to report a quantitative result. It is not clinically appropriate simply to report “greater than” values for UA or ACR. It is desirable for UA methods to have upper AMR limits suitable to reduce the need for dilution as is practical for the technology or to have the in vitro diagnostic (IVD) instruments perform highly accurate automated dilutions when needed. The LWG will recommend a desirable AMR.

Performance specifications for bias and imprecision need to be developed by the LWG for UA measurement at different clinically relevant concentrations, including concentrations that require dilution for measurement. Estimates of within individual biological variability in the literature vary between 4% and greater than 100%, likely for reasons including: UA is present at very low concentrations in nondiseased persons and thus is difficult to measure by most currently available routine measurement procedures; people with CKD are not metabolically stable regarding albumin excretion, consequently the choice of people (i.e., disease status) to include in a study will influence the estimates obtained; and the experimental design (in particular the sampling frequency) will influence the estimates obtained. A modeling approach based on the influence of changes in UA on treatment decisions may be considered to establish performance requirements. Performance requirements need to consider that urinary ACR is the recommended reported result, and performance requirements for both analytes need to be specified.

Although UA is almost always measured quantitatively in freshly collected nonfrozen samples for normal medical care, it is necessary for practical reasons to use frozen samples for the purpose of establishing calibration traceability and validating commutability of reference materials. Some groups have data for some routine clinical UA measurement procedures regarding stability of samples to freeze-thaw. An investigation showed an average 3.5% positive bias (with substantial scatter for individual samples) for samples frozen 3 days and then measured using the Mayo IDMS procedure.

It was agreed that an experiment to confirm the influence of freeze-thaw for the IDMS and routine methods would be conducted. When frozen storage is used, the temperature should be C or lower. A thawing and mixing protocol should be followed to ensure uniform conditions. Validation of long-term frozen storage also is needed. The LWG will develop protocols for these experiments.

Manufacturers need access to a laboratory that performs the reference measurement procedure for UA to assist with investigation of bias and calibration traceability. John Lieske indicated that his laboratory would be available to perform measurements to assist manufacturers. John Eckfeldt indicated that his laboratory is exploring setting up an LC-IDMS measurement procedure. These laboratories can provide cRMP support until the validation of the IDMS procedures with NIST is complete and the procedures are listed by Joint Committee for Traceability in Laboratory Medicine (JCTLM).

A reference system for UA measurement is being developed with the following components and anticipated availability:

LC-IDMS reference measurement procedures are being developed by NIST and the Renal Reference Laboratory at the Mayo Clinic. Comparison of results for a set of individual patient urines measured by each procedure showed some discrepant results that are being actively investigated by the two groups. Resolution of the root cause(s) is expected to enable a robust reference measurement procedure to be submitted to JCTLM for listing. Resolution of the discrepancies is a high priority for the UA standardization program.

Individual donors for the SRM pools will have their urine frozen to be thawed and used in the pools and 40 individual aliquots also frozen. The materials will be thawed, pooled and refrozen in vials to achieve 4 concentrations of UA at approximate concentrations 5, 30, 100, and 300 mg/L. The individual aliquots will be used for the commutability assessment. There was discussion of the potential impact of interfering substances (influence quantities) declared in routine measurement procedure IFUs that may be present in a donor’s urine that might compromise the suitability of a pool. Each pool will include a minimum of 20 donors so a potential interferent will be diluted in the final pool. It was noted that data on potential interfering substances obtained by analyzing the individual donor aliquots as part of the commutability assessment would not be useable to correct for interferences in SRM 3666 because the reference materials would already have been pooled. A precaution is to pre-qualify the individual donors before inclusion in a pool by testing aliquots by representative routine measurement procedures for UA. All of the individual donor urine samples are expected to be available in March 2015. NIST will determine if any prequalification of donors is to be performed.

It is necessary to perform the experiment on the influence of freeze-thaw on individual urine samples before conducting the commutability assessment.

It is desirable to also include freshly collected, nonfrozen individual urine samples in the commutability assessment. The experimental design will be developed with and without including nonfrozen samples to understand the complexity and feasibility. The individual nonfrozen samples will have concentration bins to correspond to the final concentrations of the pools.

Not all SRM concentrations can be evaluated by all routine measurement procedures due to large differences in the AMRs (measuring intervals) of the procedures.

A pilot assessment of the pooled urines for general suitability using representative routine measurement procedures will be conducted before the final pools are prepared. This assessment is expected to occur in the mid-2015 time frame. The experimental design for the pilot assessment will be developed as soon as possible.

The full commutability assessment likely will occur in the first part of 2016 and will include both SRM 3666 UA and SRM 3667 creatinine. The experimental design for the full commutability assessment will be developed as soon as possible.

Planning for the standardization of routine measurement procedures.

It is desirable to include manufacturer’s working calibrators and other internal reference materials in the commutability assessment to expedite the process of value assignment and recalibration of routine measurement procedures.

It is desirable to include freshly collected, nonfrozen patient samples in the commutability assessment to assist manufacturers with gathering data needed for recalibration of their routine measurement procedures.

It is necessary to plan for a separate set of individual patient samples for follow-up validation of the success of manufacturer’s recalibrations.

The experimental designs for the necessary commutability assessment and validation of successful recalibration will be developed by NIST and the LWG. The initial designs will be reviewed with manufacturers and refined as needed.

The FDA will review the final plans and provide input on its suitability to support submission by manufacturers and what additional information may be needed to support submission. The intent is to implement the recalibration as expeditiously as possible.

Interpretive Thresholds for Urine Albumin Results

Clinical Investigation Topics

Ms. Nilka Rios Burrows and Dr. Sharon Saydah provided an overview of methods, prevalence, and related factors in regard to national surveillance of albuminuria. The NHANES, a major program of the CDC, began in the 1960s as a series of surveys to assess the health and nutritional status of children and adults in the United States. Currently, the survey is conducted continuously, and data are released every 2 years. The study samples approximately 5,000 individuals per year and oversamples African Americans; Asian Americans; Hispanics; adults more than 60 years old; and low-income, non-Hispanic whites. There are four stages of NHANES sampling: Stage 1 is by county; Stage 2 is by segment of county; Stage 3 is by household, some of which are oversampled; and Stage 4 is by individual selected from a household. The Survey comprises an in-home interview and a health exam, performed in the NHANES Mobile Exam Center (MEC), which involves blood and urine sampling, a dietary interview, oral health measures, and more. Participants receive compensation and a copy of results. Urine is now collected on children 3 to 5 years old; previously, it was only collected from those older than 5. Random urine samples are collected in the MEC, home urine collection was added in 2009, and 24-hour collection began in 2014. Instructions for the home urine kit, which includes a refrigerant gel pack, request that samples be returned within 10 days. Urinary albumin is measured by solid-phase FIA, and urinary creatinine is measured by a modification of the Jaffe reaction.

For surveillance, the NHANES defined albuminuria using an ACR from random/spot urine samples of 30 to 299 mg/g (microalbuminuria) or ≥ 300 mg/g (macroalbuminuria). CKD was defined as a single test of albuminuria with Stage 1 having an eGFR ≥ 90 mL/min/1.73 m2 and Stage 2 having an eGFR of 60 to 89 and albuminuria. Stages 3 and 4 were defined as an eGFR of 30 to 59 (Stage 3) and 15 to 29 (Stage 4) regardless of ACR. From 1988 to 2012, there was an increase in Stages 3 and 4 CKD, whereas Stages 1 and 2 stayed the same. In 1999 through 2012 data, there was a high prevalence of albuminuria with CKD, particularly in individuals with a GFR < 60. Even individuals with advanced kidney disease had a low rate of knowledge of their condition, with Stage 4 individuals having a 40 to 50 percent awareness.

Ms. Burrows described laboratory reports of UA in cohorts of Veterans Administration (VA) patients and the NHANES. The VA data was collected from a random sample of 5 percent of patients within each service network in fiscal year (FY) 2005 through 2009. The percentage of patients with measured UA increased from 2005 to 2012, especially in the older population. In 2012 data, men had higher percentages of measured UA than women, and in middle age, members of minority ethnic groups had higher percentages than non-Hispanic whites. The percentage with measured UA was much higher in patients with diabetes than those who were not diabetic, and the percentage also was higher for patients with hypertension than for patients who were not hypertensive. Albuminuria testing in VA patients with AKI is used to assess quality of care and is related to the Healthy People 2020 (HP2020) objective to increase the proportion of hospital patients over 65 years who incurred AKI who have follow-up renal evaluations. In 2010 in the VA, 14.6 percent of patients had a follow-up UA test. In summary, the 1999 through 2012 NHANES data show that 15 to 16 percent of adults had evidence of CKD Stages 1 through 4; approximately 10 percent of U.S. adults had albuminuria in 1999 through 2012; and less than 5 percent of people with CKD Stages 1 through 2 knew that they had CKD in 1999 through 2010 data, with people with macroalbuminuria being generally more aware of their CKD. In the VA data, the percentage of veterans with positive UA tests increased over time; racial/ethnic minorities were more likely to have positive UA reports, with men, veterans with diabetes, and veterans who were hypertensive being the more likely to be positive for UA; and fewer than 15 percent of patients with AKI receiving a UA test following discharge.

Dr. Saydah reviewed the NHANES UA and UC data in terms of demographics and time of urine measurement. The NHANES data were stratified by sex, race, and ethnicity. UC differences were much greater than UA differences between men and women, as well as by race and ethnicity. If a sex-specific decision value for microalbuminuria is used, the differences between men and women are much less. If a race/ethnicity-specific decision value is used, the magnitude of the albuminuria rate changes but the relative differences remain the same.

The NHANES data also can be used to compare first morning void and random urine ACR test results. The first morning void results had a lower mean but was highly correlated with random/spot test results. A plot of the percent difference versus the mean for random and first morning ACR results showed that higher results had a greater range. As Dr. Narva had indicated in his presentation, only 43.5 percent of patients had confirmed elevated ACR in both first morning void and random/spot test results. The degree of agreement increased in patients with diabetes and those with hypertension. In summary of the results from the NHANES data, ACR differences by sex were attributable to differences in creatinine levels; ACR differences by race/ethnicity were explained by differences in both UA and UC levels; sex-specific decision values result in similar prevalence by sex, but differences persist by race/ethnicity; ACR from random urine tests had a higher mean and broader distribution than ACR measured in first morning void; and prevalence of elevated ACR based on random urine tests compared to first morning void was higher, a pattern that persisted across demographic groups and by diabetes and hypertension status.

The use of albuminuria data to predict outcomes was discussed by Dr. Saydah. She presented results from a longitudinal study of adults with diabetes in the VA with a median follow-up of 6.4 years, and meta-analyses of 30 general population or high-risk cohorts, as well as 13 CKD cohorts, with a mean follow-up of 8.5 years. An analysis of 2002 VA data showed that a large percentage of adults with diabetes and reduced eGFR have albuminuria. In patients stratified by age group and eGFR, those with an ACR result less than 30 had the lowest mortality, and patients with elevated ACR had increased mortality regardless of eGFR. Meta-analyses of adults with and without diabetes revealed an increase in risk of all-cause and cardiovascular mortality with increasing ACR that was higher for patients with diabetes. For adults with hypertension, even at levels of ACR of 10, there was an increased risk of all-cause and cardiovascular mortality. Therefore, ACR has been shown to be independently associated with elevated mortality in older adults with diabetes at all levels of eGFR, and ACR has been shown to be positively associated with all-cause and cardiovascular mortality with a greater magnitude of risk for those with diabetes. Standardization of UA measurement will provide the reproducibility to explore trends in prevalence, allow the use of UA in diagnosis of CKD and predicting complications, and facilitate comparisons across studies and between laboratories.

Discussion

The participants discussed the urine samples and attendant data collected by the NHANES. In the discussion, the participants made the following points:

The NHANES collects some medical history and has information regarding medications and supplement use. (changed from complete medical history and added medication and supplement use)

From 1999 forward, urine samples are available from individuals age 6 and older that have been stored at −70 °C and never thawed. These could be used in freeze-thaw experiments. Samples from children who are 3 to 5 years old began to be collected in 2015. Also, NHANES collected a second spot urine specimen at home and a second 24 hour urine on a subset of participants to estimate biological variation of urine ACR.

The ACR results of 24-hour urine samples, which began being collected in 2014, should be available at the end of 2015. The participants discussed compliance by individuals in collecting 24-hour samples. A British study estimated only 50 to 60 percent compliance, but the CDC established quality control guidelines for the NHANES, such as a 500 mL minimum volume.

The correction of existing NHANES ACR data measured by FIA for known bias will be possible once discrepancies between the Mayo Clinic and NIST methods are resolved.

Possible uses for the NHANES data—once it is corrected—were proposed:

It can be used to determine the importance of sex and race/ethnicity in developing discrete decision levels.

The data from the 24-hour samples compared with random/spot results will be useful in establishing decision values for ACR.

Exercise proteinuria was proposed as a possible cause of elevated random/spot ACR results relative to first morning void. The point was raised that exercise proteinuria might predict disease.

Laboratory Considerations

Dr. Miller

Dr. Miller observed that the need to establish treatment thresholds clarifies requirements for the laboratory measurements. ACR values of 5 to 10 mg/g and possibly lower may be clinically relevant. This creates challenges to manufacturers to ensure that their assays perform well in that range. Some routine measurement procedures have current limits of quantitation (LOQs) for UA in the range of 1 to 2 mg/L, but other procedures are higher. A participant noted that in determining the required LOQ for albumin, the creatinine concentrations also will need to be considered.

After standardization is achieved, it will need to be decided what validation studies to perform for any new clinical standard if a new standard is supported by reanalyzing a large amount of data, such as from the NHANES.

It is important to continue to improve the understanding of biological variability. The NKDEP LWG has a protocol design to study biological variability, but identifying a medical center to organize the work has proven difficult because of the challenging logistics of collecting urine samples over several months. The NHANES data will provide information about biological variability, but the limitations of the timeframe for NHANES data was recognized. The data in a given location are collected over a maximum period of 6 weeks. The current data from an individual are spaced by a maximum of 15 to 17 days with 10 days being the average. The NHANES data will be informative about biological variability but do not represent an ideal experimental design. Dr. Narva stated that the first step is to have a standardized method for measuring UA. After the method is developed, prospective studies will be proposed to explore issues like biological variability.

In response to a participant’s question on the timeframe of the commutability assessments, Dr. Miller anticipated that it would take approximately 1 month to develop preliminary experimental designs for the manufacturers’ review.

Dr. Miller indicated that the slides for the workshop, redacted as necessary by the presenters to remove preliminary data, will be available for a defined period of time online on the website of the contractor, The Scientific Consulting Group, Inc. Minutes also will be available from the meeting.

Adjournment

Dr. Miller thanked the attendees for their participation and adjourned the meeting at 3:25 p.m.

Action Items

Dr. Miller agreed to lead the effort on behalf of the LWG/NKDEP to develop draft experimental designs for the pilot and full commutability assessments in collaboration with NIST. After initial development, the experimental designs will be submitted to the manufacturers and the FDA for review.

The slides for the workshop, redacted as necessary by the presenters to remove unpublished data, will be posted online.

Minutes from the meeting will be available online.

Summary of action items for standardization of routine measurement procedures for UA

Item

Approximate Date

Responsible

Concentration-dependent biases

ASAP; predecessor for standardization

IVD manufacturers

Dilution influence on bias

ASAP

IVD manufacturers

Resolution of differences in IDMS procedure results

ASAP

NIST; Mayo Renal Reference Laboratory

Experimental design for freeze-thaw effects

March 2015

NKDEP

Pre-qualification of donors to SRM 3666

March 2015

NIST

Experimental design for pilot commutability assessment for SRM 3666

March 2015

NIST

Experimental design for full commutability assessment for SRM 3666 (to include components for manufacturer recalibration support)

Robert Star
Director
Division of Kidney, Urologic, and Hematologic Diseases
National Institute of Diabetes and Digestive and Kidney Diseases
National Institutes of Health
Bethesda, MD
Email: starr@mail.nih.gov