If you took a university introductory statistics course, you may have learned the distinction between accuracy and precision. It may likely have been presented with an archery analogy, where ‘Accurate’ was represented by arrows loosely clustered around the target’s bull’s-eye, ‘Precise’ was shown as a tight grouping displaced from the center, and ‘Accurate and Precise’ was depicted as what every archer aims for, a tight grouping directly at the bull’s-eye. Suddenly, words that are used interchangeably in everyday conversation took on dramatically different meanings.
The concept is straightforward, and is applicable to modern, complex diagnostic testing. In the case of clinical genetics and genomics using Next Generation Sequencing (NGS), the target may take on many different shapes and sizes depending on whether you are interested in a few genes, the exome, or the entire genome. Additional considerations the archer must take into account are skill at different ranges, bow draw weight and arrow selection, and even whether shots will be fired on foot, or from horseback. Similarly, for clinical NGS testing, coverage depth and other parameters can determine the ability to call variants; workflow components such as sequencing platforms and target enrichment chemistries also have an impact on the final outcome; variability that can affect results is also inherent across highly heterogeneous tumor samples.
This analogy is a good starting point, but must expand since NGS offers unprecedented ability to hit an almost unlimited number of targets. Sensitivity and Specificity are key metrics that are used to report test accuracy; sensitivity estimates False Negative rate, and specificity False Positive rate. The FDA has long stated that these terms should be reserved for measuring the performance characteristics of an assay when using an accepted reference standard; if no reference standard is available and results are compared to those generated by an orthogonal method, the Agency says the terms Positive and Negative Agreement should be used instead. Typically, these estimates are presented with confidence intervals, so claims will appear more robust as sample size increases, assuming a high degree of concordance. A critical limitation is the time and resources that laboratories and developers are willing to put toward procuring samples with relevant mutations.
Though accuracy is a key performance characteristic that must be determined through well designed studies, it is only part of the picture. The ability to detect variants alone does not mean a test is diagnostically useful unless there is also assurance it can yield the correct results again and again, over long periods of time and in the face of inevitable change, such as the addition of new biomarkers, new library preparation methods, and even new laboratory personnel who are trained to run the test.
‘IRREPRODUCIBLE SCIENCE IS NOT SCIENCE’
Dr. Weida Tong of the FDA was one of many speakers to drive home this point during the recent Sequencing Quality Control (SEQC2) workshop that took place at the NIH in Bethesda, Maryland. Dr. Tong forcefully stated, ‘Irreproducible science is not science’, which was a primary topic of the workshop captured in the many different strategies that were presented to evaluate assay reproducibility. There is widespread consensus in the diagnostic NGS community on the importance of determining not only test accuracy, but precision as well through reproducibility studies. However, there is currently no unified, standardized practice to evaluate this metric.
For example, New York State Department of Health guidelines for clinical NGS somatic mutation detection tests (PDF) currently use the term ‘precision’ to refer to intra-run reproducibility, and require triplicate testing within the same run for at least three samples for each variant type the assay can detect. The same guidelines use ‘reproducibility’ to refer to inter-run reproducibility, and the requirement is at least three positive samples for each variant type analyzed in three independent runs; this requirement also states that the batches should be run on separate days, by two different operators, with the parenthetical qualification, ‘if possible’.
On the other hand, Proficiency Testing (PT) programs managed by the College of American Pathologists (CAP) for laboratories certified under the Clinical Laboratory Improvement Amendment of 1988 (CLIA) require that PT samples be handled no differently than real patient samples, which means intra- and inter-run reproducibility testing are generally not performed. However, PT is currently the most widely practiced method for testing site-to-site reproducibility, with all of the potential sources of variation that come with it.
Meanwhile, various FDA guidance documents, including the draft guidance on the use of standards in NGS-based diagnosis of germline diseases (PDF) issued earlier this year in July recommend comprehensive reproducibility studies as an essential part of assay validation. In the style expected of the FDA, however, no concrete requirements are defined for evaluating test precision, and it is left up to individual laboratories to determine the scope of their reproducibility studies.
Because the Agency recommends evaluating sources of variability such as instruments, reagent lots, operators, days, and testing sites, reproducibility studies performed as part of the FDA approval process involve a large amount of replicate testing for each sample. To help ease the burden of procuring large amounts of samples carrying relevant mutations, often a ‘combined’ approach is used where many parameters are varied at the same time, in the same study. The disadvantage of this approach is that the variables are confounded: if significant variation is observed in the results, the effort required to identify the root cause(s) is costly, time-consuming, and a great source of frustration.
RELIABLE, CONSISTENT REFERENCE MATERIALS ARE ESSENTIAL TO ESTABLISH ASSAY PERFORMANCE
Regardless of the strategy used to evaluate assay precision, it is clear that laboratories should not have to face this challenge on their own. As NGS technologies advance and the regulatory landscape evolves, tools such as SeraCare’s oncology reference materials will enable clinicians to have confidence in the performance of the tests they are running, and to focus on developing new solutions to improve patient outcome.
Today, we are pleased to introduce Seraseq™ Tumor Mutation DNA Mix v2, a high-quality reference material that is manufactured under Good Manufacturing Practices (GMP) in an ISO-certified facility to guarantee precision across different lots. Whether you are looking to evaluate multiple lots to demonstrate robust reproducibility for your NGS somatic mutation test, or need a steady supply over time that does not change enabling you to monitor assay performance, Seraseq™ Tumor Mutation DNA Mix v2 can quickly and easily meet your needs.
Orthogonal testing by digital PCR is also performed for this product to establish ‘ground truth’ variant calls and allele frequencies that can be used to evaluate Positive and Negative Agreement for your NGS assay. Mutational content is comprised of forty clinically-actionable and analytically-challenging variants; this includes single nucleotide variants, insertion/deletion mutations, and structural variants, all of which are present in a single well-characterized genomic background. The forty variants are offered at 10%, 7%, or 4% minor allele frequency. Additionally, a single-vial ‘Tri-Level’ format is available with the forty variants evenly distributed across these three allele frequencies.
To learn more about these products, as well as other oncology reference materials from SeraCare, please visit us online or contact us here via the comments below.