Data management for array-based chromosomal analysis

Introduction

Implementing a relational database usually becomes a must for Cytogenetic laboratory when it reaches a certain level of throughput in array-based screening and, consequently, involves a significant number of personnel in the data analysis process. Handling hundreds of array-screened samples per year without a reliable data storage and management solution is already a challenge, but when this number approaches a thousand and keeps rising it becomes virtually impossible.

infoQuant's data management solution cnTrack can assist Cytogenetic teams in setting up their workflow of data interpretation laboratory-wide. It takes care of such critical elements of multi-user environment as data storage, data workflow management, sample progress monitoring and remote data access. The system utilizes all latest trends in information technologies including compatibility with cloud-based deployment. At the same time, it is a solution designed solely for high-resolution data generated by array CGH, SNP arrays and NGS, accommodating every specific need of clinical users of these hardware platforms.

Workflow organization

cnTrack solution does not only provide centralized storage for raw array data and results of data analysis, but also organizes lab-wide data interpretation workflow within two dimensions of complexity: across multiple users and along multiple stages of the sample review procedure.

A cytogenetic team usually comprises multiple users with different sets of responsibilities in the data analysis workflow: from personnel responsible for data extraction and submission to the database to laboratory directors approving and releasing reports. Different members of the team need to be assigned different database permissions and to be provided with different software capabilities. In order to make this role-distribution process simpler, cnTrack defines an easy to use web-based workspace for every particular user. This dedicated user space helps define the exact role of each user in the data review workflow and facilitates efficient accomplishment of the tasks assigned to her by providing easy access to all necessary software tools, including infoQuant's analytical tools (see "Efficient analytics" white paper for details).

Each processed sample has to make it through different stages of a review process, set of which can vary greatly from one particular laboratory to another. In this evolution of a sample it may have to be passed from one member of the team to another without losing track of the overall progress and of team member's accountability. This seemingly complex process is handled by cnTrack in an intuitive and interactive manner. Full audit trail is available for a sample at any point, which ensures integrity of the review process. At the same time sample's projected "workflow trajectory" remains fully flexible. Any sample can be re-assigned to a different cnTrack user at any workflow stage accounting for possible changes in availability of human resources.

In cnTrack high-resolution data for each sample gets automatically analyzed for abnormal chromosomal regions – copy number gains/losses and regions of LOH – and then gets committed to the core data repository. When that happens, the newly committed sample gets assigned initial stage of the interpretation workflow and a member of the laboratory becomes responsible for making sure that the sample makes it through the review process in a timely manner. Only when the final report is "signed off" by an authorized user, data lifecycle is marked as "complete" in the system. But usually cnTrack's job does not stop there.

Once a Cytogenetic report is generated (see "Efficient analytics" white paper for available formats), it can be sent to the organization or person that ordered the test. That can be done manually via email/post or via a feature of our system specifically designed for automated delivery of analysis results to remote users. By deploying a restricted user portal (Physician's Portal) built for secure outside access the cytogenetic team can make distribution of their final report documents to the end-users a fully automated task.

Workflow monitoring

No matter how well-designed and automated sample's lifecycle is within a database system, there is always a need to keep track of what's happening with every individual sample in the review pipeline. Such monitoring is available in cnTrack at two distinct levels.

Each member of the review team can immediately see samples awaiting their action when they come to work in the morning and log into their work space. We made this feature very user-friendly so that every user can quickly understand her tasks for the day and is fully equipped to fulfill them without having to go through a sequence of cumbersome extra steps.

Laboratory directors need easy access to such snapshot information across all users and samples available in the database in order to be able to react quickly to any roadblocks that may have occurred. Our solution not only provides them with such capability, but also enables them to change workflow trajectory for an individual sample or a batch of samples.

Status monitoring is a powerful tool that can be used to ensure speedy delivery of reports to laboratory customers and to optimize allocation of laboratory's resources. It can be used to identify process bottlenecks and to re-distribute workload across team members quickly keeping analysis throughput at a desired level.

Legacy data

Large volumes of analyzed high-resolution data produce a very valuable source of knowledge if legacy data is stored and handled appropriately. cnTrack is designed to capture and store a wealth of information for each processed sample.

Each sample is accompanied by its clinical attributes that span many categories such as demographic and phenotypic information. Availability of this information not only ensures accuracy of final report, but also enables categorization of historical data accumulated within the database. Groups of samples can be then analyzed and compared according to their phenotype or genotype using analytical components of our software suite, leading to higher accuracy of interpretation results for every individual patient.

Loads of region data can be extracted from final reports stored in the database. Common regions of copy number change can be used for looking up legacy samples with similarities in their genotypes, speeding up sample review. Moreover, after having gone through clinical interpretation chromosomal abnormalities can carry priceless causal information, especially when complimented with phenotype annotations. Such clinically relevant DNA anomalies can be compressed into a custom region track or an aberration frequency track and presented as a chromosomal profile of the phenotype under review. Both types of tracks can add significant power to your analysis.

And, finally, availability of probe-level data inside infoQuant solutions ensures accuracy when comparing a case to previously analyzed sample. A legacy sample can be quickly looked up in the database and its probe-wise data can be pulled into our analytical software for visual comparison or re-processing, if needed.

Appropriate organization of committed data is ensured by cnTrack's architecture. The solution is designed to take advantage of all types of data associated with a sample during array-based experiment and to deliver it into user's data interpretation workflow for more accurate results and speedy processing. It also makes sure that the connection between connected pieces of data does not get lost and can be used to extract relational or causal information.

Ease of deployment

Implementation of such a significant infrastructural change as database solution in a busy genetic laboratory can be prohibitively difficult and potentially disruptive. We have taken this aspect into account when designing cnTrack and built its architecture to be flexible enough to adapt to any laboratory workflow and team size, yet simple enough for out-of-the-box lab-wide deployment. It can be connected to any major type of a database server and accessed from any OS platform. Array database can be hosted inside customer institution or outside – on a dedicated instance of infoQuant cloud-based server, for instance. The database can be seamlessly integrated with customer's existing sample tracking solutions such as Shire, OMNILAB and STARLIMS to take advantage of available demographic and clinical information.

Contrary to common opinion that data management system can take months to roll out, our solution can be set up within a few hours. Intuitive web-integrated interface is designed to require minimal user training so that individual users can get started with data analysis and interpretation almost right away.

Conclusion

Benefits of using an array-specific data management system are multifaceted and can provide a serious boost in operational performance both volume- and quality-wise. Impact on laboratory throughput can be immediate due to better data flow organization and delayed in time due to lower new personnel training costs. Integration with high-performance analytical modules makes cnTrack an end-to-end solution accounting for all array data processing needs. infoQuant suite also stimulates evolution of data interpretation workflow at every particular laboratory by re-using accumulated legacy data and presenting it in a clinically meaningful manner.