Clinical Data Management – Processes after Data collection

Clinical Data Management – Processes after Data collection

After all the data has been collected and the Last Patient Last Visit (LPLV) has been completed, it should be made sure that all the discrepancies have been resolved and all queries for any data sent out to the investigator via DCFs (Data Clarification Forms) have been answered. At this point of time it should be ascertained that the data is clean, accurate and complete. After the data is declared clean, the database needs to go through the following phases to ensure integrity:

Database Lock

Database Freeze

After the database freeze is done the data needs to be stored in the database securely and efficiently. This involves the last few steps:

Data Retention

Data Archiving

Data Sharing

We shall discuss each of these in the following sections.

Database Lock and Freeze

After the LPLV, when no more data is expected and all the queries have been resolved so that the data is declared clean, it is essential that no user should make any changes to the databases. Thus the rights of all the users having access to the database are revoked and they cannot login into the database or make any changes to existing data. However a user termed the “Privileged User”; still retain access to the database. Such a process is called a Database Lock.

The data is then extracted out of the database into SAS and the biostatisticians as well as statistical programmers run programs against this dataset so as to test its statistical significance and integrity. If it is recommended by them that the data isn’t sufficiently clean and is giving errors during statistical analysis, then the database has to be unlocked. In this case the privileged user unlocks the database and makes necessary changes. However during a database lock, patients can be added deleted or their data can be modified.

However as per industry norms that situation should be prohibited where changes need be made after the database lock. However under circumstances where it is required to re-open the database, the changes made to it after re-opening should be documented and proper sign off from relevant personnel should be taken as mentioned in the data management plan.

After the database is re-opened and relevant changes are made it should be closed via the same process with all relevant sign-offs.

Once the database is closed and a review is done by the Biostatistician, a final step is carried out called the Database Freeze. A database freeze involves revoking the right of the privileged user so that no more changes can be made to the database. No patients can be added, deleted or modified. Freezing of the database means that the data is ready to be submitted to the regulatory agency.

However under extreme cases the database can be unfreeze, by only the sponsor or designated personnel. For example a patient develops a SAE after the database is frozen. He falls so ill that the only way to save him is if the medication he was taking in the trial is known. This would mean breaking the blind hence compromising the trial. In such cases the regulatory authorities are considerate and the blind should be broken by unfreezing the database in the benefit of the patient.

However such cases are rare and as a general rule once database freeze is done there is no turning back and the data that is collected is the final data that the regulatory authorities will analyze to see if the drug should be given a green signal to be marketed and manufactured.

Data Retention

The data that is collected at each site must be stored by it for a minimum retention period of that study so that it could be referred to if required. Not only this but also the database used to store and manage the trial data should be retained for as long.

The retention of data should follow some basic guidelines. The data should be retained in such as way that:

It cannot be modified by the site

It should be stored such that the data of any patient, for any visit can be extracted i.e. it must be indexed properly.

The Principal Investigator should be the only one responsible for that data.

Depending on whether the data is electronic or paper based additional investments may be required such as for paper based data should be stored in fire-proof cabinets and for electronic means a copy of the data should be backed up outside the premises of the site to ensure disaster recovery.

Data Archiving

Once the data has been collected, stored and analyzed it has to be archived.

Archiving is a process of storing data in an efficient manner so that it may be retrieved easily.

Again if paper based methods are used for collecting and managing data, then it should be well indexed and archived in fire proof cabinets or the paper documents can be scanned and stored electronically. If it is electronic data then the computerized database that is used to collect and manage the data is archived.

Data Archiving is different from Data Warehousing. Whereas the former is storing the data efficiently for quick retrieval, the latter is storing the large amount of data for a long time, as it does not need to be retrieved quickly but can be used for mining the data to get information.

Data warehousing is generally used for storing legacy data. This storage helps mine the data by running computer programs/algorithms against it, so as to get relevant answers form the existing data that may be useful for future trials. Thus it is storage of data for analysis purpose. Warehousing is not possible for paper data. For example you couldn’t ask certain questions in a trial using pure paper based methods such as:

“How many patients, with a medical history of hypertension and age between 25 to 30 years were enrolled in a trial”? Such questions could be asked against an electronic data warehouse.