The HST archive at CADC contains the following data:

All public (non-proprietary) HST Data (produced by the CADC's and ECF's cache system, see below)

The standard HST archive products from the active instruments (ACS, COS, STIS, WFC3) are kept current by the HST Cache system described in the bottom of this page.

Data from legacy instruments (FOC, FOS, HRS, WFPC, WFPC2) have gone through a final calibration run and is not foreseen to change anymore. (spectroscopy will appear very soon)

What is the HST Cache?

The cache is an envelope around HST archive file production. It is a set
of database tables and software agents that ensures that all publically
available HST science pipeline products are preprocessed and readily
available from storage at all times. This includes mechanisms to discover
newly observed datasets to insert, and automatic reprocessing of datasets
which benefit from updates to reference files, available meta-data and
general processing software upgrades.

Why do we need a cache?

Since 2002 all data from active instruments has been produced from
scratch triggered by user requests. The reasoning behind the On The
Fly Reprocessing (OTFR) and On The Fly Calibration (OTFC) pipelines was
that it would guarantee that the archive user always would get her data
equipped with the newest set of meta-data and calibrated according to the
best methods available. This was a clear advantage to the previous system,
where the raw data was produced centrally at the STScI and delivered
to the partner-sites, essentially freezing that data in time. Another
advantage of the system was that it conserved storage space as only the
Hubble Space Telescope telemetry files and a few smaller auxiliary files
needed to be stored, an important resource aspect when data is stored
on optical disks in jukeboxes.

With the advent of cheap mass storage in form of hard-disk arrays this
aspect became less important and a number of other drawbacks of the
on-the-fly paradigm became apparent over time as well: Live processing of
data requires that support is available at all times to resolve errors
and bugs in the pipeline, a inevitable task when a system becomes as
complex as this with such a heterogeneous set of data as input. Another
drawback is the processing speed: Producing a dataset could take from
several minutes to hours, which might not be an issue for the patient
astronomer, but makes it impossible to expose the data through synchronous
VO protocols. Next level efforts like data-mining/metadata harvesting
and production of high-level data products is also enormously difficult
in the on-the-fly world.