4. System and Design Issues (Finland)

4.1 IT Arc=
hitecture

Statistics Finland's common metadata system is implemented according to =
the principles of service-based architecture.

Services meeting the needs of different us=
er groups and client systems have a key role in the service-based architect=
ure. The picture below shows the service interface to be built on top of th=
e metadata warehouse, whose services produce the required data from the doc=
uments in the metadata warehouse, and also attend to storing of data to the=
warehouse.

The content of the metadata warehouse is maintained and it can be made a=
vailable in client systems by ordering services through the service interfa=
ce from the metadata warehouse. The service interface is implemented in lin=
e with the REST architecture (Representational State Transfer). The basic s=
tructures of the application are carried out according to the layer style. =
The business logic layer is formed of REST service interfaces, their proces=
sing logics and data transfer modules offered by the interfaces to client s=
oftware. The function of data transfer modules is to offer data from the XM=
L data warehouse to client software with an easy-to-use entity structure.=
p>

<=
span style=3D"color: black;"> 4.2 Metadata Management Tools

See Section 2.2.

4.3 S=
tandards and formats

Statistics Finland has developed a Common Structure of Statistical Infor=
mation (CoSSI) based on xml. It is a modular data model for describing stat=
istical tables, classifications, concepts, variables, general information o=
n statistical documents, quality descriptions, etc. CoSSI was designed in a=
ccordance with international standards such as the Dublin Core and CALS. If=
needed, CoSSI can be expanded; new elements, e.g. for data descriptions ha=
ve already been integrated into it. In its ITC strategy, Statistics F=
inland has provided guidelines for the use of the CoSSI model. The data mod=
els of the classifications and concepts in use have been developed in the 1=
990s, and the elements they contain are presently part of CoSSI.

The basic structure and content of statistical inform=
ation is defined in the CoSSI data model. It describes the information stru=
cture of the statistical data to be produced. The way in which data are pro=
duced, that is, the production steering system, is not described in the CoS=
SI data model. The definition of the data and content required by the produ=
ction steering system was left to the future development phase of the model=
.

The data model comprises a description of basic infor=
mation of data sets for the production and editing of statistical data and =
distribution of statistical information. At the moment, the model's parts t=
o be extended and checked due to changed content requirements are as follow=
s:

Quality description of statistics

The classification information model

Supplementing the metadata part (docmeta) concerning the data record wi=
th data required by archiving

Methodological description of editing

Attaching source system metadata as part of statistical metadata

Metadata of questions and questionnaires.

Preliminary examinations indicate that the CoSSI data=
model offers an adequate basis for producing content description data of s=
tatistical information following the GSIM data model (Generic Statistical I=
nformation Model, version 0.4/ 5.2012). A preliminary outline has been made=
to the CoSSI model of the structure that would cover the needs of Eurostat=
's different quality reports.

4.4 Version control and revisions

The versioning of the classifications can be seen as a 4-level hierarchy=
:

Level 1 (the highest level) is the classification name. It is a logical =
element in a hierarchy and its purpose is to aggregate all the statistical =
versions of a classification. In the classification database the classifica=
tion name is expressed as a short technical name.

Level 2 consists of the statistical versions of a classification. There =
can be one or more statistical versions per classification. For example, st=
atistical versions for the classification name =E2=80=9CIndustrial classifi=
cation=E2=80=9D are, for example, the Finnish Standard Industrial Classific=
ation TOL and the Industrial Classification for business services statistic=
s. In the classification database the statistical versions are separated fr=
om one another by version number.

Level 3 contains the time versions of a classification. In the classific=
ation database the time versions are separated from one another by period o=
f validity. When a classification changes, a new time version is made on th=
e classification database. The old versions remain in the database and they=
can be used, e.g. in archiving.

Level 4 includes the language versions of a classification. The language=
versions (Finnish, Swedish and English) share the basic information with t=
heir mother classification or concept.

The concepts are being versioned in the same way as the classifications.=
The concepts usually stay the same for a long time but sometimes modificat=
ions take place, for example, due to amendments to legislation.

A configuration on the versioning of the classifications (see picture be=
low).

(click on the thumbnail to view full size image)

4.5 Outsourcing versus in-house development

The user interfaces and the applications f=
or the databases have been mainly developed and built in-house.The applicat=
ions developed at Statistics Finland can in principle be shared free of cha=
rge with other statistical organizations. Where necessary, details r=
egarding test use and access to more precise descriptions etc. may be agree=
d upon separately.