This WP lays the empirical foundations for the development of the CASMACAT workbench.
A series of experiments will establish basic facts about translator behaviour in computer-aided
translation, focusing on the use of visualisation option and input modalities. Another series of
studies will deal with individual di erences in translation, in particular translator types and
translation styles.
The initial report deals with translation types and styles, text types and reading model
adapted for machine translated texts. It covers the rst periode of Tasks 1.3, 1.4, and 1.5. The
deliverable is structured into three sections which bie
y summarize the work and an appendix
which contains more detailed information about the produced material and a number of papers.
An experimental setup (see section 2.1) and a questionnaire (see section 1.1) was designed to
obtain consistent data from various translators in di erent languages under similar conditions.
Translation data was collected in several locations (section 2.2) and assembled into a TPR
database, as described in section 1.2. Preliminary studies were conducted to investigate post-
editing and translation styles (section 1.3). Translation data was also collected in the rst
casmacat eld trial. The assessment is provided in Deliverable d6.1. Section 3 describes the
rst Edinburgh Eyetracking experiment while the Appendix contains furter material.

This paper describes a pilot study with a computed-assisted translation workbench aiming at
testing the integration of online and active learning features. We investigate the effect of these
features on translation productivity, using interactive translation prediction (ITP) as a baseline.
User activity data were collected from five beta testers using key-logging and eye-tracking.
User feedback was also collected at the end of the experiments in the form of retrospective
think-aloud protocols. We found that OL performs better than ITP, especially in terms of trans-
lation speed. In addition, AL provides better translation quality than ITP for the same levels of
user effort. We plan to incorporate these features in the final version of the workbench.

Based on empirical studies of texts, the three Germanic languages German, English and Danish are compared regarding similarities and differences. The analysis focuses on special aspects depending on how meaning is structured in the three languages. Some typical difficulties and pitfalls human translators experience are described. In the last part of the article, errors of machine translations are shown. Interestingly also many of the problems of machine translation systems are caused by linguistic differences and interferences.

Files in this item: 1

Today's synthetic voices are largely based on diphone synthesis (DiSyn) and unit selection synthesis (UnitSyn). In most
DiSyn systems, prosodic envelopes are generated with formal models while UnitSyn systems refer to extensive, highly
indexed sound databases. Each approach has its drawbacks; such as low naturalness (DiSyn) and dependence on huge
amounts of background data (UnitSyn). We present a hybrid model based on high-level speech data. As preliminary
tests show, prosodic models combining DiSyn style at the phone level with UnitSyn style at the supra-segmental levels
may approach UnitSyn quality on a DiSyn footprint. Our test data are Danish, but our algorithm is language neutral.

Files in this item: 1

While scholars agree that planning and preparation is key to a negotiation’s effectiveness, negotiation research has largely focused solely on what happens at the negotiation table and little is known about what occurs in preparation for a negotiation meeting. This paper aims to redress the balance by clarifying which preparation and planning activities are undertaken to conduct a complex business negotiation compared to the recommendations found in the literature. In contrast to the majority of negotiation research this study follows a qualitative research design multiple methods of inquiry and draws upon data grounded in a large global industrial company. The results suggest that a significant number of activities recommended in the literature concerning negotiation preparation and planning do not take place in the real-world. In addition, the study demonstrated the inherent weakness of relying on an open-ended survey as the sole data source through which to understand an internal-organizational phenomenon.

Files in this item: 1

This article focuses on the role of user modeling
and semantic-enhanced representations for personalization. The
paper presents a generic Ontology-based User Modeling
framework (OntobUMf), its components and its associated user
modeling processes. This framework models the behavior of the
users and classifies its users according to their behavior. The user
ontology is the backbone of OntobUMf and has been designed
according to the Information Management System Learning
Information Package (IMS LIP). The user ontology includes a
behavior concept that extends IMS LIP specification and defines
characteristics of the users interacting with the system. Concrete
examples of how OntobUMf is used in the context of a
Knowledge Management System (KMS) are provided. The paper
discusses some of the implications of ontology-based user
modeling for semantic-enhanced Knowledge Management (KM),
and in particular for personal KM. The results of this research
may contribute to the development of other frameworks for
modeling user behavior, other semantic-enhanced user modeling
frameworks or other semantic-enhanced information systems.

Purpose – This paper discusses new approaches for managing personal knowledge in the
Web 2.0 era. We question whether Web 2.0 technologies (social software) are a real panacea
for the challenges associated with the management of knowledge. Can Web2.0 reconcile the
conflicting interests of managing organisational knowledge with personal objectives? Does
Web 2.0 enable a more effective way of sharing and managing knowledge at the personal
level?
Design /methodology/approach – Theoretically deductive with illustrative examples.
Findings – Web 2.0 plays a multifaceted role for communicating, collaborating, sharing and
managing knowledge. Web 2.0 enables a new model of PKM that includes formal and
informal communication, collaboration and social networking tools. This new PKM model
facilitates interaction, collaboration and knowledge exchanges on the web and in
organisations.
Practical implications – Based on these findings professionals and scholars will gain a better
understanding of the potential role of Web 2.0 technologies for harnessing and managing
personal knowledge. The paper provides concrete examples of how Web 2.0 tools are
currently used in organisations.
Originality/value – As Web 2.0 has become integrated in our day-to-day activities, there is a
need to further understand the relationship between Web 2.0 and Personal Knowledge
Management (PKM).

WU Kun’s philosophy of information is the product of Dialectics of Nature coming from Stalin Textbook System and thought liberty movement in 1980s. It is a distinguish philosophy in Chinese style. This philosophical system begins with new founding of the area of objective unreality through re-dividing the field of existence which he called the world of information in itself. Actually, the concept of information in his system is different from information in common sense. So his philosophy of information is not the philosophy about information. A proper theoretical framework of information should be a framework covering objective laws, subjective meaning and intersubjective normativity.

Files in this item: 1

This paper suggests that for negotiation studies, the well-researched role of cognitive closure
in decision-making should be supplemented with specific research on what sort of information is seized
on as unambiguous, salient and easily processable by negotiators. A study of email negotiation is
reported that suggests that negotiators seize on concrete examples as building blocks that produce
immediate positive feedback and consequent utilization in establishing common ground.

The purpose of the current investigation is to predict post-editor profiles based on user be-
haviour and demographics using machine learning techniques to gain a better understanding of
post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database
from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main
research goals: We create n-gram models based on user activity and part-of-speech sequences
to automatically cluster post-editors, and we use discriminative classifier models to character-
ize post-editors based on a diverse range of translation process features. The classification and
clustering of participants resulting from our study suggest this type of exploration could be
used as a tool to develop new translation tool features or customization possibilities.

Files in this item: 1

These lecture notes present the basic principles of phrase structure that apply in English. We start by presenting in some detail the most complex phrase type in English, the noun phrase. Having done that, we demonstrate that all the other main phrase types, the AP, the PP and the VP, are modelled on the same structural principles as noun phrases.

This WP presents the empirical foundations for the development of the CasMaCat workbench.
A series of experiments are being run to establish basic facts about translator behaviour in
computer-aided translation, focusing on the use of visualization options and input modalities
while post-editing machine translation (sections 1 and 2). Another series of studies deals with
cognitive modelling and individual di erences in translation production, in particular translator
types and translation/post-editing styles (sections 3 and 4).
This deliverable, D1.2, is a progress report on user interface studies, cognitive and user
modelling. It reports on post-editing and interactive translation experiments, as well as cognitive
modelling covering Tasks 1.1, 1.2, 1.3 and 1.5. It also addresses the issues that were raised in
the last review report for the project period M1 to M12, in particular:
the basic facts about the translator behaviour in CAT (sections 1 and 4) highlighting
usage of visualization and input modalities (see also D5.3).
the individual di erences in translator types and translation styles, (section 3, see also
terminology, section A.1)
the results and conclusions of preliminary studies conducted to investigate post-editing
and translation styles (section 2 and 5)
From the experiments and analyses so far, it is clear that the data collected in the CRITT
TPR-DB (Translation Process Research database) is an essential resource to achieve the Cas-
MaCat project goals. It allows for large-scale in depth studies of human translation processes
and thus serves as a basis of information to empirically grounded future development of the
CasMaCat workbench. It attracts an international research community to investigate human
translation processes under various conditions and to arrive at a more advanced level of understanding.
Additional language pairs and more data increase the chances to better underpin the
conclusions needed, as will be shown in this report, and as concluded in section 5.

Files in this item: 1

Workpackage 7 comprises of dissemination activities of the casmacat project. In this report,
we summarize the promotion of project goals, progress and outcomes to the larger academic
research community, the commercial sector targeted by the work, and beyond.

In this paper, we present the newly established Danish speech corpus PiTu. The corpus consists of recordings of 28 native
Danish talkers (14 female and 14 male) each reproducing (i) a series of nonsense syllables, and (ii) a set of authentic
natural language sentences. The speech corpus is tailored for investigating the relationship between early stages of the
speech perceptual process and later stages. We present our considerations involved in preparing the experimental set-up,
producing the anechoic recordings, compiling the data, and exploring the materials in linguistic research. We report on
a small pilot experiment demonstrating how PiTu and similar speech corpora can be used in studies of prosody as a
function of semantic content. The experiment addresses the issue of whether the governing principles of Danish prosody
assignment is mainly talker-specific or mainly content-typical (under the specific experimental conditions). The corpus is
available at http://amtoolbox.sourceforge.net/pitu/.