Nowadays, the Web acts as a principal driver of innovation in which various data sources together give a fully integrated view of distributed information. The emergence of the Web revolutionized the access to data, but it is disputable this revolution would have happened if relevant information could not be found and integrated. The Semantic Web is an extension of the current Web and, together with Linked Data, is gaining traction as a prominent solution for knowledge representation and integration on the Web. Nevertheless, its real power will be realized once a significant number of software agents that operate relying on information derived from diverse sources, become available. However, software agents still have limited ability to interact with heterogeneous data. Intelligent software agents do not have enough data to work with, and human agents do not want to put in effort to provide Linked Data until there are software agents that use it.

This PhD dissertation proposes a set of complementary techniques, each one addressing a part of the semantically-enhanced, interrelated and integrated information acquisition from (semi-)structured heterogeneous data. The uppermost goal is to facilitate high quality Linked Data generation independently of the available original data. (i) We provide a solution, in the form of a mapping language, RML, that allows agents to declaratively define mapping rules that specify how Linked Data is generated; (ii) we investigate factors which determine alternative approaches for executing rule-based Linked Data generation; (iii) we introduce a methodology that applies the quality assessment to mapping rules that generate Linked Data, and allows to refine them, instead of assessing the quality of the generated Linked Data and (iv) we present complete workflow(s) for Linked Data generation.