The design of today's electronic embedded systems is an increasingly complicated task. This is especially problematic for Small and Medium Enterprises (SMEs) which have limited resources. In this work, we identify a set of common design practices used in industry, with a special focus on problems faced by smaller companies, and formulate them as design scenarios. We show how SMEs can benefit from a system-level design approach by customizing a formal heterogeneous system modeling framework for each scenario. The applicability of this approach is demonstrated by two industrial use cases, an impulse-radio radar and a UART-based protocol.

Electronic System Level (ESL) design of embedded systems proposes raising the abstraction level of the design entry to cope with the increasing complexity of such systems. To exploit the benefits of ESL, design languages should allow specification of models which are a) heterogeneous, to describe different aspects of systems; b) formally defined, for application of analysis and synthesis methods; c) executable, to enable early detection of specification; and d) parallel, to exploit the multi- and many-core platforms for simulation and implementation. We present a modeling library on top of SystemC, targeting heterogeneous embedded system design, based on four models of computation. The library has a formal basis where all elements are well defined and lead in construction of analyzable models. The semantics of communication and computation are implemented by the library, which allows the designer to focus on specifying the pure functional aspects. A key advantage is that the formalism is used to export the structure and behavior of the models via introspection as an abstract representation for further analysis and synthesis.

3.

Attarzadeh Niaki, Seyed Hosein

et al.

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Mikulcak, Marcus

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Robino, Francesco

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Sander, Ingo

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

The design of real-time multiprocessor systems is a very costly and time-consuming process due to the need for extensive verification efforts. Genericcorrect-by-construction system-level design flows, targeting predictable plat-forms, would help to tackle this problem. Unfortunately, because system-level design problems are formulated monolithically, existing methods areeither not powerful enough to perform efficient design space exploration,over-customized to a specific class of platforms, or do not allow to be ex-tended with new heuristics and solving methods, which makes their reusedifficult. We present a formal framework to explicitly capture and character-ize predictable platform templates that can be used to formulate a genericdesign flow for real-time streaming applications in a composable manner. Aproof-of-concept implementation of such a flow is performed and used to mapa JPEG encoder application onto an FPGA-based time-predictable platform.

Virtual Prototypes (VPs) provide an early development platform to embedded software designers when the hardware is not ready yet and allows them to explore the design space of a system, both from the software and architecture perspective. However, automatic generation of VPs is not straightforward because several aspects such as the validity of the generated platforms and the timing of the components needs to be considered. To address this problem, based on a framework which characterizes predictable platform templates, we propose a method for automated generation of VPs which is integrated into a combined design flow consisting of analytic and simulation based design-space exploration. Using our approach the valid TLM-2.0-based simulated VP instances with timing annotation can be generated automatically and used for further development of the system in the design flow. We have demonstrated the potential of our method by designing a JPEG encoder system.

Virtual prototypes (VPs) provide an early development platform to embedded software designers when the hardware is not ready yet and allows them to explore the design space of a system, both from the software and architecture perspective. However, automatic generation of VPs is not straightforward because several aspects such as the validity of the generated platforms and the timing of the components needs to be considered. To address this problem, based on a framework which characterizes predictable platform templates, we propose a method for automated generation of VPs which is integrated into a combined design flow consisting of analytic and simulation based design-space exploration. Using our approach the valid TLM 2.0-based simulated VP instances with timing annotation can be generated automatically and used for further development of the system in the design flow. We have demonstrated the potential of our method by designing a JPEG encoder system.

6.

Attarzadeh Niaki, Seyed Hosein

et al.

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Sander, Ingo

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Simulation of complex embedded and cyber-physical systems requires exploitation of the computation power ofavailable parallel architectures. Current simulation environments either do not address this parallelism or use separate models for parallel simulation and for analysis and synthesis, which might lead to model mismatches. We extend a formal modeling framework targeting heterogeneous systems with elements that enable parallel simulations. An automated flow is then proposed that starting from a serial executable specification generates an efficient MPI-based parallel simulation model by using aconstraint-based method. The proposed flow generates parallel models with acceptable speedups for a representative example.

New design methodologies and modeling frameworks are required to provide a solution for integrating legacy code and IP models in order to be accepted in the industry. To tackle this problem, we introduce the concept of wrappers in the context of a formal heterogeneous embedded system modeling framework. The formalism is based on the language-independent concept of models of computation. Wrappers enable the framework to co-simulate/co-execute with external models which might be legacy code, an IP block, or an implementation of a partially refined system. They are defined formally in order to keep the analyzability of the original framework and also enable automations such as generation of model wrappers and co-simulation interfaces. As a proof of concept, three wrappers for models in different abstraction levels are introduced and implemented for two case studies.

There is a need for integration of external models in high-level system design flows. We introduce a set of partial refinement operations to implement models of heterogeneous embedded systems. The models are in form of process networks where each process belongs to a single model of computation. A semi-formal design flow has been introduced based on these operations to incrementally refine system specifications to their implementation. Wrapper processes, which allow co-simulation of a system model in the framework with external models and implementations are used to keep the intermediate system models after each refinement step verifiable. Additionally, this design flow has the advantage of integrating legacy code and IP cores. Using a simple example as the case study, we have shown how we can apply this design methodology to a simple system.

Design of real-time MPSoC systems including multiple applications is challenging because temporal requirements of each application must be respected throughout the entire design flow. Currently the design of different applications is often interdependent, making converge to a solution for each application difficult. This chapter proposes a compositional method to design applications independently, and then to execute them without interference. We define a formal modeling framework as a suitable entry point for application design. The models are executable, which enables early detection of specification errors, and include the formal properties of the applications based on well-defined models of computation. We combine this with a predictable MPSoC platform template that has a supporting design flow but lacks a simulation front-end. The structure and behavior of the application models are exported to an intermediate format via introspection which is iteratively transformed for the backend flow. We identify the problems arising in this transformation and provide appropriate solutions. The design flow is demonstrated by a system consisting of two streaming applications where less than half of the design time is dedicated to operating on the integrated system model.

Due to the variety of application models and also the target platforms used in embedded electronic system design, it is challenging to formulate a generic and extensible analytic design-space exploration (DSE) framework. Current approaches support a restricted class of application and platform models and are difficult to extend. This paper proposes a framework for automatic construction of system-level DSE problem models based on a coherent, constraint-based representation of system functionality, flexible target platforms, and binding policies. Heterogeneous semantics is captured using constraints on logical clocks. The applicability of this method is demonstrated by constructing DSE problem models from different combinations of application and platforms models. Time-triggered and untimed models of the system functionality and heterogeneous target platforms are used for this purpose. Another potential advantage of this approach is that constructed models can be solved using a variety of standard and ad-hoc solvers and search heuristics.

The Functional Mock-up Interface (FMI) standard defines a method for tool- and platform-independent model exchange and co-simulation of dynamic system models. In FMI, the master algorithm, which executes the imported components, is a timed differential equation solver. This is a limitation for heterogeneous embedded and cyber-physical systems, where models with different time abstractions co-exist and interact. This work integrates FMI into a heterogeneous system modeling and simulation framework as process constructors and co-simulation wrappers. Consequently, each external model communicates with the framework without unnecessary semantic adaptation while the framework provides necessary mechanisms for handling heterogeneity. The presented methods are implemented in the ForSyDe-SystemC modeling framework and tested using a case study.

Design of real-time MPSoC systems including multiple appli-cations is challenging because temporal requirements of each applicationmust be respected throughout the entire design flow. Currently the de-sign of different applications is often interdependent, making converge toa solution for each application difficult. This paper proposes a composi-tional method to design applications independently, and then to executethem without interference. We define a formal modeling framework as asuitable entry point for application design. The models are executable,which enables early detection of specification errors, and include the for-mal properties of the applications based on well-defined models of com-putation. We combine this with a predictable MPSoC platform templatethat has a supporting design flow but lacks a simulation front-end. Thestructure and behavior of the application models are exported to an in-termediate format via introspection which is iteratively adapted for thebackend flow. We identify the problems arising in this adaptation andprovide appropriate solutions. The design flow is demonstrated by a sys-tem consisting of two streaming applications where less than half of thedesign time is dedicated to operating on the integrated system model.

Abstract models are important tools to manage the increasing complexity of system design. The choice of a modeling language for constructing models governs what types of systems can be modeled and which subsequent design activities can be performed. This is especially true for the area of embedded electronic and cyber-physical system design, which poses several challenging requirements on modeling and design methodologies. This article argues that the ForSyDe methodology with the necessary extensions can fulfill these requirements and thus qualifies for the design of tomorrow’s systems. Based on the theory of models of computation and the concept of process constructors, heterogeneous models are captured in ForSyDe with precise semantics. A refined layer of the formalism is introduced to make its denotational-style semantics easy to implement on top of the commonly used imperative languages and an open-source realization on top of the IEEE standard language SystemC is reported. The introspection mechanism is introduced to automatically export an intermediate representation of the constructed models for further analysis/synthesis by external tools. Flexibility and extensibility of ForSyDe is emphasized by integrating a new timed model of computation without central synchronization, and providing mechanisms for integrating foreign models, parallel and distributed simulation, modeling adaptive, data-parallel, and non-deterministic systems. A set of ForSyDe features are demonstrated in practice and compared to similar approaches using two relevant case studies.

14.

Attarzadeh-Niaki, Seyed-Hosein

et al.

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Sander, Ingo

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

models are important tools to manage the increasing complexity of system design. The choice of a modeling language for constructing models governs what types of systems can be modeled, and which subsequent design activities can be performed. This is especially true for the area of embedded electronic and cyber-physical system design, which poses several challenging requirements of modeling and design methodologies. This article argues that the Formal System Design (ForSyDe) methodology with the necessary presented extensions fulfills these requirements, and thus qualifies for the design of tomorrow's systems. Based on the theory of models of computation and the concept of process constructors, heterogeneous models are captured in ForSyDe with formal semantics. A refined layer of the formalism is introduced to make its denotational-style semantics easy to implement on top of commonly used imperative languages, and an open-source realization on top of the IEEE standard language SystemC is presented. The introspection mechanism is introduced to automatically export an intermediate representation of the constructed models for further analysis/synthesis by external tools. Flexibility and extensibility of ForSyDe is emphasized by integrating a new timed model of computation without central synchronization, and by providing mechanisms for integrating foreign models, parallel and distributed simulation, modeling adaptive, data-parallel, and non-deterministic systems. A set of ForSyDe features is demonstrated in practice, and compared with similar approaches using a running example and two relevant case studies.

Due to the variety of application semantics and also the target platforms used in embedded electronic system design, it is challenging to propose a generic and extensible analytic design-space exploration (DSE) framework. Current approaches support a restricted class of application and platform models and are difficult to extend. This paper proposes a framework to capture the system functionality, a flexible target platform, and a binding policy explicitly using coherent constraint-based representations; together with a method for automatic construction of DSE problem models from them. Heterogeneous semantics is captured using constraints on logical clocks. The applicability of this method is demonstrated by constructing DSE problem models from various combinations of application and platforms models. Time-triggered and untimed models of the system functionality and heterogeneous target platforms are used for this purpose. The constructed models can be solved using different solvers and heuristics.

16.

Beserra, G. S.

et al.

University of Brasilia.

Attarzadeh Niaki, Seyed Hosein

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Sander, Ingo

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

In order to handle the increasing complexity of embedded systems, design methodologies must take into account important aspects, such as abstraction, IP-reuse and heterogeneity. System design often starts in a high abstraction level, by developing a virtual platform (VP), which is typically composed of TLM models. TLM has become very popular in the modeling of bus-based systems and currently there is an increasing availability of libraries that provide TLM IPs. Heterogeneity can be naturally captured in a framework supporting different Models of Computation (MoCs). We introduce a novel approach for integrating TLM IPs/VPs into a MoC-based modeling framework, allowing them to co-simulate heterogeneous systems. This approach allows to raise the abstraction level, enabling a more careful design space exploration before selecting a proper VP. We exemplify the potential of our approach with a case study in which a VP with a processor generated by ArchC communicates with a continuous-time model.

17.

de Medeiros, Jose. E. G.

et al.

Univ Brasilia, Dept Elect Engn, Brasilia, DF, Brazil..

Ungureanu, George

KTH, School of Electrical Engineering and Computer Science (EECS), Electronics.

Sander, Ingo

KTH, School of Electrical Engineering and Computer Science (EECS), Electronics.

Advancements on analog integrated design have led to new possibilities for complex systems combining both continuous and discrete time modules on a signal processing chain. However, this also increases the complexity any design flow needs to address in order to describe a synergy between the two domains, as the interactions between them should be better understood. We believe that a common language for describing continuous and discrete time computations is beneficial for such a goal and a step towards it is to gain insight and describe more fundamental building blocks. In this work we present an algebra based on the General Purpose Analog Computer, a theoretical model of computation recently updated as a continuous time equivalent of the Turing Machine.

With the growing complexity of Real-Time Embedded Systems (RTES), there is a huge interest in using modeling languages such as the Unified Modeling Language (UML), and other Model-Driven Engineering (MDE) techniques targeting RTES system design. These approaches provide language abstractions for system design, allowing to focus on their relevant properties. Unfortunately, such approaches still suffer from several shortcomings including the lack of well-defined semantics. Therefore, it remains difficult to connect the MDE specification tools and the design tools that are based on formal grounds and well-defined semantics to perform analysis, validation or system synthesis for RTES. This paper presents a top-down RTES design flow aiming to reduce the gap between MDE and formal design approaches. We present the connection between a framework dedicated to the enrichment of modeling languages such as UML with formal semantics, a framework based on formal models of computation supporting validation by simulation, and a system synthesis tool targeting a flexible platform with well-defined execution services. Our purpose is to cover several system design phases from specification, simulation down to implementation on a platform. As a case study, a JPEG Encoder application was realized following the different design steps of the tool-chain.

With the ever increasing industrial demand for bigger, faster and more efficient systems, a growing number of cores is integrated on a single chip. Additionally, their performance is further maximized by simultaneously executing as many processes as possible without regarding their criticality. Even safety critical domains like railway and avionics apply these paradigms under strict certification regulations. As the number of cores is continuously expanding, the importance of cost-effectiveness grows. One way to increase the cost-efficiency of such System on Chip (SoC) is to enhance the way the SoC handles its power resources. By increasing the power efficiency, the reliability of the SoC is raised because the lifetime of the battery lengthens. Secondly, by having less energy consumed, the emitted heat is reduced in the SoC which translates into fewer cooling devices. Though energy efficiency has been thoroughly researched, there is no application of those power saving methods in safety critical domains yet. The EU project SAFEPOWER1.

The increasing processing power of today's HW/SW platforms leads to the integration of more and more functions in a single device. Additional design challenges arise when these functions share computing resources and belong to different criticality levels. The paper presents the CONTREX European project and its preliminary results. CONTREX complements current activities in the area of predictable computing platforms and segregation mechanisms with techniques to consider the extra-functional properties, i.e., timing constraints, power, and temperature. CONTREX enables energy efficient and cost aware design through analysis and optimization of these properties with regard to application demands at different criticality levels.

The increasing processing power of today's HW/SW platforms leads to the integration of more and more functions in a single device. Additional design challenges arise when these functions share computing resources and belong to different criticality levels. CONTREX complements current activities in the area of predictable computing platforms and segregation mechanisms with techniques to consider the extra-functional properties, i.e., timing constraints, power, and temperature. CONTREX enables energy efficient and cost aware design through analysis and optimization of these properties with regard to application demands at different criticality levels. This article presents an overview of the CONTREX European project, its main innovative technology (extension of a model based design approach, functional and extra-functional analysis with executable models and run-time management) and the final results of three industrial use-cases from different domain (avionics, automotive and telecommunication).

Recent work has proposed two-phase joint analytical and simulation-based design space exploration (JAS-DSE) approaches. In such approaches, a first analytical phase relies on static performance estimation and either on exhaustive or heuristic search, to perform a very fast filtering of the design space. Then, a second phase obtains the Pareto solutions after an exhaustive simulation of the solutions found as compliant by the analytical phase. However, the capability of such approaches to find solutions close to the actual Pareto set at a reasonable time cost is compromised by current system complexities. This limitation is due to the fact that such approaches do not support an heuristic exploration on the simulation-based phase. It is not straightforward because in the second phase the heuristic is constrained to consider only the custom set of solutions found in the first phase. This set is in general unconnected and irregularly distributed, which prevents the application of existing heuristics. This paper provides as a solution a novel search heuristic called ARS (Adaptive Random Sampling). The ARS strategy enables the application of heuristic search in the two phases of the JAS-DSE flow, by enabling the application of heuristic in the second phase, regardless the type of performance estimation done at each phase. Moreover, it enables the definition of N-phase DSE flows. The paper shows on an experiment focused on predictable multi-core systems how this enhanced JAS-DSE is capable to find more efficient solutions and to tune the trade-off between exploration time and accuracy in finding actual Pareto solutions.

Mixed-criticality system (MCS) design is an emerging discipline, which has been identified as a core foundational concept in fields such as cyber-physical systems. The hard real-time design community has pioneered the contributions to MCS design, extending scheduling theory to consider mixed-criticalities and the impact of on-chip and off-chip communication infrastructures. However, the development of MCS design methodologies capable to provide safe and efficient solutions for complex applications and platforms in an acceptable design time demands a more interdisciplinary approach. This paper is a first step towards such an approach in the development of MCS design methodologies. The paper first identifies main design disciplines to be involved in MCS design, both at SoC and system-of-systems (SoS) scales. Then, the paper proposes a core ontology for modelling a mixed-criticality system at both SoC scale (MCSoC) and SoS scale (MCSoS). Finally, the paper introduces a set of aspects required for MCS design which have been identified as open and challenging attending the overviewed state-of-the-art.

In the context of the design on time-critical systems, analytical models with worst case workloads are used to identify safe solutions that guarantee hard timing constraints. However, the focus on the worst case often leads to unnecessarily pessimistic and inefficient solutions, in particular for mixed-critical systems. To overcome the situation, the paper proposes a novel design flow integrating analytical and simulation-based Design Space Exploration (DSE). This combined approach is capable to find more efficient design solutions, without sacrificing timing guarantees. For it, a first analytical DSE phase obtains a set of solutions compliant with the critical time constraints. Search of the Pareto optimum solutions is done among this set, but it is delegated to a second simulation-based search. The simulation-based search enables more accurate estimations, and the consideration of a specific (or an average-case) scenario. The chapter shows that this can lead to different Pareto sets which reflect improved design decisions with respect to a pure analytical DSE approach, and which are found faster than through a pure simulation-based DSE approach. This is illustrated through an accompanying example and a proof-of-concept implementation of the proposed DSE flow.

In the context of the design on time-critical systems, analytical models with worst case workloads are used to identify safe solutions that guarantee hard timing constraints. However, the focus on the worst case often leads to unnecessarily pessimistic and inefficient solutions, in particular for mixed-critical systems. To overcome the situation, the paper proposes a novel design flow integrating analytical and simulation-based design space exploration (DSE). This combined approach is capable to find more efficient design solutions, without sacrificing timing guarantees. For it, a first analytical DSE phase obtains a set of solutions compliant with the critical time constraints. Search of the optimum solution is done among this set, but it is delegated to a second simulation-based search, for fine tuning and average-case optimisation. The potential of our approach is illustrated by a proof-of-concept implementation of the proposed DSE flow and an accompanying DSE example.

Today's heterogeneous embedded systems combine components from different domains, such as software, analogue hardware and digital hardware. The design and implementation of these systems is still a complex and error-prone task due to the different Models of Computations (MoCs), design languages and tools associated with each of the domains. Though making such systems adaptive is technologically feasible, most of the current design methodologies do not explicitely support adaptive architectures. This paper present the ANDRES project. The main objective of ANDRES is the development of a seamless design flow for adaptive heterogeneous embedded systems (AHES) based on the modelling language SystemC. Using domain-specific modelling extensions and libraries, ANDRES will provide means to efficiently use and exploit adaptivity in embedded system design. The design flow is completed by a methodology and tools for automatic hardware and software synthesis for adaptive architectures.

Today multiple frameworks exist for elevating the task of writing programs for GPGPUs, which are massively data-parallel execution platforms. These are needed as writing correct and high-performing applications for GPGPUs is notoriously difficult due to the intricacies of the underlying architecture. However, the existing frameworks lack a formal foundation that makes them difficult to use together with formal verification, testing, and design space exploration. We present in this chapter a novel software synthesis tool—called f2cc—which is capable of generating efficient GPGPU code from abstract formal models based on the synchronous model of computation. These models can be built using high-level modeling methodologies that hide low-level architecture details from the developer. The correctness of the tool has been experimentally validated on models derived from two applications. The experiments also demonstrate that the synthesized GPGPU code yielded a 28× speedup when executed on a graphics card with 96 cores and compared against a sequential version that uses only the CPU.

Today multiple frameworks exist for elevating thetask of writing programs for GPGPUs, which are massively data-parallel execution platforms. These are needed as writing correctand high-performing applications for GPGPUs is notoriouslydifficult due to the intricacies of the underlying architecture.However, the existing frameworks lack a formal foundation thatmakes them difficult to use together with formal verification,testing, and design space exploration. We present in this papera novel software synthesis tool – called f2cc – which is capableof generating efficient GPGPU code from abstract formal modelsbased on the synchronous model of computation. These modelscan be built using high-level modeling methodologies that hidelow-level architecture details from the developer. The correctnessof the tool has been experimentally validated on models derivedfrom two applications. The experiments also demonstrate that thesynthesized GPGPU code yielded a 28× speedup when executedon a graphics card with 96 cores and compared against asequential version that uses only the CPU.

In this paper, we present a system level designmethodology which allows designers to model andanalyze their systems from the early stages of thedesign process until nal implementation. The de-sign methodology targets heterogeneous embeddedsystems and is based on a formal modeling frame-work, called ForSyDe. ForSyDe is available underthe open Source approach, which allows small andmedium enterprises (SME) to get easy access toadvanced modeling capabilities and tools. We givean introduction to the design methodology throughthe system level modeling of a simple industrial usecase, and we outline the basics of the underlyingForSyDe model.

34.

Jantsch, A.

et al.

KTH, School of Information and Communication Technology (ICT), Electronic Systems.

Models of computation (MoC) are reviewed and organised with respect to the time abstraction they use. Continuous time, discrete time, synchronous and untimed MoCs are distinguished. System level models serve a variety of objectives with partially contradicting requirements. Consequently, it is argued that different MoCs are necessary for the various tasks and phases in the design of an embedded system. Moreover, different MoCs have to be integrated to provide a coherent system modelling and analysis environment. The relation between some popular languages and the reviewed MoCs is discussed to find that a given MoC is offered by many languages and a single language can support multiple MoCs. It is contended that it is of importance for the quality of tools and overall design productivity, which abstraction levels and which primitive operators are provided in a language. However, it is observed that there are various flexible ways to do this, e.g. by way of heterogeneous frameworks, coordination languages and embedding of different MoCs in the same language.

Embedded system designers often face a large number of design alternatives when designing complex systems. A designer must select an alternative which satisfies application constraints (e.g. timing requirements) while optimizing system level objectives such as overall energy consumption. The size of design space is often very large giving rise to the need for systematic Design Space Exploration (DSE) methods. In this paper we address the DSE problem for real-time applications that belong to two different domains: (i) streaming applications modeled using the synchronous dataflow graphs; (ii) feedback control tasks modeled using the periodic task model. We consider a heterogeneous multiprocessor platform in which processors communicate through a predictable bus architecture. We present our DSE tool in which the DSE problem is modeled as a constraint satisfaction problem, and it is solved using a constraint programming solver. This approach provides a modular framework in which different constraints such as deadline, throughput and energy consumption can easily be plugged depending on the system being designed.

With the ever increasing industrial demand for bigger, faster and more efficient systems, a growing number of cores is integrated on a single chip. Additionally, their performance is further maximized by simultaneously executing as many processes as possible not regarding their criticality. Even safety critical domains like railway and avionics apply these paradigms under strict certification regulations. As the number of cores is continuously expanding, the importance of cost-effectiveness grows. One way to increase the cost-efficiency of such System on Chip (SoC) is to enhance the way the SoC handles its power resources. By increasing the power efficiency, the reliability of the SoC is raised, because the lifetime of the battery lengthens. Secondly, by having less energy consumed, the emitted heat is reduced in the SoC which translates into fewer cooling devices. Though energy efficiency has been thoroughly researched, there is no application of those power saving methods in safety critical domains yet. The EU project SAFEPOWER(1) targets this research gap and aims to introduce certifiable methods to improve the power efficiency of mixed-criticality real-time systems (MCRTES). This paper will introduce the requirements that a power efficient SoC has to meet and the challenges such a SoC has to overcome.

SYLVA is a system level synthesis framework that transforms DSP sub-systems modeled as synchronous data flow into hardware implementations in ASIC, FPGAs or CGRAs. SYLVA synthesizes in terms of pre-characterized function implementations (FTMPs). It explores the design space in three dimensions, number of FTMPs, type of FTMPs and pipeline parallelism between the producing and consuming FTMPs. We introduce timing and interface model of FTMPs to enable reuse and automatic generation of Global Interconnect and Control (GLIC) to glue the FTMPs together into a working system. SYLVA has been evaluated by applying it to five realistic DSP applications and results analyzed for design space exploration, efficacy in generating GLIC by comparing to manually generated GLIC and accuracy of design space exploration by comparing the area and energy costs considered during the design space exploration based on pre-characterized FIMPs and the final results.

The feasibility of a message in a network concerns if its timing property can be satisfied without jeopardizing any messages already in the network to meet their timing properties. We present a novel feasibility analysis for real-time (RT) and non-realtime (NT) messages in wormhole-routed networks on chip. For RT messages, we formulate a contention tree that captures contentions in the network. For coexisting RT and NT messages, we propose a simple bandwidth partitioning method that allows us to analyze their feasibility independently.

We present a novel approach to refine a system model specified with perfectly synchronous communication onto a network-on-chip (NoC) best-effort communication service. It is a top-down procedure with three steps, namely, channel refinement, process refinement, and communication mapping. In channel refinement, synchronous channels are replaced with stochastic channels abstracting the best-effort service. In process refinement, processes are refined in terms of interfaces and synchronization properties. Particularly, we use synchronizers to maintain local synchronization of processes and thus achieve synchronization consistency, which is a key requirement while mapping a synchronous model onto an asynchronous architecture. Within communication mapping, the refined processes and channels are mapped to an NoC architecture. Adopting the Nostrum NoC platform as target architecture, we use a digital equalizer as a tutorial example to illustrate the feasibility of our concepts.

We present a performance-oriented refinement approach that refines a perfectly synchronous communication model onto Network-on-Chip (NoC) communication. We first identify four basic forms of NoC process interaction patterns at the process level, namely, producer-consumer, peers, client-server and multicast. We propose a three-step top-down refinement method: channel refinement, protocol refinement and channel mapping. For the producer-consumer pattern, we describe it in detail. In channel refinement, we deal with interfacing multiple clock domains and use a stochastic process to model channel delay and jitter In protocol refinement, we show how to refine communication towards application requirements such as reliability and throughput. In channel mapping, we discuss channel convergence and channel merge arising from channel overlapping. All the refinements have been conducted and validated as an integral design phase towards implementation in ForSyDe, a formal system-level design methodology based on a synchronous model of computation.

We have presented a formal set of synchronization components called synchronizers for refining synchronous communication onto HW/SW codesign architectures. Such an architecture imposes asynchronous communication between HW-HW SW-SW and HW-SW components. The synchronizers enable local synchronization, thus satisfy the synchronization requirement of a typical IP core. In this paper we present their implementations in HW, SW and HW/SW as well as their application. To validate our concepts, we conduct a case study on a Nios FPGA that comprises a processor memory and custom logic. The final HW/SW implementation achieves equivalent performance to pure HW implementation. Our prototyping experience suggests that the synchronizers can be standardized as library modules and effectively separate the design of computation from that of communication.

The Multi-Core NoC is a 4 by 4 Mesh NoC targeted for Altera FPGAs. It implements a deflective routing policy and is used to connect sixteen NIOS II processors. Each NIOS II is connected to the NoC via an address-mapped Resource Network Interface. The Multi-Core NoC is implemented on four separate Altera Stratix II FPGA boards, each hosting a Quad-Core NoC, which operates on a local 50 MHz clock. It has an onboard throughput of 650 Mbps (12.5 MFlit/s), and uses 28% of the LUs, 18% of the ALUTs, 22 % of the dedicated registers and 31% of the total memory blocks of a Stratix II FPGA. Asynchronous clock bridges, with a throughput of 50 Mbps (∼1MFlit/s), are used for the inter-board communication. Application programs use an MPI compatible Hardware Abstraction Layer (HAL) to communicate with the Resource Network Interface of the NoC. The RNI sets up message transfer, with a maximum length of 512 bytes, and sends flits with the size of 32 bit data plus 20 bit headers through the network. The MPI is the bottleneck of the system; it takes 46 us (43.4 kPackets/s) to send a minimum-sized packet through the protocol stack to a near neighbour and bounce it back to the original application. The bounce-back time for a far neighbour is 56 us.

Multiprocessor system-on-chip design (MPSoC) is becoming a regular feature of the embedded systems. Shared-bus systems hold many advantages, but they do not scale. Network on chip (NoC) offers a promising solution to the scalability problem by enhancing the topology design. However, standard NoCs are only scalable within a chip. To be able to build infinitely scalable structures, a method to enhance the NoC-grid off-chip is needed. In this paper, we present such a method. As a proof of concept, the protocol is implemented on a 4 by 4 Mesh NoC, with NIOS II CPU cores as nodes, partitioned across four separate Altera FPGA boards, each board hosting a Quad-Core (2x2) NoC, operating on a local 50 MHz clock. The inter-chip communication protocol uses asynchronous clock bridges, with a throughput of 50 Mbps (~1MFlit/s) and is completely scalable. The NoC has an onboard throughput of 650 Mbps (12.5 MFlit/s). Each Quad-Core uses 28% of the LUs, 18% of the ALUTs, 22 % of the dedicated registers and 31% of the total memory blocks of the Stratix II FPGAs. Application programs use an MPI compatible Hardware Abstraction Layer (HAL) to communicate with each other over the NoC.