Broadcast encryption (BE) schemes allow a sender to securely broadcast to any subset of members but require a trusted party to distribute decryption keys. Group key agreement (GKA) protocols enable a group of members to negotiate a common encryption key via open networks so that only the group members can decrypt the ciphertexts encrypted under the shared encryption key, but a sender cannot exclud...
View full abstract»

In this manuscript, we present a redundant data storage system based on NAND flash memory chips for in-line Pipeline Inspection Gauges (PIGs). The system is the next step for a technique that reduces data from 1,024 to 37 bytes by 80 transducers used for straight-beam ultrasonic inspection. Each inspection is costly, because PIGs check pipelines up to 100 Km, collecting data every 3 mm and reachin...
View full abstract»

Due to physical limitations, mobile devices are restricted in memory, battery, processing, among other characteristics. This results in many applications that cannot be run in such devices. This problem is fixed by Edge Cloud Computing, where the users offload tasks they cannot run to cloudlet servers in the edge of the network. The main requirement of such a system is having a low Service Delay, ...
View full abstract»

Convolutional Neural Networks (CNNs) have shown a great deal of success in diverse application domains including computer vision, speech recognition, and natural language processing. However, as the size of datasets and the depth of neural network architectures continue to grow, it is imperative to design high-performance and energy-efficient computing hardware for training CNNs. In this paper, we...
View full abstract»

Traditional standalone embedded system is limited in their functionality, flexibility, and scalability. Fog computing platform, characterized by pushing the cloud services to the network edge, is a promising solution to support and strengthen traditional embedded system. Resource management is always a critical issue to the system performance. In this paper, we consider a fog computing supported s...
View full abstract»

Using cloud storage, users can remotely store their data and enjoy the on-demand high-quality applications and services from a shared pool of configurable computing resources, without the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the outsourced data makes the data integrity protection in cloud computing a formidable task, espec...
View full abstract»

Host-based anomaly intrusion detection system design is very challenging due to the notoriously high false alarm rate. This paper introduces a new host-based anomaly intrusion detection methodology using discontiguous system call patterns, in an attempt to increase detection rates whilst reducing false alarm rates. The key concept is to apply a semantic structure to kernel level system calls in or...
View full abstract»

Redundant and irrelevant features in data have caused a long-term problem in network traffic classification. These features not only slow down the process of classification but also prevent a classifier from making accurate decisions, especially when coping with big data. In this paper, we propose a mutual information based algorithm that analytically selects the optimal feature for classification...
View full abstract»

Many companies are deploying services largely based on machine-learning algorithms for sophisticated processing of large amounts of data, either for consumers or industry. The state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs), which are known to be computationally and memory intensive. A number of neural network accelerato...
View full abstract»

Today, security can no longer be treated as a secondary issue in embedded and cyber-physical systems. Therefore, one of the main challenges in these domains is the design of secure embedded systems under stringent resource constraints and real-time requirements. However, there exists an inherent trade-off between the security protection provided and the amount of resources allocated for this purpo...
View full abstract»

The traditional hallmark in embedded systems is to minimize energy consumption considering hard or soft real-time deadlines. The basic principle is to transfigure the uncertainties of task execution times in therealworld into energy saving opportunities. The energy saving is achieved by suitably controlling the reliable power supply at circuit or system-level with the aim of minim...
View full abstract»

As technology is reaching physical limits, reducing power consumption is a key issue on our path to sustained performance. In this paper, we study fundamental tradeoffs and limits in efficiency (as measured in energy per operation) that can be achieved for an important class of kernels, namely the level-3 Basic Linear Algebra Subprograms (BLAS). It is well-accepted that specialization is the key t...
View full abstract»

Mobile devices have several restrictions due to design choices that guarantee their mobility. A way of surpassing such limitations is to utilize cloud servers called cloudlets on the edge of the network through Mobile Edge Computing. However, as the number of clients and devices grows, the service must also increase its scalability in order to guarantee a latency limit and quality threshold. This ...
View full abstract»

The problem of the curse of dimensionality for processing large high-dimensional datasets has been an open challenge. Numerous research efforts have been proposed for improving query performance in high-dimensional space through hierarchical indexing using the R-tree or its variants and exploring parallel processing of the R-tree on GPUs. Despite these existing efforts, the curse of dimensionality...
View full abstract»

Inexact (or approximate) computing is an attractive paradigm for digital processing at nanometric scales. Inexact computing is particularly interesting for computer arithmetic designs. This paper deals with the analysis and design of two new approximate 4-2 compressors for utilization in a multiplier. These designs rely on different features of compression, such that imprecision in computation (as...
View full abstract»

In smart city, all kinds of users' data are stored in electronic devices to make everything intelligent. A smartphone is the most widely used electronic device and it is the pivot of all smart systems. However, current smartphones are not competent to manage users' sensitive data, and they are facing the privacy leakage caused by data over-collection. Data over-collection, which means smartphones ...
View full abstract»

As data volumes of high-performance computing applications continuously increase, low I/O performance becomes a fatal bottleneck of these data-intensive applications. Data replication is a promising approach to improve parallel I/O performance. However, most existing strategies are designed based on the assumption that contiguous requests are being served more efficiently than non-contiguous reque...
View full abstract»

Stochastic computation has recently been proposed for implementing artificial neural networks with reduced hardware and power consumption, but at a decreased accuracy and processing speed. Most existing implementations are based on pre-training such that the weights are predetermined for neurons at different layers, thus these implementations lack the ability to update the values of the network pa...
View full abstract»

Lattice-based cryptography is one of the most promising branches of quantum resilient cryptography, offering versatility and efficiency. Discrete Gaussian samplers are a core building block in most, if not all, lattice-based cryptosystems, and optimised samplers are desirable both for high-speed and low-area applications. Due to the inherent structure of existing discrete Gaussian sampling methods...
View full abstract»

NAND flash memory is the major storage media for both mobile storage cards and enterprise Solid-State Drives (SSDs). Log-block-based Flash Translation Layer (FTL) schemes have been widely used to manage NAND flash memory storage systems in industry. In log-block-based FTLs, a few physical blocks called log blocks are used to hold all page updates from a large amount of data blocks. Frequent page u...
View full abstract»

Somewhat Homomorphic Encryption (SHE) schemes allow to carry out operations on data in the cipher domain. In a cloud computing scenario, personal information can be processed secretly, inferring a high level of confidentiality. For many years, practical parameters of SHE schemes were overestimated, leading to only consider the FFT algorithm to accelerate SHE in hardware. Nevertheless, recent work ...
View full abstract»

A discrete cosine transform (DCT) is defined and an algorithm to compute it using the fast Fourier transform is developed. It is shown that the discrete cosine transform can be used in the area of digital processing for the purposes of pattern recognition and Wiener filtering. Its performance is compared with that of a class of orthogonal transforms and is found to compare closely to that of the K...
View full abstract»

Server workloads operate on large volumes of data. As a result, processors executing these workloads encounter frequent L1-D misses. In a many-core processor, an L1-D miss causes a request packet to be sent to an LLC slice and a response packet to be sent back to the L1-D, which results in high overhead. While prior work targeted response packets, this work focuses on accelerating the request pack...
View full abstract»

Approximate computing is an attractive design methodology to achieve low power, high performance (low delay) and reduced circuit complexity by relaxing the requirement of accuracy. In this paper, approximate Booth multipliers are designed based on approximate radix-4 modified Booth encoding (MBE) algorithms and a regular partial product array that employs an approximate Wallace tree. Two approxima...
View full abstract»

The Booth multiplier has been widely used for high performance signed multiplication by encoding and thereby reducing the number of partial products. A multiplier using the radix-$4$(or ...
View full abstract»

Elastic partitioning of computations between mobile devices and cloud is an important and challenging research topic for mobile cloud computing. Existing works focus on the single-user computation partitioning, which aims to optimize the application completion time for one particular single user. These works assume that the cloud always has enough resources to execute the computations immediately ...
View full abstract»

Designing soft errors resilient systems is a complex engineering task, which nowadays follows a cross-layer approach. It requires a careful planning for different fault-tolerance mechanisms at different system's layers: starting from the technology up to the software domain. While these design decisions have a positive effect on the reliability of the system, they usually have a detrimental effect...
View full abstract»

This paper proposes a new framework for digital image processing; it relies on inexact computing to address some of the challenges associated with the discrete cosine transform (DCT) compression. The proposed framework has three levels of processing; the first level uses approximate DCT for image compressing to eliminate all computational intensive floating-point multiplications and executing the ...
View full abstract»

With proliferation of smart phones and an increasing number of services provisioned by clouds, it is commonplace for users to request cloud services from their mobile devices. Accessing services directly from the Internet data centers inherently incurs high latency due to long RTTs and possible congestions in WAN. To lower the latency, some researchers propose to `cache' the services at edge cloud...
View full abstract»

Arbiter Physically Unclonable Functions (APUFs), while being relatively lightweight, are extremely vulnerable to modeling attacks. Hence, various compositions of APUFs such as XOR APUF and Lightweight Secure PUF have been proposed to be secure alternatives. Previous research has demonstrated that PUF compositions have two major challenges to overcome: vulnerability against modeling and statistical...
View full abstract»

As the DRAM cell size continues to shrink, the proportion of leaky cells is increasing. As a result, the prior approaches, called retention aware refresh, which skip unnecessary refresh operations for non-leaky cells, are unable to skip as many refresh operations as before. The large granularity of the DRAM refresh mechanism makes this problem more serious. Specifically, even when there are only a...
View full abstract»

Physical unclonable functions (PUFs) are security primitives that enable the extraction of digital identifiers from electronic devices, based on the inherent silicon process variations between devices which occur during the manufacturing process. Due to the intrinsic and lightweight nature of a PUF, they have been proposed to provide security at a low cost for many applications, in particular for ...
View full abstract»

Globalization of the integrated circuit (IC) design industry is making it easy for rogue elements in the supply chain to pirate ICs, overbuild ICs, and insert hardware Trojans. Due to supply chain attacks, the IC industry is losing approximately $4 billion annually. One way to protect ICs from these attacks is to encrypt the design by inserting additional gates such that correct outputs are produc...
View full abstract»

Underwater wireless sensor networks (UWSNs) have been showed as a promising technology to monitor and explore the oceans in lieu of traditional undersea wireline instruments. Nevertheless, the data gathering of UWSNs is still severely limited because of the acoustic channel communication characteristics. One way to improve the data collection in UWSNs is through the design of routing protocols con...
View full abstract»

In this paper, we propose a two-factor data security protection mechanism with factor revocability for cloud storage system. Our system allows a sender to send an encrypted message to a receiver through a cloud storage server. The sender only needs to know the identity of the receiver but no other information (such as its public key or its certificate). The receiver needs to possess two things in ...
View full abstract»

A general neural-network (connectionist) model for fuzzy logic control and decision systems is proposed. This connectionist model, in the form of feedforward multilayer net, combines the idea of fuzzy logic controller and neural-network structure and learning abilities into an integrated neural-network-based fuzzy logic control and decision system. A fuzzy logic control decision network is constru...
View full abstract»

3D integration opens up new opportunities for future multiprocessor chips by enabling fast and highly scalable 3D Network-on-Chip (NoC) topologies. However, in an aim to reduce the cost of Through-silicon via (TSV), partially vertically connected NoCs, in which only a few vertical TSV links are available, have been gaining relevance. To reliably route packets under such conditions, we introduce a ...
View full abstract»

Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of two decimal floating-point multipliers: one whose partial product accumulation strategy employs decimal carry-save addition and one that employs binary carry-save addition. The multiplier based ...
View full abstract»

From the little we know about the human brain, the inherent cognitive mechanism is very different from the de facto state-of-the-art computing platforms. The human brain uses distributed, yet integrated memory and computation units, unlike the physically separate memory and computation cores in typical von Neumann architectures. Despite huge success of artificial intelligence, hardware systems run...
View full abstract»

Flash memory-based SSD-RAIDs are swiftly replacing conventional hard disk drives by exhibiting improved performance and stability, especially in I/O-intensive environments. However, the variations in latency and throughput occurring due to uncoordinated internal garbage collection cripples further boosting of performance. In addition, the unwanted variations in each SSD can influence the overall p...
View full abstract»

We present a thread-voting DVFS technique for manycore networks-on-chip (NoCs). This technique has two remarkable features which differentiate from conventional NoC DVFS schemes. (1) Not only network-level but also thread-level runtime performance indicatives are used to guide DVFS decisions. (2) To resolve multiple perhaps conflicting performance indicatives from many cores, it allows each thread...
View full abstract»

This paper proposes a new data prefetching technique for Graphics Processing Units (GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to dynamically select warps whose progress is slower than that of the current warp as prefetching target warps. Under the in-order instruction execution model of GPUs, these prefetching target warps will certainly execute the same load a...
View full abstract»

Many large sequential computers execute operations in a different order than is specified by the program. A correct execution is achieved if the results produced are the same as would be produced by executing the program steps in order. For a multiprocessor computer, such a correct execution by each processor does not guarantee the correct execution of the entire program. Additional conditions are...
View full abstract»

Auction-style pricing policies can effectively reflect the underlying trends in demand and supply for the cloud resources, and thereby attracted a research interest recently. In particular, a desirable cloud auction design should be (1) online to timely reflect the fluctuation of supply-demand relations, (2) expressive to support the heterogeneous user demands, and (3) truthful to discourage users...
View full abstract»

In this work we present a new 64-bit floating point Fused Multiply Add (FMA) unit that can perform both binary and decimal addition, multiplication, and fused-multiply-add operations. The presented FMA has 6 percent less delay than the fastest stand-alone decimal unit and 23 percent less area than both binary and decimal units together. These results were achieved by the use of: 1) column by colum...
View full abstract»

We present ADDSEN middleware as a holistic solution for Adaptive Data processing and dissemination for Drone swarms in urban SENsing. To efficiently process sensed data in the middleware, we have proposed a cyber-physical sensing framework using partially ordered knowledge sharing for distributed knowledge management in drone swarms. A reinforcement learning dissemination strategy is implemented i...
View full abstract»

In this paper, the joint optimization problem with energy efficiency and effective resource utilization is investigated for heterogeneous and distributed multi-core embedded systems. The system model is considered to be fully a heterogeneous model, that is, all nodes have different maximum speeds and power consumption levels from the perspective of hardware while they can employ different scheduli...
View full abstract»

Power has become a key constraint in nanoscale integrated circuit design due to the increasing demands for mobile computing and higher integration density. As an emerging computational paradigm, an inexact circuit offers a promising approach to significantly reduce both dynamic and static power dissipation for error-tolerant applications. In this paper, an inexact floating-point adder is proposed ...
View full abstract»

Owing to the fast-growing demands of larger and faster NAND flash devices, new manufacturing techniques have accelerated the down-scaling process of NAND flash memory. Among these new techniques, 3D charge trap flash is considered to be one of the most promising candidates for the next-generation NAND flash devices. However, the long erase latency of 3D charge trap flash becomes a critical issue. ...
View full abstract»

Further Links

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.

TC is a scholarly archival journal published monthly. In addition to full papers, brief contributions and comments are also published.