Sample records for scalable isosurface visualization

In this paper, we present a novel isosurfacevisualization technique that guarantees the accurate visualization of isosurfaces with complex attribute data defined on (un)structured (curvi)linear hexahedral grids. Isosurfaces of high-order hexahedral-based finite element solutions on both uniform grids (including MRI and CT scans) and more complex geometry representing a domain of interest that can be rendered using our algorithm. Additionally, our technique can be used to directly visualize solutions and attributes in isogeometric analysis, an area based on trivariate high-order NURBS (Non-Uniform Rational B-splines) geometry and attribute representations for the analysis. Furthermore, our technique can be used to visualizeisosurfaces of algebraic functions. Our approach combines subdivision and numerical root finding to form a robust and efficient isosurfacevisualization algorithm that does not miss surface features, while finding all intersections between a view frustum and desired isosurfaces. This allows the use of view-independent transparency in the rendering process. We demonstrate our technique through a straightforward CPU implementation on both complex-structured and complex-unstructured geometries with high-order simulation solutions, isosurfaces of medical data sets, and isosurfaces of algebraic functions. PMID:22442127

We introduce a new subdivision-surface wavelet transform for arbitrary two-manifolds with boundary that is the first to use simple lifting-style filtering operations with bicubic precision. We also describe a conversion process for re-mapping large-scale isosurfaces to have subdivision connectivity and fair parameterizations so that the new wavelet transform can be used for compression and visualization. The main idea enabling our wavelet transform is the circular symmetrization of the filters in irregular neighborhoods, which replaces the traditional separation of filters into two 1-D passes. Our wavelet transform uses polygonal base meshes to represent surface topology, from which a Catmull-Clark-style subdivision hierarchy is generated. The details between these levels of resolution are quickly computed and compactly stored as wavelet coefficients. The isosurface conversion process begins with a contour triangulation computed using conventional techniques, which we subsequently simplify with a variant edge-collapse procedure, followed by an edge-removal process. This provides a coarse initial base mesh, which is subsequently refined, relaxed and attracted in phases to converge to the contour. The conversion is designed to produce smooth, untangled and minimally-skewed parameterizations, which improves the subsequent compression after applying the transform. We have demonstrated our conversion and transform for an isosurface obtained from a high-resolution turbulent-mixing hydrodynamics simulation, showing the potential for compression and level-of-detail visualization.

Data sets that are being produced by today's simulations, such as the ones generated by DOE's ASCI program, are too large for real-time exploration and visualization. Therefore, new methods of visualizing these data sets need to be investigated. The authors present a method that combines isosurface representations of different resolutions into a seamless solution, virtually free of cracks and overlaps. The solution combines existing isosurface generation algorithms and wavelet theory to produce a real-time solution to multiple-resolution isosurfaces.

Isosurface extraction is an important and useful visualization method. Over the past ten years, the field has seen numerous isosurface techniques published leaving the user in a quandary about which one should be used. Some papers have published complexity analysis of the techniques yet empirical evidence comparing different methods is lacking. This case study presents a comparative study of several representative isosurface extraction algorithms. It reports and analyzes empirical measurements of execution times and memory behavior for each algorithm. The results show that asymptotically optimal techniques may not be the best choice when implemented on modern computer architectures.

Remote visualization of volumetric data has gained importance over the past few years in order to realize the full potential of tele-radiology. Volume rendering is a computationally intensive process, often requiring hardware acceleration to achieve real time visualization. Hence a remote visualization model that is well-suited for high speed networks would be to transmit rendered images from the server (with dedicated hardware) based on view point requests from clients. In this regard, a compression scheme for the rendered images is vital for efficient utilization of the server-client bandwidth. Also, the complexity of the decompressor should be considered so that a low end client workstation can decode images at the desired frame rate. We present a scalable low complexity image coder that has good compression efficiency and high throughput.

A central challenge in visual analytics is the creation of accessible, widely distributable analysis applications that bring the benefits of visual discovery to as broad a user base as possible. Moreover, to support the role of visualization in the knowledge creation process, it is advantageous to allow users to describe the reasoning strategies they employ while interacting with analytic environments. We introduce an application suite called the Scalable Reasoning System (SRS), which provides web-based and mobile interfaces for visual analysis. The service-oriented analytic framework that underlies SRS provides a platform for deploying pervasive visual analytic environments across an enterprise. SRS represents a “lightweight” approach to visual analytics whereby thin client analytic applications can be rapidly deployed in a platform-agnostic fashion. Client applications support multiple coordinated views while giving analysts the ability to record evidence, assumptions, hypotheses and other reasoning artifacts. We describe the capabilities of SRS in the context of a real-world deployment at a regional law enforcement organization.

A central challenge in visual analytics is the creation of accessible, widely distributable analysis applications that bring the benefits of visual discovery to as broad a user base as possible. Moreover, to support the role of visualization in the knowledge creation process, it is advantageous to allow users to describe the reasoning strategies they employ while interacting with analytic environments. We introduce an application suite called the Scalable Reasoning System (SRS), which provides web-based and mobile interfaces for visual analysis. The service-oriented analytic framework that underlies SRS provides a platform for deploying pervasive visual analytic environments across an enterprise. SRS represents a “lightweight” approach to visual analytics whereby thin client analytic applications can be rapidly deployed in a platform-agnostic fashion. Client applications support multiple coordinated views while giving analysts the ability to record evidence, assumptions, hypotheses and other reasoning artifacts. We describe the capabilities of SRS in the context of a real-world deployment at a regional law enforcement organization.

In this project we developed a suite of progressive visualization algorithms and a data-streaming infrastructure that enable interactive exploration of scientific datasets of unprecedented size. The methodology aims to globally optimize the data flow in a pipeline of processing modules. Each module reads a multi-resolution representation of the input while producing a multi-resolution representation of the output. The use of multi-resolution representations provides the necessary flexibility to trade speed for accuracy in the visualization process. Maximum coherency and minimum delay in the data-flow is achieved by extensive use of progressive algorithms that continuously map local geometric updates of the input stream into immediate updates of the output stream. We implemented a prototype software infrastructure that demonstrated the flexibility and scalability of this approach by allowing large data visualization on single desktop computers, on PC clusters, and on heterogeneous computing resources distributed over a wide area network. When processing terabytes of scientific data, we have achieved an effective increase in visualization performance of several orders of magnitude in two major settings: (i) interactive visualization on desktop workstations of large datasets that cannot be stored locally; (ii) real-time monitoring of a large scientific simulation with negligible impact on the computing resources available. The ViSUS streaming infrastructure enabled the real-time execution and visualization of the two LLNL simulation codes (Miranda and Raptor) run at Supercomputing 2004 on Blue Gene/L at its presentation as the fastest supercomputer in the world. In addition to SC04, we have run live demonstrations at the IEEE VIS conference and at invited talks at the DOE MICS office, DOE computer graphics forum, UC Riverside, and the University of Maryland. In all cases we have shown the capability to stream and visualize interactively data stored remotely at the San

We present a new method for the interactive rendering of isosurfaces using ray casting on multi-core processors. This method consists of a combination of an object-order traversal that coarsely identifies possible candidate 3D data blocks for each small set of contiguous pixels, and an isosurface ray casting strategy tailored for the resulting limited-size lists of candidate 3D data blocks. While static screen partitioning is widely used in the literature, our scheme performs dynamic allocation of groups of ray casting tasks to ensure almost equal loads among the different threads running on multi-cores while maintaining spatial locality. We also make careful use of memory management environment commonly present in multi-core processors. We test our system on a two-processor Clovertown platform, each consisting of a Quad-Core 1.86-GHz Intel Xeon Processor, for a number of widely different benchmarks. The detailed experimental results show that our system is efficient and scalable, and achieves high cache performance and excellent load balancing, resulting in an overall performance that is superior to any of the previous algorithms. In fact, we achieve an interactive isosurface rendering on a 1024(2) screen for all the datasets tested up to the maximum size of the main memory of our platform. PMID:18369267

In NURBS-based isogeometric analysis, the basis functions of a 3D model's geometric description also form the basis for the solution space of variational formulations of partial differential equations. In order to visualize the results of a NURBS-based isogeometric analysis, we developed a novel GPU-based multi-pass isosurfacevisualization technique which performs directly on an equivalent rational Bézier representation without the need for discretization or approximation. Our approach utilizes rasterization to generate a list of intervals along the ray that each potentially contain boundary or isosurface intersections. Depth-sorting this list for each ray allows us to proceed in front-to-back order and enables early ray termination. We detect multiple intersections of a ray with the higher-order surface of the model using a sampling-based root-isolation method. The model's surfaces and the isosurfaces always appear smooth, independent of the zoom level due to our pixel-precise processing scheme. Our adaptive sampling strategy minimizes costs for point evaluations and intersection computations. The implementation shows that the proposed approach interactively visualizes volume meshes containing hundreds of thousands of Bézier elements on current graphics hardware. A comparison to a GPU-based ray casting implementation using spatial data structures indicates that our approach generally performs significantly faster while being more accurate. PMID:26357373

A scalable and portable code named Atomsviewer has been developed to interactively visualize a large atomistic dataset consisting of up to a billion atoms. The code uses a hierarchical view frustum-culling algorithm based on the octree data structure to efficiently remove atoms outside of the user's field-of-view. Probabilistic and depth-based occlusion-culling algorithms then select atoms, which have a high probability of being visible. Finally a multiresolution algorithm is used to render the selected subset of visible atoms at varying levels of detail. Atomsviewer is written in C++ and OpenGL, and it has been tested on a number of architectures including Windows, Macintosh, and SGI. Atomsviewer has been used to visualize tens of millions of atoms on a standard desktop computer and, in its parallel version, up to a billion atoms. Program summaryTitle of program: Atomsviewer Catalogue identifier: ADUM Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADUM Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Computer for which the program is designed and others on which it has been tested: 2.4 GHz Pentium 4/Xeon processor, professional graphics card; Apple G4 (867 MHz)/G5, professional graphics card Operating systems under which the program has been tested: Windows 2000/XP, Mac OS 10.2/10.3, SGI IRIX 6.5 Programming languages used: C++, C and OpenGL Memory required to execute with typical data: 1 gigabyte of RAM High speed storage required: 60 gigabytes No. of lines in the distributed program including test data, etc.: 550 241 No. of bytes in the distributed program including test data, etc.: 6 258 245 Number of bits in a word: Arbitrary Number of processors used: 1 Has the code been vectorized or parallelized: No Distribution format: tar gzip file Nature of physical problem: Scientific visualization of atomic systems Method of solution: Rendering of atoms using computer graphic techniques, culling algorithms for data

Multi-resolution data-structures and algorithms are key in Visualization to achieve real-time interaction with large data-sets. Research has been primarily focused on the off-line construction of such representations mostly using decimation schemes. Drawbacks of this class of approaches include: (i) the inability to maintain interactivity when the displayed surface changes frequently, (ii) inability to control the global geometry of the embedding (no self-intersections) of any approximated level of detail of the output surface. In this paper we introduce a technique for on-line construction and smoothing of progressive isosurfaces. Our hybrid approach combines the flexibility of a progressive multi-resolution representation with the advantages of a recursive sub-division scheme. Our main contributions are: (i) a progressive algorithm that builds a multi-resolution surface by successive refinements so that a coarse representation of the output is generated as soon as a coarse representation of the input is provided, (ii) application of the same scheme to smooth the surface by means of a 3D recursive subdivision rule, (iii) a multi-resolution representation where any adaptively selected level of detail surface is guaranteed to be free of self-intersections.

Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems in areas including decision making, risk assessment, social network analysis, intelligence analysis, scholarly research and others. However, as data sizes continue to grow in these areas, scalable processing, modeling, and semantic analysis of text collections becomes essential. In this paper, we present the ParaText text analysis engine, a distributed memory software framework for processing, modeling, and analyzing collections of unstructured text documents. Results on several document collections using hundreds of processors are presented to illustrate the exibility, extensibility, and scalability of the the entire process of text modeling from raw data ingestion to application analysis.

This document describes the LBNL vision for issues to be considered when assembling a large, multi-institution visualization and analysis effort. It was drafted at the request of the PNNL National Visual Analytics Center in July 2004.

The authors developed image and video compression algorithms that provide scalability, reconstructibility, and network adaptivity, and developed compression and quantization strategies that are visually optimal at all bit rates. The goal of this research is to enable reliable ''universal access'' to visual communications over the National Information Infrastructure (NII). All users, regardless of their individual network connection bandwidths, qualities-of-service, or terminal capabilities, should have the ability to access still images, video clips, and multimedia information services, and to use interactive visual communications services. To do so requires special capabilities for image and video compression algorithms: scalability, reconstructibility, and network adaptivity. Scalability allows an information service to provide visual information at many rates, without requiring additional compression or storage after the stream has been compressed the first time. Reconstructibility allows reliable visual communications over an imperfect network. Network adaptivity permits real-time modification of compression parameters to adjust to changing network conditions. Furthermore, to optimize the efficiency of the compression algorithms, they should be visually optimal, where each bit expended reduces the visual distortion. Visual optimality is achieved through first extensive experimentation to quantify human sensitivity to supra-threshold compression artifacts and then incorporation of these experimental results into quantization strategies and compression algorithms.

In many applications, iso-surface is the primary method for visualizing the structure of 3D density maps. We consider a common scenario where the user views the iso-surfaces from a distance and varies the level associated with the iso-surface as well as the view direction to gain a sense of the general 3D structure of the density map. For many types of density data, the iso-surfaces associated with a particular threshold may be nested and never visible during this type of viewing. In this paper, we discuss a simple, conservative culling method that avoids the generation of interior portions of iso-surfaces at the contouring stage. Unlike existing methods that perform culling based on the current view direction, our culling is performed once for all views and requires no additional computation as the view changes. By pre-computing a single visibility map, culling is done at any iso-value with little overhead in contouring. We demonstrate the effectiveness of the algorithm on a range of bio-medical data and discuss a practical application in online visualization. PMID:21673830

A new visualization representation is described, which dramatically improves interactivity for scientific visualizations of structured grid data sets by creating isosurfaces at interactive speeds and with dynamically changeable levels-of-detail (LOD). This representation enables greater interactivity by allowing an analyst to dynamically specify both the desired isosurface threshold and required level-of-detail to be used while rendering the image. A scientist can therefore view very large isosurfaces at interactive speeds (with a low level-of-detail), but has the full data set always available for analysis. The key idea is that various levels-of-detail are represented as differently sized hexahedral virtual voxels, which are stored in a three-dimensional binary tree, or kd-tree; thus the level-of-detail representation is done in voxel space instead of the traditional approach which relies on surface or geometry space decimations. Utilizing the voxel space is an essential step to moving from a post-processing visualization paradigm to a quantitative, real-time paradigm. This algorithm has been implemented as an integral component of the EIGEN/VR project at Sandia National Laboratories, which provides a rich environment for scientists to interactively explore and visualize the results of very large-scale simulations performed on massively parallel supercomputers.

Datasets commonly include multi-value (set-typed) attributes that describe set memberships over elements, such as genres per movie or courses taken per student. Set-typed attributes describe rich relations across elements, sets, and the set intersections. Increasing the number of sets results in a combinatorial growth of relations and creates scalability challenges. Exploratory tasks (e.g. selection, comparison) have commonly been designed in separation for set-typed attributes, which reduces interface consistency. To improve on scalability and to support rich, contextual exploration of set-typed data, we present AggreSet. AggreSet creates aggregations for each data dimension: sets, set-degrees, set-pair intersections, and other attributes. It visualizes the element count per aggregate using a matrix plot for set-pair intersections, and histograms for set lists, set-degrees and other attributes. Its non-overlapping visual design is scalable to numerous and large sets. AggreSet supports selection, filtering, and comparison as core exploratory tasks. It allows analysis of set relations inluding subsets, disjoint sets and set intersection strength, and also features perceptual set ordering for detecting patterns in set matrices. Its interaction is designed for rich and rapid data exploration. We demonstrate results on a wide range of datasets from different domains with varying characteristics, and report on expert reviews and a case study using student enrollment and degree data with assistant deans at a major public university. PMID:26390465

In this paper, we propose a scalablevisualization system to offer high-resolution visualization on multiparty collaborative environments. The proposed system treats with a coordination technique to employ large-scale high-resolution display system and to display multiple high-quality videos effectively on systems with limited resources. To handle these, the proposed system includes the distributed visualization application under generic structure to enable high-resolution video format, such as DV (digital video) and HDV (high definition video) streaming, and under decomposable decoding and display structure to assign the separated visualization task (decoding/display) to different system resources. The system is based on high-performance local area network and the high-performance network between decoding and display task is utilized as the system bus to transfer the decoded large pixel data. The main focus in this paper is the decoupling technique of decoding and display based on high-performance network to handle multiple high-resolution videos effectively. We explore the possibility of the proposed system by implementing a prototype and evaluating it over a high-performance network. Finally, the experiment results verify the improved scalable display system through the proposed structure.

Given the explosive growth of modern graph data, new methods are needed that allow for the querying of complex graph structures without the need of a complicated querying languages; in short, interactive graph querying is desirable. We describe our work towards achieving our overall research goal of designing and developing an interactive querying system for large network data. We focus on three critical aspects: scalable data mining algorithms, graph visualization, and interaction design. We have already completed an approximate subgraph matching system called MAGE in our previous work that fulfills the algorithmic foundation allowing us to query a graph with hundreds of millions of edges. Our preliminary work on visual graph querying, Graphite, was the first step in the process to making an interactive graph querying system. We are in the process of designing the graph visualization and robust interaction needed to make truly interactive graph querying a reality. PMID:25859567

One way to provide global illumination for the scientist who performs an interactive sweep through a 3D scalar dataset is to pre-compute global illumination, resample the radiance onto a 3D grid, then use it as a 3D texture. The basic approach of repeatedly extracting isosurfaces, illuminating them, and then building a 3D illumination grid suffers from the non-uniform sampling that arises from coupling the sampling of radiance with the sampling of isosurfaces. We demonstrate how the illumination step can be decoupled from the isosurface extraction step by illuminating the entire 3D scalar function as a 3-manifold in 4-dimensional space. By reformulating light transport in a higher dimension, one can sample a 3D volume without requiring the radiance samples to aggregate along individual isosurfaces in the pre-computed illumination grid. PMID:19834238

One way to provide global illumination for the scientist who performs an interactive sweep through a 3D scalar dataset is to pre-compute global illumination, resample the radiance onto a 3D grid, then use it as a 3D texture. The basic approach of repeatedly extracting isosurfaces, illuminating them, and then building a 3D illumination grid suffers from the non-uniform sampling that arises from coupling the sampling of radiance with the sampling of isosurfaces. We demonstrate how the illumination step can be decoupled from the isosurface extraction step by illuminating the entire 3D scalar function as a 3-manifold in 4-dimensional space. By reformulating light transport in a higher dimension, one can sample a 3D volume without requiring the radiance samples to aggregate along individual isosurfaces in the pre-computed illumination grid. PMID:19834238

This paper describes a method for constructing isosurface triangulations of sampled, volumetric, three-dimensional scalar fields. The resulting meshes consist of triangles that are of consistently high quality, making them well suited for accurate interpolation of scalar and vector-valued quantities, as required for numerous applications in visualization and numerical simulation. The proposed method does not rely on a local construction or adjustment of triangles as is done, for instance, in advancing wavefront or adaptive refinement methods. Instead, a system of dynamic particles optimally samples an implicit function such that the particles' relative positions can produce a topologically correct Delaunay triangulation. Thus, the proposed method relies on a global placement of triangle vertices. The main contributions of the paper are the integration of dynamic particles systems with surface sampling theory and PDE-based methods for controlling the local variability of particle densities, as well as detailing a practical method that accommodates Delaunay sampling requirements to generate sparse sets of points for the production of high-quality tessellations. PMID:17968128

With the accumulation of large amounts of health related data, predictive analytics could stimulate the transformation of reactive medicine towards Predictive, Preventive and Personalized (PPPM) Medicine, ultimately affecting both cost and quality of care. However, high-dimensionality and high-complexity of the data involved, prevents data-driven methods from easy translation into clinically relevant models. Additionally, the application of cutting edge predictive methods and data manipulation require substantial programming skills, limiting its direct exploitation by medical domain experts. This leaves a gap between potential and actual data usage. In this study, the authors address this problem by focusing on open, visual environments, suited to be applied by the medical community. Moreover, we review code free applications of big data technologies. As a showcase, a framework was developed for the meaningful use of data from critical care patients by integrating the MIMIC-II database in a data mining environment (RapidMiner) supporting scalable predictive analytics using visual tools (RapidMiner’s Radoop extension). Guided by the CRoss-Industry Standard Process for Data Mining (CRISP-DM), the ETL process (Extract, Transform, Load) was initiated by retrieving data from the MIMIC-II tables of interest. As use case, correlation of platelet count and ICU survival was quantitatively assessed. Using visual tools for ETL on Hadoop and predictive modeling in RapidMiner, we developed robust processes for automatic building, parameter optimization and evaluation of various predictive models, under different feature selection schemes. Because these processes can be easily adopted in other projects, this environment is attractive for scalable predictive analytics in health research. PMID:26731286

With the accumulation of large amounts of health related data, predictive analytics could stimulate the transformation of reactive medicine towards Predictive, Preventive and Personalized (PPPM) Medicine, ultimately affecting both cost and quality of care. However, high-dimensionality and high-complexity of the data involved, prevents data-driven methods from easy translation into clinically relevant models. Additionally, the application of cutting edge predictive methods and data manipulation require substantial programming skills, limiting its direct exploitation by medical domain experts. This leaves a gap between potential and actual data usage. In this study, the authors address this problem by focusing on open, visual environments, suited to be applied by the medical community. Moreover, we review code free applications of big data technologies. As a showcase, a framework was developed for the meaningful use of data from critical care patients by integrating the MIMIC-II database in a data mining environment (RapidMiner) supporting scalable predictive analytics using visual tools (RapidMiner's Radoop extension). Guided by the CRoss-Industry Standard Process for Data Mining (CRISP-DM), the ETL process (Extract, Transform, Load) was initiated by retrieving data from the MIMIC-II tables of interest. As use case, correlation of platelet count and ICU survival was quantitatively assessed. Using visual tools for ETL on Hadoop and predictive modeling in RapidMiner, we developed robust processes for automatic building, parameter optimization and evaluation of various predictive models, under different feature selection schemes. Because these processes can be easily adopted in other projects, this environment is attractive for scalable predictive analytics in health research. PMID:26731286

Recent advances in scanning technology provide high resolution EM (Electron Microscopy) datasets that allow neuro-scientists to reconstruct complex neural connections in a nervous system. However, due to the enormous size and complexity of the resulting data, segmentation and visualization of neural processes in EM data is usually a difficult and very time-consuming task. In this paper, we present NeuroTrace, a novel EM volume segmentation and visualization system that consists of two parts: a semi-automatic multiphase level set segmentation with 3D tracking for reconstruction of neural processes, and a specialized volume rendering approach for visualization of EM volumes. It employs view-dependent on-demand filtering and evaluation of a local histogram edge metric, as well as on-the-fly interpolation and ray-casting of implicit surfaces for segmented neural structures. Both methods are implemented on the GPU for interactive performance. NeuroTrace is designed to be scalable to large datasets and data-parallel hardware architectures. A comparison of NeuroTrace with a commonly used manual EM segmentation tool shows that our interactive workflow is faster and easier to use for the reconstruction of complex neural processes. PMID:19834227

JuxtaView is a cluster-based application for viewing ultra-high-resolution images on scalable tiled displays. We present in JuxtaView, a new parallel computing and distributed memory approach for out-of-core montage visualization, using LambdaRAM, a software-based network-level cache system. The ultimate goal of JuxtaView is to enable a user to interactively roam through potentially terabytes of distributed, spatially referenced image data such as those from electron microscopes, satellites and aerial photographs. In working towards this goal, we describe our first prototype implemented over a local area network, where the image is distributed using LambdaRAM, on the memory of all nodes of a PC cluster driving a tiled display wall. Aggressive pre-fetching schemes employed by LambdaRAM help to reduce latency involved in remote memory access. We compare LambdaRAM with a more traditional memory-mapped file approach for out-of-core visualization. ?? 2004 IEEE.

We present an algorithm for interactively extracting and rendering isosurfaces of large volume datasets in a view-dependent fashion. A recursive tetrahedral mesh refinement scheme, based on longest edge bisection, is used to hierarchically decompose the data into a multiresolution structure. This data structure allows fast extraction of arbitrary isosurfaces to within user specified view-dependent error bounds. A data layout scheme based on hierarchical space filling curves provides access to the data in a cache coherent manner that follows the data access pattern indicated by the mesh refinement.

A cloud-resolving model (CRM) is an atmospheric numerical model that can numerically resolve clouds and cloud systems at 0.25~5km horizontal grid spacings. The main advantage of the CRM is that it can allow explicit interactive processes between microphysics, radiation, turbulence, surface, and aerosols without subgrid cloud fraction, overlapping and convective parameterization. Because of their fine resolution and complex physical processes, it is challenging for the CRM community to i) visualize/inter-compare CRM simulations, ii) diagnose key processes for cloud-precipitation formation and intensity, and iii) evaluate against NASA's field campaign data and L1/L2 satellite data products due to large data volume (~10TB) and complexity of CRM's physical processes. We have been building the Super Cloud Library (SCL) upon a Hadoop framework, capable of CRM database management, distribution, visualization, subsetting, and evaluation in a scalable way. The current SCL capability includes (1) A SCL data model enables various CRM simulation outputs in NetCDF, including the NASA-Unified Weather Research and Forecasting (NU-WRF) and Goddard Cumulus Ensemble (GCE) model, to be accessed and processed by Hadoop, (2) A parallel NetCDF-to-CSV converter supports NU-WRF and GCE model outputs, (3) A technique visualizes Hadoop-resident data with IDL, (4) A technique subsets Hadoop-resident data, compliant to the SCL data model, with HIVE or Impala via HUE's Web interface, (5) A prototype enables a Hadoop MapReduce application to dynamically access and process data residing in a parallel file system, PVFS2 or CephFS, where high performance computing (HPC) simulation outputs such as NU-WRF's and GCE's are located. We are testing Apache Spark to speed up SCL data processing and analysis.With the SCL capabilities, SCL users can conduct large-domain on-demand tasks without downloading voluminous CRM datasets and various observations from NASA Field Campaigns and Satellite data to a

We propose an adaptively synchronous scalable spread spectrum (A4S) data-hiding strategy to integrate disparate data, needed for a typical 3-D visualization, into a single JPEG2000 format file. JPEG2000 encoding provides a standard format on one hand and the needed multiresolution for scalability on the other. The method has the potential of being imperceptible and robust at the same time. While the spread spectrum (SS) methods are known for the high robustness they offer, our data-hiding strategy is removable at the same time, which ensures highest possible visualization quality. The SS embedding of the discrete wavelet transform (DWT)-domain depth map is carried out in transform domain YCrCb components from the JPEG2000 coding stream just after the DWT stage. To maintain synchronization, the embedding is carried out while taking into account the correspondence of subbands. Since security is not the immediate concern, we are at liberty with the strength of embedding. This permits us to increase the robustness and bring the reversibility of our method. To estimate the maximum tolerable error in the depth map according to a given viewpoint, a human visual system (HVS)-based psychovisual analysis is also presented.

A new visualization technique is reported, which dramatically improves interactivity for scientific visualizations by working directly with voxel data and by employing efficient algorithms and data structures. This discussion covers the research software, the file structures, examples of data creation, data search, and triangle rendering codes that allow geometric surfaces to be extracted from volumetric data. Uniquely, these methods enable greater interactivity by allowing an analyst to dynamically specify both the desired isosurface threshold and required level-of-detail to be used while rendering the image. The key idea behind this visualization paradigm is that various levels-of-detail are represented as differently sized hexahedral virtual voxels, which are stored in a three-dimensional kd-tree; thus the level-of-detail representation is done in voxel space instead of the traditional approach which relies on surface or geometry space decimations. This algorithm has been implemented as an integral component in the EIGEN/VR project at Sandia National Laboratories, which provides a rich environment for scientists to interactively explore and visualize the results of very large-scale simulations performed on massively parallel supercomputers.

The use of collaborative scientific visualization systems for the analysis, visualization, and sharing of 'big data' available from new high resolution remote sensing satellite sensors or four-dimensional numerical model simulations is propelling the wider adoption of ultra-resolution tiled display walls interconnected by high speed networks. These systems require a globally connected and well-integrated operating environment that provides persistent visualization and collaboration services. This abstract and subsequent presentation describes a new collaborative visualization system installed for NASA's Short-term Prediction Research and Transition (SPoRT) program at Marshall Space Flight Center and its use for Earth science applications. The system consists of a 3 x 4 array of 1920 x 1080 pixel thin bezel video monitors mounted on a wall in a scientific collaboration lab. The monitors are physically and virtually integrated into a 14' x 7' for video display. The display of scientific data on the video wall is controlled by a single Alienware Aurora PC with a 2nd Generation Intel Core 4.1 GHz processor, 32 GB memory, and an AMD Fire Pro W600 video card with 6 mini display port connections. Six mini display-to-dual DVI cables are used to connect the 12 individual video monitors. The open source Scalable Adaptive Graphics Environment (SAGE) windowing and media control framework, running on top of the Ubuntu 12 Linux operating system, allows several users to simultaneously control the display and storage of high resolution still and moving graphics in a variety of formats, on tiled display walls of any size. The Ubuntu operating system supports the open source Scalable Adaptive Graphics Environment (SAGE) software which provides a common environment, or framework, enabling its users to access, display and share a variety of data-intensive information. This information can be digital-cinema animations, high-resolution images, high-definition video

The use of collaborative scientific visualization systems for the analysis, visualization, and sharing of "big data" available from new high resolution remote sensing satellite sensors or four-dimensional numerical model simulations is propelling the wider adoption of ultra-resolution tiled display walls interconnected by high speed networks. These systems require a globally connected and well-integrated operating environment that provides persistent visualization and collaboration services. This abstract and subsequent presentation describes a new collaborative visualization system installed for NASA's Shortterm Prediction Research and Transition (SPoRT) program at Marshall Space Flight Center and its use for Earth science applications. The system consists of a 3 x 4 array of 1920 x 1080 pixel thin bezel video monitors mounted on a wall in a scientific collaboration lab. The monitors are physically and virtually integrated into a 14' x 7' for video display. The display of scientific data on the video wall is controlled by a single Alienware Aurora PC with a 2nd Generation Intel Core 4.1 GHz processor, 32 GB memory, and an AMD Fire Pro W600 video card with 6 mini display port connections. Six mini display-to-dual DVI cables are used to connect the 12 individual video monitors. The open source Scalable Adaptive Graphics Environment (SAGE) windowing and media control framework, running on top of the Ubuntu 12 Linux operating system, allows several users to simultaneously control the display and storage of high resolution still and moving graphics in a variety of formats, on tiled display walls of any size. The Ubuntu operating system supports the open source Scalable Adaptive Graphics Environment (SAGE) software which provides a common environment, or framework, enabling its users to access, display and share a variety of data-intensive information. This information can be digital-cinema animations, high-resolution images, high-definition video

This project developed software tools for the automation of grid computing. In particular, the project focused in visualization and imaging tools (VTK, ParaView and ITK); i.e., we developed tools to automatically create Grid services from C++ programs implemented using the open-source VTK visualization and ITK segmentation and registration systems. This approach helps non-Grid experts to create applications using tools with which they are familiar, ultimately producing Grid services for visualization and image analysis by invocation of an automatic process.

An analyst today has a tremendous amount of data available, but each of the various data sources typically exists in their own silos, so an analyst has limited ability to see an integrated view of the data and has little or no access to contextual information that could help in understanding the data. We have developed the Domain-Insight Graph (DIG) system, an innovative architecture for extracting, aligning, linking, and visualizing massive amounts of domain-specific content from unstructured sources. Under the DARPA Memex program we have already successfully applied this architecture to multiple application domains, including the enormous international problem of human trafficking, where we extracted, aligned and linked data from 50 million online Web pages. DIG builds on our Karma data integration toolkit, which makes it easy to rapidly integrate structured data from a variety of sources, including databases, spreadsheets, XML, JSON, and Web services. The ability to integrate Web services allows Karma to pull in live data from the various social media sites, such as Twitter, Instagram, and OpenStreetMaps. DIG then indexes the integrated data and provides an easy to use interface for query, visualization, and analysis.

This paper presents a simple approach for rendering isosurfaces of a scalar field. Using the vertex programming capability of commodity graphics cards, we transfer the cost of computing an isosurface from the Central Processing Unit (CPU), running the main application, to the Graphics Processing Unit (GPU), rendering the images. We consider a tetrahedral decomposition of the domain and draw one quadrangle (quad) primitive per tetrahedron. A vertex program transforms the quad into the piece of isosurface within the tetrahedron (see Figure 2). In this way, the main application is only devoted to streaming the vertices of the tetrahedra from main memory to the graphics card. For adaptively refined rectilinear grids, the optimization of this streaming process leads to the definition of a new 3D space-filling curve, which generalizes the 2D Sierpinski curve used for efficient rendering of triangulated terrains. We maintain the simplicity of the scheme when constructing view-dependent adaptive refinements of the domain mesh. In particular, we guarantee the absence of T-junctions by satisfying local bounds in our nested error basis. The expensive stage of fixing cracks in the mesh is completely avoided. We discuss practical tradeoffs in the distribution of the workload between the application and the graphics hardware. With current GPU's it is convenient to perform certain computations on the main CPU. Beyond the performance considerations that will change with the new generations of GPU's this approach has the major advantage of avoiding completely the storage in memory of the isosurface vertices and triangles.

Many high-performance isosurface extraction algorithms have been proposed in the past several years as a result of intensive research efforts. When applying these algorithms to large-scale time-varying fields, the storage overhead incurred from storing the search index often becomes overwhelming. this paper proposes an algorithm for locating isosurface cells in time-varying fields. We devise a new data structure, called Temporal Hierarchical Index Tree, which utilizes the temporal coherence that exists in a time-varying field and adoptively coalesces the cells' extreme values over time; the resulting extreme values are then used to create the isosurface cell search index. For a typical time-varying scalar data set, not only does this temporal hierarchical index tree require much less storage space, but also the amount of I/O required to access the indices from the disk at different time steps is substantially reduced. We illustrate the utility and speed of our algorithm with data from several large-scale time-varying CID simulations. Our algorithm can achieve more than 80% of disk-space savings when compared with the existing techniques, while the isosurface extraction time is nearly optimal.

The goal is to develop a Unified Air-Sea Visualization System (UASVS) to enable the rapid fusion of observational, archival, and model data for verification and analysis. To design and develop UASVS, modelers were polled to determine the gridding structures and visualization systems used, and their needs with respect to visual analysis. A basic UASVS requirement is to allow a modeler to explore multiple data sets within a single environment, or to interpolate multiple datasets onto one unified grid. From this survey, the UASVS should be able to visualize 3D scalar/vector fields; render isosurfaces; visualize arbitrary slices of the 3D data; visualize data defined on spectral element grids with the minimum number of interpolation stages; render contours; produce 3D vector plots and streamlines; provide unified visualization of satellite images, observations and model output overlays; display the visualization on a projection of the users choice; implement functions so the user can derive diagnostic values; animate the data to see the time-evolution; animate ocean and atmosphere at different rates; store the record of cursor movement, smooth the path, and animate a window around the moving path; repeatedly start and stop the visual time-stepping; generate VHS tape animations; work on a variety of workstations; and allow visualization across clusters of workstations and scalable high performance computer systems.

The purpose of this work is to compare the speed of isosurface rendering in software with that using dedicated hardware. Input data consists of 10 different objects form various parts of the body and various modalities with a variety of surface sizes and shapes. The software rendering technique consists of a particular method of voxel-based surface rendering, called shell rendering. The hardware method is OpenGL-based and uses the surfaces constructed from our implementation of the 'Marching Cubes' algorithm. The hardware environment consists of a variety of platforms including a Sun Ultra I with a Creator3D graphics card and a Silicon Graphics Reality Engine II, both with polygon rendering hardware, and a 300Mhz Pentium PC. The results indicate that the software method was 18 to 31 times faster than any hardware rendering methods. This work demonstrates that a software implementation of a particular rendering algorithm can outperform dedicated hardware. We conclude that for medical surface visualization, expensive dedicated hardware engines are not required. More importantly, available software algorithms on a 300Mhz Pentium PC outperform the speed of rendering via hardware engines by a factor of 18 to 31.

Current parallel supercomputers provide sufficient performance to simulate unsteady three-dimensional fluid dynamics in high resolution. However, the visualization of the huge amounts of result data cannot be handled by traditional methods, where post-processing modules are usually coupled to the raw data source, either by files or by data flow. To avoid significant bottlenecks of the storage and communication resources, efficient techniques for data extraction and preprocessing at the source have been realized in the parallel, network-distributed chain of our Distributed Simulation and Virtual Reality Environment(DSVR). Here the 3D data extraction is implemented as a parallel library (libDVRP) and can be done in-situ during the numerical simulations, which avoids the storage of raw data for visualization at all.

Until the introduction of non-invasive imaging techniques, the representation of anatomy and pathology relied solely on gross dissection and histological staining. Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) protocols allow for the clinical evaluation of anatomical images derived from complementary modalities, thereby increasing reliability of the diagnosis and the prognosis of disease. Despite the significant improvements in image contrast and resolution of MRI, autopsy and classical histopathological analysis are still indispensable for the correct diagnosis of specific disease. It is therefore important to be able to correlate multiple images from different modalities, in vivo and postmortem, in order to validate non-invasive imaging markers of disease. To that effect, we have developed a methodological pipeline and a visualization environment that allow for the concurrent observation of both macroscopic and microscopic image data relative to the same patient. We describe these applications and sample data relative to the study of the anatomy and disease of the Central Nervous System (CNS). The brain is approached as an organ with a complex 3-dimensional (3-D) architecture that can only be effectively studied combining observation and analysis at the system level as well as at the cellular level. Our computational and visualization environment allows seamless navigation through multiple layers of neurological data that are accessible quickly and simultaneously. PMID:19377104

The behavior of scalar iso-surfaces in turbulent flows is of fundamental interest and also of importance in certain applications, e.g., the stoichiometric surface in nonpremixed, turbulent reacting flows. Of particular interest is the average area per unit volume of the surface, Σ. We report on the use of direct numerical simulations to directly compute Σ and to model its evolution in time for the case of isotropic turbulence. Using both a direct measurement technique, and also Corrsin's (1955) suggestion of surface-crossing, we find the iso-surface in space and also measure Σ as the surface evolves in time. This allows us to follow the growth of the surface due to local surface stretching and its ultimate decrease due to molecular destruction. We are also able to measure the principal terms in the evolution equation for Σ, including the surface stretching term S and the molecular destruction term M . For example, for the scalar Z we find that its spatial derivative quantities are approximately statistically independent of Z itself, so that S and M are approximately statistically independent of Z as well. Finally, a model is proposed which fairly accurately predicts the evolution of Σ. Supported by NSF Grant No. OCI-0749200.

Stackless traversal techniques are often used to circumvent memory bottlenecks by avoiding a stack and replacing return traversal with extra computation. This paper addresses whether the stackless traversal approaches are useful on newer hardware and technology (such as CUDA). To this end, we present a novel stackless approach for implicit kd-trees, which exploits the benefits of index-based node traversal, without incurring extra node visitation. This approach, which we term Kd-Jump, enables the traversal to immediately return to the next valid node, like a stack, without incurring extra node visitation (kd-restart). Also, Kd-Jump does not require global memory (stack) at all and only requires a small matrix in fast constant-memory. We report that Kd-Jump outperforms a stack by 10 to 20% and kd-restart by 100%. We also present a Hybrid Kd-Jump, which utilizes a volume stepper for leaf testing and a run-time depth threshold to define where kd-tree traversal stops and volume-stepping occurs. By using both methods, we gain the benefits of empty space removal, fast texture-caching and realtime ability to determine the best threshold for current isosurface and view direction. PMID:19834233

Tags are non-invasive features induced in the heart muscle that enable the tracking of heart motion. Each tag line, in fact, corresponds to a 3D tag surface that deforms with the heart muscle during the cardiac cycle. Tracking of tag surfaces deformation is useful for the analysis of left ventricular motion. Cardiac material markers (Kerwin et al, MIA, 1997) can be obtained from the intersections of orthogonal surfaces which can be reconstructed from short- and long-axis tagged images. The proposed method uses Harmonic Phase (HARP) method for tracking tag lines corresponding to a specific harmonic phase value and then the reconstruction of grid tag surfaces is achieved by a Delaunay triangulation-based interpolation for sparse tag points. Having three different tag orientations from short- and long-axis images, the proposed method showed the deformation of 3D tag surfaces during the cardiac cycle. Previous work on tag surface reconstruction was restricted for the "dark" tag lines; however, the use of HARP as proposed enables the reconstruction of isosurfaces based on their harmonic phase values. The use of HARP, also, provides a fast and accurate way for tag lines identification and tracking, and hence, generating the surfaces.

In this paper we describe the application of folding measures to tracking in vivo cortical brain development in premature neonatal brain anatomy. The outer gray matter and the gray-white matter interface surfaces were extracted from semi-interactively segmented high-resolution T1 MRI data. Nine curvature- and geometric descriptor-based folding measures were applied to six premature infants, aged 28-37 weeks, using a direct voxelwise iso-surface representation. We have shown that using such an approach it is feasible to extract meaningful surfaces of adequate quality from typical clinically acquired neonatal MRI data. We have shown that most of the folding measures, including a new proposed measure, are sensitive to changes in age and therefore applicable in developing a model that tracks development in premature infants. For the first time gyrification measures have been computed on the gray-white matter interface and on cases whose age is representative of a period of intense brain development.

terabytes. The combination of different data sources (e.g., MOLA, HRSC, HiRISE) and selection of presented data (e.g., infrared, spectral, imagery) is also supported. Furthermore, the data is presented unchanged and with the highest possible resolution for the target setup (e.g., power-wall, workstation, laptop) and view distance. The visualization techniques for the volumetric data sets can handle VTK [6] based data sets and also support different grid types as well as a time component. In detail, the integrated volume rendering uses a GPU based ray casting algorithm which was adapted to work in spherical coordinate systems. This approach results in interactive frame-rates without compromising visual fidelity. Besides direct visualization via volume rendering the prototype supports interactive slicing, extraction of iso-surfaces and probing. The latter can also be used for side-by-side comparison and on-the-fly diagram generation within the application. Similarily to the surface data a combination of different data sources is supported as well. For example, the extracted iso-surface of a scalar pressure field can be used for the visualization of the temperature. The software development is supported by the ViSTA VR-toolkit [7] and supports different target systems as well as a wide range of VR-devices. Furthermore, the prototype is scalable to run on laptops, workstations and cluster setups. REFERENCES [1] A. S. Garcia, D. J. Roberts, T. Fernando, C. Bar, R. Wolff, J. Dodiya, W. Engelke, and A. Gerndt, "A collaborative workspace architecture for strengthening collaboration among space scientists," in IEEE Aerospace Conference, (Big Sky, Montana, USA), 7-14 March 2015. [2] W. Engelke, "Mars Cartography VR System 2/3." German Aerospace Center (DLR), 2015. Project Deliverable D4.2. [3] E. Hivon, F. K. Hansen, and A. J. Banday, "The healpix primer," arXivpreprint astro-ph/9905275, 1999. [4] K. M. Gorski, E. Hivon, A. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, and M

Characterization of the earth's subsurface involves the construction of 3D models from sparse data and so leads to simulation results that involve some degree of uncertainty. This uncertainty is often neglected in the subsequent visualization, due to the fact that no established methods or available software exist. We describe a visualization method to render scalar fields with a probability density function at each data point. We render these data as isosurfaces and make use of a colour scheme, which intuitively gives the viewer an idea of which parts of the surface are more reliable than others. We further show how to extract an envelope that indicates within which volume the isosurface will lie with a certain confidence, and augment the isosurfaces with additional geometry in order to show this information. The resulting visualization is easy and intuitive to understand and is suitable for rendering multiple distinguishable isosurfaces at a time. It can moreover be easily used together with other visualized objects, such as the geological context. Finally we show how we have integrated this into a visualization pipeline that is based on the Visualization Toolkit (VTK) and the open source scenegraph OpenSG, allowing us to render the results on a desktop and in different kinds of virtual environments.

GRIZ is a general-purpose post-processing application supporting interactive visualization of finite element analysis results on unstructured grids. In addition to basic pseudocolor renderings of state variables over the mesh surface, GRIZ provides modern visualization techniques such as isocontours and isosurfaces, cutting planes, vector field display, and particle traces. GRIZ accepts both command-line and mouse-driven input, and is portable to virtually any UNIX platform which provides Motif and OpenGl libraries.

An interactive visualization system pV3 is being developed for the investigation of advanced computational methodologies employing visualization and parallel processing for the extraction of information contained in large-scale transient engineering simulations. Visual techniques for extracting information from the data in terms of cutting planes, iso-surfaces, particle tracing and vector fields are included in this system. This paper discusses improvements to the pV3 system developed under NASA's Affordable High Performance Computing project.

and methods, we are developing a stand-alone post-processor, adding further data structures and mapping algorithms, and cooperating with the ICON developers and users. With the implementation of a DSVR-based post-processor, a milestone was achieved. By using the DSVR post-processor the mentioned 3 processes are completely separated: the data set is processed in a batch mode - e.g. on the same supercomputer, which the data is generated on - and the interactive 3D rendering is done afterwards on the scientist's local system. At the actual status of implementation the DSVR post-processor supports the generation of isosurfaces and colored slicers on volume data set time series based on rectilinear grids as well as the visualization of pathlines on time varying flow fields based on either rectilinear grids or prism grids. The software implementation and evaluation is done on the supercomputers at DKRZ, including scalability tests using ICON output files in NetCDF format. The next milestones will be (a) the in-situ integration of the DSVR library in the ICON model and (b) the implementation of an isosurface algorithm for prism grids.

Muster is a framework for scalable cluster analysis. It includes implementations of classic K-Medoids partitioning algorithms, as well as infrastructure for making these algorithms run scalably on very large systems. In particular, Muster contains algorithms such as CAPEK (described in reference 1) that are capable of clustering highly distributed data sets in-place on a hundred thousand or more processes.

This case study presents initial results from research targeted at the development of cost-effective scalablevisualization and rendering technologies. The implementations of two 3D graphics libraries based on the popular sort-last and sort-middle parallel rendering techniques are discussed. An important goal of these implementations is to provide scalable rendering capability for extremely large datasets (>> 5 million polygons). Applications can use these libraries for either run-time visualization, by linking to an existing parallel simulation, or for traditional post-processing by linking to an interactive display program. The use of parallel, hardware-accelerated rendering on commodity hardware is leveraged to achieve high performance. Current performance results show that, using current hardware (a small 16-node cluster), they can utilize up to 85% of the aggregate graphics performance and achieve rendering rates in excess of 20 million polygons/second using OpenGL{reg_sign} with lighting, Gouraud shading, and individually specified triangles (not t-stripped).

To help in interpreting the polarity of a molecule, charge separation can be visualized by mapping the electrostatic potential at the van der Waals surface using a color gradient or by indicating positive and negative regions of the electrostatic potential using different colored isosurfaces. Although these visualizations capture the molecular…

This report contains an algorithm for decomposing higher-order finite elements into regions appropriate for isosurfacing and proves the conditions under which the algorithm will terminate. Finite elements are used to create piecewise polynomial approximants to the solution of partial differential equations for which no analytical solution exists. These polynomials represent fields such as pressure, stress, and momentum. In the past, these polynomials have been linear in each parametric coordinate. Each polynomial coefficient must be uniquely determined by a simulation, and these coefficients are called degrees of freedom. When there are not enough degrees of freedom, simulations will typically fail to produce a valid approximation to the solution. Recent work has shown that increasing the number of degrees of freedom by increasing the order of the polynomial approximation (instead of increasing the number of finite elements, each of which has its own set of coefficients) can allow some types of simulations to produce a valid approximation with many fewer degrees of freedom than increasing the number of finite elements alone. However, once the simulation has determined the values of all the coefficients in a higher-order approximant, tools do not exist for visual inspection of the solution. This report focuses on a technique for the visual inspection of higher-order finite element simulation results based on decomposing each finite element into simplicial regions where existing visualization algorithms such as isosurfacing will work. The requirements of the isosurfacing algorithm are enumerated and related to the places where the partial derivatives of the polynomial become zero. The original isosurfacing algorithm is then applied to each of these regions in turn.

Sandia Scalable Encryption Library (SSEL) Version 1.0 is a library of functions that implement Sandia''s scalable encryption algorithm. This algorithm is used to encrypt Asynchronous Transfer Mode (ATM) data traffic, and is capable of operating on an arbitrary number of bits at a time (which permits scaling via parallel implementations), while being interoperable with differently scaled versions of this algorithm. The routines in this library implement 8 bit and 32 bit versions of a non-linear mixer which is compatible with Sandia''s hardware-based ATM encryptor.

Sandia Scalable Encryption Library (SSEL) Version 1.0 is a library of functions that implement Sandia''s scalable encryption algorithm. This algorithm is used to encrypt Asynchronous Transfer Mode (ATM) data traffic, and is capable of operating on an arbitrary number of bits at a time (which permits scaling via parallel implementations), while being interoperable with differently scaled versions of this algorithm. The routines in this library implement 8 bit and 32 bit versions of a non-linearmore » mixer which is compatible with Sandia''s hardware-based ATM encryptor.« less

A finite frame is said to be scalable if its vectors can be rescaled so that the resulting set of vectors is a tight frame. The theory of scalable frame has been extended to the setting of Laplacian pyramids which are based on (rectangular) paraunitary matrices whose column vectors are Laurent polynomial vectors. This is equivalent to scaling the polyphase matrices of the associated filter banks. Consequently, tight wavelet frames can be constructed by appropriately scaling the columns of these paraunitary matrices by diagonal matrices whose diagonal entries are square magnitude of Laurent polynomials. In this paper we present examples of tight wavelet frames constructed in this manner and discuss some of their properties in comparison to the (non tight) wavelet frames they arise from.

Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. While effective at small scale, centralized load balancing schemes quickly become a bottleneck on large-scale clusters. Work stealing is a popular approach to distributed dynamic load balancing; however its performance on large-scale clusters is not well understood. Prior work on work stealing has largely focused on shared memory machines. In this work we investigate the design and scalability of work stealing on modern distributed memory systems. We demonstrate high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.

Atomic orbitals are a theme throughout the undergraduate chemistry curriculum, and visualizing them has been a theme in this journal. Contour plots as isosurfaces or contour lines in a plane are the most familiar representations of the hydrogen wave functions. In these representations, a surface of a fixed value of the wave function ? is plotted…

The rich history of scalable computing research owes much to a rapid rise in computing platform scale in terms of size and speed. As platforms evolve, so must algorithms and the software expressions of those algorithms. Unbridled growth in scale inevitably leads to complexity. This special issue grapples with two facets of this complexity: scalable execution and scalable development. The former results from efficient programming of novel hardware with increasing numbers of processing units (e.g., cores, processors, threads or processes). The latter results from efficient development of robust, flexible software with increasing numbers of programming units (e.g., procedures, classes, components or developers). The progression in the above two parenthetical lists goes from the lowest levels of abstraction (hardware) to the highest (people). This issue's theme encompasses this entire spectrum. The lead author of each article resides in the Scalable Computing Research and Development Department at Sandia National Laboratories in Livermore, CA. Their co-authors hail from other parts of Sandia, other national laboratories and academia. Their research sponsors include several programs within the Department of Energy's Office of Advanced Scientific Computing Research and its National Nuclear Security Administration, along with Sandia's Laboratory Directed Research and Development program and the Office of Naval Research. The breadth of interests of these authors and their customers reflects in the breadth of applications this issue covers. This article demonstrates how to obtain scalable execution on the increasingly dominant high-performance computing platform: a Linux cluster with multicore chips. The authors describe how deep memory hierarchies necessitate reducing communication overhead by using threads to exploit shared register and cache memory. On a matrix-matrix multiplication problem, they achieve up to 96% parallel efficiency with a three-part strategy: intra

terabytes. The combination of different data sources (e.g., MOLA, HRSC, HiRISE) and selection of presented data (e.g., infrared, spectral, imagery) is also supported. Furthermore, the data is presented unchanged and with the highest possible resolution for the target setup (e.g., power-wall, workstation, laptop) and view distance. The visualization techniques for the volumetric data sets can handle VTK [6] based data sets and also support different grid types as well as a time component. In detail, the integrated volume rendering uses a GPU based ray casting algorithm which was adapted to work in spherical coordinate systems. This approach results in interactive frame-rates without compromising visual fidelity. Besides direct visualization via volume rendering the prototype supports interactive slicing, extraction of iso-surfaces and probing. The latter can also be used for side-by-side comparison and on-the-fly diagram generation within the application. Similarily to the surface data a combination of different data sources is supported as well. For example, the extracted iso-surface of a scalar pressure field can be used for the visualization of the temperature. The software development is supported by the ViSTA VR-toolkit [7] and supports different target systems as well as a wide range of VR-devices. Furthermore, the prototype is scalable to run on laptops, workstations and cluster setups. REFERENCES [1] A. S. Garcia, D. J. Roberts, T. Fernando, C. Bar, R. Wolff, J. Dodiya, W. Engelke, and A. Gerndt, "A collaborative workspace architecture for strengthening collaboration among space scientists," in IEEE Aerospace Conference, (Big Sky, Montana, USA), 7-14 March 2015. [2] W. Engelke, "Mars Cartography VR System 2/3." German Aerospace Center (DLR), 2015. Project Deliverable D4.2. [3] E. Hivon, F. K. Hansen, and A. J. Banday, "The healpix primer," arXivpreprint astro-ph/9905275, 1999. [4] K. M. Gorski, E. Hivon, A. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, and M

The Scalable Analysis Toolkit (SAT) project aimed to demonstrate that it is feasible and useful to statically detect software bugs in very large systems. The technical focus of the project was on a relatively new class of constraint-based techniques for analysis software, where the desired facts about programs (e.g., the presence of a particular bug) are phrased as constraint problems to be solved. At the beginning of this project, the most successful forms of formal software analysis were limited forms of automatic theorem proving (as exemplified by the analyses used in language type systems and optimizing compilers), semi-automatic theorem proving for full verification, and model checking. With a few notable exceptions these approaches had not been demonstrated to scale to software systems of even 50,000 lines of code. Realistic approaches to large-scale software analysis cannot hope to make every conceivable formal method scale. Thus, the SAT approach is to mix different methods in one application by using coarse and fast but still adequate methods at the largest scales, and reserving the use of more precise but also more expensive methods at smaller scales for critical aspects (that is, aspects critical to the analysis problem under consideration) of a software system. The principled method proposed for combining a heterogeneous collection of formal systems with different scalability characteristics is mixed constraints. This idea had been used previously in small-scale applications with encouraging results: using mostly coarse methods and narrowly targeted precise methods, useful information (meaning the discovery of bugs in real programs) was obtained with excellent scalability.

A way of designing a scalable optical quantum computer based on the photon echo effect is proposed. Individual rare earth ions Pr{sup 3+}, regularly located in the lattice of the orthosilicate (Y{sub 2}SiO{sub 5}) crystal, are suggested to be used as optical qubits. Operations with qubits are performed using coherent and incoherent laser pulses. The operation protocol includes both the method of measurement-based quantum computations and the technique of optical computations. Modern hybrid photon echo protocols, which provide a sufficient quantum efficiency when reading recorded states, are considered as most promising for quantum computations and communications. (quantum computer)

necessarily entirely exposed to scientists writing visualization queries, facilitates the automated construction of visualization pipelines. VisKo queries have been successfully used in support of visualization scenarios from Earth Science domains including: velocity model isosurfaces, gravity data raster, and contour map renderings. Our synergistic environment provided by our CYBER-ShARE initiative at the University of Texas at El Paso has allowed us to work closely with Earth Science experts that have both provided us our test data as well as validation as to whether the execution of VisKo queries are returning visualizations that can be used for data analysis. Additionally, we have employed VisKo queries to support visualization scenarios associated with Giovanni, an online platform for data analysis developed by NASA GES DISC. VisKo-enhanced visualizations included time series plotting of aerosol data as well as contour and raster map generation of gridded brightness-temperature data.

In this paper we will present a new technology that we are currently developing within the SFT: Scalable Fault Tolerance FastOS project which seeks to implement fault tolerance at the operating system level. Major design goals include dynamic reallocation of resources to allow continuing execution in the presence of hardware failures, very high scalability, high efficiency (low overhead), and transparency—requiring no changes to user applications. Our technology is based on a global coordination mechanism, that enforces transparent recovery lines in the system, and TICK, a lightweight, incremental checkpointing software architecture implemented as a Linux kernel module. TICK is completely user-transparent and does not require any changes to user code or system libraries; it is highly responsive: an interrupt, such as a timer interrupt, can trigger a checkpoint in as little as 2.5μs; and it supports incremental and full checkpoints with minimal overhead—less than 6% with full checkpointing to disk performed as frequently as once per minute.

In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.

In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.

Synthetic biology is focused on engineering biological organisms to study natural systems and to provide new solutions for pressing medical, industrial and environmental problems. At the core of engineered organisms are synthetic biological circuits that execute the tasks of sensing inputs, processing logic and performing output functions. In the last decade, significant progress has been made in developing basic designs for a wide range of biological circuits in bacteria, yeast and mammalian systems. However, significant challenges in the construction, probing, modulation and debugging of synthetic biological systems must be addressed in order to achieve scalable higher-complexity biological circuits. Furthermore, concomitant efforts to evaluate the safety and biocontainment of engineered organisms and address public and regulatory concerns will be necessary to ensure that technological advances are translated into real-world solutions. PMID:21468204

In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.

Project description is: (1) Build a high performance computer; and (2) Create a tool to monitor node applications in Component Based Tool Framework (CBTF) using code from Lightweight Data Metric Service (LDMS). The importance of this project is that: (1) there is a need a scalable, parallel tool to monitor nodes on clusters; and (2) New LDMS plugins need to be able to be easily added to tool. CBTF stands for Component Based Tool Framework. It's scalable and adjusts to different topologies automatically. It uses MRNet (Multicast/Reduction Network) mechanism for information transport. CBTF is flexible and general enough to be used for any tool that needs to do a task on many nodes. Its components are reusable and 'EASILY' added to a new tool. There are three levels of CBTF: (1) frontend node - interacts with users; (2) filter nodes - filters or concatenates information from backend nodes; and (3) backend nodes - where the actual work of the tool is done. LDMS stands for lightweight data metric servies. It's a tool used for monitoring nodes. Ltool is the name of the tool we derived from LDMS. It's dynamically linked and includes the following components: Vmstat, Meminfo, Procinterrupts and more. It works by: Ltool command is run on the frontend node; Ltool collects information from the backend nodes; backend nodes send information to the filter nodes; and filter nodes concatenate information and send to a database on the front end node. Ltool is a useful tool when it comes to monitoring nodes on a cluster because the overhead involved with running the tool is not particularly high and it will automatically scale to any size cluster.

A decoder was developed that decodes a serial concatenated pulse position modulation (SCPPM) encoded information sequence. The decoder takes as input a sequence of four bit log-likelihood ratios (LLR) for each PPM slot in a codeword via a XAUI 10-Gb/s quad optical fiber interface. If the decoder is unavailable, it passes the LLRs on to the next decoder via a XAUI 10-Gb/s quad optical fiber interface. Otherwise, it decodes the sequence and outputs information bits through a 1-GB/s Ethernet UDP/IP (User Datagram Protocol/Internet Protocol) interface. The throughput for a single decoder unit is 150-Mb/s at an average of four decoding iterations; by connecting a number of decoder units in series, a decoding rate equal to that of the aggregate rate is achieved. The unit is controlled through a 1-GB/s Ethernet UDP/IP interface. This ground station decoder was developed to demonstrate a deep space optical communication link capability, and is unique in the scalable design to achieve real-time SCPP decoding at the aggregate data rate.

Current high-performance computers and advanced image processing capabilities have made the application of three- dimensional visualization objects in biomedical computer tomographic (CT) images facilitate the researches on biomedical engineering greatly. Trying to cooperate with the update technology using Internet, where 3D data are typically stored and processed on powerful servers accessible by using TCP/IP, we should hold the results of the isosurface be applied in medical visualization generally. Furthermore, this project is a future part of PACS system our lab is working on. So in this system we use the 3D file format VRML2.0, which is used through the Web interface for manipulating 3D models. In this program we implemented to generate and modify triangular isosurface meshes by marching cubes algorithm. Then we used OpenGL and MFC techniques to render the isosurface and manipulating voxel data. This software is more adequate visualization of volumetric data. The drawbacks are that 3D image processing on personal computers is rather slow and the set of tools for 3D visualization is limited. However, these limitations have not affected the applicability of this platform for all the tasks needed in elementary experiments in laboratory or data preprocessed.

Current high-performance computers and advanced image processing capabilities have made the application of three dimensional visualization objects in biomedical images facilitate the researches on biomedical engineering greatly. Trying to cooperate with the update technology using Internet, where 3-D data are typically stored and processed on powerful servers accessible by using TCP/IP, we held the results of the isosurface be applied in medical visualization generally. So in this system we use the 3-D file format VRML2.0, which is used through the Web interface for manipulating 3-D models. In this program we implemented to generate and modify triangular isosurface meshes by marching cubes algorithm, using OpenGL and MFC techniques to render the isosurface and manipulate voxel data. This software is more adequate visualization of volumetric data. The drawbacks are that 3-D image processing on personal computers is rather slow and the set of tools for 3-D visualization is limited. However, these limitations have not affected the applicability of this platform for all the tasks needed in elementary experiments in laboratory or data preprocessed. With the help of OCT and MPE scanning image system, applying these techniques to the visualization of rabbit brain, constructing data sets of hierarchical subdivisions of the cerebral information, we can establish a virtual environment on the World Wide Web for the rabbit brain research from its gross anatomy to its tissue and cellular levels of detail, providng graphical modeling and information management of both the outer and the inner space of the rabbit brain.

This paper presents a scalable framework for real-time raycasting of large unstructured volumes that employs a hybrid bricking approach. It adaptively combines original unstructured bricks in important (focus) regions, with structured bricks that are resampled on demand in less important (context) regions. The basis of this focus+context approach is interactive specification of a scalar degree of interest (DOI) function. Thus, rendering always considers two volumes simultaneously: a scalar data volume, and the current DOI volume. The crucial problem of visibility sorting is solved by raycasting individual bricks and compositing in visibility order from front to back. In order to minimize visual errors at the grid boundary, it is always rendered accurately, even for resampled bricks. A variety of different rendering modes can be combined, including contour enhancement. A very important property of our approach is that it supports a variety of cell types natively, i.e., it is not constrained to tetrahedral grids, even when interpolation within cells is used. Moreover, our framework can handle multi-variate data, e.g., multiple scalar channels such as temperature or pressure, as well as time-dependent data. The combination of unstructured and structured bricks with different quality characteristics such as the type of interpolation or resampling resolution in conjunction with custom texture memory management yields a very scalable system. PMID:17968114

The engineering analysis community at Sandia National Laboratories uses a number of internal and commercial software codes and tools, including mesh generators, preprocessors, mesh manipulators, simulation codes, post-processors, and visualization packages. We define an analysis workflow as the execution of an ordered, logical sequence of these tools. Various forms of analysis (and in particular, methodologies that use multiple function evaluations or samples) involve executing parameterized variations of these workflows. As part of the DART project, we are evaluating various commercial workflow management systems, including iSIGHT-FD from Engineous. This report documents the results of a scalability test that was driven by DAKOTA and conducted on a parallel computer (Thunderbird). The purpose of this experiment was to examine the suitability and performance of iSIGHT-FD for large-scale, parameterized analysis workflows. As the results indicate, we found iSIGHT-FD to be suitable for this type of application.

Understanding vector fields resulting from large scientific simulations is an important and often difficult task. Streamlines, curves that are tangential to a vector field at each point, are a powerful visualization method in this context. Application of streamline-based visualization to very large vector field data represents a significant challenge due to the non-local and data-dependent nature of streamline computation, and requires careful balancing of computational demands placed on I/O, memory, communication, and processors. In this paper we review two parallelization approaches based on established parallelization paradigms (static decomposition and on-demand loading) and present a novel hybrid algorithm for computing streamlines. Our algorithm is aimed at good scalability and performance across the widely varying computational characteristics of streamline-based problems. We perform performance and scalability studies of all three algorithms on a number of prototypical application problems and demonstrate that our hybrid scheme is able to perform well in different settings.

Libra is a tool for scalable analysis of load balance data from all processes in a parallel application. Libra contains an instrumentation module that collects model data from parallel applications and a parallel compression mechanism that uses distributed wavelet transforms to gather load balance model data in a scalable fashion. Data is output to files, and these files can be viewed in a GUI tool by Libra users. The GUI tool associates particular load balance data with regions for code, emabling users to view the load balance properties of distributed "slices" of their application code.

Libra is a tool for scalable analysis of load balance data from all processes in a parallel application. Libra contains an instrumentation module that collects model data from parallel applications and a parallel compression mechanism that uses distributed wavelet transforms to gather load balance model data in a scalable fashion. Data is output to files, and these files can be viewed in a GUI tool by Libra users. The GUI tool associates particular load balancemore » data with regions for code, emabling users to view the load balance properties of distributed "slices" of their application code.« less

We report a demonstration of the scalability of optically transparent xenon in the solid phase for use as a particle detector above a kilogram scale. We employed a cryostat cooled by liquid nitrogen combined with a xenon purification and chiller system. A modified {\\it Bridgeman's technique} reproduces a large scale optically transparent solid xenon.

Scalability has been used extensively as a de facto performance criterion for evaluating parallel algorithms and architectures. However, for many, scalability has theoretical interests only since it does not reveal execution time. In this paper, the relation between scalability and execution time is carefully studied. Results show that the isospeed scalability well characterizes the variation of execution time: smaller scalability leads to larger execution time, the same scalability leads to the same execution time, etc. Three algorithms from scientific computing are implemented on an Intel Paragon and an IBM SP2 parallel computer. Experimental and theoretical results show that scalability is an important, distinct metric for parallel and distributed systems, and may be as important as execution time in a scalable parallel and distributed environment.

The rapidly increasing volume and complexity of MG&G data, and the growing demand from funding agencies and the user community that it be easily accessible, demand that we improve our approach to data management in order to reach a broader user-base and operate more efficient and effectively. We have chosen an approach based on industry-standard relational database management systems (RDBMS) that use community-wide data specifications, where there is a clear and well-documented external interface that allows use of general purpose as well as customized clients. Rapid prototypes assembled with this approach show significant advantages over the traditional, custom-built data management systems that often use "in-house" legacy file formats, data specifications, and access tools. We have developed an effective database prototype based a public domain RDBMS (PostgreSQL) and metadata standard (FGDC), and used it as a template for several ongoing MG&G database management projects - including ADGRAV (Antarctic Digital Gravity Synthesis), MARGINS, the Community Review system of the Digital Library for Earth Science Education, multibeam swath bathymetry metadata, and the R/V Maurice Ewing onboard acquisition system. By using standard formats and specifications, and working from a common prototype, we are able to reuse code and deploy rapidly. Rather than spend time on low-level details such as storage and indexing (which are built into the RDBMS), we can focus on high-level details such as documentation and quality control. In addition, because many commercial off-the-shelf (COTS) and public domain data browsers and visualization tools have built-in RDBMS support, we can focus on backend development and leave the choice of a frontend client(s) up to the end user. While our prototype is running under an open source RDBMS on a single processor host, the choice of standard components allows this implementation to scale to commercial RDBMS products and multiprocessor servers as

Despite the fact that human ability to perceive a high degree of realism is directly related to our ability to perceive depth accurately in a scene, most of the commonly used imaging and display technologies are able to provide only a 2D rendering of the 3D real world. Many current as well as emerging applications in areas of entertainment, remote operations, industrial and medicine can benefit from the depth perception offered by stereoscopic video systems which employ two views of a scene imaged under the constraints imposed by human visual system. Among the many challenges to be overcome for practical realization and widespread use of 3D/stereoscopic systems are efficient techniques for digital compression of enormous amounts of data while maintaining compatibility with normal video decoding and display systems. After a brief discussion on the relationship of digital stereoscopic 3DTV with digital TV and HDTV, we present an overview of tools in the MPEG-2 video standard that are relevant to our discussion on compression of stereoscopic video, which is the main topic of this paper. Next, we determine ways in which temporal scalability concepts can be applied to exploit redundancies inherent between the two views of a scene comprising stereoscopic video. Due consideration is given to masking properties of stereoscopic vision to determine bandwidth partitioning between the two views to realize an efficient coding scheme while providing sufficient quality. Simulations are performed on stereoscopic video of normal TV resolution to compare the performance of the two temporal scalability configurations with each other and with the simulcast solution. Preliminary results are quite promising and indicate that the configuration that exploits motion and disparity compensation significantly outperforms the one that exploits disparity compensation alone. Compression of both views of stereo video of normal TV resolution appears feasible in a total of 8 or 9 Mbit/s. Finally

We discuss and present search strategies for finding new thermoelectric compositions based on first principles electronic structure and transport calculations. We illustrate them by application to a search for potential n-type oxide thermoelectric materials. This includes a screen based on visualization of electronic energy isosurfaces. We report compounds that show potential as thermoelectric materials along with detailed properties, including SrTiO3, which is a known thermoelectric, and appropriately doped KNbO3 and rutile TiO2.

IP multicast has been proved to be unfeasible for deployment, Application Layer Multicast (ALM) Based on end multicast system is practical and more scalable than IP multicast in Internet. In this paper, an ALM protocol called Scalable multicast for High Definition streaming media (SHD) is proposed in which end to end transmission capability is fully cultivated for HD media transmission without increasing much control overhead. Similar to the transmission style of BiTtorrent, hosts only forward part of data piece according to the available bandwidth that improves the usage of bandwidth greatly. On the other hand, some novel strategies are adopted to overcome the disadvantages of BiTtorrent protocol in streaming media transmission. Data transmission between hosts is implemented in many-one transmission style in Hierarchical architecture in most circumstances. Simulations implemented on Internet-like topology indicate that SHD achieves low link stress, end to end latency and stability.

The Scalable Tools Communication Infrastructure (STCI) is an open source collaborative effort intended to provide high-performance, scalable, resilient, and portable communications and process control services for a wide variety of user and system tools. STCI is aimed specifically at tools for ultrascale computing and uses a component architecture to simplify tailoring the infrastructure to a wide range of scenarios. This paper describes STCI's design philosophy, the various components that will be used to provide an STCI implementation for a range of ultrascale platforms, and a range of tool types. These include tools supporting parallel run-time environments, such as MPI, parallel application correctness tools and performance analysis tools, as well as system monitoring and management tools.

In this paper we present the results of an evaluation of different visualization methods for angiogram volumetric data-ray casting, marching cubes, and multi-level partition of unity implicits. There are several options available with ray-casting: isosurface extraction, maximum intensity projection and alpha compositing, each producing fundamentally different results. Different visualization methods are suitable for different needs, so this choice is crucial in diagnosis and decision making processes. We also evaluate visual effects such as ambient occlusion, screen space ambient occlusion, and depth of field. Some visualization methods include transparency, so we address the question of relevancy of this additional visual information. We employ transfer functions to map data values to color and transparency, allowing us to view or hide particular tissues. All the methods presented in this paper were developed using OpenCL, striving for real-time rendering and quality interaction. An evaluation has been conducted to assess the suitability of the visualization methods. Results show superiority of isosurface extraction with ambient occlusion effects. Visual effects may positively or negatively affect perception of depth, motion, and relative positions in space.

NWChem is a general purpose computational chemistry code specifically designed to run on distributed memory parallel computers. The core functionality of the code focuses on molecular dynamics, Hartree-Fock and density functional theory methods for both plane-wave basis sets as well as Gaussian basis sets, tensor contraction engine based coupled cluster capabilities and combined quantum mechanics/molecular mechanics descriptions. It was realized from the beginning that scalable implementations of these methods required a programming paradigm inherently different from what message passing approaches could offer. In response a global address space library, the Global Array Toolkit, was developed. The programming model it offers is based on using predominantly one-sided communication. This model underpins most of the functionality in NWChem and the power of it is exemplified by the fact that the code scales to tens of thousands of processors. In this paper the core capabilities of NWChem are described as well as their implementation to achieve an efficient computational chemistry code with high parallel scalability. NWChem is a modern, open source, computational chemistry code1 specifically designed for large scale parallel applications2. To meet the challenges of developing efficient, scalable and portable programs of this nature a particular code design was adopted. This code design involved two main features. First of all, the code is build up in a modular fashion so that a large variety of functionality can be integrated easily. Secondly, to facilitate writing complex parallel algorithms the Global Array toolkit was developed. This toolkit allows one to write parallel applications in a shared memory like approach, but offers additional mechanisms to exploit data locality to lower communication overheads. This framework has proven to be very successful in computational chemistry but is applicable to any engineering domain. Within the context created by the features

The problem of scaling chemical oxygen - iodine lasers (COILs) is discussed. The results of experimental study of a twisted-aerosol singlet oxygen generator meeting the COIL scalability requirements are presented. The energy characteristics of a supersonic COIL with singlet oxygen and iodine mixing in parallel flows are also experimentally studied. The output power of {approx}7.5 kW, corresponding to a specific power of 230 W cm{sup -2}, is achieved. The maximum chemical efficiency of the COIL is {approx}30%.

Visualization systems are complex dynamic software systems. Debugging such systems is difficult using conventional debuggers because the programmer must try to imagine the three-dimensional geometry based on a list of positions and attributes. In addition, the programmer must be able to mentally animate changes in those positions and attributes to grasp dynamic behaviors within the algorithm. In this paper we shall show that representing geometry, attributes, and relationships graphically permits visual pattern recognition skills to be applied to the debugging problem. The particular application is a particle system used for isosurface extraction from volumetric data. Coloring particles based on individual attributes is especially helpful when these colorings are viewed as animations over successive iterations in the program. Although we describe a particular application, the types of tools that we discuss can be applied to a variety of problems.

Taxanes are a large family of terpenes comprising over 350 members, the most famous of which is Taxol (paclitaxel) — a billion-dollar anticancer drug. Here, we describe the first practical and scalable synthetic entry to these natural products via a concise preparation of (+)-taxa-4(5),11(12)-dien-2-one, which possesses a suitable functional handle to access more oxidised members of its family. This route enabled a gram-scale preparation of the ”parent” taxane, taxadiene, representing the largest quantity of this naturally occurring terpene ever isolated or prepared in pure form. The taxane family’s characteristic 6-8-6 tricyclic system containing a bridgehead alkene is forged via a vicinal difunctionalisation/Diels–Alder strategy. Asymmetry is introduced by means of an enantioselective conjugate addition that forms an all-carbon quaternary centre, from which all other stereocentres are fixed via substrate control. This study lays a critical foundation for a planned access to minimally oxidised taxane analogs and a scalable laboratory preparation of Taxol itself. PMID:22169867

VIMES (Visual Inteface for Materials Simulations) is a graphical user interface (GUI) for pre- and post-processing alomistic materials science calculations. The code includes tools for building and visualizing simple crystals, supercells, and surfaces, as well as tools for managing and modifying the input to Sandia materials simulations codes such as Quest (Peter Schultz, SNL 9235) and Towhee (Marcus Martin, SNL 9235). It is often useful to have a graphical interlace to construct input for materialsmore » simulations codes and to analyze the output of these programs. VIMES has been designed not only to build and visualize different materials systems, but also to allow several Sandia codes to be easier to use and analyze. Furthermore. VIMES has been designed to be reasonably easy to extend to new materials programs. We anticipate that users of Sandia materials simulations codes will use VIMCS to simplify the submission and analysis of these simulations. VIMES uses standard OpenGL graphics (as implemented in the Python programming language) to display the molecules. The algorithms used to rotate, zoom, and pan molecules are all standard applications using the OpenGL libraries. VIMES uses the Marching Cubes algorithm for isosurfacing 3D data such as molecular orbitals or electron densities around the molecules.« less

Scripts for scalable monitoring of parallel filesystem infrastructure provide frameworks for monitoring the health of block storage arrays and large InfiniBand fabrics. The block storage framework uses Python multiprocessing to within scale the number monitored arrays to scale with the number of processors in the system. This enables live monitoring of HPC-scale filesystem with 10-50 storage arrays. For InfiniBand monitoring, there are scripts included that monitor InfiniBand health of each host along with visualization toolsmore » for mapping the topology of complex fabric topologies.« less

Scripts for scalable monitoring of parallel filesystem infrastructure provide frameworks for monitoring the health of block storage arrays and large InfiniBand fabrics. The block storage framework uses Python multiprocessing to within scale the number monitored arrays to scale with the number of processors in the system. This enables live monitoring of HPC-scale filesystem with 10-50 storage arrays. For InfiniBand monitoring, there are scripts included that monitor InfiniBand health of each host along with visualization tools for mapping the topology of complex fabric topologies.

An architecture for active coherent fiber laser beam combining using an interferometric measurement is demonstrated. This technique allows measuring the exact phase errors of each fiber beam in a single shot. Therefore, this method is a promising candidate toward very large number of combined fibers. Our experimental system, composed of 16 independent fiber channels, is used to evaluate the achieved phase locking stability in terms of phase shift error and bandwidth. We show that only 8 pixels per fiber on the camera is required for a stable close loop operation with a residual phase error of λ/20 rms, which demonstrates the scalability of this concept. Furthermore we propose a beam shaping technique to increase the combining efficiency.

Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Modern machines may contain 100,000 or more microprocessor cores, and the largest of these, IBM's Blue Gene/L, contains over 200,000 cores. Future systems are expected to support millions of concurrent tasks. In this dissertation, we focus on efficient techniques for measuring and analyzing the performance of applications running on very large parallel machines. Tuning the performance of large-scale applications can be a subtle and time-consuming task because application developers must measure and interpret data from many independent processes. While the volume of the raw data scales linearly with the number of tasks in the running system, the number of tasks is growing exponentially, and data for even small systems quickly becomes unmanageable. Transporting performance data from so many processes over a network can perturb application performance and make measurements inaccurate, and storing such data would require a prohibitive amount of space. Moreover, even if it were stored, analyzing the data would be extremely time-consuming. In this dissertation, we present novel methods for reducing performance data volume. The first draws on multi-scale wavelet techniques from signal processing to compress systemwide, time-varying load-balance data. The second uses statistical sampling to select a small subset of running processes to generate low-volume traces. A third approach combines sampling and wavelet compression to stratify performance data adaptively at run-time and to reduce further the cost of sampled tracing. We have integrated these approaches into Libra, a toolset for scalable load-balance analysis. We present Libra and show how it can be used to analyze data from large scientific applications scalably.

Multimedia data delivered to mobile devices over wireless channels or the Internet are complicated by bandwidth fluctuation and the variety of mobile devices. Scalable video coding has been developed as an extension of H.264/AVC to solve this problem. Since scalable video codec provides various scalabilities to adapt the bitstream for the channel conditions and terminal types, scalable codec is one of the useful codecs for wired or wireless multimedia communication systems, such as IPTV and streaming services. In such scalable multimedia communication systems, video quality fluctuation degrades the visual perception significantly. It is important to efficiently use the target bits in order to maintain a consistent video quality or achieve a small distortion variation throughout the whole video sequence. The scheme proposed in this paper provides a useful function to control video quality in applications supporting scalability, whereas conventional schemes have been proposed to control video quality in the H.264 and MPEG-4 systems. The proposed algorithm decides the quantization parameter of the enhancement layer to maintain a consistent video quality throughout the entire sequence. The video quality of the enhancement layer is controlled based on a closed-form formula which utilizes the residual data and quantization error of the base layer. The simulation results show that the proposed algorithm controls the frame quality of the enhancement layer in a simple operation, where the parameter decision algorithm is applied to each frame. PMID:21411408

The 9/30/2009 ASC Level 2 Scalable Analysis Tools for Sensitivity Analysis and UQ (Milestone 3160) contains feature recognition capability required by the user community for certain verification and validation tasks focused around sensitivity analysis and uncertainty quantification (UQ). These feature recognition capabilities include crater detection, characterization, and analysis from CTH simulation data; the ability to call fragment and crater identification code from within a CTH simulation; and the ability to output fragments in a geometric format that includes data values over the fragments. The feature recognition capabilities were tested extensively on sample and actual simulations. In addition, a number of stretch criteria were met including the ability to visualize CTH tracer particles and the ability to visualize output from within an S3D simulation.

Vortex modeling can produce attractive visual effects of dynamic fluids, which are widely applicable for dynamic media, computer games, special effects, and virtual reality systems. However, it is challenging to effectively simulate intensive and fine detailed fluids such as smoke with fast increasing vortex filaments and smoke particles. The authors propose a novel vortex filaments in grids scheme in which the uniform grids dynamically bridge the vortex filaments and smoke particles for scalable, fine smoke simulation with macroscopic vortex structures. Using the vortex model, their approach supports the trade-off between simulation speed and scale of details. After computing the whole velocity, external control can be easily exerted on the embedded grid to guide the vortex-based smoke motion. The experimental results demonstrate the efficiency of using the proposed scheme for a visually plausible smoke simulation with macroscopic vortex structures. PMID:25594961

Virtual 3D city models provide powerful user interfaces for communication of 2D and 3D geoinformation. Providing high quality visualization of massive 3D geoinformation in a scalable, fast, and cost efficient manner is still a challenging task. Especially for mobile and web-based system environments, software and hardware configurations of target systems differ significantly. This makes it hard to provide fast, visually appealing renderings of 3D data throughout a variety of platforms and devices. Current mobile or web-based solutions for 3D visualization usually require raw 3D scene data such as triangle meshes together with textures delivered from server to client, what makes them strongly limited in terms of size and complexity of the models they can handle. In this paper, we introduce a new approach for provisioning of massive, virtual 3D city models on different platforms namely web browsers, smartphones or tablets, by means of an interactive map assembled from artificial oblique image tiles. The key concept is to synthesize such images of a virtual 3D city model by a 3D rendering service in a preprocessing step. This service encapsulates model handling and 3D rendering techniques for high quality visualization of massive 3D models. By generating image tiles using this service, the 3D rendering process is shifted from the client side, which provides major advantages: (a) The complexity of the 3D city model data is decoupled from data transfer complexity (b) the implementation of client applications is simplified significantly as 3D rendering is encapsulated on server side (c) 3D city models can be easily deployed for and used by a large number of concurrent users, leading to a high degree of scalability of the overall approach. All core 3D rendering techniques are performed on a dedicated 3D rendering server, and thin-client applications can be compactly implemented for various devices and platforms.

Scalability beyond a small number of processors, typically 32 or less, is known to be a problem for existing parallel general sparse (PGS) direct solvers. This paper presents a parallel general sparse PGS direct solver for general sparse linear systems on distributed memory machines. The algorithm is based on the well-known sequential sparse algorithm Y12M. To achieve efficient parallelization, a 2-D scattered decomposition of the sparse matrix is used. The proposed algorithm is more scalable than existing parallel sparse direct solvers. Its scalability is evaluated on a 256 processor nCUBE2s machine using Boeing/Harwell benchmark matrices.

Full and partial encryption methods are important for subscription based content providers, such as internet and cable TV pay channels. Providers need to be able to protect their products while at the same time being able to provide demonstrations to attract new customers without giving away the full value of the content. If an algorithm were introduced which could provide any level of full or partial encryption in a fast and cost effective manner, the applications to real-time commercial implementation would be numerous. In this paper, we present a novel application of alpha rooting, using it to achieve fast and straightforward scalable encryption with a single algorithm. We further present use of the measure of enhancement, the Logarithmic AME, to select optimal parameters for the partial encryption. When parameters are selected using the measure, the output image achieves a balance between protecting the important data in the image while still containing a good overall representation of the image. We will show results for this encryption method on a number of images, using histograms to evaluate the effectiveness of the encryption.

The purpose of this techbase project was to investigate the use of parallel array data types to reduce the memory footprint of the Livermore Equation Of State (LEOS) library. Addressing the memory scalability of LEOS is necessary to run large scientific simulations on IBM BG/L and future architectures with low memory per processing core. We considered using normal MPI, one-sided MPI, and Global Arrays to manage the distributed array and ended up choosing Global Arrays because it was the only communication library that provided the level of asynchronous access required. To reduce the runtime overhead using a parallel array data structure, a least recently used (LRU) caching algorithm was used to provide a local cache of commonly used parts of the parallel array. The approach was initially implemented in a isolated copy of LEOS and was later integrated into the main trunk of the LEOS Subversion repository. The approach was tested using a simple test. Testing indicated that the approach was feasible, and the simple LRU caching had a 86% hit rate.

NCSA’s role in the SCIDAC Scalable Systems Software (SSS) project was to develop interfaces and communication mechanisms for systems monitoring, and to implement a prototype demonstrating those standards. The Scalable Systems Monitoring component of the SSS suite was designed to provide a large volume of both static and dynamic systems data to the components within the SSS infrastructure as well as external data consumers.

A scalability metric, called constant-memory-per-processor (CMP), is described for parallel architecture-algorithrn pairs. Its purpose is to predict the behavior of a specific algorithm on a distributed-memory machine as the number of processors is increased, but the memory per processor remains constant. While the CMP scalability metric predicts the asymptotic behavior, we show how to use it to predict expected performance on actual parallel machines, specifically the MasPar MP-I and MP-2.

Computer networks and the internet have taken an important role in modern society. Together with their development, the need for digital video transmission over these networks has grown. To cope with the user demands and limitations of the network, compression of the video material has become an important issue. Additionally, many video-applications require flexibility in terms of scalability and complexity (e.g. HD/SD-TV, video-surveillance). Current ITU-T and ISO/IEC video compression standards (MPEG-x, H.26-x) lack efficient support for these types of scalability. Wavelet-based compression techniques have been proposed to tackle this problem, of which the Motion Compensated Temporal Filtering (MCTF)-based architectures couple state-of-the-art performance with full (quality, resolution, and frame-rate) scalability. However, a significant drawback of these architectures is their high complexity. The computational and memory complexity of both spatial domain (SD) MCTF and in-band (IB) MCTF video codec instantiations are examined in this study. Comparisons in terms of complexity versus performance are presented for both types of codecs. The paper indicates how complexity scalability can be achieved in such video-codecs, and analyses some of the trade-offs between complexity and coding performance. Finally, guidelines on how to implement a fully scalable video-codec that incorporates quality, temporal, resolution and complexity scalability are proposed.

PURPOSE: Validation of image registration algorithms is frequently accomplished by the visual inspection of the resulting linear or deformable transformation due to the lack of ground truth information. Visualization of transformations produced by image registration algorithms during image-guided interventions allows for a clinician to evaluate the accuracy of the result transformation. Software packages that perform the visualization of transformations exist, but are not part of a clinically usable software application. We present a tool that visualizes both linear and deformable transformations and is integrated in an open-source software application framework suited for intraoperative use and general evaluation of registration algorithms. METHODS: A choice of six different modes are available for visualization of a transform. Glyph visualization mode uses oriented and scaled glyphs, such as arrows, to represent the displacement field in 3D whereas glyph slice visualization mode creates arrows that can be seen as a 2D vector field. Grid visualization mode creates deformed grids shown in 3D whereas grid slice visualization mode creates a series of 2D grids. Block visualization mode creates a deformed bounding box of the warped volume. Finally, contour visualization mode creates isosurfaces and isolines that visualize the magnitude of displacement across a volume. The application 3D Slicer was chosen as the platform for the transform visualizer tool. 3D Slicer is a comprehensive open-source application framework developed for medical image computing and used for intra-operative registration. RESULTS: The transform visualizer tool fulfilled the requirements for quick evaluation of intraoperative image registrations. Visualizations were generated in 3D Slicer with little computation time on realistic datasets. It is freely available as an extension for 3D Slicer. CONCLUSION: A tool for the visualization of displacement fields was created and integrated into 3D Slicer

The high efficiency video coding (HEVC) standard being developed by ITU-T VCEG and ISO/IEC MPEG achieves a compression goal of reducing the bitrate by half for the same visual quality when compared with earlier video compression standards such as H.264/AVC. It achieves this goal with the use of several new tools such as quad-tree based partitioning of data, larger block sizes, improved intra prediction, the use of sophisticated prediction of motion information, inclusion of an in-loop sample adaptive offset process etc. This paper describes an approach where the HEVC framework is extended to achieve spatial scalability using a multi-loop approach. The enhancement layer inter-predictive coding efficiency is improved by including within the decoded picture buffer multiple up-sampled versions of the decoded base layer picture. This approach has the advantage of achieving significant coding gains with a simple extension of the base layer tools such as inter-prediction, motion information signaling etc. Coding efficiency of the enhancement layer is further improved using adaptive loop filter and internal bit-depth increment. The performance of the proposed scalable video coding approach is compared to simulcast transmission of video data using high efficiency model version 6.1 (HM-6.1). The bitrate savings are measured using Bjontegaard Delta (BD) rate for a spatial scalability factor of 2 and 1.5 respectively when compared with simulcast anchors. It is observed that the proposed approach provides an average luma BD rate gains of 33.7% and 50.5% respectively.

Non-intrusive inspection and non-destructive testing of manufactured objects with complex internal structures typically requires the enhancement, analysis and visualization of high-resolution volumetric data. Given the increasing availability of fast 3D scanning technology (e.g. cone-beam CT), enabling on-line detection and accurate discrimination of components or sub-structures, the inherent complexity of classification algorithms inevitably leads to throughput bottlenecks. Indeed, whereas typical inspection throughput requirements range from 1 to 1000 volumes per hour, depending on density and resolution, current computational capability is one to two orders-of-magnitude less. Accordingly, speeding up classification algorithms requires both reduction of algorithm complexity and acceleration of computer performance. A shape-based classification algorithm, offering algorithm complexity reduction, by using ellipses as generic descriptors of solids-of-revolution, and supporting performance-scalability, by exploiting the inherent parallelism of volumetric data, is presented. A two-stage variant of the classical Hough transform is used for ellipse detection and correlation of the detected ellipses facilitates position-, scale- and orientation-invariant component classification. Performance-scalability is achieved cost-effectively by accelerating a PC host with one or more COTS (Commercial-Off-The-Shelf) PCI multiprocessor cards. Experimental results are reported to demonstrate the feasibility and cost-effectiveness of the data-parallel classification algorithm for on-line industrial inspection applications.

Nowadays, it is very convenient to capture photos by a smart phone. As using, the smart phone is a convenient way to share what users experienced anytime and anywhere through social networks, it is very possible that we capture multiple photos to make sure the content is well photographed. In this paper, an effective scalable mobile image retrieval approach is proposed by exploring contextual salient information for the input query image. Our goal is to explore the high-level semantic information of an image by finding the contextual saliency from multiple relevant photos rather than solely using the input image. Thus, the proposed mobile image retrieval approach first determines the relevant photos according to visual similarity, then mines salient features by exploring contextual saliency from multiple relevant images, and finally determines contributions of salient features for scalable retrieval. Compared with the existing mobile-based image retrieval approaches, our approach requires less bandwidth and has better retrieval performance. We can carry out retrieval with <200-B data, which is <5% of existing approaches. Most importantly, when the bandwidth is limited, we can rank the transmitted features according to their contributions to retrieval. Experimental results show the effectiveness of the proposed approach. PMID:25775488

Scalable tracers are potentially a useful tool to examine diffusion mechanisms and to predict diffusion coefficients, particularly for hindered diffusion in complex, heterogeneous, or crowded systems. Scalable tracers are defined as a series of tracers varying in size but with the same shape, structure, surface chemistry, deformability, and diffusion mechanism. Both chemical homology and constant dynamics are required. In particular, branching must not vary with size, and there must be no transition between ordinary diffusion and reptation. Measurements using scalable tracers yield the mean diffusion coefficient as a function of size alone; measurements using nonscalable tracers yield the variation due to differences in the other properties. Candidate scalable tracers are discussed for two-dimensional (2D) diffusion in membranes and three-dimensional diffusion in aqueous solutions. Correlations to predict the mean diffusion coefficient of globular biomolecules from molecular mass are reviewed briefly. Specific suggestions for the 3D case include the use of synthetic dendrimers or random hyperbranched polymers instead of dextran and the use of core–shell quantum dots. Another useful tool would be a series of scalable tracers varying in deformability alone, prepared by varying the density of crosslinking in a polymer to make say “reinforced Ficoll” or “reinforced hyperbranched polyglycerol.” PMID:25319586

Scientific visualization of geospatial data provides highly effective tools for analysis and communication of information about the land surface and its features, properties, and temporal evolution. Whereas single-surface visualization of landscapes is now routinely used in presentation of Earth surface data, interactive 3D visualization based upon multiple elevation surfaces and cutting planes is gaining recognition as a powerful tool for analyzing landscape structure based on multiple return Light Detection and Ranging (LiDAR) data. This approach also provides valuable insights into land surface changes captured by multi-temporal elevation models. Thus, animations using 2D images and 3D views are becoming essential for communicating results of landscape monitoring and computer simulations of Earth processes. Multiple surfaces and 3D animations are also used to introduce novel concepts for visual analysis of terrain models derived from time-series of LiDAR data using multi-year core and envelope surfaces. Analysis of terrain evolution using voxel models and visualization of contour evolution using isosurfaces has potential for unique insights into geometric properties of rapidly evolving coastal landscapes. In addition to visualization on desktop computers, the coupling of GIS with new types of graphics hardware systems provides opportunities for cutting-edge applications of visualization for geomorphological research. These systems include tangible environments that facilitate intuitive 3D perception, interaction and collaboration. Application of the presented visualization techniques as supporting tools for analyses of landform evolution using airborne LiDAR data and open source geospatial software is illustrated by two case studies from North Carolina, USA.

This report contains an algorithm for decomposing higher-order finite elementsinto regions appropriate for isosurfacing and proves the conditions under which thealgorithm will terminate. Finite elements are used to create piecewise polynomialapproximants to the solution of partial differential equations for which no analyticalsolution exists. These polynomials represent fields such as pressure, stress, and mo-mentim. In the past, these polynomials have been linear in each parametric coordinate.Each polynomial coefficient must be uniquely determined by a simulation, and thesecoefficients are called degrees of freedom. When there are not enough degrees of free-dom, simulations will typically fail to produce a valid approximation to the solution.Recent work has shown that increasing the number of degrees of freedom by increas-ing the order of the polynomial approximation (instead of increasing the number offinite elements, each of which has its own set of coefficients) can allow some typesof simulations to produce a valid approximation with many fewer degrees of freedomthan increasing the number of finite elements alone. However, once the simulation hasdetermined the values of all the coefficients in a higher-order approximant, tools donot exist for visual inspection of the solution.This report focuses on a technique for the visual inspection of higher-order finiteelement simulation results based on decomposing each finite element into simplicialregions where existing visualization algorithms such as isosurfacing will work. Therequirements of the isosurfacing algorithm are enumerated and related to the placeswhere the partial derivatives of the polynomial become zero. The original isosurfacingalgorithm is then applied to each of these regions in turn.3 AcknowledgementThe authors would like to thank David Day and Louis Romero for their insight into poly-nomial system solvers and the LDRD Senior Council for the opportunity to pursue thisresearch. The authors were

Wireless environments present many challenges for secure multimedia access, especial streaming media. The availability of varying network bandwidths and diverse receiver device processing powers and storage spaces demand scalable and flexible approaches that are capable of adapting to changing network conditions as well as device capabilities. To meet these requirements, scalable and fine granularity scalable (FGS) compression algorithms were proposed and widely adopted to provide scalable access of multimedia with interoperability between different services and flexible support to receivers with different device capabilities. Encryption is one of the most important security tools to protect content from unauthorized use. If a medium data stream is encrypted using non-scalable cryptography algorithms, decryption at arbitrary bit rate to provide scalable services can hardly be accomplished. If a medium compressed using scalable coding needs to be protected and non-scalable cryptography algorithms are used, the advantages of scalable coding may be lost. Therefore scalable encryption techniques are needed to provide scalability or to preserve the FGS adaptation capability (if the media stream is FGS coded) and enable intermediate processing of encrypted data without unnecessary decryption. In this paper, we will give an overview of scalable encryption schemes and present a fine grained scalable encryption algorithm. One desirable feature is its simplicity and flexibility in supporting scalable multimedia communication and multimedia content access control in wireless environments.

The Scalable Checkpoint/Restart (SCR) library provides an interface that codes may use to worite our and read in application-level checkpoints in a scalable fashion. In the current implementation, checkpoint files are cached in local storage (hard disk or RAM disk) on the compute nodes. This technique provides scalable aggregate bandwidth and uses storage resources that are fully dedicated to the job. This approach addresses the two common drawbacks of checkpointing a large-scale application to amore » shared parallel file system, namely, limited bandwidth and file system contention. In fact, on current platforms, SCR scales linearly with the number of compute nodes. It has been benchmarked as high as 720GB/s on 1094 nodes of Atlas, which is nearly two orders of magnitude faster thanthe parallel file system.« less

The Scalable Checkpoint/Restart (SCR) library provides an interface that codes may use to worite our and read in application-level checkpoints in a scalable fashion. In the current implementation, checkpoint files are cached in local storage (hard disk or RAM disk) on the compute nodes. This technique provides scalable aggregate bandwidth and uses storage resources that are fully dedicated to the job. This approach addresses the two common drawbacks of checkpointing a large-scale application to a shared parallel file system, namely, limited bandwidth and file system contention. In fact, on current platforms, SCR scales linearly with the number of compute nodes. It has been benchmarked as high as 720GB/s on 1094 nodes of Atlas, which is nearly two orders of magnitude faster thanthe parallel file system.

Subsurface data analysis and visualization represents one of the main aspect in Planetary Observation (i.e. search for water or geological characterization). The data are collected by subsurface sounding radars as instruments on-board of deep space missions. These data are generally represented as 2D radargrams in the perspective of space track and z axes (perpendicular to the subsurface) but without direct correlation to other data acquisition or knowledge on the planet . In many case there are plenty of data from other sensors of the same mission, or other ones, with high continuity in time and in space and specially around the scientific sites of interest (i.e. candidate landing areas or particular scientific interesting sites). The 2D perspective is good to analyse single acquisitions and to perform detailed analysis on the returned echo but are quite useless to compare very large dataset as now are available on many planets and moons of solar system. The best way is to approach the analysis on 3D visualization model generated from the entire stack of data. First of all this approach allows to navigate the subsurface in all directions and analyses different sections and slices or moreover navigate the iso-surfaces respect to a value (or interval). The last one allows to isolate one or more iso-surfaces and remove, in the visualization mode, other data not interesting for the analysis; finally it helps to individuate the underground 3D bodies. Other aspect is the needs to link the on-ground data, as imaging, to the underground one by geographical and context field of view.

TOPS is providing high-performance, scalable sparse direct solvers, which have had significant impacts on the SciDAC applications, including fusion simulation (CEMM), accelerator modeling (COMPASS), as well as many other mission-critical applications in DOE and elsewhere. Our recent developments have been focusing on new techniques to overcome scalability bottleneck of direct methods, in both time and memory. These include parallelizing symbolic analysis phase and developing linear-complexity sparse factorization methods. The new techniques will make sparse direct methods more widely usable in large 3D simulations on highly-parallel petascale computers.

This report summarizes existing statistical engines in VTK/Titan and presents both the serial and parallel k-means statistics engines. It is a sequel to [PT08], [BPRT09], and [PT09] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, and contingency engines. The ease of use of the new parallel k-means engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the k-means engine.

The NASA In-Space Propulsion (ISP) program sponsored intensive solar sail technology and systems design, development, and hardware demonstration activities over the past 3 years. Efforts to validate a scalable solar sail system by functional demonstration in relevant environments, together with test-analysis correlation activities on a scalable solar sail system have recently been successfully completed. A review of the program, with descriptions of the design, results of testing, and analytical model validations of component and assembly functional, strength, stiffness, shape, and dynamic behavior are discussed. The scaled performance of the validated system is projected to demonstrate the applicability to flight demonstration and important NASA road-map missions.

Magnetic resonance imaging (MRI) pulse sequence consoles typically employ closed proprietary hardware, software, and interfaces, making difficult any adaptation for innovative experimental technology. Yet MRI systems research is trending to higher channel count receivers, transmitters, gradient/shims, and unique interfaces for interventional applications. Customized console designs are now feasible for researchers with modern electronic components, but high data rates, synchronization, scalability, and cost present important challenges. Implementing large multichannel MR systems with efficiency and flexibility requires a scalable modular architecture. With Medusa, we propose an open system architecture using the universal serial bus (USB) for scalability, combined with distributed processing and buffering to address the high data rates and strict synchronization required by multichannel MRI. Medusa uses a modular design concept based on digital synthesizer, receiver, and gradient blocks, in conjunction with fast programmable logic for sampling and synchronization. Medusa is a form of synthetic instrument, being reconfigurable for a variety of medical/scientific instrumentation needs. The Medusa distributed architecture, scalability, and data bandwidth limits are presented, and its flexibility is demonstrated in a variety of novel MRI applications. PMID:21954200

The present invention provides a scalable microreactor comprising a multilayered reaction block having alternating reaction plates and heat exchanger plates that have a plurality of microchannels; a multilaminated reactor input manifold, a collecting reactor output manifold, a heat exchange input manifold and a heat exchange output manifold. The present invention also provides methods of using the microreactor for multiphase chemical reactions.

In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.

MRI pulse sequence consoles typically employ closed proprietary hardware, software, and interfaces, making difficult any adaptation for innovative experimental technology. Yet MRI systems research is trending to higher channel count receivers, transmitters, gradient/shims, and unique interfaces for interventional applications. Customized console designs are now feasible for researchers with modern electronic components, but high data rates, synchronization, scalability, and cost present important challenges. Implementing large multi-channel MR systems with efficiency and flexibility requires a scalable modular architecture. With Medusa, we propose an open system architecture using the Universal Serial Bus (USB) for scalability, combined with distributed processing and buffering to address the high data rates and strict synchronization required by multi-channel MRI. Medusa uses a modular design concept based on digital synthesizer, receiver, and gradient blocks, in conjunction with fast programmable logic for sampling and synchronization. Medusa is a form of synthetic instrument, being reconfigurable for a variety of medical/scientific instrumentation needs. The Medusa distributed architecture, scalability, and data bandwidth limits are presented, and its flexibility is demonstrated in a variety of novel MRI applications. PMID:21954200

Scalable Metadata Environments (MDEs) are an artistic approach for designing immersive environments for large scale data exploration in which users interact with data by forming multiscale patterns that they alternatively disrupt and reform. Developed and prototyped as part of an art-science research collaboration, we define an MDE as a 4D virtual environment structured by quantitative and qualitative metadata describing multidimensional data collections. Entire data sets (e.g.10s of millions of records) can be visualized and sonified at multiple scales and at different levels of detail so they can be explored interactively in real-time within MDEs. They are designed to reflect similarities and differences in the underlying data or metadata such that patterns can be visually/aurally sorted in an exploratory fashion by an observer who is not familiar with the details of the mapping from data to visual, auditory or dynamic attributes. While many approaches for visual and auditory data mining exist, MDEs are distinct in that they utilize qualitative and quantitative data and metadata to construct multiple interrelated conceptual coordinate systems. These "regions" function as conceptual lattices for scalable auditory and visual representations within virtual environments computationally driven by multi-GPU CUDA-enabled fluid dyamics systems.

This paper documents our progress during the first year of work on our original proposal entitled 'A Scalable Distributed Approach to Mobile Robot Vision'. We are pursuing a strategy for real-time visual identification and tracking of complex objects which does not rely on specialized image-processing hardware. In this system perceptual schemas represent objects as a graph of primitive features. Distributed software agents identify and track these features, using variable-geometry image subwindows of limited size. Active control of imaging parameters and selective processing makes simultaneous real-time tracking of many primitive features tractable. Perceptual schemas operate independently from the tracking of primitive features, so that real-time tracking of a set of image features is not hurt by latency in recognition of the object that those features make up. The architecture allows semantically significant features to be tracked with limited expenditure of computational resources, and allows the visual computation to be distributed across a network of processors. Early experiments are described which demonstrate the usefulness of this formulation, followed by a brief overview of our more recent progress (after the first year).

Now that the Scalable Coherent Interface (SCI) has solved the bandwidth problem, what can we use it for? SCI was developed to support closely coupled multiprocessors and their caches in a distributed shared-memory environment, but its scalability and the efficient generality of its architecture make it work very well over a wide range of applications. It can replace a local area network for connecting workstations on a campus. It can be powerful I/O channel for a supercomputer. It can be the processor-cache-memory-I/O connection in a highly parallel computer. It can gather data from enormous particle detectors and distribute it among thousands of processors. It can connect a desktop microprocessor to memory chips a few millimeters away, disk drivers a few meters away, and servers a few kilometers away.

Now that the Scalable Coherent Interface (SCI) has solved the bandwidth problem, what can we use it for SCI was developed to support closely coupled multiprocessors and their caches in a distributed shared-memory environment, but its scalability and the efficient generality of its architecture make it work very well over a wide range of applications. It can replace a local area network for connecting workstations on a campus. It can be powerful I/O channel for a supercomputer. It can be the processor-cache-memory-I/O connection in a highly parallel computer. It can gather data from enormous particle detectors and distribute it among thousands of processors. It can connect a desktop microprocessor to memory chips a few millimeters away, disk drivers a few meters away, and servers a few kilometers away.

Scalable video coding is important in a number of applications where video needs to be decoded and displayed at a variety of resolution scales. It is more efficient than simulcasting, in which all desired resolution scales are coded totally independent of one another within the constraint of a fixed available bandwidth. In this paper, we focus on scalability using the frequency domain approach. We employ the framework proposed for the ongoing second phase of Motion Picture Experts Group (MPEG-2) standard to study the performance of one such scheme and investigate improvements aimed at increasing its efficiency. Practical issues related to multiplexing of encoded data of various resolution scales to facilitate decoding are considered. Simulations are performed to investigate the potential of a chosen frequency domain scheme. Various prospects and limitations are also discussed.

This report presents in four chapters a model for the scalability analysis of the Data Server subsystem of the Earth Observing System Data and Information System (EOSDIS) Core System (ECS). The model analyzes if the planned architecture of the Data Server will support an increase in the workload with the possible upgrade and/or addition of processors, storage subsystems, and networks. The approaches in the report include a summary of the architecture of ECS's Data server as well as a high level description of the Ingest and Retrieval operations as they relate to ECS's Data Server. This description forms the basis for the development of the scalability model of the data server and the methodology used to solve it.

We describe our visualization process for a particle-based simulation of the formation of the first stars and their impact on cosmic history. The dataset consists of several hundred time-steps of point simulation data, with each time-step containing approximately two million point particles. For each time-step, we interpolate the point data onto a regular grid using a method taken from the radiance estimate of photon mapping. We import the resulting regular grid representation into ParaView, with which we extract isosurfaces across multiple variables. Our images provide insights into the evolution of the early universe, tracing the cosmic transition from an initially homogeneous state to one of increasing complexity. Specifically, our visualizations capture the build-up of regions of ionized gas around the first stars, their evolution, and their complex interactions with the surrounding matter. These observations will guide the upcoming James Webb Space Telescope, the key astronomy mission of the next decade. PMID:17968129

This report summarizes the existing statistical engines in VTK/Titan and presents the parallel versions thereof which have already been implemented. The ease of use of these parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; then, this theoretical property is verified with test runs that demonstrate optimal parallel speed-up with up to 200 processors.

HINT is a program to measure a wide variety of scalable computer systems. It is capable of demonstrating the benefits of using more memory or processing power, and of improving communications within the system. HINT can be used for measurement of an existing system, while the associated program ANALYTIC HINT can be used to explain the measurements or as a design tool for proposed systems.

We have deployed a 1 PB clustered filesystem for High Energy Physics. The use of commodity storage arrays and bonded ethernet interconnects makes the array cost effective, whilst providing high bandwidth to the storage. The filesystem is a POSIX filesytem, presented to the Grid using the StoRM Storage Resource Manager (SRM). We describe an upgrade to 10 Gbit/s networking and we present benchmarks demonstrating the performance and scalability of the filesystem.

The software library hypre provides high performance preconditioners and solvers for the solution of large, sparse linear systems on massively parallel computers as well as conceptual interfaces that allow users to access the library in the way they naturally think about their problems. These interfaces include a stencil-based structured interface (Struct); a semi-structured interface (semiStruct), which is appropriate for applications that are mostly structured, e.g. block structured grids, composite grids in structured adaptive mesh refinement applications, and overset grids; a finite element interface (FEI) for unstructured problems, as well as a conventional linear-algebraic interface (IJ). It is extremely important to provide an efficient, scalable implementation of these interfaces in order to support the scalable solvers of the library, especially when using tens of thousands of processors. This paper describes the data structures, parallel implementation and resulting performance of the IJ, Struct and semiStruct interfaces. It investigates their scalability, presents successes as well as pitfalls of some of the approaches and suggests ways of dealing with them.

This report is a summary of the accomplishments of the 'Scalable Solutions for Processing and Searching Very Large Document Collections' LDRD, which ran from FY08 through FY10. Our goal was to investigate scalable text analysis; specifically, methods for information retrieval and visualization that could scale to extremely large document collections. Towards that end, we designed, implemented, and demonstrated a scalable framework for text analysis - ParaText - as a major project deliverable. Further, we demonstrated the benefits of using visual analysis in text analysis algorithm development, improved performance of heterogeneous ensemble models in data classification problems, and the advantages of information theoretic methods in user analysis and interpretation in cross language information retrieval. The project involved 5 members of the technical staff and 3 summer interns (including one who worked two summers). It resulted in a total of 14 publications, 3 new software libraries (2 open source and 1 internal to Sandia), several new end-user software applications, and over 20 presentations. Several follow-on projects have already begun or will start in FY11, with additional projects currently in proposal.

In order to reduce costs, computer manufacturers try to use commodity parts as much as possible. Mainframes using proprietary processors are being replaced by high performance RISC microprocessor-based workstations, which are further being replaced by the commodity microprocessor used in personal computers. Highly reliable disks for mainframes are also being replaced by disk arrays, which are complexes of disk drives. In this paper we try to clarify the feasibility of a large scale tertiary storage system composed of 8-mm tape archivers utilizing robotics. In the near future, the 8-mm tape archiver will be widely used and become a commodity part, since recent rapid growth of multimedia applications requires much larger storage than disk drives can provide. We designed a scalable tape archiver which connects as many 8-mm tape archivers (element archivers) as possible. In the scalable archiver, robotics can exchange a cassette tape between two adjacent element archivers mechanically. Thus, we can build a large scalable archiver inexpensively. In addition, a sophisticated migration mechanism distributes frequently accessed tapes (hot tapes) evenly among all of the element archivers, which improves the throughput considerably. Even with the failures of some tape drives, the system dynamically redistributes hot tapes to the other element archivers which have live tape drives. Several kinds of specially tailored huge archivers are on the market, however, the 8-mm tape scalable archiver could replace them. To maintain high performance in spite of high access locality when a large number of archivers are attached to the scalable archiver, it is necessary to scatter frequently accessed cassettes among the element archivers and to use the tape drives efficiently. For this purpose, we introduce two cassette migration algorithms, foreground migration and background migration. Background migration transfers cassettes between element archivers to redistribute frequently accessed

As the Army's Future Combat Systems (FCS) introduce emerging technologies and new force structures to the battlefield, soldiers will increasingly face new challenges in workload management. The next generation warfighter will be responsible for effectively managing robotic assets in addition to performing other missions. Studies of future battlefield operational scenarios involving the use of automation, including the specification of existing and proposed technologies, will provide significant insight into potential problem areas regarding soldier workload. The US Army Tank Automotive Research, Development, and Engineering Center (TARDEC) is currently executing an Army technology objective program to analyze and evaluate the effect of automated technologies and their associated control devices with respect to soldier workload. The Human-Robotic Interface (HRI) Intelligent Systems Behavior Simulator (ISBS) is a human performance measurement simulation system that allows modelers to develop constructive simulations of military scenarios with various deployments of interface technologies in order to evaluate operator effectiveness. One such interface is TARDEC's Scalable Soldier-Machine Interface (SMI). The scalable SMI provides a configurable machine interface application that is capable of adapting to several hardware platforms by recognizing the physical space limitations of the display device. This paper describes the integration of the ISBS and Scalable SMI applications, which will ultimately benefit both systems. The ISBS will be able to use the Scalable SMI to visualize the behaviors of virtual soldiers performing HRI tasks, such as route planning, and the scalable SMI will benefit from stimuli provided by the ISBS simulation environment. The paper describes the background of each system and details of the system integration approach.

This research applies recent advances in 3D isosurface reconstruction to images of test spheres and plant cells growing in suspension culture. Isosurfaces that represent object boundaries are constructed with a Marching Cubes algorithm applied to simple data sets, i.e., fluorescent test beads, and complex data sets, i.e., fluorescent plant cells, acquired with a Zeiss Confocal Laser Scanning Microscope (LSM). The marching cubes algorithm treats each pixel or voxel of the image as a separate entity when performing computations. To test the spatial accuracy of the reconstruction, control data representing the volume of a 25 micrometer test shaper was obtained with the LSM. This volume was then judged on the basis of uniformity and smoothness. Using polygon decimation and smoothing algorithms available through the visualization toolkit, 'voxellated' test spheres and cells were smoothed using several different smoothing algorithms after unessential polygons were eliminated. With these improvements, the shape of subcellular organelles could be modeled at various levels of accuracy. However, in order to accurately reconstruct these complex structures of interest to us, the subcellular organelles of the endosomal system or the endoplasmic reticulum of plant cells, measurements of the accuracy of connectedness of structures need to be developed.

An increasingly visual culture is affecting work and training. Achievement of visual literacy means acquiring competence in critical analysis of visual images and in communicating through visual media. (SK)

Contingency analysis is the process of employing different measures to model scenarios, analyze them, and then derive the best response to remove the threats. This application paper focuses on a class of contingency analysis problems found in the power grid management system. A power grid is a geographically distributed interconnected transmission network that transmits and delivers electricity from generators to end users. The power grid contingency analysis problem is increasingly important because of both the growing size of the underlying raw data that need to be analyzed and the urgency to deliver working solutions in an aggressive timeframe. Failure to do so may bring significant financial, economic, and security impacts to all parties involved and the society at large. The paper presents a scalablevisual analytics pipeline that transforms about 100 million contingency scenarios to a manageable size and form for grid operators to examine different scenarios and come up with preventive or mitigation strategies to address the problems in a predictive and timely manner. Great attention is given to the computational scalability, information scalability, visualscalability, and display scalability issues surrounding the data analytics pipeline. Most of the large-scale computation requirements of our work are conducted on a Cray XMT multi-threaded parallel computer. The paper demonstrates a number of examples using western North American power grid models and data.

Clusters of workstations have emerged as an important platform for building cost-effective, scalable and highly-available computers. Although many hardware solutions are available today, the largest challenge in making large-scale clusters usable lies in the system software. In this paper we present STORM, a resource management tool designed to provide scalability, low overhead and the flexibility necessary to efficiently support and analyze a wide range of job scheduling algorithms. STORM achieves these feats by closely integrating the management daemons with the low-level features that are common in state-of-the-art high-performance system area networks. The architecture of STORM is based on three main technical innovations. First, a sizable part of the scheduler runs in the thread processor located on the network interface. Second, we use hardware collectives that are highly scalable both for implementing control heartbeats and to distribute the binary of a parallel job in near-constant time, irrespective of job and machine sizes. Third, we use an I/O bypass protocol that allows fast data movements from the file system to the communication buffers in the network interface and vice versa. The experimental results show that STORM can launch a job with a binary of 12MB on a 64 processor/32 node cluster in less than 0.25 sec on an empty network, in less than 0.45 sec when all the processors are busy computing other jobs, and in less than 0.65 sec when the network is flooded with a background traffic. This paper provides experimental and analytical evidence that these results scale to a much larger number of nodes. To the best of our knowledge, STORM is at least two orders of magnitude faster than existing production schedulers in launching jobs, performing resource management tasks and gang scheduling.

The research project RD24 is studying applications of the Scalable Coherent Interface (IEEE-1596) standard for the large hadron collider (LHC). First SCI node chips from Dolphin were used to demonstrate the use and functioning of SCI's packet protocols and to measure data rates. The authors present results from a first, two-node SCI ringlet at CERN, based on a R3000 RISC processor node and DMA node on a MC68040 processor bus. A diagnostic link analyzer monitors the SCI packet protocols up to full link bandwidth. In its second phase, RD24 will build a first implementation of a multi-ringlet SCI data merger.

Standard Shack-Hartman wavefront sensors use a CCD element to sample position and distortion of a target or guide star. Digital sampling of the element and transfer to a memory space for subsequent computation adds significant temporal delay, thus, limiting the spatial frequency and scalability of the system as a wavefront sensor. A new approach to sampling uses information processing principles in an insect compound eye. Analog circuitry eliminates digital sampling and extends the useful range of the system to control a deformable mirror and make a faster, more capable wavefront sensor.

Many tools that target parallel and distributed environments must co-locate a set of daemons with the distributed processes of the target application. However, efficient and portable deployment of these daemons on large scale systems is an unsolved problem. We overcome this gap with LaunchMON, a scalable, robust, portable, secure, and general purpose infrastructure for launching tool daemons. Its API allows tool builders to identify all processes of a target job, launch daemons on the relevant nodes and control daemon interaction. Our results show that Launch-MON scales to very large daemon counts and substantially enhances performance over existing ad hoc mechanisms.

The introduction of parallel processors that run a separate copy of Unix on each process has introduced new problems in managing the user`s environment. This paper discusses some generalizations of common Unix commands for managing files (e.g. 1s) and processes (e.g. ps) that are convenient and scalable. These basic tools, just like their Unix counterparts, are text-based. We also discuss a way to use these with a graphical user interface (GUI). Some notes on the implementation are provided. Prototypes of these commands are publicly available.

This revision corrects some errors in SPRNG 1. Users of newer SPRNG versions can obtain the corrected files and build their version with it. This version also improves the scalability of some of the application-based tests in the SPRNG test suite. It also includes an interface to a parallel Mersenne Twister, so that if users install the Mersenne Twister, then they can test this generator with the SPRNG test suite and also use some SPRNGmore » features with that generator.« less

Superhydrophobic (SH) surfaces, created from hydrophobic materials with micro- or nano- roughness, trap air pockets in the interstices of the roughness, leading, in fluid flow conditions, to shear-free regions with finite interfacial fluid velocity and reduced resistance to flow. Significant attention has been given to SH conditions on ordered, periodic surfaces. However, in practical terms, random surfaces are more applicable due to their relative ease of fabrication. We investigate SH behavior on a novel durable polymeric rough surface created through a scalable roll-coating process with varying micro-scale roughness through velocity and pressure drop measurements. We introduce a new method to construct the velocity profile over SH surfaces with significant roughness in microchannels. Slip length was measured as a function of differing roughness and interstitial air conditions, with roughness and air fraction parameters obtained through direct visualization. The slip length was matched to scaling laws with good agreement. Roughness at high air fractions led to a reduced pressure drop and higher velocities, demonstrating the effectiveness of the considered surface in terms of reduced resistance to flow. We conclude that the observed air fraction under flow conditions is the primary factor determining the response in fluid flow. Such behavior correlated well with the hydrophobic or superhydrophobic response, indicating significant potential for practical use in enhancing fluid flow efficiency.

The Cray Gemini Interconnect has been recently introduced as a next generation network architecture for building multi-petaflop supercomputers. Cray XE6 systems including LANL Cielo, NERSC Hopper, ORNL Titan and proposed NCSA BlueWaters leverage the Gemini Interconnect as their primary Interconnection network. At the same time, programming models such as the Message Passing Interface (MPI) and Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) and Co-Array Fortran (CAF) have become available on these systems. Global Arrays is a popular PGAS model used in a variety of application domains including hydrodynamics, chemistry and visualization. Global Arrays uses Aggregate Re- mote Memory Copy Interface (ARMCI) as the communication runtime system for Remote Memory Access communication. This paper presents a design, implementation and performance evaluation of scalable and high performance communication subsystems on Cray Gemini Interconnect using ARMCI. The design space is explored and time-space complexities of commu- nication protocols for one-sided communication primitives such as contiguous and uniformly non-contiguous datatypes, atomic memory operations (AMOs) and memory synchronization is presented. An implementation of the proposed design (referred as ARMCI-Gemini) demonstrates the efficacy on communication primitives, application kernels such as LU decomposition and full applications such as Smooth Particle Hydrodynamics (SPH) application.

Petascale systems will have hundreds of thousands of processor cores so their applications must be massively parallel. Effective use of petascale systems will require efficient interprocess communication through memory hierarchies and complex network topologies. Tools to collect and analyze detailed data about this communication would facilitate its optimization. However, several factors complicate tool design. First, large-scale runs on petascale systems will be a precious commodity, so scalable tools must have almost no overhead. Second, the volume of performance data from petascale runs could easily overwhelm hand analysis and, thus, tools must collect only data that is relevant to diagnosing performance problems. Analysis must be done in-situ, when available processing power is proportional to the data. We describe a tool framework that overcomes these complications. Our approach allows application developers to combine existing techniques for measurement, analysis, and data aggregation to develop application-specific tools quickly. Dynamic configuration enables application developers to select exactly the measurements needed and generic components support scalable aggregation and analysis of this data with little additional effort.

The large number of reagents that have been developed for the synthesis of trifluoromethylated compounds is a testament to the importance of the CF3 group as well as the associated synthetic challenge. Current state-of-the-art reagents for appending the CF3 functionality directly are highly effective; however, their use on preparative scale has minimal precedent because they require multistep synthesis for their preparation, and/or are prohibitively expensive for large-scale application. For a scalable trifluoromethylation methodology, trifluoroacetic acid and its anhydride represent an attractive solution in terms of cost and availability; however, because of the exceedingly high oxidation potential of trifluoroacetate, previous endeavours to use this material as a CF3 source have required the use of highly forcing conditions. Here we report a strategy for the use of trifluoroacetic anhydride for a scalable and operationally simple trifluoromethylation reaction using pyridine N-oxide and photoredox catalysis to affect a facile decarboxylation to the CF3 radical. PMID:26258541

The large number of reagents that have been developed for the synthesis of trifluoromethylated compounds is a testament to the importance of the CF3 group as well as the associated synthetic challenge. Current state-of-the-art reagents for appending the CF3 functionality directly are highly effective; however, their use on preparative scale has minimal precedent because they require multistep synthesis for their preparation, and/or are prohibitively expensive for large-scale application. For a scalable trifluoromethylation methodology, trifluoroacetic acid and its anhydride represent an attractive solution in terms of cost and availability; however, because of the exceedingly high oxidation potential of trifluoroacetate, previous endeavours to use this material as a CF3 source have required the use of highly forcing conditions. Here we report a strategy for the use of trifluoroacetic anhydride for a scalable and operationally simple trifluoromethylation reaction using pyridine N-oxide and photoredox catalysis to affect a facile decarboxylation to the CF3 radical. PMID:26258541

SNES (Scalable Nonlinear Equations Solvers) is a software package for the numerical solution of large-scale systems of nonlinear equations on both uniprocessors and parallel architectures. SNES also contains a component for the solution of unconstrained minimization problems, called SUMS (Scalable Unconstrained Minimization Solvers). Newton-like methods, which are known for their efficiency and robustness, constitute the core of the package. As part of the multilevel PETSc library, SNES incorporates many features and options from other parts of PETSc. In keeping with the spirit of the PETSc library, the nonlinear solution routines are data-structure-neutral, making them flexible and easily extensible. This users guide contains a detailed description of uniprocessor usage of SNES, with some added comments regarding multiprocessor usage. At this time the parallel version is undergoing refinement and extension, as we work toward a common interface for the uniprocessor and parallel cases. Thus, forthcoming versions of the software will contain additional features, and changes to parallel interface may result at any time. The new parallel version will employ the MPI (Message Passing Interface) standard for interprocessor communication. Since most of these details will be hidden, users will need to perform only minimal message-passing programming.

This paper presents a novel unequal erasure protection (UEP) strategy for the transmission of scalable data, formed by interleaving independently decodable and scalable streams, over packet erasure networks. The technique, termed multistream UEP (M-UEP), differs from the traditional UEP strategy by: 1) placing separate streams in separate packets to establish independence and 2) using permuted systematic Reed-Solomon codes to enhance the distribution of message symbols amongst the packets. M-UEP improves upon UEP by ensuring that all received source symbols are decoded. The R-D optimal redundancy allocation problem for M-UEP is formulated and its globally optimal solution is shown to have a time complexity of O(2(N)N(L+1)(N+1)) , where N is the number of packets and L is the packet length. To address the high complexity of the globally optimal solution, an efficient suboptimal algorithm is proposed which runs in O(N(2)L(2)) time. The proposed M-UEP algorithm is applied on SPIHT coded images in conjunction with an appropriate grouping of wavelet coefficients into streams. The experimental results reveal that M-UEP consistently outperforms the traditional UEP reaching peak improvements of 0.6 dB. Moreover, our tests show that M-UEP is more robust than UEP in adverse channel conditions. PMID:19783503

The Fast Line-of-sight Imagery for Target and Exhaust Signatures (FLITES) is a High Performance Computing (HPC-CHSSI) and Missile Defense Agency (MDA) funded effort that provides a scalable program to compute highly resolved temporal, spatial, and spectral hardbody and plume optical signatures. Distributed processing capabilities are included to allow complex, high fidelity, solutions to be generated quickly generated. The distributed processing logic includes automated load balancing algorithms to facilitate scalability using large numbers of processors. To enhance exhaust plume optical signature capabilities, FLITES employs two different radiance transport algorithms. The first algorithm is the traditional Curtis-Godson bandmodel approach and is provided to support comparisons to historical results and high-frame rate production requirements. The second algorithm is the Quasi Bandmodel Line-by-line (QBL) approach, which uses randomly placed "cloned" spectral lines to yield highly resolved radiation spectra for increased accuracy while maintaining tractable runtimes. This capability will provide a significant advancement over the traditional SPURC/SIRRM radiance transport methodology.

Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564

We have investigated the computational scalability of image pyramid building needed for dissemination of very large image data. The sources of large images include high resolution microscopes and telescopes, remote sensing and airborne imaging, and high resolution scanners. The term 'large' is understood from a user perspective which means either larger than a display size or larger than a memory/disk to hold the image data. The application drivers for our work are digitization projects such as the Lincoln Papers project (each image scan is about 100-150MB or about 5000x8000 pixels with the total number to be around 200,000) and the UIUC library scanning project for historical maps from 17th and 18th century (smaller number but larger images). The goal of our work is understand computational scalability of the web-based dissemination using image pyramids for these large image scans, as well as the preservation aspects of the data. We report our computational benchmarks for (a) building image pyramids to be disseminated using the Microsoft Seadragon library, (b) a computation execution approach using hyper-threading to generate image pyramids and to utilize the underlying hardware, and (c) an image pyramid preservation approach using various hard drive configurations of Redundant Array of Independent Disks (RAID) drives for input/output operations. The benchmarks are obtained with a map (334.61 MB, JPEG format, 17591x15014 pixels). The discussion combines the speed and preservation objectives.

Taxanes form a large family of terpenes comprising over 350 members, the most famous of which is Taxol (paclitaxel), a billion-dollar anticancer drug. Here, we describe the first practical and scalable synthetic entry to these natural products via a concise preparation of (+)-taxa-4(5),11(12)-dien-2-one, which has a suitable functional handle with which to access more oxidized members of its family. This route enables a gram-scale preparation of the ‘parent’ taxane—taxadiene—which is the largest quantity of this naturally occurring terpene ever isolated or prepared in pure form. The characteristic 6-8-6 tricyclic system of the taxane family, containing a bridgehead alkene, is forged via a vicinal difunctionalization/Diels-Alder strategy. Asymmetry is introduced by means of an enantioselective conjugate addition that forms an all-carbon quaternary centre, from which all other stereocentres are fixed through substrate control. This study lays a critical foundation for a planned access to minimally oxidized taxane analogues and a scalable laboratory preparation of Taxol itself.

The physical implementation of nontrivial quantum computing is an experimental challenge due to decoherence and the need for scalability. Recently we proved a novel theoretical scheme for realizing a scalable quantum register of very large size, entangled in a cluster state, in the optical frequency comb (OFC) defined by the eigenmodes of a single optical parametric oscillator (OPO). The classical OFC is well known as implemented by the femtosecond, carrier-envelope-phase- and mode-locked lasers which have redefined frequency metrology in recent years. The quantum OFC is a set of harmonic oscillators, or Qmodes, whose amplitude and phase quadratures are continuous variables, the manipulation of which is a mature field for one or two Qmodes. We have shown that the nonlinear optical medium of a single OPO can be engineered, in a sophisticated but already demonstrated manner, so as to entangle in constant time the OPO's OFC into a finitely squeezed, Gaussian cluster state suitable for universal quantum computing over continuous variables. Here we summarize our theoretical result and survey the ongoing experimental efforts in this direction.

In this paper, we investigate the problem of scalablevisual feature matching in large-scale image search and propose a novel cascaded scalar quantization scheme in dual resolution. We formulate the visual feature matching as a range-based neighbor search problem and approach it by identifying hyper-cubes with a dual-resolution scalar quantization strategy. Specifically, for each dimension of the PCA-transformed feature, scalar quantization is performed at both coarse and fine resolutions. The scalar quantization results at the coarse resolution are cascaded over multiple dimensions to index an image database. The scalar quantization results over multiple dimensions at the fine resolution are concatenated into a binary super-vector and stored into the index list for efficient verification. The proposed cascaded scalar quantization (CSQ) method is free of the costly visual codebook training and thus is independent of any image descriptor training set. The index structure of the CSQ is flexible enough to accommodate new image features and scalable to index large-scale image database. We evaluate our approach on the public benchmark datasets for large-scale image retrieval. Experimental results demonstrate the competitive retrieval performance of the proposed method compared with several recent retrieval algorithms on feature quantization. PMID:26656584

Visualization plays a critical role in the statistical model building and data analysis process. Data analysts, well-versed in statistical and machine learning methods, visualize data to hypothesize and validate models. These analysts need flexible, scalablevisualization tools that are not decoupled from their analysis environment. In this paper we introduce Trelliscope, a visualization framework for statistical analysis of large complex data. Trelliscope extends Trellis, an effective visualization framework that divides data into subsets and applies a plotting method to each subset, arranging the results in rows and columns of panels. Trelliscope provides a way to create, arrange and interactively view panels for very large datasets, enabling flexible detailed visualization for data of any size. Scalability is achieved using distributed computing technologies coupled with . We discuss the underlying principles, design, and scalable architecture of Trelliscope, and illustrate its use on three analysis projects in the domains of proteomics, high energy physics, and power systems engineering.

We present the first distributed paradigm for multiple users to interact simultaneously with large tiled rear projection display walls. Unlike earlier works, our paradigm allows easy scalability across different applications, interaction modalities, displays and users. The novelty of the design lies in its distributed nature allowing well-compartmented, application independent, and application specific modules. This enables adapting to different 2D applications and interaction modalities easily by changing a few application specific modules. We demonstrate four challenging 2D applications on a nine projector display to demonstrate the application scalability of our method: map visualization, virtual graffiti, virtual bulletin board and an emergency management system. We demonstrate the scalability of our method to multiple interaction modalities by showing both gesture-based and laser-based user interfaces. Finally, we improve earlier distributed methods to register multiple projectors. Previous works need multiple patterns to identify the neighbors, the configuration of the display and the registration across multiple projectors in logarithmic time with respect to the number of projectors in the display. We propose a new approach that achieves this using a single pattern based on specially augmented QR codes in constant time. Further, previous distributed registration algorithms are prone to large misregistrations. We propose a novel radially cascading geometric registration technique that yields significantly better accuracy. Thus, our improvements allow a significantly more efficient and accurate technique for distributed self-registration of multi-projector display walls. PMID:20975205

As media processing gradually migrates from hardware to software programmable platforms, the number of media processing functions added on the media processor grow even faster than the ever-increasing media processor power can support. Computational complexity scalable algorithms become powerful vehicles for implementing many time-critical yet complexity-constrained applications, such as MPEG2 video decoding. In this paper, we present an adaptive resource-constrained complexity scalable MPEG2 video decoding scheme that makes a good trade-off between decoding complexity and output quality. Based on the available computational resources and the energy level of B-frame residuals, the scalable decoding algorithm selectively decodes B-residual blocks to significantly reduce system complexity. Furthermore, we describe an iterative procedure designed to dynamically adjust the complexity levels in order to achieve the best possible output quality under a given resource constraint. Experimental results show that up to 20% of total computational complexity reduction can be obtained with satisfactory output visual quality.

In this work a new, scalable and low cost multi-channel monitoring system for Polymer Electrolyte Fuel Cells (PEFCs) has been designed, constructed and experimentally validated. This developed monitoring system performs non-intrusive voltage measurement of each individual cell of a PEFC stack and it is scalable, in the sense that it is capable to carry out measurements in stacks from 1 to 120 cells (from watts to kilowatts). The developed system comprises two main subsystems: hardware devoted to data acquisition (DAQ) and software devoted to real-time monitoring. The DAQ subsystem is based on the low-cost open-source platform Arduino and the real-time monitoring subsystem has been developed using the high-level graphical language NI LabVIEW. Such integration can be considered a novelty in scientific literature for PEFC monitoring systems. An original amplifying and multiplexing board has been designed to increase the Arduino input port availability. Data storage and real-time monitoring have been performed with an easy-to-use interface. Graphical and numerical visualization allows a continuous tracking of cell voltage. Scalability, flexibility, easy-to-use, versatility and low cost are the main features of the proposed approach. The system is described and experimental results are presented. These results demonstrate its suitability to monitor the voltage in a PEFC at cell level. PMID:27005630

In this work a new, scalable and low cost multi-channel monitoring system for Polymer Electrolyte Fuel Cells (PEFCs) has been designed, constructed and experimentally validated. This developed monitoring system performs non-intrusive voltage measurement of each individual cell of a PEFC stack and it is scalable, in the sense that it is capable to carry out measurements in stacks from 1 to 120 cells (from watts to kilowatts). The developed system comprises two main subsystems: hardware devoted to data acquisition (DAQ) and software devoted to real-time monitoring. The DAQ subsystem is based on the low-cost open-source platform Arduino and the real-time monitoring subsystem has been developed using the high-level graphical language NI LabVIEW. Such integration can be considered a novelty in scientific literature for PEFC monitoring systems. An original amplifying and multiplexing board has been designed to increase the Arduino input port availability. Data storage and real-time monitoring have been performed with an easy-to-use interface. Graphical and numerical visualization allows a continuous tracking of cell voltage. Scalability, flexibility, easy-to-use, versatility and low cost are the main features of the proposed approach. The system is described and experimental results are presented. These results demonstrate its suitability to monitor the voltage in a PEFC at cell level. PMID:27005630

-complete factorizations, possibly with a high percentage of missing values. This promotes additional sparsity beyond rank reduction. Computationally, we design methods based on a ``decomposition and combination'' strategy, to break large-scale optimization into many small subproblems to solve in a recursive and parallel manner. On this basis, we implement the proposed methods through multi-platform shared-memory parallel programming, and through Mahout, a library for scalable machine learning and data mining, for mapReduce computation. For example, our methods are scalable to a dataset consisting of three billions of observations on a single machine with sufficient memory, having good timings. Both theoretical and numerical investigations show that the proposed methods exhibit significant improvement in accuracy over state-of-the-art scalable methods.

Beginning with the ground-breaking work on DNA computation by Adleman in 1994 [2], the idea of using DNA molecules to perform computations has been explored extensively. In this thesis, a computation based on a scalable DNA neural network was discussed and a neuron model was partially implemented using DNA molecules. In order to understand the behavior of short DNA strands in a polyacrylamide gel, we have measured the mobilities of various short single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) shorter than 100 bases. We found that sufficiently short lengths of ssDNA had a higher mobility than same lengths of dsDNA, with a crossover length Lx at which the mobilities are equal. The crossover length decreases approximately linearly with polyacrylamide gel acrylamide concentration. At the same time, the influence of DNA structure on its mobility was studied and the effect of single-stranded overhangs on dsDNA was discussed. The idea to make a scalable DNA neural network was discussed. To prepare our basis vector DNA oligomers, a 90 base DNA template with 50 base random strand in the middle and two 20 base primers on the ends was designed and purchased. By a series of dilutions, we obtained several aliquots, containing only 30 random sequence molecules each. These were amplified to roughly 5 pico mole quantities by 38 cycles of PCR with hot start DNA polymerase. We then used asymmetric PCR followed by polyacrylamide gel purification to get the necessary single-stranded basis vectors (ssDNA) and their complements. We tested the suitability of this scheme by adding two vectors formed from different linear of the basis vectors. The full scheme for DNA neural network computation was tested using two determinate ssDNA strands. We successfully transformed an input DNA oligomer to a different output oligomer using the polymerase reaction required by the proposed DNA neural network algorithm. Isothermal linear amplification was used to obtain a sufficient quantity of

The development of methods, algorithms and applications for visualization of molecular dynamics simulation outputs is discussed. The visual analysis of the results of such calculations is a complex and actual problem especially in case of the large scale simulations. To solve this challenging task it is necessary to decide on: 1) what data parameters to render, 2) what type of visualization to choose, 3) what development tools to use. In the present work an attempt to answer these questions was made. For visualization it was offered to draw particles in the corresponding 3D coordinates and also their velocity vectors, trajectories and volume density in the form of isosurfaces or fog. We tested the way of post-processing and visualization based on the Python language with use of additional libraries. Also parallel software was developed that allows processing large volumes of data in the 3D regions of the examined system. This software gives the opportunity to achieve desired results that are obtained in parallel with the calculations, and at the end to collect discrete received frames into a video file. The software package "Enthought Mayavi2" was used as the tool for visualization. This visualization application gave us the opportunity to study the interaction of a gas with a metal surface and to closely observe the adsorption effect.

In emergency situations, the ability to remotely monitor unfolding events using high-quality video feeds will significantly improve the incident commander's understanding of the situation and thereby aids effective decision making. This paper presents a novel, adaptive video monitoring system for emergency situations where the normal communications network infrastructure has been severely impaired or is no longer operational. The proposed scheme, operating over a rapidly deployable wireless mesh network, supports real-time video feeds between first responders, forward operating bases and primary command and control centers. Video feeds captured on portable devices carried by first responders and by static visual sensors are encoded in H.264/SVC, the scalable extension to H.264/AVC, allowing efficient, standard-based temporal, spatial, and quality scalability of the video. A three-tier video delivery system is proposed, which balances the need to avoid overuse of mesh nodes with the operational requirements of the emergency management team. In the first tier, the video feeds are delivered at a low spatial and temporal resolution employing only the base layer of the H.264/SVC video stream. Routing in this mode is designed to employ all nodes across the entire mesh network. In the second tier, whenever operational considerations require that commanders or operators focus on a particular video feed, a `fidelity control' mechanism at the monitoring station sends control messages to the routing and scheduling agents in the mesh network, which increase the quality of the received picture using SNR scalability while conserving bandwidth by maintaining a low frame rate. In this mode, routing decisions are based on reliable packet delivery with the most reliable routes being used to deliver the base and lower enhancement layers; as fidelity is increased and more scalable layers are transmitted they will be assigned to routes in descending order of reliability. The third tier

The Scalable Reasoning System (SRS) is a lightweight visual analytics framework that makes analytical capabilities widely accessible to a class of users we have deemed “impromptu analysts.” By focusing on a deployment of SRS, the Lessons Learned Explorer (LLEx), we examine how to develop visualizations around analytical-oriented goals and data availability. We discuss how to help impromptu analysts to explore deeper patterns. Through designing consistent interactions, we arrive at an interdependent view capable of showcasing patterns. With the combination of SRS widget visualizations and interactions around the underlying textual data, we aim to transition the casual, infrequent user into a viable–albeit impromptu–analyst.

Piezoelectric material (PZT) has drawn enormous attention in the past decades due to its ability to convert mechanical deformation energy into electrical potential energy, and vice versa, and has been applied to energy harvesting and vibration control. In this work, we consider the effect of PZT on the stability of a flexible flag using the inviscid vortex-sheet model. We find that the critical flutter speed is increased due to the extra damping effect of the PZT, and can also be altered by tuning the output inductance-resistance circuit. Optimal resistance and inductance are found to either maximize or minimize the flutter speed. The former application is useful for the vibration control while the latter is important for energy harvesting. We also discuss the scalability of above system to the actual application in air and water.

In this paper three models of parallel speedup are studied. They are fixed-size speedup, fixed-time speedup and memory-bounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set considers uneven workload allocation and communication overhead and gives more accurate estimation. Another set considers a simplified case and provides a clear picture on the impact of the sequential portion of an application on the possible performance gain from parallel processing. The simplified fixed-size speedup is Amdahl's law. The simplified fixed-time speedup is Gustafson's scaled speedup. The simplified memory-bounded speedup contains both Amdahl's law and Gustafson's scaled speedup as special cases. This study leads to a better understanding of parallel processing.

Positron Emission Tomography (PET) historically has major clinical and preclinical applications in cancerous oncology, neurology, and cardiovascular diseases. Recently, in a new direction, an application specific PET system is being developed at Thomas Jefferson National Accelerator Facility (Jefferson Lab) in collaboration with Duke University, University of Maryland at Baltimore (UMAB), and West Virginia University (WVU) targeted for plant eco-physiology research. The new plant imaging PET system is versatile and scalable such that it could adapt to several plant imaging needs - imaging many important plant organs including leaves, roots, and stems. The mechanical arrangement of the detectors is designed to accommodate the unpredictable and random distribution in space of the plant organs without requiring the plant be disturbed. Prototyping such a system requires a new data acquisition system (DAQ) and data processing system which are adaptable to the requirements of these unique and versatile detectors.

Digital vascular computer systems are used for radiology and fluoroscopy (R/F), angiography, and cardiac applications. In the United States alone, about 26 million procedures of these types are performed annually: about 81% R/F, 11% cardiac, and 8% angiography. Digital vascular systems have a very wide range of performance requirements, especially in terms of data rates. In addition, new features are added over time as they are shown to be clinically efficacious. Application-specific processing modes such as roadmapping, peak opacification, and bolus chasing are particular to some vascular systems. New algorithms continue to be developed and proven, such as Cox and deJager's precise registration methods for masks and live images in digital subtraction angiography. A computer architecture must have high scalability and reconfigurability to meet the needs of this modality. Ideally, the architecture could also serve as the basis for a nonvascular R/F system.

We report on scalable solid-state neutron detector system that is specifically designed to yield high thermal neutron detection sensitivity. The basic detector unit in this system is made of a {sup 6}Li foil coupled to two crystalline silicon diodes. The theoretical intrinsic efficiency of a detector-unit is 23.8% and that of detector element comprising a stack of five detector-units is 60%. Based on the measured performance of this detector-unit, the performance of a detector system comprising a planar array of detector elements, scaled to encompass effective area of 0.43 m{sup 2}, is estimated to yield the minimum absolute efficiency required of radiological portal monitors used in homeland security.

Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree–Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.

Despite the explosion of text on the Internet, hard copy documents that have been scanned as images still play a significant role for some tasks. The best method to perform ranked retrieval on a large corpus of document images, however, remains an open research question. The most common approach has been to perform text retrieval using terms generated by optical character recognition. This paper, by contrast, examines whether a scalable segmentation-free image retrieval algorithm, which matches sub-images containing text or graphical objects, can provide additional benefit in satisfying a user's information needs on a large, real world dataset. Results on 7 million scanned pages from the CDIP v1.0 test collection show that content based image retrieval finds a substantial number of documents that text retrieval misses, and that when used as a basis for relevance feedback can yield improvements in retrieval effectiveness.

We report on scalable solid-state neutron detector system that is specifically designed to yield high thermal neutron detection sensitivity. The basic detector unit in this system is made of a 6Li foil coupled to two crystalline silicon diodes. The theoretical intrinsic efficiency of a detector-unit is 23.8% and that of detector element comprising a stack of five detector-units is 60%. Based on the measured performance of this detector-unit, the performance of a detector system comprising a planar array of detector elements, scaled to encompass effective area of 0.43 m2, is estimated to yield the minimum absolute efficiency required of radiological portal monitors used in homeland security.

We report on scalable solid-state neutron detector system that is specifically designed to yield high thermal neutron detection sensitivity. The basic detector unit in this system is made of a (6)Li foil coupled to two crystalline silicon diodes. The theoretical intrinsic efficiency of a detector-unit is 23.8% and that of detector element comprising a stack of five detector-units is 60%. Based on the measured performance of this detector-unit, the performance of a detector system comprising a planar array of detector elements, scaled to encompass effective area of 0.43 m(2), is estimated to yield the minimum absolute efficiency required of radiological portal monitors used in homeland security. PMID:26133869

Catalysis is a topic of continuous interest since it was discovered in chemistry centuries ago. Aiming at the advance of reactions for efficient processes, a number of approaches have been developed over the last 180 years, and more recently, porphyrins occupy an important role in this field. Porphyrins and metalloporphyrins are fascinating compounds which are involved in a number of synthetic transformations of great interest for industry and academy. The aim of this review is to cover the most recent progress in reactions catalysed by porphyrins in scalable procedures, thus presenting the state of the art in reactions of epoxidation, sulfoxidation, oxidation of alcohols to carbonyl compounds and C-H functionalization. In addition, the use of porphyrins as photocatalysts in continuous flow processes is covered. PMID:27005601

Given a social network, who is the best person to introduce you to, say, Chris Ferguson, the poker champion? Or, given a network of people and skills, who is the best person to help you learn about, say, wavelets? The goal is to find a small group of 'gateways': persons who are close enough to us, as well as close enough to the target (person, or skill) or, in other words, are crucial in connecting us to the target. The main contributions are the following: (a) we show how to formulate this problem precisely; (b) we show that it is sub-modular and thus it can be solved near-optimally; (c) we give fast, scalable algorithms to find such gateways. Experiments on real data sets validate the effectiveness and efficiency of the proposed methods, achieving up to 6,000,000x speedup.

This paper presents guidelines for the design of a mass storage system benchmark suite, along with preliminary suggestions for programs to be included. The benchmarks will measure both peak and sustained performance of the system as well as predicting both short- and long-term behavior. These benchmarks should be both portable and scalable so they may be used on storage systems from tens of gigabytes to petabytes or more. By developing a standard set of benchmarks that reflect real user workload, we hope to encourage system designers and users to publish performance figures that can be compared with those of other systems. This will allow users to choose the system that best meets their needs and give designers a tool with which they can measure the performance effects of improvements to their systems.

Clip art is a simplified illustration form consisting of layered filled polygons or closed curves used to convey 3D shape information in a 2D vector graphics format. This paper focuses on the problem of direct conversion of smooth surfaces, ranging from the free-form shapes of art and design to the mathematical structures of geometry and topology, into a clip art form suitable for illustration use in books, papers and presentations. We show how to represent silhouette, shadow, gleam and other surface feature curves as the intersection of implicit surfaces, and derive equations for their efficient interrogation via particle chains. We further describe how to sort, orient, identify and fill the closed regions that overlay to form clip art. We demonstrate the results with numerous renderings used to illustrate the paper itself. PMID:17993708

We present two new applications that engage the network as a tool for astronomical research and/or education. The first is a VRML server which allows users over the Web to interactively create three-dimensional visualizations of FITS images contained in the NCSA Astronomy Digital Image Library (ADIL). The server's Web interface allows users to select images from the ADIL, fill in processing parameters, and create renderings featuring isosurfaces, slices, contours, and annotations; the often extensive computations are carried out on an NCSA SGI supercomputer server without the user having an individual account on the system. The user can then download the 3D visualizations as VRML files, which may be rotated and manipulated locally on virtually any class of computer. The second application is the ADILBrowser, a part of the NCSA Horizon Image Data Browser Java package. ADILBrowser allows a group of participants to browse images from the ADIL within a collaborative session. The collaborative environment is provided by the NCSA Habanero package which includes text and audio chat tools and a white board. The ADILBrowser is just an example of a collaborative tool that can be built with the Horizon and Habanero packages. The classes provided by these packages can be assembled to create custom collaborative applications that visualize data either from local disk or from anywhere on the network.

Recently a methodology for representation and adaptation of arbitrary scalable bit-streams in a fully content non-specific manner has been proposed on the basis of a universal model for all scalable bit-streams called Scalable Structured Meta-formats (SSM). According to this model, elementary scalable bit-streams are naturally organized in a symmetric multi-dimensional logical structure. The model parameters for a specific bit-stream along with information guiding decision-making among possible adaptation choices are represented in a binary or XML descriptor to accompany the bit-stream flowing downstream. The capabilities and preferences of receiving terminals flow upstream and are also specified in binary or XML form to represent constraints that guide adaptation. By interpreting the descriptor and the constraint specifications, a universal adaptation engine sitting on a network node can adapt the content appropriately to suit the specified needs and preferences of recipients, without knowledge of the specifics of the content, its encoding and/or encryption. In this framework, different adaptation infrastructures are no longer needed for different types of scalable media. In this work, we show how this framework can be used to adapt fully scalable video bit-streams, specifically ones obtained by the fully scalable MC-EZBC video coding system. MC-EZBC uses a 3-D subband/wavelet transform that exploits correlation by filtering along motion trajectories, to obtain a 3-dimensional scalable bit-stream combining temporal, spatial and SNR scalability in a compact bit-stream. Several adaptation use cases are presented to demonstrate the flexibility and advantages of a fully scalable video bit-stream when used in conjunction with a network adaptation engine for transmission.

Scalable video coding (SVC) is attractive due to the capability of reconstructing lower resolution or lower quality signals from partial bit streams, which allows for simple solutions adaptted to network and terminal capabilities. This article addresses the spatial scalability of SVC and proposes an efficient H.264-based scalable intra coding algorithm. In comparison with precious single layer intra prediction (SLIP) method, the proposed algorithm aims to improve the intra coding performance of the enhancement layer by a new inter layer intra prediction (ILIP) method. The main idea of ILIP is that up-sampled and reconstructed pixels of the base layer are very useful to predict and encode those pixels of the enhancement layer, especially when those neighbouring pixels are not available. Experimental results show that the peak signal to noise ratio (PSNR) data of luminance component of encoded frames are improved, and both bit-rates and computation complexity are maintained very well. For sequence Football, the average increase of PSNR is up to 0.21 dB, while for Foreman and Bus, they are 0.14 dB and 0.17 dB, respectively.

The current research will investigate the possibility of developing a computing-visualization system using a public domain software system built on a personal computer. Visualization Toolkit (VTK) is available on UNIX and PC platforms. VTK uses C++ to build an executable. It has abundant programming classes/objects that are contained in the system library. Users can also develop their own classes/objects in addition to those existing in the class library. Users can develop applications with any of the C++, Tcl/Tk, and JAVA environments. The present research will show how a data visualization system can be developed with VTK running on a personal computer. The topics will include: execution efficiency; visual object quality; availability of the user interface design; and exploring the feasibility of the VTK-based World Wide Web data visualization system. The present research will feature a case study showing how to use VTK to visualize meteorological data with techniques including, iso-surface, volume rendering, vector display, and composite analysis. The study also shows how the VTK outline, axes, and two-dimensional annotation text and title are enhancing the data presentation. The present research will also demonstrate how VTK works in an internet environment while accessing an executable with a JAVA application programing in a webpage.

The feasibility of high energy computed tomography (9 MeV) to detect volumetric and planar discontinuities in large pressure vessel mock-up blocks was studied. The data supplied by the manufacturer of the test blocks on the intended flaw geometry were compared to manual, contact ultrasonic test and computed tomography test data. Subsequently, a visualization program was used to construct fully three-dimensional morphological information enabling interactive data analysis on the detected flaws. Density isosurfaces show the relative shape and location of the volumetric defects within the mock-up blocks. Such a technique may be used to qualify personnel or newly developed ultrasonic test methods without the associated high cost of destructive evaluation. Data is presented showing the capability of the volumetric data analysis program to overlay the computed tomography and destructive evaluation (serial metallography) data for a direct, three-dimensional comparison.

Conventional methods to diagnose and follow treatment of Multiple Sclerosis require radiologists and technicians to compare current images with older images of a particular patient, on a slic-by-slice basis. Although there has been progress in creating 3D displays of medical images, little attempt has been made to design visual tools that emphasize change over time. We implemented several ideas that attempt to address this deficiency. In one approach, isosurfaces of segmented lesions at each time step were displayed either on the same image (each time step in a different color), or consecutively in an animation. In a second approach, voxel- wise differences between time steps were calculated and displayed statically using ray casting. Animation was used to show cumulative changes over time. Finally, in a method borrowed from computational fluid dynamics (CFD), glyphs (small arrow-like objects) were rendered with a surface model of the lesions to indicate changes at localized points.

Visual agnosia is defined as an impairment of object recognition, in the absence of visual acuity or cognitive dysfunction that would explain this impairment. This condition is caused by lesions in the visual association cortex, sparing primary visual cortex. There are 2 main pathways that process visual information: the ventral stream, tasked with object recognition, and the dorsal stream, in charge of locating objects in space. Visual agnosia can therefore be divided into 2 major groups depending on which of the two streams is damaged. The aim of this article is to conduct a narrative review of the various visual agnosia syndromes, including recent developments in a number of these syndromes. PMID:26358494

Detecting and predicting maneuvering satellites is an important problem for Space Situational Awareness. The spatial envelope of all possible locations within reach of such a maneuvering satellite is known as the Reachable Volume (RV). As soon as custody of a satellite is lost, calculating the RV and its subsequent time evolution is a critical component in the rapid recovery of the satellite. In this paper, we present a Monte Carlo approach to computing the RV for a given object. Essentially, our approach samples all possible trajectories by randomizing thrust-vectors, thrust magnitudes and time of burn. At any given instance, the distribution of the 'point-cloud' of the virtual particles defines the RV. For short orbital time-scales, the temporal evolution of the point-cloud can result in complex, multi-reentrant manifolds. Visualization plays an important role in gaining insight and understanding into this complex and evolving manifold. In the second part of this paper, we focus on how to effectively visualize the large number of virtual trajectories and the computed RV. We present a real-time out-of-core rendering technique for visualizing the large number of virtual trajectories. We also examine different techniques for visualizing the computed volume of probability density distribution, including volume slicing, convex hull and isosurfacing. We compare and contrast these techniques in terms of computational cost and visualization effectiveness, and describe the main implementation issues encountered during our development process. Finally, we will present some of the results from our end-to-end system for computing and visualizing RVs using examples of maneuvering satellites.

This paper examines the scalability of several types of parallel genetic algorithms (GAs). The objective is to determine the optimal number of processors that can be used by each type to minimize the execution time. The first part of the paper considers algorithms with a single population. The investigation focuses on an implementation where the population is distributed to several processors, but the results are applicable to more common master-slave implementations, where the population is entirely stored in a master processor and multiple slaves are used to evaluate the fitness. The second part of the paper deals with parallel GAs with multiple populations. It first considers a bounding case where the connectivity, the migration rate, and the frequency of migrations are set to their maximal values. Then, arbitrary regular topologies with lower migration rates are considered and the frequency of migrations is set to its lowest value. The investigationis mainly theoretical, but experimental evidence with an additively-decomposable function is included to illustrate the accuracy of the theory. In all cases, the calculations show that the optimal number of processors that minimizes the execution time is directly proportional to the square root of the population size and the fitness evaluation time. Since these two factors usually increase as the domain becomes more difficult, the results of the paper suggest that parallel GAs can integrate large numbers of processors and significantly reduce the execution time of many practical applications. PMID:10578030

The current limitations of commercially available thermoelectric (TE) generators include their incompatibility with human body applications due to the toxicity of commonly used alloys and possible future shortage of raw materials (Bi-Sb-Te and Se). In this respect, exploiting silicon as an environmentally friendly candidate for thermoelectric applications is a promising alternative since it is an abundant, ecofriendly semiconductor for which there already exists an infrastructure for low-cost and high-yield processing. Contrary to the existing approaches, where n/ p-legs were either heavily doped to an optimal carrier concentration of 1019 cm-3 or morphologically modified by increasing their roughness, in this work improved thermoelectric performance was achieved in smooth silicon nanostructures with low doping concentration (1.5 × 1015 cm-3). Scalable, highly reproducible e-beam lithographies, which are compatible with nanoimprint and followed by deep reactive-ion etching (DRIE), were employed to produce arrays of regularly spaced nanopillars of 400 nm height with diameters varying from 140 nm to 300 nm. A potential Seebeck microprobe (PSM) was used to measure the Seebeck coefficients of such nanostructures. This resulted in values ranging from -75 μV/K to -120 μV/K for n-type and 100 μV/K to 140 μV/K for p-type, which are significant improvements over previously reported data.

We report progress towards improving our previous demonstrations that combined all the fundamental building blocks required for scalable quantum information processing using trapped atomic ions. Included elements are long-lived qubits; a laser-induced universal gate set; state initialization and readout; and information transport, including co-trapping a second ion species to reinitialize motion without qubit decoherence. Recent efforts have focused on reducing experimental overhead and increasing gate fidelity. Most of the experimental duty cycle was previously used for transport, separation, and recombination of ion chains as well as re-cooling of motional excitation. We have addressed these issues by developing and implementing an arbitrary waveform generator with an update rate far above the ions' motional frequencies. To reduce gate errors, we actively stabilize the position of several UV (313 nm) laser beams. We have also switched the two-qubit entangling gate to one that acts directly on 9Be+ hyperfine qubit states whose energy separation is magnetic-fluctuation insensitive. This work is supported by DARPA, NSA, ONR, IARPA, Sandia, and the NIST Quantum Information Program.

The goal of healthcare is to provide high quality care at an affordable cost for its patients. However, the population it serves has changed dramatically since the popularization of hospital-based healthcare. With available new technology, alternative healthcare delivery methods can be designed and tested. This study examines Scalable Office Based Healthcare for Small Business, where healthcare is delivered to the office floor. This delivery was tested in 18 individuals at a small business in Minneapolis, Minnesota. The goal was to deliver modular healthcare and mitigate conditions such as diabetes, hyperlipidemia, obesity, sedentariness, and metabolic disease. The modular healthcare system was welcomed by employees – 70% of those eligible enrolled. The findings showed that the modular healthcare deliverable was feasible and effective. The data demonstrated significant improvements in weight loss, fat loss, and blood variables for at risk participants. This study leaves room for improvement and further innovation. Expansion to include offerings such as physicals, diabetes management, smoking cessation, and pre-natal treatment would improve its utility. Future studies could include testing the adaptability of delivery method, as it should adapt to reach rural and underserved populations. PMID:21471576

To avoid a message to be tempered and forged in vehicular ad hoc network (VANET), the digital signature method is adopted by IEEE1609.2. However, the costs of the method are excessively high for large-scale networks. The paper efficiently copes with the issue with a secure communication framework by introducing some lightweight cryptography primitives. In our framework, point-to-point and broadcast communications for vehicle-to-infrastructure (V2I) and vehicle-to-vehicle (V2V) are studied, mainly based on symmetric cryptography. A new issue incurred is symmetric key management. Thus, we develop key distribution and agreement protocols for two-party key and group key under different environments, whether a road side unit (RSU) is deployed or not. The analysis shows that our protocols provide confidentiality, authentication, perfect forward secrecy, forward secrecy and backward secrecy. The proposed group key agreement protocol especially solves the key leak problem caused by members joining or leaving in existing key agreement protocols. Due to aggregated signature and substitution of XOR for point addition, the average computation and communication costs do not significantly increase with the increase in the number of vehicles; hence, our framework provides good scalability.

This study presents an evaluation of the Stream Transmission Control Protocol (SCTP) for the transport of the scalable video codec (SVC), proposed by MPEG as an extension to H.264/AVC. Both technologies fit together properly. On the one hand, SVC permits to split easily the bitstream into substreams carrying different video layers, each with different importance for the reconstruction of the complete video sequence at the receiver end. On the other hand, SCTP includes features, such as the multi-streaming and multi-homing capabilities, that permit to transport robustly and efficiently the SVC layers. Several transmission strategies supported on baseline SCTP and its concurrent multipath transfer (CMT) extension are compared with the classical solutions based on the Transmission Control Protocol (TCP) and the Realtime Transmission Protocol (RTP). Using ns-2 simulations, it is shown that CMT-SCTP outperforms TCP and RTP in error-prone networking environments. The comparison is established according to several performance measurements, including delay, throughput, packet loss, and peak signal-to-noise ratio of the received video.

This article proposes design and architecture of a dynamically scalable dual-core pipelined processor. Methodology of the design is the core fusion of two processors where two independent cores can dynamically morph into a larger processing unit, or they can be used as distinct processing elements to achieve high sequential performance and high parallel performance. Processor provides two execution modes. Mode1 is multiprogramming mode for execution of streams of instruction of lower data width, i.e., each core can perform 16-bit operations individually. Performance is improved in this mode due to the parallel execution of instructions in both the cores at the cost of area. In mode2, both the processing cores are coupled and behave like single, high data width processing unit, i.e., can perform 32-bit operation. Additional core-to-core communication is needed to realise this mode. The mode can switch dynamically; therefore, this processor can provide multifunction with single design. Design and verification of processor has been done successfully using Verilog on Xilinx 14.1 platform. The processor is verified in both simulation and synthesis with the help of test programs. This design aimed to be implemented on Xilinx Spartan 3E XC3S500E FPGA.

Training an effective and scalable system for medical image analysis usually requires a large amount of labeled data, which incurs a tremendous annotation burden for pathologists. Recent progress in active learning can alleviate this issue, leading to a great reduction on the labeling cost without sacrificing the predicting accuracy too much. However, most existing active learning methods disregard the "structured information" that may exist in medical images (e.g., data from individual patients), and make a simplifying assumption that unlabeled data is independently and identically distributed. Both may not be suitable for real-world medical images. In this paper, we propose a novel batch-mode active learning method which explores and leverages such structured information in annotations of medical images to enforce diversity among the selected data, therefore maximizing the information gain. We formulate the active learning problem as an adaptive submodular function maximization problem subject to a partition matroid constraint, and further present an efficient greedy algorithm to achieve a good solution with a theoretically proven bound. We demonstrate the efficacy of our algorithm on thousands of histopathological images of breast microscopic tissues. PMID:25320821

Abstract—The field of bioinformatics and computational biol- ogy is experiencing a data revolution — experimental techniques to procure data have increased in throughput, improved in accuracy and reduced in costs. This has spurred an array of high profile sequencing and data generation projects. While the data repositories represent untapped reservoirs of rich information critical for scientific breakthroughs, the analytical software tools that are needed to analyze large volumes of such sequence data have significantly lagged behind in their capacity to scale. In this paper, we address homology detection, which is a funda- mental problem in large-scale sequence analysis with numerous applications. We present a scalable framework to conduct large- scale optimal homology detection on massively parallel super- computing platforms. Our approach employs distributed memory work stealing to effectively parallelize optimal pairwise alignment computation tasks. Results on 120,000 cores of the Hopper Cray XE6 supercomputer demonstrate strong scaling and up to 2.42 × 107 optimal pairwise sequence alignments computed per second (PSAPS), the highest reported in the literature.

Despite staggering investments made in unraveling the human genome, current estimates suggest that as much as 90% of the variance in cancer and chronic diseases can be attributed to factors outside an individual’s genetic endowment, particularly to environmental exposures experienced across his or her life course. New analytical approaches are clearly required as investigators turn to complicated systems theory and ecological, place-based and life-history perspectives in order to understand more clearly the relationships between social determinants, environmental exposures and health disparities. While traditional data analysis techniques remain foundational to health disparities research, they are easily overwhelmed by the ever-increasing size and heterogeneity of available data needed to illuminate latent gene x environment interactions. This has prompted the adaptation and application of scalable combinatorial methods, many from genome science research, to the study of population health. Most of these powerful tools are algorithmically sophisticated, highly automated and mathematically abstract. Their utility motivates the main theme of this paper, which is to describe real applications of innovative transdisciplinary models and analyses in an effort to help move the research community closer toward identifying the causal mechanisms and associated environmental contexts underlying health disparities. The public health exposome is used as a contemporary focus for addressing the complex nature of this subject. PMID:25310540

Physicists have access to thousands of CPUs in grid federations such as OSG and EGEE. With the start-up of the LHC, it is essential for individuals or groups of users to wrap together available resources from multiple sites across multiple grids under a higher user-controlled layer in order to provide a homogeneous pool of available resources. One such system is glideinWMS, which is based on the Condor batch system. A general discussion of glideinWMS can be found elsewhere. Here, we focus on recent advances in extending its reach: scalability and integration of heterogeneous compute elements. We demonstrate that the new developments exceed the design goal of over 10,000 simultaneous running jobs under a single Condor schedd, using strong security protocols across global networks, and sustaining a steady-state job completion rate of a few Hz. We also show interoperability across heterogeneous computing elements achieved using client-side methods. We discuss this technique and the challenges in direct access to NorduGrid and CREAM compute elements, in addition to Globus based systems.

A scalable multichannel digital MRI receiver system was designed to achieve high bandwidth echo-planar imaging (EPI) acquisitions for applications such as BOLD-fMRI. The modular system design allows for easy extension to an arbitrary number of channels. A 16-channel receiver was developed and integrated with a General Electric (GE) Signa 3T VH/3 clinical scanner. Receiver performance was evaluated on phantoms and human volunteers using a custom-built 16-element receive-only brain surface coil array. At an output bandwidth of 1 MHz, a 100% acquisition duty cycle was achieved. Overall system noise figure and dynamic range were better than 0.85 dB and 84 dB, respectively. During repetitive EPI scanning on phantoms, the relative temporal standard deviation of the image intensity time-course was below 0.2%. As compared to the product birdcage head coil, 16-channel reception with the custom array yielded a nearly 6-fold SNR gain in the cerebral cortex and a 1.8-fold SNR gain in the center of the brain. The excellent system stability combined with the increased sensitivity and SENSE capabilities of 16-channel coils are expected to significantly benefit and enhance fMRI applications. PMID:14705057

The Hodgkin-Huxley model for action potential generation in biological axons is central for understanding the computational capability of the nervous system and emulating its functionality. Owing to the historical success of silicon complementary metal-oxide-semiconductors, spike-based computing is primarily confined to software simulations and specialized analogue metal-oxide-semiconductor field-effect transistor circuits. However, there is interest in constructing physical systems that emulate biological functionality more directly, with the goal of improving efficiency and scale. The neuristor was proposed as an electronic device with properties similar to the Hodgkin-Huxley axon, but previous implementations were not scalable. Here we demonstrate a neuristor built using two nanoscale Mott memristors, dynamical devices that exhibit transient memory and negative differential resistance arising from an insulating-to-conducting phase transition driven by Joule heating. This neuristor exhibits the important neural functions of all-or-nothing spiking with signal gain and diverse periodic spiking, using materials and structures that are amenable to extremely high-density integration with or without silicon transistors.

The Hodgkin-Huxley model for action potential generation in biological axons is central for understanding the computational capability of the nervous system and emulating its functionality. Owing to the historical success of silicon complementary metal-oxide-semiconductors, spike-based computing is primarily confined to software simulations and specialized analogue metal-oxide-semiconductor field-effect transistor circuits. However, there is interest in constructing physical systems that emulate biological functionality more directly, with the goal of improving efficiency and scale. The neuristor was proposed as an electronic device with properties similar to the Hodgkin-Huxley axon, but previous implementations were not scalable. Here we demonstrate a neuristor built using two nanoscale Mott memristors, dynamical devices that exhibit transient memory and negative differential resistance arising from an insulating-to-conducting phase transition driven by Joule heating. This neuristor exhibits the important neural functions of all-or-nothing spiking with signal gain and diverse periodic spiking, using materials and structures that are amenable to extremely high-density integration with or without silicon transistors. PMID:23241533

A new approach to calculating the properties of large systems within the local density approximation (LDA) that offers the promise of scalability on massively parallel supercomputers is outlined. The electronic structure problem is formulated in real space using multiple scattering theory. The standard LDA algorithm is divided into two parts. Firstly, finding the self-consistent field (SCF) electron density, Secondly, calculating the energy corresponding to the SCF density. We show, at least for metals and alloys, that the former problem is easily solved using real space methods. For the second we take advantage of the variational properties of a generalized Harris-Foulkes free energy functional, a new conduction band Fermi function, and a fictitious finite electron temperature that again allow us to use real-space methods. Using a compute-node {R_arrow} atom equivalence the new method is naturally highly parallel and leads to O(N) scaling where N is the number of atoms making up the system. We show scaling data gathered on the Intel XP/S 35 Paragon for systems up to 512-atoms/simulation cell. To demonstrate that we can achieve metallurgical-precision, we apply the new method to the calculation the energies of disordered CuO{sub 0.5}Zn{sub 0.5} alloys using a large random sample.

Rhinologists are often faced with the challenge of assessing nasal breathing from a functional point of view to derive effective therapeutic interventions. While the complex nasal anatomy can be revealed by visual inspection and medical imaging, only vague information is available regarding the nasal airflow itself: Rhinomanometry delivers rather unspecific integral information on the pressure gradient as well as on total flow and nasal flow resistance. In this article we demonstrate how the understanding of physiological nasal breathing can be improved by simulating and visually analyzing nasal airflow, based on an anatomically correct model of the upper human respiratory tract. In particular we demonstrate how various Information Visualization (InfoVis) techniques, such as a highly scalable implementation of parallel coordinates, time series visualizations, as well as unstructured grid multi-volume rendering, all integrated within a multiple linked views framework, can be utilized to gain a deeper understanding of nasal breathing. Evaluation is accomplished by visual exploration of spatio-temporal airflow characteristics that include not only information on flow features but also on accompanying quantities such as temperature and humidity. To our knowledge, this is the first in-depth visual exploration of the physiological function of the nose over several simulated breathing cycles under consideration of a complete model of the nasal airways, realistic boundary conditions, and all physically relevant time-varying quantities. PMID:19834215

Memory scalability is an enduring problem and bottleneck that plagues many parallel codes. Parallel codes designed for High Performance Systems are typically designed over the span of several, and in some instances 10+, years. As a result, optimization practices which were appropriate for earlier systems may no longer be valid and thus require careful optimization consideration. Specifically, parallel codes whose memory footprint is a function of their scalability must be carefully considered for future exa-scale systems. In this paper we present a methodology and tool to study the memory scalability of parallel codes. Using our methodology we evaluate an application s memory footprint as a function of scalability, which we coined memory efficiency, and describe our results. In particular, using our in-house tools we can pinpoint the specific application components which contribute to the application s overall memory foot-print (application data- structures, libraries, etc.).

The Community Atmosphere Model (CAM), which serves as the atmosphere component of the Community Climate System Model (CCSM), is the most computationally expensive CCSM component in typical configurations. On current and next-generation leadership class computing systems, the performance of CAM is tied to its parallel scalability. Improving performance scalability in CAM has been a challenge, due largely to algorithmic restrictions necessitated by the polar singularities in its latitude-longitude computational grid. Nevertheless, through a combination of exploiting additional parallelism, implementing improved communication protocols, and eliminating scalability bottlenecks, we have been able to more than double the maximum throughput rate of CAM on production platforms. We describe these improvements and present results on the Cray XT5 and IBM BG/P. The approaches taken are not specific to CAM and may inform similar scalability enhancement activities for other codes.

Our understanding of the structuring of the Universe from large-scale cosmological structures down to the formation of galaxies now largely benefits from numerical simulations. The RAMSES code, relying on the Adaptive Mesh Refinement technique, is used to perform massively parallel simulations at multiple scales. The interactive, immersive, three-dimensional visualization of such complex simulations is a challenge that is addressed using the SDvision software package. Several rendering techniques are available, including ray-casting and isosurface reconstruction, to explore the simulated volumes at various resolution levels and construct temporal sequences. These techniques are illustrated in the context of different classes of simulations. We first report on the visualization of the HORIZON Galaxy Formation Simulation at MareNostrum, a cosmological simulation with detailed physics at work in the galaxy formation process. We then carry on in the context of an intermediate zoom simulation leading to the formation of a Milky-Way like galaxy. Finally, we present a variety of simulations of interacting galaxies, including a case-study of the Antennae Galaxies interaction.

Matching Pursuit Decomposition (MPD) is a powerful iterative algorithm for signal decomposition and feature extraction. MPD decomposes any signal into linear combinations of its dictionary elements or atoms . A best fit atom from an arbitrarily defined dictionary is determined through cross-correlation. The selected atom is subtracted from the signal and this procedure is repeated on the residual in the subsequent iterations until a stopping criterion is met. The reconstructed signal reveals the waveform structure of the original signal. However, a sufficiently large dictionary is required for an accurate reconstruction; this in return increases the computational burden of the algorithm, thus limiting its applicability and level of adoption. The purpose of this research is to improve the scalability and performance of the classical MPD algorithm. Correlation thresholds were defined to prune insignificant atoms from the dictionary. The Coarse-Fine Grids and Multiple Atom Extraction techniques were proposed to decrease the computational burden of the algorithm. The Coarse-Fine Grids method enabled the approximation and refinement of the parameters for the best fit atom. The ability to extract multiple atoms within a single iteration enhanced the effectiveness and efficiency of each iteration. These improvements were implemented to produce an improved Matching Pursuit Decomposition algorithm entitled MPD++. Disparate signal decomposition applications may require a particular emphasis of accuracy or computational efficiency. The prominence of the key signal features required for the proper signal classification dictates the level of accuracy necessary in the decomposition. The MPD++ algorithm may be easily adapted to accommodate the imposed requirements. Certain feature extraction applications may require rapid signal decomposition. The full potential of MPD++ may be utilized to produce incredible performance gains while extracting only slightly less energy than the

In many educator professional development workshops, scientists present content in a slideshow-type format and field questions afterwards. Drawbacks of this approach include: inability to begin the lecture with content that is responsive to audience needs; lack of flexible access to specific material within the linear presentation; and “Q&A” sessions are not easily scalable to broader audiences. Often this type of traditional interaction provides little direct benefit to the scientists. The Centers for Ocean Sciences Education Excellence - Ocean Systems (COSEE-OS) applies the technique of concept mapping with demonstrated effectiveness in helping scientists and educators “get on the same page” (deCharon et al., 2009). A key aspect is scientist professional development geared towards improving face-to-face and online communication with non-scientists. COSEE-OS promotes scientist-educator collaboration, tests the application of scientist-educator maps in new contexts through webinars, and is piloting the expansion of maps as long-lived resources for the broader community. Collaboration - COSEE-OS has developed and tested a workshop model bringing scientists and educators together in a peer-oriented process, often clarifying common misconceptions. Scientist-educator teams develop online concept maps that are hyperlinked to “assets” (i.e., images, videos, news) and are responsive to the needs of non-scientist audiences. In workshop evaluations, 91% of educators said that the process of concept mapping helped them think through science topics and 89% said that concept mapping helped build a bridge of communication with scientists (n=53). Application - After developing a concept map, with COSEE-OS staff assistance, scientists are invited to give webinar presentations that include live “Q&A” sessions. The webinars extend the reach of scientist-created concept maps to new contexts, both geographically and topically (e.g., oil spill), with a relatively small

This paper describes an extension of the upcoming High Efficiency Video Coding (HEVC) standard for supporting spatial and quality scalable video coding. Besides scalable coding tools known from scalable profiles of prior video coding standards such as H.262/MPEG-2 Video and H.264/MPEG-4 AVC, the proposed scalable HEVC extension includes new coding tools that further improve the coding efficiency of the enhancement layer. In particular, new coding modes by which base and enhancement layer signals are combined for forming an improved enhancement layer prediction signal have been added. All scalable coding tools have been integrated in a way that the low-level syntax and decoding process of HEVC remain unchanged to a large extent. Simulation results for typical application scenarios demonstrate the effectiveness of the proposed design. For spatial and quality scalable coding with two layers, bit-rate savings of about 20-30% have been measured relative to simulcasting the layers, which corresponds to a bit-rate overhead of about 5-15% relative to single-layer coding of the enhancement layer.

Background The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visualscalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. Results In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visuallyscalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. Conclusions BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visualscalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics. PMID:26329021

The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visualscalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visuallyscalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE throughmore » a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visualscalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.« less

The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visualscalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visuallyscalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visualscalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.

This report documents the 2005 Revolutionary System Concept for Aeronautics (RSCA) study entitled "A Robust, Scalable Transportation System Concept". The objective of the study was to generate, at a high-level of abstraction, characteristics of a new concept for the National Airspace System, or the new NAS, under which transportation goals such as increased throughput, delay reduction, and improved robustness could be realized. Since such an objective can be overwhelmingly complex if pursued at the lowest levels of detail, instead a System-of-Systems (SoS) approach was adopted to model alternative air transportation architectures at a high level. The SoS approach allows the consideration of not only the technical aspects of the NAS", but also incorporates policy, socio-economic, and alternative transportation system considerations into one architecture. While the representations of the individual systems are basic, the higher level approach allows for ways to optimize the SoS at the network level, determining the best topology (i.e. configuration of nodes and links). The final product (concept) is a set of rules of behavior and network structure that not only satisfies national transportation goals, but represents the high impact rules that accomplish those goals by getting the agents to "do the right thing" naturally. The novel combination of Agent Based Modeling and Network Theory provides the core analysis methodology in the System-of-Systems approach. Our method of approach is non-deterministic which means, fundamentally, it asks and answers different questions than deterministic models. The nondeterministic method is necessary primarily due to our marriage of human systems with technological ones in a partially unknown set of future worlds. Our goal is to understand and simulate how the SoS, human and technological components combined, evolve.

Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability to problems that can be solved on desktops. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose multiple heuristics that are designed to break the sequential barrier. Our heuristics are agnostic to the underlying parallel architecture. For evaluation purposes, we implemented our heuristics on shared memory (OpenMP) and distributed memory (MapReduce-MPI) machines, and tested them over real world graphs derived from multiple application domains (internet, biological, natural language processing). Experimental results demonstrate the ability of our heuristics to converge to high modularity solutions comparable to those output by the serial algorithm in nearly the same number of iterations, while also drastically reducing time to solution.

The problem of missing data is ubiquitous in domains such as biomedical signal processing, network traffic analysis, bibliometrics, social network analysis, chemometrics, computer vision, and communication networks|all domains in which data collection is subject to occasional errors. Moreover, these data sets can be quite large and have more than two axes of variation, e.g., sender, receiver, time. Many applications in those domains aim to capture the underlying latent structure of the data; in other words, they need to factorize data sets with missing entries. If we cannot address the problem of missing data, many important data sets will be discarded or improperly analyzed. Therefore, we need a robust and scalable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. We focus on one of the most well-known tensor factorizations, CANDECOMP/PARAFAC (CP), and formulate the CP model as a weighted least squares problem that models only the known entries. We develop an algorithm called CP-WOPT (CP Weighted OPTimization) using a first-order optimization approach to solve the weighted least squares problem. Based on extensive numerical experiments, our algorithm is shown to successfully factor tensors with noise and up to 70% missing data. Moreover, our approach is significantly faster than the leading alternative and scales to larger problems. To show the real-world usefulness of CP-WOPT, we illustrate its applicability on a novel EEG (electroencephalogram) application where missing data is frequently encountered due to disconnections of electrodes.

Simultaneously measuring the activities of all neurons in a mammalian brain at millisecond resolution is a challenge beyond the limits of existing techniques in neuroscience. Entirely new approaches may be required, motivating an analysis of the fundamental physical constraints on the problem. We outline the physical principles governing brain activity mapping using optical, electrical, magnetic resonance, and molecular modalities of neural recording. Focusing on the mouse brain, we analyze the scalability of each method, concentrating on the limitations imposed by spatiotemporal resolution, energy dissipation, and volume displacement. Based on this analysis, all existing approaches require orders of magnitude improvement in key parameters. Electrical recording is limited by the low multiplexing capacity of electrodes and their lack of intrinsic spatial resolution, optical methods are constrained by the scattering of visible light in brain tissue, magnetic resonance is hindered by the diffusion and relaxation timescales of water protons, and the implementation of molecular recording is complicated by the stochastic kinetics of enzymes. Understanding the physical limits of brain activity mapping may provide insight into opportunities for novel solutions. For example, unconventional methods for delivering electrodes may enable unprecedented numbers of recording sites, embedded optical devices could allow optical detectors to be placed within a few scattering lengths of the measured neurons, and new classes of molecularly engineered sensors might obviate cumbersome hardware architectures. We also study the physics of powering and communicating with microscale devices embedded in brain tissue and find that, while radio-frequency electromagnetic data transmission suffers from a severe power-bandwidth tradeoff, communication via infrared light or ultrasound may allow high data rates due to the possibility of spatial multiplexing. The use of embedded local recording and

Strong interactions can amplify quantum effects such that they become important on macroscopic scales. Controlling these coherently on a single-particle level is essential for the tailored preparation of strongly correlated quantum systems and opens up new prospects for quantum technologies. Rydberg atoms offer such strong interactions, which lead to extreme nonlinearities in laser-coupled atomic ensembles. As a result, multiple excitation of a micrometer-sized cloud can be blocked while the light-matter coupling becomes collectively enhanced. The resulting two-level system, often called a "superatom," is a valuable resource for quantum information, providing a collective qubit. Here, we report on the preparation of 2 orders of magnitude scalable superatoms utilizing the large interaction strength provided by Rydberg atoms combined with precise control of an ensemble of ultracold atoms in an optical lattice. The latter is achieved with sub-shot-noise precision by local manipulation of a two-dimensional Mott insulator. We microscopically confirm the superatom picture by in situ detection of the Rydberg excitations and observe the characteristic square-root scaling of the optical coupling with the number of atoms. Enabled by the full control over the atomic sample, including the motional degrees of freedom, we infer the overlap of the produced many-body state with a W state from the observed Rabi oscillations and deduce the presence of entanglement. Finally, we investigate the breakdown of the superatom picture when two Rydberg excitations are present in the system, which leads to dephasing and a loss of coherence.

We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors. This is likely of particular interest to the radio astronomy community given, for example, that survey projects contain groups dedicated to this topic. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex

At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.

Simultaneously measuring the activities of all neurons in a mammalian brain at millisecond resolution is a challenge beyond the limits of existing techniques in neuroscience. Entirely new approaches may be required, motivating an analysis of the fundamental physical constraints on the problem. We outline the physical principles governing brain activity mapping using optical, electrical, magnetic resonance, and molecular modalities of neural recording. Focusing on the mouse brain, we analyze the scalability of each method, concentrating on the limitations imposed by spatiotemporal resolution, energy dissipation, and volume displacement. Based on this analysis, all existing approaches require orders of magnitude improvement in key parameters. Electrical recording is limited by the low multiplexing capacity of electrodes and their lack of intrinsic spatial resolution, optical methods are constrained by the scattering of visible light in brain tissue, magnetic resonance is hindered by the diffusion and relaxation timescales of water protons, and the implementation of molecular recording is complicated by the stochastic kinetics of enzymes. Understanding the physical limits of brain activity mapping may provide insight into opportunities for novel solutions. For example, unconventional methods for delivering electrodes may enable unprecedented numbers of recording sites, embedded optical devices could allow optical detectors to be placed within a few scattering lengths of the measured neurons, and new classes of molecularly engineered sensors might obviate cumbersome hardware architectures. We also study the physics of powering and communicating with microscale devices embedded in brain tissue and find that, while radio-frequency electromagnetic data transmission suffers from a severe power–bandwidth tradeoff, communication via infrared light or ultrasound may allow high data rates due to the possibility of spatial multiplexing. The use of embedded local recording and

We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors, and the local outlier factor. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex datasets that wishes to extract the full scientific value from its data.

Temporal flow control of a jet has been widely studied in the past to enhance jet mixing or reduce jet noise. Most of this research, however, has been done using small diameter low Reynolds number jets that often have little resemblance to the much larger jets common in real world applications because the flow actuators available lacked either the power or bandwidth to sufficiently impact these larger higher energy jets. The Localized Arc Filament Plasma Actuators (LAFPA), developed at the Ohio State University (OSU), have demonstrated the ability to impact a small high speed jet in experiments conducted at OSU and the power to perturb a larger high Reynolds number jet in experiments conducted at the NASA Glenn Research Center. However, the response measured in the large-scale experiments was significantly reduced for the same number of actuators compared to the jet response found in the small-scale experiments. A computational study has been initiated to simulate the LAFPA system with additional actuators on a large-scale jet to determine the number of actuators required to achieve the same desired response for a given jet diameter. Central to this computational study is a model for the LAFPA that both accurately represents the physics of the actuator and can be implemented into a computational fluid dynamics solver. One possible model, based on pressure waves created by the rapid localized heating that occurs at the actuator, is investigated using simplified axisymmetric simulations. The results of these simulations will be used to determine the validity of the model before more realistic and time consuming three-dimensional simulations are conducted to ultimately determine the scalability of the LAFPA system.

Visual scripting is the coordination of words with pictures in sequence. This book presents the methods and viewpoints on visual scripting of fourteen film makers, from nine countries, who are involved in animated cinema; it contains concise examples of how a storybook and preproduction script can be prepared in visual terms; and it includes a…

In situ visualization has become a popular method for avoiding the slowest component of many visualization pipelines: reading data from disk. Most previous in situ work has focused on achieving visualizationscalability on par with simulation codes, or on the data movement concerns that become prevalent at extreme scales. In this work, we consider in situ analysis with respect to ease of use and programmability. We describe an abstraction that opens up new applications for in situ visualization, and demonstrate that this abstraction and an expanded set of use cases can be realized without a performance cost. PMID:25995996

This paper introduces a scalable “climate health justice” model for assessing and projecting incidence, treatment costs, and sociospatial disparities for diseases with well-documented climate change linkages. The model is designed to employ low-cost secondary data, and it is rooted in a perspective that merges normative environmental justice concerns with theoretical grounding in health inequalities. Since the model employs International Classification of Diseases, Ninth Revision Clinical Modification (ICD-9-CM) disease codes, it is transferable to other contexts, appropriate for use across spatial scales, and suitable for comparative analyses. We demonstrate the utility of the model through analysis of 2008–2010 hospitalization discharge data at state and county levels in Texas (USA). We identified several disease categories (i.e., cardiovascular, gastrointestinal, heat-related, and respiratory) associated with climate change, and then selected corresponding ICD-9 codes with the highest hospitalization counts for further analyses. Selected diseases include ischemic heart disease, diarrhea, heat exhaustion/cramps/stroke/syncope, and asthma. Cardiovascular disease ranked first among the general categories of diseases for age-adjusted hospital admission rate (5286.37 per 100,000). In terms of specific selected diseases (per 100,000 population), asthma ranked first (517.51), followed by ischemic heart disease (195.20), diarrhea (75.35), and heat exhaustion/cramps/stroke/syncope (7.81). Charges associated with the selected diseases over the 3-year period amounted to US$5.6 billion. Blacks were disproportionately burdened by the selected diseases in comparison to non-Hispanic whites, while Hispanics were not. Spatial distributions of the selected disease rates revealed geographic zones of disproportionate risk. Based upon a downscaled regional climate-change projection model, we estimate a >5% increase in the incidence and treatment costs of asthma attributable to

Recent studies confirm that climate change will cause wildfires to increase in frequency and severity in the coming decades especially for California and in much of the North American West. The most critical sustainability issue in the midst of these ever-changing dynamics is how to achieve a new social-ecological equilibrium of this fire ecology. Wildfire wind speeds and directions change in an instant, and first responders can only be effective when they take action as quickly as the conditions change. To deliver information needed for sustainable policy and management in this dynamically changing fire regime, we must capture these details to understand the environmental processes. We are building an end-to-end cyberinfrastructure (CI), called WIFIRE, for real-time and data-driven simulation, prediction and visualization of wildfire behavior. The WIFIRE integrated CI system supports social-ecological resilience to the changing fire ecology regime in the face of urban dynamics and climate change. Networked observations, e.g., heterogeneous satellite data and real-time remote sensor data is integrated with computational techniques in signal processing, visualization, modeling and data assimilation to provide a scalable, technological, and educational solution to monitor weather patterns to predict a wildfire's Rate of Spread. Our collaborative WIFIRE team of scientists, engineers, technologists, government policy managers, private industry, and firefighters architects implement CI pathways that enable joint innovation for wildfire management. Scientific workflows are used as an integrative distributed programming model and simplify the implementation of engineering modules for data-driven simulation, prediction and visualization while allowing integration with large-scale computing facilities. WIFIRE will be scalable to users with different skill-levels via specialized web interfaces and user-specified alerts for environmental events broadcasted to receivers before

The question regarding visual imagery and visual perception remain an open issue. Many studies have tried to understand if the two processes share the same mechanisms or if they are independent, using different neural substrates. Most research has been directed towards the need of activation of primary visual areas during imagery. Here we review…

The MPEG-4 Fine Grained Scalability (FGS) profile aims at scalable layered video encoding, in order to ensure efficient video streaming in networks with fluctuating bandwidths. In this paper, we propose a novel technique, termed as FMOEMR, which delivers significantly improved rate distortion performance compared to existing MPEG-4 Base Layer encoding techniques. The video frames are re-encoded at high resolution at semantically and visually important regions of the video (termed as Features, Motion and Objects) that are defined using a mask (FMO-Mask) and at low resolution in the remaining regions. The multiple-resolution re-rendering step is implemented such that further MPEG-4 compression leads to low bit rate Base Layer video encoding. The Features, Motion and Objects Encoded-Multi- Resolution (FMOE-MR) scheme is an integrated approach that requires only encoder-side modifications, and is transparent to the decoder. Further, since the FMOE-MR scheme incorporates "smart" video preprocessing, it requires no change in existing MPEG-4 codecs. As a result, it is straightforward to use the proposed FMOE-MR scheme with any existing MPEG codec, thus allowing great flexibility in implementation. In this paper, we have described, and implemented, unsupervised and semi-supervised algorithms to create the FMO-Mask from a given video sequence, using state-of-the-art computer vision algorithms.

Virtual Reality (VR) environments can offer immersion, interaction and realistic images to users. A VR system is usually expensive and requires special equipment in a complex setup. One approach is to use Commodity-Off-The-Shelf (COTS) desktop multi-projectors manually or camera based calibrated to reduce the cost of VR systems without significant decrease of the visual experience. Additionally, for non-planar screen shapes, special optics such as lenses and mirrors are required thus increasing costs. We propose a low-cost, scalable, flexible and mobile solution that allows building complex VR systems that projects images onto a variety of arbitrary surfaces such as planar, cylindrical and spherical surfaces. This approach combines three key aspects: 1) clusters of DLP-picoprojectors to provide homogeneous and continuous pixel density upon arbitrary surfaces without additional optics; 2) LED lighting technology for energy efficiency and light control; 3) smaller physical footprint for flexibility purposes. Therefore, the proposed system is scalable in terms of pixel density, energy and physical space. To achieve these goals, we developed a multi-projector software library called FastFusion that calibrates all projectors in a uniform image that is presented to viewers. FastFusion uses a camera to automatically calibrate geometric and photometric correction of projected images from ad-hoc positioned projectors, the only requirement is some few pixels overlapping amongst them. We present results with eight Pico-projectors, with 7 lumens (LED) and DLP 0.17 HVGA Chipset.

A paper discusses NEXUS, a common, next-generation avionics interconnect that is transparently compatible with wired, fiber-optic, and RF physical layers; provides a flexible, scalable, packet switched topology; is fault-tolerant with sub-microsecond detection/recovery latency; has scalable bandwidth from 1 Kbps to 10 Gbps; has guaranteed real-time determinism with sub-microsecond latency/jitter; has built-in testability; features low power consumption (< 100 mW per Gbps); is lightweight with about a 5,000-logic-gate footprint; and is implemented in a small Bus Interface Unit (BIU) with reconfigurable back-end providing interface to legacy subsystems. NEXUS enhances a commercial interconnect standard, Serial RapidIO, to meet avionics interconnect requirements without breaking the standard. This unified interconnect technology can be used to meet performance, power, size, and reliability requirements of all ranges of equipment, sensors, and actuators at chip-to-chip, board-to-board, or box-to-box boundary. Early results from in-house modeling activity of Serial RapidIO using VisualSim indicate that the use of a switched, high-performance avionics network will provide a quantum leap in spacecraft onboard science and autonomy capability for science and exploration missions.

Dynamic change in the topology of an ad hoc network makes it difficult to design an efficient routing protocol. Scalability of an ad hoc network is also one of the important criteria of research in this field. Most of the research works in ad hoc network focus on routing and medium access protocols and produce simulation results for limited-size networks. Ad hoc on-demand distance vector (AODV) is one of the best reactive routing protocols. In this article, modified routing protocols based on local link repairing of AODV are proposed. Method of finding alternate routes for next-to-next node is proposed in case of link failure. These protocols are beacon-less, means periodic hello message is removed from the basic AODV to improve scalability. Few control packet formats have been changed to accommodate suggested modification. Proposed protocols are simulated to investigate scalability performance and compared with basic AODV protocol. This also proves that local link repairing of proposed protocol improves scalability of the network. From simulation results, it is clear that scalability performance of routing protocol is improved because of link repairing method. We have tested protocols for different terrain area with approximate constant node densities and different traffic load.

The Laboratory for the Ocean Observatory Knowledge INtegration Grid (LOOKING) is a NSF research project focused on the identification, synthesis and assembly of existing and emerging concepts and technologies into a coherent viable cyberinfrastructure design for ocean observatories. One of the goals of the project is to prototype an automated pipeline for continuously generating visualization products (time variant geometric representations and rendered image sequences) from streaming and regenerating data sources. Current work involves remote visualization of NASA JPL's Our Ocean Data Assimilation of the Central California region on a continuous basis. The prototype uses OPeNDAP as the data retrieval mechanism to fetch netcdf formatted data for specific variables or time steps. A geometry conversion engine transforms this data into 3D geometric models (e.g. isosurfaces for scalar data like temperature and salinity or streamlines for ocean currents) using the Visualization Toolkit (VTK) and delivers a 3D scene graph that can be imported into the end user's choice of visualization software for viewing the scene. Our current preferred 3D viewer is ossimPlanet (a 3D Geospatial viewer built using OpenSceneGraph, libwms and OSSIM) embedded inside the COVISE framework for interactive exploration in a georeferenced framework.

We present visualization tools for analyzing molecular simulations of liquid crystal (LC) behavior. The simulation data consists of terabytes of data describing the position and orientation of every molecule in the simulated system over time. Condensed matter physicists study the evolution of topological defects in these data, and our visualization tools focus on that goal. We first convert the discrete simulation data to a sampled version of a continuous second-order tensor field and then use combinations of visualization methods to simultaneously display combinations of contractions of the tensor data, providing an interactive environment for exploring these complicated data. The system, built using AVS, employs colored cutting planes, colored isosurfaces, and colored integral curves to display fields of tensor contractions including Westin's scalar cl, cp, and cs metrics and the principal eigenvector. Our approach has been in active use in the physics lab for over a year. It correctly displays structures already known; it displays the data in a spatially and temporally smoother way than earlier approaches, avoiding confusing grid effects and facilitating the study of multiple time steps; it extends the use of tools developed for visualizing diffusion tensor data, re-interpreting them in the context of molecular simulations; and it has answered long-standing questions regarding the orientation of molecules around defects and the conformational changes of the defects. PMID:17080868

least cubic and commonly quartic or higher. Therefore, practical implementations require attention to the scalability of the algorithms, when one is dealing with the very large number of observations from large surveillance telescopes. We address two broad categories of algorithms. The first category includes and extends the classical methods of Laplace and Gauss, as well as the more modern method of Gooding, in which one solves explicitly for the apparent range to the target in terms of the given data. In particular, recent ideas offered by Mortari and Karimi allow us to construct a family of range-solution methods that can be scaled to many processors efficiently. We find that the orbit solutions (data association hypotheses) can be ranked by means of a concept we call persistence, in which a simple statistical measure of likelihood is based on the frequency of occurrence of combinations of observations in consistent orbit solutions. Of course, range-solution methods can be expected to perform poorly if the orbit solutions of most interest are not well conditioned. The second category of algorithms addresses this difficulty. Instead of solving for range, these methods attach a set of range hypotheses to each measured line of sight. Then all pair-wise combinations of observations are considered and the family of Lambert problems is solved for each pair. These algorithms also have polynomial complexity, though now the complexity is quadratic in the number of observations and also quadratic in the number of range hypotheses. We offer a novel type of admissible-region analysis, constructing partitions of the orbital element space and deriving rigorous upper and lower bounds on the possible values of the range for each partition. This analysis allows us to parallelize with respect to the element partitions and to reduce the number of range hypotheses that have to be considered in each processor simply by making the partitions smaller. Naturally, there are many ways to

The NSF-funded WIFIRE project has developed an open-source, online geospatial workflow platform for unifying geoprocessing tools and models for for fire and other geospatially dependent modeling applications. It is a product of WIFIRE's objective to build an end-to-end cyberinfrastructure for real-time and data-driven simulation, prediction and visualization of wildfire behavior. geoKepler includes a set of reusable GIS components, or actors, for the Kepler Scientific Workflow System (https://kepler-project.org). Actors exist for reading and writing GIS data in formats such as Shapefile, GeoJSON, KML, and using OGC web services such as WFS. The actors also allow for calling geoprocessing tools in other packages such as GDAL and GRASS. Kepler integrates functions from multiple platforms and file formats into one framework, thus enabling optimal GIS interoperability, model coupling, and scalability. Products of the GIS actors can be fed directly to models such as FARSITE and WRF. Kepler's ability to schedule and scale processes using Hadoop and Spark also makes geoprocessing ultimately extensible and computationally scalable. The reusable workflows in geoKepler can be made to run automatically when alerted by real-time environmental conditions. Here, we show breakthroughs in the speed of creating complex data for hazard assessments with this platform. We also demonstrate geoKepler workflows that use Data Assimilation to ingest real-time weather data into wildfire simulations, and for data mining techniques to gain insight into environmental conditions affecting fire behavior. Existing machine learning tools and libraries such as R and MLlib are being leveraged for this purpose in Kepler, as well as Kepler's Distributed Data Parallel (DDP) capability to provide a framework for scalable processing. geoKepler workflows can be executed via an iPython notebook as a part of a Jupyter hub at UC San Diego for sharing and reporting of the scientific analysis and results from

We implemented a scalable parallel quasi-Monte Carlo numerical high-dimensional integration for tera-scale data points. The implemented algorithm uses the Sobol s quasi-sequences to generate random samples. Sobol s sequence was used to avoid clustering effects in the generated random samples and to produce low-discrepancy random samples which cover the entire integration domain. The performance of the algorithm was tested. Obtained results prove the scalability and accuracy of the implemented algorithms. The implemented algorithm could be used in different applications where a huge data volume is generated and numerical integration is required. We suggest using the hyprid MPI and OpenMP programming model to improve the performance of the algorithms. If the mixed model is used, attention should be paid to the scalability and accuracy.

This report describes the limitations to parallel scalability which we have encountered when applying our otherwise optimally scalable parallel statistical analysis tool kit to large data sets distributed across the parallel file system of the current premier DOE computational facility. This report describes our study to evaluate the effect of parallel I/O on the overall scalability of a parallel data analysis pipeline using our scalable parallel statistics tool kit [PTBM11]. In this goal, we tested it using the Jaguar-pf DOE/ORNL peta-scale platform on a large combustion simulation data under a variety of process counts and domain decompositions scenarios. In this report we have recalled the foundations of the parallel statistical analysis tool kit which we have designed and implemented, with the specific double intent of reproducing typical data analysis workflows, and achieving optimal design for scalable parallel implementations. We have briefly reviewed those earlier results and publications which allow us to conclude that we have achieved both goals. However, in this report we have further established that, when used in conjuction with a state-of-the-art parallel I/O system, as can be found on the premier DOE peta-scale platform, the scaling properties of the overall analysis pipeline comprising parallel data access routines degrade rapidly. This finding is problematic and must be addressed if peta-scale data analysis is to be made scalable, or even possible. In order to attempt to address these parallel I/O limitations, we will investigate the use the Adaptable IO System (ADIOS) [LZL+10] to improve I/O performance, while maintaining flexibility for a variety of IO options, such MPI IO, POSIX IO. This system is developed at ORNL and other collaborating institutions, and is being tested extensively on Jaguar-pf. Simulation code being developed on these systems will also use ADIOS to output the data thereby making it easier for other systems, such as ours, to

Based on a parallel scalable library for Coulomb interactions in particle systems, a comparison between the fast multipole method (FMM), multigrid-based methods, fast Fourier transform (FFT)-based methods, and a Maxwell solver is provided for the case of three-dimensional periodic boundary conditions. These methods are directly compared with respect to complexity, scalability, performance, and accuracy. To ensure comparable conditions for all methods and to cover typical applications, we tested all methods on the same set of computers using identical benchmark systems. Our findings suggest that, depending on system size and desired accuracy, the FMM- and FFT-based methods are most efficient in performance and stability. PMID:24483585

Sandia Scalable Encryption Library (SSEL) Version 1.0 is a library of functions that implement Sandia`s scalable encryption algorithm. This algorithm is used to encrypt Asynchronous Transfer Mode (ATM) data traffic, and is capable of operating on an arbitrary number of bits at a time (which permits scaling via parallel implementations), while being interoperable with differently scaled versions of this algorithm. The routines in this library implement 8 bit and 32 bit versions of a non-linear mixer which is compatible with Sandia`s hardware-based ATM encryptor.

Detailed, full-system, complex physics simulations have been shown to be feasible on systems containing thousands of processors. In order to manage these computer systems it has been necessary to create scalable system services. In this talk Sandia`s research on scalable systems will be described. The key concepts of low overhead data movement through portals and of flexible services through multi-partition architectures will be illustrated in detail. The talk will conclude with a discussion of how these techniques can be applied outside of the standard monolithic MPP system.

Multi-sensor Data Fusion is synergistic integration of multiple data sets. Data fusion includes processes for aligning, associating and combining data and information in estimating and predicting the state of objects, their relationships, and characterizing situations and their significance. The combination of complex data sets and the need for real-time data storage and retrieval compounds the data fusion problem. The systematic development and use of data fusion techniques are particularly critical in applications requiring massive, diverse, ambiguous, and time-critical data. Such conditions are characteristic of new emerging requirements; e.g., network-centric and information-centric warfare, low intensity conflicts such as special operations, counter narcotics, antiterrorism, information operations and CALOW (Conventional Arms, Limited Objectives Warfare), economic and political intelligence. In this paper, Aximetric presents a novel, scalable, object-oriented, metamodel framework for parallel, cluster-based data-fusion engine on High Performance Computing (HPC) Systems. The data-clustering algorithms provide a fast, scalable technique to sift through massive, complex data sets coming through multiple streams in real-time. The load-balancing algorithm provides the capability to evenly distribute the workload among processors on-the-fly and achieve real-time scalability. The proposed data-fusion engine exploits unique data-structures for fast storage, retrieval and interactive visualization of the multiple data streams.

least cubic and commonly quartic or higher. Therefore, practical implementations require attention to the scalability of the algorithms, when one is dealing with the very large number of observations from large surveillance telescopes. We address two broad categories of algorithms. The first category includes and extends the classical methods of Laplace and Gauss, as well as the more modern method of Gooding, in which one solves explicitly for the apparent range to the target in terms of the given data. In particular, recent ideas offered by Mortari and Karimi allow us to construct a family of range-solution methods that can be scaled to many processors efficiently. We find that the orbit solutions (data association hypotheses) can be ranked by means of a concept we call persistence, in which a simple statistical measure of likelihood is based on the frequency of occurrence of combinations of observations in consistent orbit solutions. Of course, range-solution methods can be expected to perform poorly if the orbit solutions of most interest are not well conditioned. The second category of algorithms addresses this difficulty. Instead of solving for range, these methods attach a set of range hypotheses to each measured line of sight. Then all pair-wise combinations of observations are considered and the family of Lambert problems is solved for each pair. These algorithms also have polynomial complexity, though now the complexity is quadratic in the number of observations and also quadratic in the number of range hypotheses. We offer a novel type of admissible-region analysis, constructing partitions of the orbital element space and deriving rigorous upper and lower bounds on the possible values of the range for each partition. This analysis allows us to parallelize with respect to the element partitions and to reduce the number of range hypotheses that have to be considered in each processor simply by making the partitions smaller. Naturally, there are many ways to

Advances in computer graphics have provided mathematicians with the ability to create stunning visualizations, both to gain insight and to help demonstrate the beauty of mathematics to others. As educators these tools can be particularly important as we search for ways to work with students raised with constant visual stimulation, from video games…

Living in an image-rich world does not mean students (or faculty and administrators) naturally possess sophisticated visual literacy skills, just as continually listening to an iPod does not teach a person to critically analyze or create music. Instead, "visual literacy involves the ability to understand, produce, and use culturally significant…

A series of articles examines visual literacy from the perspectives of definition, research, curriculum, and resources. Articles examining the definition of visual literacy approach it in terms of semantics, techniques, and exploratory definition areas. There are surveys of present and potential research, and a discussion of the problem of…

An experimental test of visual closure based on an information-theory concept of perception was devised to test the ability to discriminate visual stimuli with reduced cues. The test is to be administered in a timed individual situation in which the subject is presented with sets of incomplete drawings of simple objects that he is required to name…

Based on the more general principle that all thinking (including reasoning) is basically perceptual in nature, the author proposes that visual perception is not a passive recording of stimulus material but an active concern of the mind. He delineates the task of visually distinguishing changes in size, shape, and position and points out the…

Cyber analysts are tasked with the identification and mitigation of network exploits and threats. These compromises are difficult to identify due to the characteristics of cyber communication, the volume of traffic, and the duration of possible attack. It is necessary to have analytical tools to help analysts identify anomalies that span seconds, days, and weeks. Unfortunately, providing analytical tools effective access to the volumes of underlying data requires novel architectures, which is often overlooked in operational deployments. Our work is focused on a summary record of communication, called a flow. Flow records are intended to summarize a communication session between a source and a destination, providing a level of aggregation from the base data. Despite this aggregation, many enterprise network perimeter sensors store millions of network flow records per day. The volume of data makes analytics difficult, requiring the development of new techniques to efficiently identify temporal patterns and potential threats. The massive volume makes analytics difficult, but there are other characteristics in the data which compound the problem. Within the billions of records of communication that transact, there are millions of distinct IP addresses involved. Characterizing patterns of entity behavior is very difficult with the vast number of entities that exist in the data. Research has struggled to validate a model for typical network behavior with hopes it will enable the identification of atypical behavior. Complicating matters more, typically analysts are only able to visualize and interact with fractions of data and have the potential to miss long term trends and behaviors. Our analysis approach focuses on aggregate views and visualization techniques to enable flexible and efficient data exploration as well as the capability to view trends over long periods of time. Realizing that interactively exploring summary data allowed analysts to effectively identify

Scalable wavelet video coders based on Motion Compensated Temporal Filtering (MCTF) have been shown to exhibit good coding efficiency over a large range of bit-rates, in addition to providing spatial, temporal and SNR scalabilities. However, the complexity of these wavelet video coding schemes has not been thoroughly investigated. In this paper, we analyze the computational complexity of a fully-scalable MCTF-based wavelet video decoder that is likely to become part of the emerging MPEG-21 standard. We model the change in computational complexity of various components of the decoder as a function of bit-rate, encoding parameters such as filter types for spatial and temporal decomposition and the number of decomposition levels, and sequence characteristics. A key by-product of our analysis is the observation that fixed-function hardware accelerators are not appropriate for implementing these next generation fully scalable video decoders. The absolute complexity of the various functional units as well as their relative complexity varies depending on the transmission bit-rate, thereby requiring different hardware/software architecture support at different bit-rates. To cope with these variations, a preliminary architecture comprising of a reconfigurable co-processor and a general purpose processor is proposed as an implementation platform for these video decoders. We also propose an algorithm to utilize the co-processor efficiently.

A continuous and scalable bubbling system to generate functional nanodroplets dispersed in a continuous phase is proposed. Scaling up of this system can be achieved by simply tuning the bubbling parameters. This new and versatile system is capable of encapsulating various functional nanomaterials to form functional nanoemulsions and nanoparticles in one step. PMID:27007617

is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, and MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.

This paper proposes procedures for assessing the fit of a psychometric model at the level of the individual respondent. The procedures are intended for personality measures made up of Likert-type items, which, in applied research, are usually analyzed by means of factor analysis. Two scalability indices are proposed, which can be considered as…

Understanding the impact of noise and incomplete data is a critical need for using atom probe tomography effectively. Although many tools and techniques have been developed to address this problem, visualization of the raw data remains an important part of this process. In this paper, we present two contributions to the visualization of data acquired through atom probe tomography. First, we describe the application of a rendering technique, ray-cast spherical impostors, that enables the interactive rendering of large numbers (as large as 10 million plus) of pixel perfect, lit spheres representing individual atoms. This technique is made possible by the use of a consumer-level graphics processing unit (GPU), and it yields an order of magnitude improvement both in render quality and speed over techniques previously used to render spherical glyphs in this domain. Second, we present an interactive tool that allows the user to mask, filter, and colorize the data in real time to help them understand and visualize a precise subset and properties of the raw data. We demonstrate the effectiveness of our tool through benchmarks and an example that shows how the ability to interactively render large numbers of spheres, combined with the use of filters and masks, leads to improved understanding of the three-dimensional (3D) and incomplete nature of atom probe data. This improvement arises from the ability of lit spheres to more effectively show the 3D position and the local spatial distribution of individual atoms than what is possible with point or isosurface renderings. The techniques described in this paper serve to introduce new rendering and interaction techniques that have only recently become practical as well as new ways of interactively exploring the raw data. PMID:23352804

The massively parallel nature of video Time Encoding Machines (TEMs) calls for scalable, massively parallel decoders that are implemented with neural components. The current generation of decoding algorithms is based on computing the pseudo-inverse of a matrix and does not satisfy these requirements. Here we consider video TEMs with an architecture built using Gabor receptive fields and a population of Integrate-and-Fire neurons. We show how to build a scalable architecture for video Time Decoding Machines using recurrent neural networks. Furthermore, we extend our architecture to handle the reconstruction of visual stimuli encoded with massively parallel video TEMs having neurons with random thresholds. Finally, we discuss in detail our algorithms and demonstrate their scalability and performance on a large scale GPU cluster. PMID:22397951

Visual cognition, high-level vision, mid-level vision and top-down processing all refer to decision-based scene analyses that combine prior knowledge with retinal input to generate representations. The label “visual cognition” is little used at present, but research and experiments on mid- and high-level, inference-based vision have flourished, becoming in the 21st century a significant, if often understated part, of current vision research. How does visual cognition work? What are its moving parts? This paper reviews the origins and architecture of visual cognition and briefly describes some work in the areas of routines, attention, surfaces, objects, and events (motion, causality, and agency). Most vision scientists avoid being too explicit when presenting concepts about visual cognition, having learned that explicit models invite easy criticism. What we see in the literature is ample evidence for visual cognition, but few or only cautious attempts to detail how it might work. This is the great unfinished business of vision research: at some point we will be done with characterizing how the visual system measures the world and we will have to return to the question of how vision constructs models of objects, surfaces, scenes, and events. PMID:21329719

Ifremer, French marine institute, is deeply involved in data management for different ocean in-situ observation programs (ARGO, OceanSites, GOSUD, ...) or other European programs aiming at networking ocean in-situ observation data repositories (myOcean, seaDataNet, Emodnet). To capitalize the effort for implementing advance data dissemination services (visualization, download with subsetting) for these programs and generally speaking water-column observations repositories, Ifremer decided to develop the oceanotron server (2010). Knowing the diversity of data repository formats (RDBMS, netCDF, ODV, ...) and the temperamental nature of the standard interoperability interface profiles (OGC/WMS, OGC/WFS, OGC/SOS, OpeNDAP, ...), the server is designed to manage plugins: - StorageUnits : which enable to read specific data repository formats (netCDF/OceanSites, RDBMS schema, ODV binary format). - FrontDesks : which get external requests and send results for interoperable protocols (OGC/WMS, OGC/SOS, OpenDAP). In between a third type of plugin may be inserted: - TransformationUnits : which enable ocean business related transformation of the features (for example conversion of vertical coordinates from pressure in dB to meters under sea surface). The server is released under open-source license so that partners can develop their own plugins. Within MyOcean project, University of Reading has plugged a WMS implementation as an oceanotron frontdesk. The modules are connected together by sharing the same information model for marine observations (or sampling features: vertical profiles, point series and trajectories), dataset metadata and queries. The shared information model is based on OGC/Observation & Measurement and Unidata/Common Data Model initiatives. The model is implemented in java (http://www.ifremer.fr/isi/oceanotron/javadoc/). This inner-interoperability level enables to capitalize ocean business expertise in software development without being indentured to

Understanding the science behind ultra-scale simulations requires extracting meaning from data sets of hundreds of terabytes or more. Developing scalable parallel visualization algorithms is a key step enabling scientists to interact and visualize their data at this scale. However, at extreme scales, the datasets are so huge, there is not even enough time to view the data, let alone explore it with basic visualization methods. Automated tools are necessary for knowledge discovery -- to help sift through the information and isolate characteristic patterns, thereby enabling the scientist to study local interactions, the origin of features and their evolution in large volumes of data. These tools must be able to operate on data of this scale and work with the visualization process. In this project, we developed a framework for activity detection to allow scientists to model and extract spatio-temporal patterns from time-varying data.

This paper discusses some aspects of design of a data distributed, massively parallel volume rendering library for runtime visualization of parallel computational fluid dynamics simulations in a message-passing environment. Unlike the traditional scheme in which visualization is a postprocessing step, the rendering is done in place on each node processor. Computational scientists who run large-scale simulations on a massively parallel computer can thus perform interactive monitoring of their simulations. The current library provides an interface to handle volume data on rectilinear grids. The same design principles can be generalized to handle other types of grids. For demonstration, we run a parallel Navier-Stokes solver making use of this rendering library on the Intel Paragon XP/S. The interactive visual response achieved is found to be very useful. Performance studies show that the parallel rendering process is scalable with the size of the simulation as well as with the parallel computer.

This book consists of essays covering issues in visual cognition presenting experimental techniques from cognitive psychology, methods of modeling cognitive processes on computers from artificial intelligence, and methods of studying brain organization from neuropsychology. Topics considered include: parts of recognition; visual routines; upward direction; mental rotation, and discrimination of left and right turns in maps; individual differences in mental imagery, computational analysis and the neurological basis of mental imagery: componental analysis.

With the goal of improving the ability of people around the world to share the development and use of intelligent systems, Sandia National Laboratories` Intelligent Systems and Robotics Center is developing new Virtual Collaborative Engineering (VCE) and Virtual Collaborative Control (VCC) technologies. A key area of VCE and VCC research is in shared visualization of virtual environments. This paper describes a Virtual Collaborative Visualizer (VCV), named Rocinante, that Sandia developed for VCE and VCC applications. Rocinante allows multiple participants to simultaneously view dynamic geometrically-defined environments. Each viewer can exclude extraneous detail or include additional information in the scene as desired. Shared information can be saved and later replayed in a stand-alone mode. Rocinante automatically scales visualization requirements with computer system capabilities. Models with 30,000 polygons and 4 Megabytes of texture display at 12 to 15 frames per second (fps) on an SGI Onyx and at 3 to 8 fps (without texture) on Indigo 2 Extreme computers. In its networked mode, Rocinante synchronizes its local geometric model with remote simulators and sensory systems by monitoring data transmitted through UDP packets. Rocinante`s scalability and performance make it an ideal VCC tool. Users throughout the country can monitor robot motions and the thinking behind their motion planners and simulators.

Many protocols for quantum information processing use a control sequence or circuit of interactions between qubits and control fields wherein arbitrary qubits can be made to interact with one another. The primary problem with many ``physically scalable" architectures is that the qubits are restricted to nearest neighbor interactions and quantum wires between distant qubits do not exist. Because of errors, nearest neighbor interactions often present difficulty with scalability. We describe a protocol that efficiently performs non-local gates between elements of separated static logical qubits using a bus of dynamic qubits as a refreshable entanglement resource. Imperfect resource preparation due to error propagation from noisy gates and measurement errors can purified within the bus channel. Because of the inherent parallelism of entanglement swapping, communication latency within the quantum computer can be significantly reduced.

We discuss the essential design features of a library of scalable software for performing dense linear algebra computations on distributed memory concurrent computers. The square block scattered decomposition is proposed as a flexible and general-purpose way of decomposing most, if not all, dense matrix problems. An object- oriented interface to the library permits more portable applications to be written, and is easy to learn and use, since details of the parallel implementation are hidden from the user. Experiments on the Intel Touchstone Delta system with a prototype code that uses the square block scattered decomposition to perform LU factorization are presented and analyzed. It was found that the code was both scalable and efficient, performing at about 14 Gflop/s (double precision) for the largest problem considered.

The production system for Grid Data Processing handles petascale ATLAS data reprocessing and Monte Carlo activities. The production system empowered further data processing steps on the Grid performed by dozens of ATLAS physics groups with coordinated access to computing resources worldwide, including additional resources sponsored by regional facilities. The system provides knowledge management of configuration parameters for massive data processing tasks, reproducibility of results, scalable database access, orchestrated workflow and performance monitoring, dynamic workload sharing, automated fault tolerance and petascale data integrity control. The system evolves to accommodate a growing number of users and new requirements from our contacts in ATLAS main areas: Trigger, Physics, Data Preparation and Software & Computing. To assure scalability, the next generation production system architecture development is in progress. We report on scaling up the production system for a growing number of users providing data for physics analysis and other ATLAS main activities.

Harvesting mechanical energy from irregular sources is a potential way to charge batteries for devices and sensor nodes. Triboelectric effect has been extensively utilized in energy harvesting devices as a method to convert mechanical energy into electrical energy. As triboelectric nanogenerators have immense potential to be commercialized, it is important to develop scalable fabrication methods to manufacture these devices. This paper presents scalable fabrication steps to realize large scale triboelectric nanogenerators. Roll-to-roll UV embossing and lamination techniques are used to fabricate different components of large scale triboelectric nanogenerators. The device generated a peak-to-peak voltage and current of 486 V and 21.2 μA, respectively at a frequency of 5 Hz.

This letter proposes a scalable network emulator architecture to support IP optical network management. The network emulator uses the same router interfaces to communicate with the IP optical TE server as the actual IP optical network, and behaves as an actual IP optical network between the interfaces. The network emulator mainly consists of databases and three modules: interface module, resource simulator module, and traffic generator module. To make the network emulator scalable in terms of network size, we employ TCP/IP socket communications between the modules. The proposed network emulator has the benefit that its implementation is not strongly dependent on hardware limitations. We develop a prototype of the network emulator based on the proposed architecture. Our design and experiments show that the proposed architecture is effective.

Plasmonic colour printing has drawn wide attention as a promising candidate for the next-generation colour-printing technology. However, an efficient approach to realize full colour and scalable fabrication is still lacking, which prevents plasmonic colour printing from practical applications. Here we present a scalable and full-colour plasmonic printing approach by combining conjugate twin-phase modulation with a plasmonic broadband absorber. More importantly, our approach also demonstrates controllable chromotropic capability, that is, the ability of reversible colour transformations. This chromotropic capability affords enormous potentials in building functionalized prints for anticounterfeiting, special label, and high-density data encryption storage. With such excellent performances in functional colour applications, this colour-printing approach could pave the way for plasmonic colour printing in real-world commercial utilization. PMID:26567803

The unique properties of graphene make it a promising material for interconnects in flexible and transparent electronics. To increase the commercial impact of graphene in those applications, a scalable and economical method for producing graphene patterns is required. The direct synthesis of graphene from an area-selectively passivated catalyst substrate can generate patterned graphene of high quality. We here present a solution-based method for producing patterned passivation layers. Various deposition methods such as ink-jet deposition and microcontact printing were explored, that can satisfy application demands for low cost, high resolution and scalable production of patterned graphene. The demonstrated high quality and nanometer precision of grown graphene establishes the potential of this synthesis approach for future commercial applications of graphene. Finally, the ability to transfer high resolution graphene patterns onto complex three-dimensional surfaces affords the vision of graphene-based interconnects in novel electronics. PMID:24189709

Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications. PMID:26357251

Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.

Many of the challenges of scaling quantum computer hardware lie at the interface between the qubits and the classical control signals used to manipulate them. Modular ion trap quantum computer architectures address scalability by constructing individual quantum processors interconnected via a network of quantum communication channels. Successful operation of such quantum hardware requires a fully programmable classical control system capable of frequency stabilizing the continuous wave lasers necessary for loading, cooling, initialization, and detection of the ion qubits, stabilizing the optical frequency combs used to drive logic gate operations on the ion qubits, providing a large number of analog voltage sources to drive the trap electrodes, and a scheme for maintaining phase coherence among all the controllers that manipulate the qubits. In this work, we describe scalable solutions to these hardware development challenges.

Circuit quantum electrodynamics, consisting of superconducting artificial atoms coupled to on-chip resonators, represents a prime candidate to implement the scalable quantum computing architecture because of the presence of good tunability and controllability. Furthermore, recent advances have pushed the technology towards the ultrastrong coupling regime of light-matter interaction, where the qubit-resonator coupling strength reaches a considerable fraction of the resonator frequency. Here, we propose a qubit-resonator system operating in that regime, as a quantum memory device and study the storage and retrieval of quantum information in and from the Z2 parity-protected quantum memory, within experimentally feasible schemes. We are also convinced that our proposal might pave a way to realize a scalable quantum random-access memory due to its fast storage and readout performances. PMID:25727251

Plasmonic colour printing has drawn wide attention as a promising candidate for the next-generation colour-printing technology. However, an efficient approach to realize full colour and scalable fabrication is still lacking, which prevents plasmonic colour printing from practical applications. Here we present a scalable and full-colour plasmonic printing approach by combining conjugate twin-phase modulation with a plasmonic broadband absorber. More importantly, our approach also demonstrates controllable chromotropic capability, that is, the ability of reversible colour transformations. This chromotropic capability affords enormous potentials in building functionalized prints for anticounterfeiting, special label, and high-density data encryption storage. With such excellent performances in functional colour applications, this colour-printing approach could pave the way for plasmonic colour printing in real-world commercial utilization. PMID:26567803

There are more than 40 million blind individuals in the world whose plight would be greatly ameliorated by creating a visual prosthetic. We begin by outlining the basic operational characteristics of the visual system as this knowledge is essential for producing a prosthetic device based on electrical stimulation through arrays of implanted electrodes. We then list a series of tenets that we believe need to be followed in this effort. Central among these is our belief that the initial research in this area, which is in its infancy, should first be carried out in animals. We suggest that implantation of area V1 holds high promise as the area is of a large volume and can therefore accommodate extensive electrode arrays. We then proceed to consider coding operations that can effectively convert visual images viewed by a camera to stimulate electrode arrays to yield visual impressions that can provide shape, motion and depth information. We advocate experimental work that mimics electrical stimulation effects non-invasively in sighted human subjects using a camera from which visual images are converted into displays on a monitor akin to those created by electrical stimulation. PMID:19065857

Our vision remains stable even though the movements of our eyes, head and bodies create a motion pattern on the retina. One of the most important, yet basic, feats of the visual system is to correctly determine whether this retinal motion is owing to real movement in the world or rather our own self-movement. This problem has occupied many great thinkers, such as Descartes and Helmholtz, at least since the time of Alhazen. This theme issue brings together leading researchers from animal neurophysiology, clinical neurology, psychophysics and cognitive neuroscience to summarize the state of the art in the study of visual stability. Recently, there has been significant progress in understanding the limits of visual stability in humans and in identifying many of the brain circuits involved in maintaining a stable percept of the world. Clinical studies and new experimental methods, such as transcranial magnetic stimulation, now make it possible to test the causal role of different brain regions in creating visual stability and also allow us to measure the consequences when the mechanisms of visual stability break down. PMID:21242136

Reported is the first scalable synthesis of rac-jungermannenones B and C starting from the commercially available and inexpensive geraniol in 10 and 9 steps, respectively. The unique jungermannenone framework is rapidly assembled by an unprecedented regioselective 1,6-dienyne reductive cyclization reaction which proceeds through a vinyl radical cyclization/allylic radical isomerization mechanism. DFT calculations explain the high regioselectivity observed in the 1,6-dienyne reductive radical cyclization. PMID:26823176

Several features make Java an attractive choice for scientific applications. In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for scientific applications.

The unique properties of graphene make it a promising material for interconnects in flexible and transparent electronics. To increase the commercial impact of graphene in those applications, a scalable and economical method for producing graphene patterns is required. The direct synthesis of graphene from an area-selectively passivated catalyst substrate can generate patterned graphene of high quality. We here present a solution-based method for producing patterned passivation layers. Various deposition methods such as ink-jet deposition and microcontact printing were explored, that can satisfy application demands for low cost, high resolution and scalable production of patterned graphene. The demonstrated high quality and nanometer precision of grown graphene establishes the potential of this synthesis approach for future commercial applications of graphene. Finally, the ability to transfer high resolution graphene patterns onto complex three-dimensional surfaces affords the vision of graphene-based interconnects in novel electronics.The unique properties of graphene make it a promising material for interconnects in flexible and transparent electronics. To increase the commercial impact of graphene in those applications, a scalable and economical method for producing graphene patterns is required. The direct synthesis of graphene from an area-selectively passivated catalyst substrate can generate patterned graphene of high quality. We here present a solution-based method for producing patterned passivation layers. Various deposition methods such as ink-jet deposition and microcontact printing were explored, that can satisfy application demands for low cost, high resolution and scalable production of patterned graphene. The demonstrated high quality and nanometer precision of grown graphene establishes the potential of this synthesis approach for future commercial applications of graphene. Finally, the ability to transfer high resolution graphene patterns onto

Space Situational Awareness (SSA) is a fundamental and critical component of current space operations. The term SSA encompasses the awareness, understanding and predictability of all objects in space. As the population of orbital space objects and debris increases, the number of collision avoidance maneuvers grows and prompts the need for accurate and timely process measures. The SSA mission continually evolves to near real-time assessment and analysis demanding the need for higher processing capabilities. By conventional methods, meeting these demands requires the integration of new hardware to keep pace with the growing complexity of maneuver planning algorithms. SpaceNav has implemented a highly scalable architecture that will track satellites and debris by utilizing powerful virtual machines on the Google Cloud Platform. SpaceNav algorithms for processing CDMs outpace conventional means. A robust processing environment for tracking data, collision avoidance maneuvers and various other aspects of SSA can be created and deleted on demand. Migrating SpaceNav tools and algorithms into the Google Cloud Platform will be discussed and the trials and tribulations involved. Information will be shared on how and why certain cloud products were used as well as integration techniques that were implemented. Key items to be presented are: 1.Scientific algorithms and SpaceNav tools integrated into a scalable architecture a) Maneuver Planning b) Parallel Processing c) Monte Carlo Simulations d) Optimization Algorithms e) SW Application Development/Integration into the Google Cloud Platform 2. Compute Engine Processing a) Application Engine Automated Processing b) Performance testing and Performance Scalability c) Cloud MySQL databases and Database Scalability d) Cloud Data Storage e) Redundancy and Availability

Transactional data mining (association rules, decision trees etc.) has been effectively used to find non-trivial patterns in categorical and unstructured data. For applications that have an inherent structure (e.g., social networks, proteins), graph mining is useful since mapping the structured data into a transactional representation will lead to loss of information. Graph mining is used for identifying interesting or frequent subgraphs. Database mining uses SQL and relational representation to overcome limitations of main memory algorithms and to achieve scalability.

This paper presents GridMW, a scalable and reliable data middleware for smart grids. Smart grids promise to improve the efficiency of power grid systems and reduce green house emissions through incorporating power generation from renewable sources and shaping demand to match the supply. As a result, power grid systems will become much more dynamic and require constant adjustments, which requires analysis and decision making applications to improve the efficiency and reliability of smart grid systems.

Increasing production and use of digital medical imagery are driving new approaches to information storage and management. Traditional, centralized approaches to image communication, storage and archiving are becoming increasingly expensive to scale and operate with high levels of reliability. Multi-site, geographically-distributed deployments connected by limited-bandwidth networks present further scalability, reliability, and availability challenges. A grid storage architecture built from a distributed network of low cost, off-the-shelf servers (nodes) provides scalable data and metadata storage, processing, and communication without single points of failure. Imaging studies are stored, replicated, cached, managed, and retrieved based on defined rules, and nodes within the grid can acquire studies and respond to queries. Grid nodes transparently load-balance queries, storage/retrieval requests, and replicate data for automated backup and disaster recovery. This approach reduces latency, increases availability, provides near-linear scalability and allows the creation of a geographically distributed medical imaging network infrastructure. This paper presents some key concepts in grid storage and discusses the results of a clinical deployment of a multi-site storage grid for cancer care in the province of British Columbia.

Reliable group ordered delivery of multicast messages in a distributed system is a useful service that simplifies the programming of distributed applications. Such a service helps to maintain the consistency of replicated information and to coordinate the activities of the various processes. With the increasing popularity of the Internet, there is an increasing interest in scaling the protocols that provide this service to the environment of the Internet. The InterGroup protocol suite, described in this dissertation, provides such a service, and is intended for the environment of the Internet with scalability to large numbers of nodes and high latency links. The InterGroup protocols approach the scalability problem from various directions. They redefine the meaning of group membership, allow voluntary membership changes, add a receiver-oriented selection of delivery guarantees that permits heterogeneity of the receiver set, and provide a scalable reliability service. The InterGroup system comprises several components, executing at various sites within the system. Each component provides part of the services necessary to implement a group communication system for the wide-area. The components can be categorized as: (1) control hierarchy, (2) reliable multicast, (3) message distribution and delivery, and (4) process group membership. We have implemented a prototype of the InterGroup protocols in Java, and have tested the system performance in both local-area and wide-area networks.

Emerging needs in transportation network modeling and simulation are raising new challenges with respect to scal-ability of network size and vehicular traffic intensity, speed of simulation for simulation-based optimization, and fidel-ity of vehicular behavior for accurate capture of event phe-nomena. Parallel execution is warranted to sustain the re-quired detail, size and speed. However, few parallel simulators exist for such applications, partly due to the challenges underlying their development. Moreover, many simulators are based on time-stepped models, which can be computationally inefficient for the purposes of modeling evacuation traffic. Here an approach is presented to de-signing a simulator with memory and speed efficiency as the goals from the outset, and, specifically, scalability via parallel execution. The design makes use of discrete event modeling techniques as well as parallel simulation meth-ods. Our simulator, called SCATTER, is being developed, incorporating such design considerations. Preliminary per-formance results are presented on benchmark road net-works, showing scalability to one million vehicles simu-lated on one processor.

In this paper we address several issues pertinent to intrinsic evolvable hardware (EHW). The first issue is scalability; namely, how the design space scales as the programming string for the programmable device gets longer. We develop a model for population size and the number of generations as a function of the programming string length, L, and show that the number of circuit evaluations is an O(L2) process. We compare our model to several successful intrinsic EHW experiments and discuss the many implications of our model. The second issue that we address is the timing of intrinsic EHW experiments. We show that the processing time is a small part of the overall time to derive or evolve a circuit and that major improvements in processor speed alone will have only a minimal impact on improving the scalability of intrinsic EHW. The third issue we consider is the system-level design of intrinsic EHW experiments. We review what other researchers have done to break the scalability barrier and contend that the type of reconfigurable platform and the evolutionary algorithm are tied together and impose limits on each other.

Evolutionary neural networks, or neuroevolution, appear to be a promising way to build versatile adaptive systems, combining evolution and learning. One of the most challenging problems of neuroevolution is finding a scalable and robust genetic representation, which would allow to effectively grow increasingly complex networks for increasingly complex tasks. In this paper we propose a novel developmental encoding for networks, featuring scalability, modularity, regularity and hierarchy. The encoding allows to represent structural regularities of networks and build them from encapsulated and possibly reused subnetworks. These capabilities are demonstrated on several test problems. In particular for parity and symmetry problems we evolve solutions, which are fully general with respect to the number of inputs. We also evolve scalable and modular weightless recurrent networks capable of autonomous learning in a simple generic classification task. The encoding is very flexible and we demonstrate this by evolving networks capable of learning via neuromodulation. Finally, we evolve modular solutions to the retina problem, for which another well known neuroevolution method-HyperNEAT-was previously shown to fail. The proposed encoding outperformed HyperNEAT and Cellular Encoding also in another experiment, in which certain connectivity patterns must be discovered between layers. Therefore we conclude the proposed encoding is an interesting and competitive approach to evolve networks. PMID:21957432

Scalable heterogeneous computing systems, which are composed of a mix of compute devices, such as commodity multicore processors, graphics processors, reconfigurable processors, and others, are gaining attention as one approach to continuing performance improvement while managing the new challenge of energy efficiency. As these systems become more common, it is important to be able to compare and contrast architectural designs and programming systems in a fair and open forum. To this end, we have designed the Scalable HeterOgeneous Computing benchmark suite (SHOC). SHOC's initial focus is on systems containing graphics processing units (GPUs) and multi-core processors, and on the new OpenCL programming standard. SHOC is a spectrum of programs that test the performance and stability of these scalable heterogeneous computing systems. At the lowest level, SHOC uses microbenchmarks to assess architectural features of the system. At higher levels, SHOC uses application kernels to determine system-wide performance including many system features such as intranode and internode communication among devices. SHOC includes benchmark implementations in both OpenCL and CUDA in order to provide a comparison of these programming models.

This paper presents a new, scalable coding technique that can be used in interactive image/video communications over the Internet. The proposed technique generates a fully embedded bit stream that provides scalability with high quality for the whole image and it can be used to implement region based coding as well. The embedded bit stream is comprised of a basic layer and many enhancement layers. The enhancement layers add refinement to the quality of the image that has been reconstructed using the basic layer. The proposed coding technique uses multiple quantizers with thresholds (QT) for layering and it creates a bit plane for each layer. The bit plane is then partitioned into sets of small areas to be coded independently. Run length and entropy coding are applied to each of the sets to provide scalability for the entire image resulting in high picture quality in the user-specific area of interest (ROI). We tested this technique by applying it to various test images and the results consistently show high level of performance.

File system designers continue to look to new architectures to improve scalability. Object-based storage diverges from server-based (e.g. NFS) and SAN-based storage systems by coupling processors and memory with disk drives, delegating low-level allocation to object storage devices (OSDs) and decoupling I/O (read/write) from metadata (file open/close) operations. Even recent object-based systems inherit decades-old architectural choices going back to early UNIX file systems, however, limiting their ability to effectively scale to hundreds of petabytes. We present Ceph, a distributed file system that provides excellent performance and reliability with unprecedented scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable OSDs. We leverage OSD intelligence to distribute data replication, failure detection and recovery with semi-autonomous OSDs running a specialized local object storage file system (EBOFS). Finally, Ceph is built around a dynamic distributed metadata management cluster that provides extremely efficient metadata management that seamlessly adapts to a wide range of general purpose and scientific computing file system workloads. We present performance measurements under a variety of workloads that show superior I/O performance and scalable metadata management (more than a quarter million metadata ops/sec).

We illustrate through theory and numerical simulations that redundant coupled dynamical systems can be extremely robust against local noise in comparison to uncoupled dynamical systems evolving in the same noisy environment. Previous studies have shown that the noise robustness of redundant coupled dynamical systems is linearly scalable and deviations due to noise can be minimized by increasing the number of coupled units. Here, we demonstrate that the noise robustness can actually be scaled superlinearly if some conditions are met and very high noise robustness can be realized with very few coupled units. We discuss these conditions and show that this superlinear scalability depends on the nonlinearity of the individual dynamical units. The phenomenon is demonstrated in discrete as well as continuous dynamical systems. This superlinear scalability not only provides us an opportunity to exploit the nonlinearity of physical systems without being bogged down by noise but may also help us in understanding the functional role of coupled redundancy found in many biological systems. Moreover, engineers can exploit superlinear noise suppression by starting a coupled system near (not necessarily at) the appropriate initial condition.

Many problems in astronomy and astrophysics require a computation of the spherical harmonic transforms. This is in particular the case whenever data to be analyzed are distributed over the sphere or a set of corresponding mock data sets has to be generated. In many of those contexts, rapidly improving resolutions of both the data and simulations puts increasingly bigger emphasis on our ability to calculate the transforms quickly and reliably. The scalable spherical harmonic transform library S2HAT consists of a set of flexible, massively parallel, and scalable routines for calculating diverse (scalar, spin-weighted, etc) spherical harmonic transforms for a class of isolatitude sky grids or pixelizations. The library routines implement the standard algorithm with the complexity of O(n^3/2), where n is a number of pixels/grid points on the sphere, however, owing to their efficient parallelization and advanced numerical implementation, they achieve very competitive performance and near perfect scalability. S2HAT is written in Fortran 90 with a C interface. This software is a derivative of the spherical harmonic transforms included in the HEALPix package and is based on both serial and MPI routines of its version 2.01, however, since version 2.5 this software is fully autonomous of HEALPix and can be compiled and run without the HEALPix library.

In this paper, we propose a low complexity prioritized bit-plane coding scheme to improve the rate-distortion performance of cyclical block coding in MPEG-21 scalable video coding. Specifically, we use a block priority assignment algorithm to firstly transmit the symbols and the blocks with potentially better rate-distortion performance. Different blocks are allowed to be coded unequally in a coding cycle. To avoid transmitting priority overhead, the encoder and the decoder refer to the same context to assign priority. Furthermore, to reduce the complexity, the priority assignment is done by a look-up-table and the coding of each block is controlled by a simple threshold comparison mechanism. Experimental results show that our prioritized bit-plane coding scheme can offer up to 0.5dB PSNR improvement over the cyclical block coding described in the joint scalable verification model (JSVM).

Understanding relationships between sets is an important analysis task that has received widespread attention in the visualization community. The major challenge in this context is the combinatorial explosion of the number of set intersections if the number of sets exceeds a trivial threshold. In this paper we introduce UpSet, a novel visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections. UpSet is focused on creating task-driven aggregates, communicating the size and properties of aggregates and intersections, and a duality between the visualization of the elements in a dataset and their set membership. UpSet visualizes set intersections in a matrix layout and introduces aggregates based on groupings and queries. The matrix layout enables the effective representation of associated data, such as the number of elements in the aggregates and intersections, as well as additional summary statistics derived from subset or element attributes. Sorting according to various measures enables a task-driven analysis of relevant intersections and aggregates. The elements represented in the sets and their associated attributes are visualized in a separate view. Queries based on containment in specific intersections, aggregates or driven by attribute filters are propagated between both views. We also introduce several advanced visual encodings and interaction methods to overcome the problems of varying scales and to address scalability. UpSet is web-based and open source. We demonstrate its general utility in multiple use cases from various domains. PMID:26356912

The study of socioeconomic inequality is of substantial importance, scientific and general alike. The graphic visualization of inequality is commonly conveyed by Lorenz curves. While Lorenz curves are a highly effective statistical tool for quantifying the distribution of wealth in human societies, they are less effective a tool for the visual depiction of socioeconomic inequality. This paper introduces an alternative to Lorenz curves-the hill curves. On the one hand, the hill curves are a potent scientific tool: they provide detailed scans of the rich-poor gaps in human societies under consideration, and are capable of accommodating infinitely many degrees of freedom. On the other hand, the hill curves are a powerful infographic tool: they visualize inequality in a most vivid and tangible way, with no quantitative skills that are required in order to grasp the visualization. The application of hill curves extends far beyond socioeconomic inequality. Indeed, the hill curves are highly effective 'hyperspectral' measures of statistical variability that are applicable in the context of size distributions at large. This paper establishes the notion of hill curves, analyzes them, and describes their application in the context of general size distributions.

A previous paper described some numerical experiments performed using the ParaView/Catalyst in-situ visualization infrastructure deployed in the Los Alamos RAGE radiation-hydrodynamics code to produce images from a running large scale 3D ICF simulation. One challenge of the in-situ approach apparent in these experiments was the difficulty of choosing parameters likes isosurface values for the visualizations to be produced from the running simulation without the benefit of prior knowledge of the simulation results and the resultant cost of recomputing in-situ generated images when parameters are chosen suboptimally. A proposed method of addressing this difficulty is to simply render multiple images at runtime with a range of possible parameter values to produce a large database of images and to provide the user with a tool for managing the resulting database of imagery. Recently, ParaView/Catalyst has been extended to include such a capability via the so-called Cinema framework. Here I describe some initial experiments with the first delivery of Cinema and make some recommendations for future extensions of Cinema’s capabilities.

The hybrid visualization and interaction tool EarthScape is presented here. The software is able to display simultaneously LiDAR point clouds, draped videos with moving footprint, volume scientific data (using volume rendering, isosurface and slice plane), raster data such as still satellite images, vector data and 3D models such as buildings or vehicles. The application runs on touch screen devices such as tablets. The software is based on open source libraries, such as OpenSceneGraph, osgEarth and OpenCV, and shader programming is used to implement volume rendering of scientific data. The next goal of EarthScape is to perform data analysis using ENVI Services Engine, a cloud data analysis solution. EarthScape is also designed to be a client of Jagwire which provides multisource geo-referenced video fluxes. When all these components will be included, EarthScape will be a multi-purpose platform that will provide at the same time data analysis, hybrid visualization and complex interactions. The software is available on demand for free at france@exelisvis.com.

Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.

We present a method for automatically identifying and validating predictive relationships between the visual appearance of a city and its non-visual attributes (e.g. crime statistics, housing prices, population density etc.). Given a set of street-level images and (location, city-attribute-value) pairs of measurements, we first identify visual elements in the images that are discriminative of the attribute. We then train a predictor by learning a set of weights over these elements using non-linear Support Vector Regression. To perform these operations efficiently, we implement a scalable distributed processing framework that speeds up the main computational bottleneck (extracting visual elements) by an order of magnitude. This speedup allows us to investigate a variety of city attributes across 6 different American cities. We find that indeed there is a predictive relationship between visual elements and a number of city attributes including violent crime rates, theft rates, housing prices, population density, tree presence, graffiti presence, and the perception of danger. We also test human performance for predicting theft based on street-level images and show that our predictor outperforms this baseline with 33% higher accuracy on average. Finally, we present three prototype applications that use our system to (1) define the visual boundary of city neighborhoods, (2) generate walking directions that avoid or seek out exposure to city attributes, and (3) validate user-specified visual elements for prediction. PMID:26356976

We present NeuroLines, a novel visualization technique designed for scalable detailed analysis of neuronal connectivity at the nanoscale level. The topology of 3D brain tissue data is abstracted into a multi-scale, relative distance-preserving subway map visualization that allows domain scientists to conduct an interactive analysis of neurons and their connectivity. Nanoscale connectomics aims at reverse-engineering the wiring of the brain. Reconstructing and analyzing the detailed connectivity of neurons and neurites (axons, dendrites) will be crucial for understanding the brain and its development and diseases. However, the enormous scale and complexity of nanoscale neuronal connectivity pose big challenges to existing visualization techniques in terms of scalability. NeuroLines offers a scalablevisualization framework that can interactively render thousands of neurites, and that supports the detailed analysis of neuronal structures and their connectivity. We describe and analyze the design of NeuroLines based on two real-world use-cases of our collaborators in developmental neuroscience, and investigate its scalability to large-scale neuronal connectivity data. PMID:26356951

Reality Capture Technologies, Inc. is a spinoff company from Ames Research Center. Offering e-business solutions for optimizing management, design and production processes, RCT uses visual collaboration environments (VCEs) such as those used to prepare the Mars Pathfinder mission.The product, 4-D Reality Framework, allows multiple users from different locations to manage and share data. The insurance industry is one targeted commercial application for this technology.

Flow visualization techniques are reviewed, with particular attention given to those applicable to liquid helium flows. Three techniques capable of obtaining qualitative and quantitative measurements of complex 3D flow fields are discussed including focusing schlieren, particle image volocimetry, and holocinematography (HCV). It is concluded that the HCV appears to be uniquely capable of obtaining full time-varying, 3D velocity field data, but is limited to the low speeds typical of liquid helium facilities.

Flow visualization techniques are reviewed, with particular attention given to those applicable to liquid helium flows. Three techniques capable of obtaining qualitative and quantitative measurements of complex 3D flow fields are discussed including focusing schlieren, particle image volocimetry, and holocinematography (HCV). It is concluded that the HCV appears to be uniquely capable of obtaining full time-varying, 3D velocity field data, but is limited to the low speeds typical of liquid helium facilities.

This paper describes DV3D, a Vistrails package of high-level modules for the Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT) interactive visual exploration system that enables exploratory analysis of diverse and rich data sets stored in the Earth System Grid Federation (ESGF). DV3D provides user-friendly workflow interfaces for advanced visualization and analysis of climate data at a level appropriate for scientists. The application builds on VTK, an open-source, object-oriented library, for visualization and analysis. DV3D provides the high-level interfaces, tools, and application integrations required to make the analysis and visualization power of VTK readily accessible to users without exposing burdensome details such as actors, cameras, renderers, and transfer functions. It can run as a desktop application or distributed over a set of nodes for hyperwall or distributed visualization applications. DV3D is structured as a set of modules which can be linked to create workflows in Vistrails. Figure 1 displays a typical DV3D workflow as it would appear in the Vistrails workflow builder interface of UV-CDAT and, on the right, the visualization spreadsheet output of the workflow. Each DV3D module encapsulates a complex VTK pipeline with numerous supporting objects. Each visualization module implements a unique interactive 3D display. The integrated Vistrails visualization spreadsheet offers multiple synchronized visualization displays for desktop or hyperwall. The currently available displays include volume renderers, volume slicers, 3D isosurfaces, 3D hovmoller, and various vector plots. The DV3D GUI offers a rich selection of interactive query, browse, navigate, and configure options for all displays. All configuration operations are saved as Vistrails provenance. DV3D's seamless integration with UV-CDAT's climate data management system (CDMS) and other climate data analysis tools provides a wide range of climate data analysis operations, e

There is a rising need for low-cost and scalable platforms for sensitive medical diagnostic testing. Fabric weaving is a mature, scalable manufacturing technology and can be used as a platform to manufacture microfluidic diagnostic tests with controlled, tunable flow. Given its scalability, low manufacturing cost (

We describe a method for assessing the visualization literacy (VL) of a user. Assessing how well people understand visualizations has great value for research (e. g., to avoid confounds), for design (e. g., to best determine the capabilities of an audience), for teaching (e. g., to assess the level of new students), and for recruiting (e. g., to assess the level of interviewees). This paper proposes a method for assessing VL based on Item Response Theory. It describes the design and evaluation of two VL tests for line graphs, and presents the extension of the method to bar charts and scatterplots. Finally, it discusses the reimplementation of these tests for fast, effective, and scalable web-based use. PMID:26356910

AstroVis enables rapid visualization of large data files on platforms supporting the OpenGL rendering library. Radio astronomical observations are typically three dimensional and stored as data cubes. AstroVis implements a scalable approach to accessing these files using three components: a File Access Component (FAC) that reduces the impact of reading time, which speeds up access to the data; the Image Processing Component (IPC), which breaks up the data cube into smaller pieces that can be processed locally and gives a representation of the whole file; and Data Visualization, which implements an approach of Overview + Detail to reduces the dimensions of the data being worked with and the amount of memory required to store it. The result is a 3D display paired with a 2D detail display that contains a small subsection of the original file in full resolution without reducing the data in any way.

Cooperation has become a key word in the emerging Web 2.0 paradigm. The nature and motivations of the various behaviours related to this type of cooperative activities remain however incompletely understood. The information visualization tools can play a crucial role from this perspective to analyse the collected data. This paper presents a prototype allowing visualizing some data about the Wikipedia history with a technique called ellimaps. In this context the recent CGD algorithm is used in order to increase the scalability of the ellimaps approach.

We show that measurement-based quantum computation on scalable continuous-variable (CV) cluster states admits more quantum-circuit flexibility and compactness than similar protocols for standard square-lattice CV cluster states. This advantage is a direct result of the macronode structure of these states—that is, a lattice structure in which each graph node actually consists of several physical modes. These extra modes provide additional measurement degrees of freedom at each graph location, which can be used to manipulate the flow and processing of quantum information more robustly and with additional flexibility that is not available on an ordinary lattice.

The Simplex-Stochastic Collocation (SSC) method is a robust tool used to propagate uncertain input distributions through a computer code. However, it becomes prohibitively expensive for problems with dimensions higher than 5. The main purpose of this paper is to identify bottlenecks, and to improve upon this bad scalability. In order to do so, we propose an alternative interpolation stencil technique based upon the Set-Covering problem, and we integrate the SSC method in the High-Dimensional Model-Reduction framework. In addition, we address the issue of ill-conditioned sample matrices, and we present an analytical map to facilitate uniformly-distributed simplex sampling.

In this paper the authors present performance results from several parallel benchmarks and applications on a 400-node Linux cluster at Sandia National Laboratories. They compare the results on the Linux cluster to performance obtained on a traditional distributed-memory massively parallel processing machine, the Intel TeraFLOPS. They discuss the characteristics of these machines that influence the performance results and identify the key components of the system software that they feel are important to allow for scalability of commodity-based PC clusters to hundreds and possibly thousands of processors.

Here, we present the new UCL Bioinformatics Group's PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. The new web portal provides a greatly streamlined user interface with a number of new features to allow users to better explore their results. We offer a number of additional services to enable computationally scalable execution of our prediction methods; these include SOAP and XML-RPC web server access and new HADOOP packages. All software and services are available via the UCL Bioinformatics Group website at http://bioinf.cs.ucl.ac.uk/. PMID:23748958

A novel optoelectronically-controlled wideband 2-D phased-array antenna system is demonstrated. The inclusion of WDM devices makes a highly scalable system structure. Only (M+N) delay lines are required to control a M×N array. The optical true-time delay lines are combination of polymer waveguides and optical switches, using a single polymeric platform and are monolithically integrated on a single substrate. The 16 time delays generated by the device are measured to range from 0 to 175 ps in 11.6 ps. Far-field patterns at different steering angles in X-band are measured.

DTI offers a unique opportunity to characterize the structural connectivity of the human brain non-invasively by tracing white matter fiber tracts. Whole brain tractography studies routinely generate up to half million tracts per brain, which serves as edges in an extremely large 3D graph with up to half million edges. Currently there is no agreed-upon method for constructing the brain structural network graphs out of large number of white matter tracts. In this paper, we present a scalable iterative framework called the ɛ-neighbor method for building a network graph and apply it to testing abnormal connectivity in autism.

In large-scale parallel applications a graph coloring is often carried out to schedule computational tasks. In this paper, we describe a new distributed memory algorithm for doing the coloring itself in parallel. The algorithm operates in an iterative fashion; in each round vertices are speculatively colored based on limited information, and then a set of incorrectly colored vertices, to be recolored in the next round, is identified. Parallel speedup is achieved in part by reducing the frequency of communication among processors. Experimental results on a PC cluster using up to 16 processors show that the algorithm is scalable.

Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.

Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as asyncmore » progress report, checkpoint and restart, as well as integrity checking.« less

Due to the proliferation of mobile devices connected to the Internet, implementing a secure and practical Mobile IP has become an important goal. A mobile IP can not work properly without authentication between the mobile node (MN), the home agent (HA) and the foreign agent (FA). In this paper, we propose a practical Mobile IP authentication protocol that uses public key cryptography only during the initial authentication. The proposed scheme is compatible with the conventional Mobile IP protocol and provides scalability against the number of MN's. We also show that the proposed protocol offers secure operation.

We present a scalable decoding architecture for a certain class of structured LDPC codes. The codes are designed using a small (n,r) protograph that is replicated Z times to produce a decoding graph for a (Z x n, Z x r) code. Using this architecture, we have implemented a decoder for a (4096,2048) LDPC code on a Xilinx Virtex-II 2000 FPGA, and achieved decoding speeds of 31 Mbps with 10 fixed iterations. The implemented message-passing algorithm uses an optimized 3-bit non-uniform quantizer that operates with 0.2dB implementation loss relative to a floating point decoder.

A scalable architecture for wireless digital data and voice communications via ad hoc networks has been proposed. Although the details of the architecture and of its implementation in hardware and software have yet to be developed, the broad outlines of the architecture are fairly clear: This architecture departs from current commercial wireless communication architectures, which are characterized by low effective bandwidth per user and are not well suited to low-cost, rapid scaling in large metropolitan areas. This architecture is inspired by a vision more akin to that of more than two dozen noncommercial community wireless networking organizations established by volunteers in North America and several European countries.

To improve access to a key synthetic intermediate we targeted a direct hydrobromination-Negishi route. Unsurprisingly, the anti-Markovnikov addition of HBr to estragole in the presence of AIBN proved successful. However, even in the absence of an added initiator, anti-Markovnikov addition was observed. Re-examination of early reports revealed that selective Markovnikov addition, often simply termed "normal" addition, is not always observed with HBr unless air is excluded, leading to the rediscovery of a reproducible and scalable initiator-free protocol. PMID:27185636

This paper presents an ADBASE-based parallel algorithm for solving multiple objective linear programs (MOLP's). Job balance, speedup and scalability are of primary interest in evaluating efficiency of the new algorithm. Implementation results on Intel iPSC/2 and Paragon multiprocessors show that the algorithm significantly speeds up the process of solving MOLP's, which is understood as generating all or some efficient extreme points and unbounded efficient edges. The algorithm gives specially good results for large and very large problems. Motivation and justification for solving such large MOLP's are also included.

A modular, scalable focal plane array is provided as an array of integrated circuit dice, wherein each die includes a given amount of modular pixel array circuitry. The array of dice effectively multiplies the amount of modular pixel array circuitry to produce a larger pixel array without increasing die size. Desired pixel pitch across the enlarged pixel array is preserved by forming die stacks with each pixel array circuitry die stacked on a separate die that contains the corresponding signal processing circuitry. Techniques for die stack interconnections and die stack placement are implemented to ensure that the desired pixel pitch is preserved across the enlarged pixel array.

Large-scale information processing environments must rapidly search through massive streams of raw data to locate useful information. These data streams contain textual and numeric data items, and may be highly structured or mostly freeform text. This project aims to create a high performance and scalable engine for locating relevant content in data streams. Based on the J2EE Java Messaging Service (JMS), the content-based messaging (CBM) engine provides highly efficient message formatting and filtering. This paper describes the design of the CBM engine, and presents empirical results that compare the performance with a standard JMS to demonstrate the performance improvements that are achieved.

Alternative splicing is a process by which the same DNA sequence is used to assemble different proteins, called protein isoforms. Alternative splicing works by selectively omitting some of the coding regions (exons) typically associated with a gene. Detection of alternative splicing is difficult and uses a combination of advanced data acquisition methods and statistical inference. Knowledge about the abundance of isoforms is important for understanding both normal processes and diseases and to eventually improve treatment through targeted therapies. The data, however, is complex and current visualizations for isoforms are neither perceptually efficient nor scalable. To remedy this, we developed Vials, a novel visual analysis tool that enables analysts to explore the various datasets that scientists use to make judgments about isoforms: the abundance of reads associated with the coding regions of the gene, evidence for junctions, i.e., edges connecting the coding regions, and predictions of isoform frequencies. Vials is scalable as it allows for the simultaneous analysis of many samples in multiple groups. Our tool thus enables experts to (a) identify patterns of isoform abundance in groups of samples and (b) evaluate the quality of the data. We demonstrate the value of our tool in case studies using publicly available datasets. PMID:26529712

We investigate the design of declarative, domain-specific languages for constructing interactive visualizations. By separating specification from execution, declarative languages can simplify development, enable unobtrusive optimization, and support retargeting across platforms. We describe the design of the Protovis specification language and its implementation within an object-oriented, statically-typed programming language (Java). We demonstrate how to support rich visualizations without requiring a toolkit-specific data model and extend Protovis to enable declarative specification of animated transitions. To support cross-platform deployment, we introduce rendering and event-handling infrastructures decoupled from the runtime platform, letting designers retarget visualization specifications (e.g., from desktop to mobile phone) with reduced effort. We also explore optimizations such as runtime compilation of visualization specifications, parallelized execution, and hardware-accelerated rendering. We present benchmark studies measuring the performance gains provided by these optimizations and compare performance to existing Java-based visualization tools, demonstrating scalability improvements exceeding an order of magnitude. PMID:20975153

Graphs are a vital way of organizing data with complex correlations. A good visualization of a graph can fundamentally change human understanding of the data. Consequently, there is a rich body of work on graph visualization. Although there are many techniques that are effective on small to medium sized graphs (tens of thousands of nodes), there is a void in the research for visualizing massive graphs containing millions of nodes. Sandia is one of the few entities in the world that has the means and motivation to handle data on such a massive scale. For example, homeland security generates graphs from prolific media sources such as television, telephone, and the Internet. The purpose of this project is to provide the groundwork for visualizing such massive graphs. The research provides for two major feature gaps: a parallel, interactive visualization framework and scalable algorithms to make the framework usable to a practical application. Both the frameworks and algorithms are designed to run on distributed parallel computers, which are already available at Sandia. Some features are integrated into the ThreatView{trademark} application and future work will integrate further parallel algorithms.

A neutron generator is provided with a flat, rectilinear geometry and surface mounted metallizations. This construction provides scalability and ease of fabrication, and permits multiple ion source functionalities.

The increasing need of remote medical investigation services in the framework of collaborative multidisciplinary meetings (e.g. cancer follow-up) raises the challenge of on-line remote access of (large amount of) radiologic data in a limited period of time. This paper proposes a scalable compression framework of DICOM images providing low-latency display through low speed networks. The developed approach relies on useless information removal from images (i.e. not related with the patient body) and the exploitation of the JPEG2000 standard to achieve progressive quality encoding and access of the data. This mechanism also allows the efficient exploitation of any idle times (corresponding to on-line visual image analysis) to download the remaining data at lossless quality in a way transparent to the user, thus minimizing the perceived latency. The experiments performed in comparison with exchanging uncompressed or JPEGlossless compressed DICOM data, showed the benefit of the proposed approach for collaborative on-line remote diagnosis and follow-up services.

Selective enhancement mechanism of Fine-Granular-Scalability (FGS) In MPEG-4 is able to enhance specific objects under bandwidth variation. A novel technique for self-adaptive enhancement of interested regions based on Motion Vectors (MVs) of the base layer is proposed, which is suitable for those video sequences having still background and what we are interested in is only the moving objects in the scene, such as news broadcasting, video surveillance, Internet education, etc. Motion vectors generated during base layer encoding are obtained and analyzed. A Gaussian model is introduced to describe non-moving macroblocks which may have non-zero MVs caused by random noise or luminance variation. MVs of these macroblocks are set to zero to prevent them from being enhanced. A segmentation algorithm, region growth, based on MV values is exploited to separate foreground from background. Post-process is needed to reduce the influence of burst noise so that only the interested moving regions are left. Applying the result in selective enhancement during enhancement layer encoding can significantly improves the visual quality of interested regions within an aforementioned video transmitted at different bit-rate in our experiments.

Technical evolutions in the field of information technology have changed many aspects of the industries and the life of human beings. Internet and broadcasting technologies act as core ingredients for this revolution. Various new services that were never possible are now available to general public by utilizing these technologies. Multimedia service via IP networks becomes one of easily accessible service in these days. Technical advances in Internet services, the provision of constantly increasing network bandwidth capacity, and the evolution of multimedia technologies have made the demands for multimedia streaming services increased explosively. With this increasing demand Internet becomes deluged with multimedia traffics. Although multimedia streaming services became indispensable, the quality of a multimedia service over Internet can not be technically guaranteed. Recently users demand multimedia service whose quality is competitive to the traditional TV broadcasting service with additional functionalities. Such additional functionalities include interactivity, scalability, and adaptability. A multimedia that comprises these ancillary functionalities is often called richmedia. In order to satisfy aforementioned requirements, Interactive Scalable Multimedia Streaming (ISMuS) platform is designed and developed. In this paper, the architecture, implementation, and additional functionalities of ISMuS platform are presented. The presented platform is capable of providing user interactions based on MPEG-4 Systems technology [1] and supporting an efficient multimedia distribution through an overlay network technology. Loaded with feature-rich technologies, the platform can serve both on-demand and broadcast-like richmedia services.

Along with the emergence of massive graph-modeled data, it is of great importance to investigate graph similarity joins due to their wide applications for multiple purposes, including data cleaning, and near duplicate detection. This paper considers graph similarity joins with edit distance constraints, which return pairs of graphs such that their edit distances are no larger than a given threshold. Leveraging the MapReduce programming model, we propose MGSJoin, a scalable algorithm following the filtering-verification framework for efficient graph similarity joins. It relies on counting overlapping graph signatures for filtering out nonpromising candidates. With the potential issue of too many key-value pairs in the filtering phase, spectral Bloom filters are introduced to reduce the number of key-value pairs. Furthermore, we integrate the multiway join strategy to boost the verification, where a MapReduce-based method is proposed for GED calculation. The superior efficiency and scalability of the proposed algorithms are demonstrated by extensive experimental results. PMID:25121135

We describe here the optical properties of a scalable nano-mesh film both experimentally measured and calculated by FDTD numerical modeling. Typically, applications for optically responsive nano-plasmonic or photonic films are limited by virtue of tractable fabrication techniques to several hundred microns or a few millimeters in size. The films described here have been demonstrated over an extent of several inches and could be readily scaled to larger sizes. The films are comprised of a quasi-regular periodic array of nanoscale holes in a metallic film. The nanostructure is fabricated in a scalable fashion in a multi-step fashion via sputtering on a nanoscale template created by nanoparticle self-assembly. Both the numerical modeling and experimentally measured scattering demonstrate that these films are highly resonant with the resonance location in the visible or near infrared and set by the hole size and pattern geometry. Such films can also be readily be made on flexible substrates if desired. Potential applications include new proposed photonic thermal management coatings or plasmoelectric devices.

Energy models of existing buildings are unreliable unless calibrated so they correlate well with actual energy usage. Manual tuning requires a skilled professional, is prohibitively expensive for small projects, imperfect, non-repeatable, non-transferable, and not scalable to the dozens of sensor channels that smart meters, smart appliances, and cheap/ubiquitous sensors are beginning to make available today. A scalable, automated methodology is needed to quickly and intelligently calibrate building energy models to all available data, increase the usefulness of those models, and facilitate speed-and-scale penetration of simulation-based capabilities into the marketplace for actualized energy savings. The "Autotune'' project is a novel, model-agnostic methodology which leverages supercomputing, large simulation ensembles, and big data mining with multiple machine learning algorithms to allow automatic calibration of simulations that match measured experimental data in a way that is deployable on commodity hardware. This paper shares several methodologies employed to reduce the combinatorial complexity to a computationally tractable search problem for hundreds of input parameters. Furthermore, accuracy metrics are provided which quantify model error to measured data for either monthly or hourly electrical usage from a highly-instrumented, emulated-occupancy research home.

Energy models of existing buildings are unreliable unless calibrated so they correlate well with actual energy usage. Manual tuning requires a skilled professional, is prohibitively expensive for small projects, imperfect, non-repeatable, non-transferable, and not scalable to the dozens of sensor channels that smart meters, smart appliances, and cheap/ubiquitous sensors are beginning to make available today. A scalable, automated methodology is needed to quickly and intelligently calibrate building energy models to all available data, increase the usefulness of those models, and facilitate speed-and-scale penetration of simulation-based capabilities into the marketplace for actualized energy savings. The "Autotune'' project is a novel, model-agnosticmore » methodology which leverages supercomputing, large simulation ensembles, and big data mining with multiple machine learning algorithms to allow automatic calibration of simulations that match measured experimental data in a way that is deployable on commodity hardware. This paper shares several methodologies employed to reduce the combinatorial complexity to a computationally tractable search problem for hundreds of input parameters. Furthermore, accuracy metrics are provided which quantify model error to measured data for either monthly or hourly electrical usage from a highly-instrumented, emulated-occupancy research home.« less

Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate itsmore » efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.« less

This paper introduces a novel approach to facilitating image search based on a compact semantic embedding. A novel method is developed to explicitly map concepts and image contents into a unified latent semantic space for the representation of semantic concept prototypes. Then, a linear embedding matrix is learned that maps images into the semantic space, such that each image is closer to its relevant concept prototype than other prototypes. In our approach, the semantic concepts equated with query keywords and the images mapped into the vicinity of the prototype are retrieved by our scheme. In addition, a computationally efficient method is introduced to incorporate new semantic concept prototypes into the semantic space by updating the embedding matrix. This novelty improves the scalability of the method and allows it to be applied to dynamic image repositories. Therefore, the proposed approach not only narrows semantic gap but also supports an efficient image search process. We have carried out extensive experiments on various cross-modality image search tasks over three widely-used benchmark image datasets. Results demonstrate the superior effectiveness, efficiency, and scalability of our proposed approach. PMID:25248210

New methods and strategies for the direct functionalization of C-H bonds are beginning to reshape the field of retrosynthetic analysis, affecting the synthesis of natural products, medicines and materials. The oxidation of allylic systems has played a prominent role in this context as possibly the most widely applied C-H functionalization, owing to the utility of enones and allylic alcohols as versatile intermediates, and their prevalence in natural and unnatural materials. Allylic oxidations have featured in hundreds of syntheses, including some natural product syntheses regarded as "classics". Despite many attempts to improve the efficiency and practicality of this transformation, the majority of conditions still use highly toxic reagents (based around toxic elements such as chromium or selenium) or expensive catalysts (such as palladium or rhodium). These requirements are problematic in industrial settings; currently, no scalable and sustainable solution to allylic oxidation exists. This oxidation strategy is therefore rarely used for large-scale synthetic applications, limiting the adoption of this retrosynthetic strategy by industrial scientists. Here we describe an electrochemical C-H oxidation strategy that exhibits broad substrate scope, operational simplicity and high chemoselectivity. It uses inexpensive and readily available materials, and represents a scalable allylic C-H oxidation (demonstrated on 100 grams), enabling the adoption of this C-H oxidation strategy in large-scale industrial settings without substantial environmental impact. PMID:27096371

A report discusses the continuing development of a scalable multiprocessor computing system for hard real-time applications aboard a spacecraft. "Hard realtime applications" signifies applications, like real-time radar signal processing, in which the data to be processed are generated at "hundreds" of pulses per second, each pulse "requiring" millions of arithmetic operations. In these applications, the digital processors must be tightly integrated with analog instrumentation (e.g., radar equipment), and data input/output must be synchronized with analog instrumentation, controlled to within fractions of a microsecond. The scalable multiprocessor is a cluster of identical commercial-off-the-shelf generic DSP (digital-signal-processing) computers plus generic interface circuits, including analog-to-digital converters, all controlled by software. The processors are computers interconnected by high-speed serial links. Performance can be increased by adding hardware modules and correspondingly modifying the software. Work is distributed among the processors in a parallel or pipeline fashion by means of a flexible master/slave control and timing scheme. Each processor operates under its own local clock; synchronization is achieved by broadcasting master time signals to all the processors, which compute offsets between the master clock and their local clocks.

This research describes Fountain, a suite of programs used to monitor the resources of a cluster. A cluster is a collection of individual computers that are connected via a high speed communication network. They are traditionally used by users who desire more resources, such as processing power and memory, than any single computer can provide. A common drawback to effectively utilizing such a large-scale system is the management infrastructure, which often does not often scale well as the system grows. Large-scale parallel systems provide new research challenges in the area of systems software, the programs or tools that manage the system from boot-up to running a parallel job. The approach presented in this thesis utilizes a collection of separate components that communicate with each other to achieve a common goal. While systems software comprises a broad array of components, this thesis focuses on the design choices for a node monitoring component. We will describe Fountain, an implementation of the Scalable Systems Software (SSS) node monitor specification. It is targeted at aggregate node monitoring for clusters, focusing on both scalability and fault tolerance as its design goals. It leverages widely used technologies such as XML and HTTP to present an interface to other components in the SSS environment.

This paper describes the two-layer scalable wavelength routing optical interconnection network being developed in Tianjin University. The top layer of the network is multi- wavelength bi-directional optical bus, which has high bandwidth and low latency. The optical bus is made up of passive components, no wavelength-tunable devices have been sued. As a result, the optical bus has low communication latency that is mainly decided by the optical fiber length. The sub-layer of the network is single-wavelength ring, which has low communication latency and high-scalability. In each ring, a wavelength routing node is used for data transmission between the ring and the optical bus. Each node computer is connected to the ring using an optical network interface card, which is based on peripheral component interconnect bus. The communication latency inside the ring is decreased using synchronous pipelining transmission technique. The scale of the ring is mainly limited by the efficient bandwidth required by each node computer. The number of rings is mainly decided by the optical power of the laser diodes and the sensitivity of the optical detectors. If Erbium doped fiber amplifier is used in the optical bus, the scale of the network can be further developed.

NOA is a multi-parent, N-tiered, hierarchical clustering algorithm that provides a scalable, robust and reliable solution to autonomous configuration of large-scale wireless sensor networks. The novel clustering hierarchy's inherent benefits can be utilized by in-network data processing techniques to provide equally robust, reliable and scalable in-network data processing solutions capable of reducing the amount of data sent to sinks. Utilizing a multi-parent framework, NOA reduces the cost of network setup when compared to hierarchical beaconing solutions by removing the expense of r-hop broadcasting (r is the radius of the cluster) needed to build the network and instead passes network topology information among shared children. NOA2, a two-parent clustering hierarchy solution, and NOA3, the three-parent variant, saw up to an 83% and 72% reduction in overhead, respectively, when compared to performing one round of a one-parent hierarchical beaconing, as well as 92% and 88% less overhead when compared to one round of two- and three-parent hierarchical beaconing hierarchy.

Future extreme-scale high-performance computing systems will be required to work under frequent component failures. The MPI Forum's User Level Failure Mitigation proposal has introduced an operation, MPI_Comm_shrink, to synchronize the alive processes on the list of failed processes, so that applications can continue to execute even in the presence of failures by adopting algorithm-based fault tolerance techniques. This MPI_Comm_shrink operation requires a fault tolerant failure detection and consensus algorithm. This paper presents and compares two novel failure detection and consensus algorithms. The proposed algorithms are based on Gossip protocols and are inherently fault-tolerant and scalable. The proposed algorithms were implemented and tested using the Extreme-scale Simulator. The results show that in both algorithms the number of Gossip cycles to achieve global consensus scales logarithmically with system size. The second algorithm also shows better scalability in terms of memory and network bandwidth usage and a perfect synchronization in achieving global consensus.

Water vapor condensation is commonly observed in nature and routinely used as an effective means of transferring heat with dropwise condensation on nonwetting surfaces exhibiting heat transfer improvement compared to filmwise condensation on wetting surfaces. However, state-of-the-art techniques to promote dropwise condensation rely on functional hydrophobic coatings that either have challenges with chemical stability or are so thick that any potential heat transfer improvement is negated due to the added thermal resistance of the coating. In this work, we show the effectiveness of ultrathin scalable chemical vapor deposited (CVD) graphene coatings to promote dropwise condensation while offering robust chemical stability and maintaining low thermal resistance. Heat transfer enhancements of 4× were demonstrated compared to filmwise condensation, and the robustness of these CVD coatings was superior to typical hydrophobic monolayer coatings. Our results indicate that graphene is a promising surface coating to promote dropwise condensation of water in industrial conditions with the potential for scalable application via CVD. PMID:25826223

New methods and strategies for the direct functionalization of C–H bonds are beginning to reshape the field of retrosynthetic analysis, affecting the synthesis of natural products, medicines and materials. The oxidation of allylic systems has played a prominent role in this context as possibly the most widely applied C–H functionalization, owing to the utility of enones and allylic alcohols as versatile intermediates, and their prevalence in natural and unnatural materials. Allylic oxidations have featured in hundreds of syntheses, including some natural product syntheses regarded as “classics”. Despite many attempts to improve the efficiency and practicality of this transformation, the majority of conditions still use highly toxic reagents (based around toxic elements such as chromium or selenium) or expensive catalysts (such as palladium or rhodium). These requirements are problematic in industrial settings; currently, no scalable and sustainable solution to allylic oxidation exists. This oxidation strategy is therefore rarely used for large-scale synthetic applications, limiting the adoption of this retrosynthetic strategy by industrial scientists. Here we describe an electrochemical C–H oxidation strategy that exhibits broad substrate scope, operational simplicity and high chemoselectivity. It uses inexpensive and readily available materials, and represents a scalable allylic C–H oxidation (demonstrated on 100 grams), enabling the adoption of this C–H oxidation strategy in large-scale industrial settings without substantial environmental impact.

Collective communication operations, used by many scientific applications, tend to limit overall parallel application performance and scalability. Computer systems are becoming more heterogeneous with increasing node and core-per-node counts. Also, a growing number of data-access mechanisms, of varying characteristics, are supported within a single computer system. We describe a new hierarchical collective communication framework that takes advantage of hardware-specific data-access mechanisms. It is flexible, with run-time hierarchy specification, and sharing of collective communication primitives between collective algorithms. Data buffers are shared between levels in the hierarchy reducing collective communication management overhead. We have implemented several versions of the Message Passing Interface (MPI) collective operations, MPI Barrier() and MPI Bcast(), and run experiments using up to 49, 152 processes on a Cray XT5, and a small InfiniBand based cluster. At 49, 152 processes our barrier implementation outperforms the optimized native implementation by 75%. 32 Byte and one Mega-Byte broadcasts outperform it by 62% and 11%, respectively, with better scalability characteristics. Improvements relative to the default Open MPI implementation are much larger.

Motivation: Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. Results: A method is presented to shift computational costs to an off-line computation by creating a taxonomy/genome index that supports scalable metagenomic classification. Scalable performance is demonstrated on real and simulated data to show accurate classification in the presence of novel organisms on samples that include viruses, prokaryotes, fungi and protists. Taxonomic classification of the previously published 150 giga-base Tyrolean Iceman dataset was found to take <20 h on a single node 40 core large memory machine and provide new insights on the metagenomic contents of the sample. Availability: Software was implemented in C++ and is freely available at http://sourceforge.net/projects/lmat Contact: allen99@llnl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23828782

Although images are pervasive in public policy debates in bioethics, few who work in the field attend carefully to the way that images function rhetorically. If the use of images is discussed at all, it is usually to dismiss appeals to images as a form of manipulation. Yet it is possible to speak meaningfully of visual arguments. Examining the appeal to images of the embryo and fetus in debates about abortion and stem cell research, I suggest that bioethicists would be well served by attending much more carefully to how images function in public policy debates. PMID:19085479

Background: Whole-slide imaging (WSI), while technologically mature, remains in the early adopter phase of the technology adoption lifecycle. One reason for this current situation is that current methods of visualizing and using WSI closely follow long-existing workflows for glass slides. We set out to “reimagine” the digital microscope in the era of cloud computing by combining WSI with the rich collaborative environment of the Scalable Adaptive Graphics Environment (SAGE). SAGE is a cross-platform, open-source visualization and collaboration tool that enables users to access, display and share a variety of data-intensive information, in a variety of resolutions and formats, from multiple sources, on display walls of arbitrary size. Methods: A prototype of a WSI viewer app in the SAGE environment was created. While not full featured, it enabled the testing of our hypothesis that these technologies could be blended together to change the essential nature of how microscopic images are utilized for patient care, medical education, and research. Results: Using the newly created WSI viewer app, demonstration scenarios were created in the patient care and medical education scenarios. This included a live demonstration of a pathology consultation at the International Academy of Digital Pathology meeting in Boston in November 2014. Conclusions: SAGE is well suited to display, manipulate and collaborate using WSIs, along with other images and data, for a variety of purposes. It goes beyond how glass slides and current WSI viewers are being used today, changing the nature of digital pathology in the process. A fully developed WSI viewer app within SAGE has the potential to encourage the wider adoption of WSI throughout pathology. PMID:26110092

Pursuits in geological sciences and other branches of quantitative sciences often require data visualization frameworks that are in continual need of improvement and new ideas. Virtual reality is a medium of visualization that has large audiences originally designed for gaming purposes; Virtual reality can be captured in Cave-like environment but they are unwieldy and expensive to maintain. Recent efforts by major companies such as Facebook have focussed more on a large market , The Oculus is the first of such kind of mobile devices The operating system Unity makes it possible for us to convert the data files into a mesh of isosurfaces and be rendered into 3D. A user is immersed inside of the virtual reality and is able to move within and around the data using arrow keys and other steering devices, similar to those employed in XBox.. With introductions of products like the Oculus Rift and Holo Lens combined with ever increasing mobile computing strength, mobile virtual reality data visualization can be implemented for better analysis of 3D geological and mineralogical data sets. As more new products like the Surface Pro 4 and other high power yet very mobile computers are introduced to the market, the RAM and graphics card capacity necessary to run these models is more available, opening doors to this new reality. The computing requirements needed to run these models are a mere 8 GB of RAM and 2 GHz of CPU speed, which many mobile computers are starting to exceed. Using Unity 3D software to create a virtual environment containing a visual representation of the data, any data set converted into FBX or OBJ format which can be traversed by wearing the Oculus Rift device. This new method for analysis in conjunction with 3D scanning has potential applications in many fields, including the analysis of precious stones or jewelry. Using hologram technology to capture in high-resolution the 3D shape, color, and imperfections of minerals and stones, detailed review and

The visualization of 3D groundwater flow is a challenging task. Previous versions of our software STRING [1] solely focused on intuitive visualization of complex flow scenarios for non-professional audiences. STRING, developed by Fraunhofer ITWM (Kaiserslautern, Germany) and delta h Ingenieurgesellschaft mbH (Witten, Germany), provides the necessary means for visualization of both 2D and 3D data on planar and curved surfaces. In this contribution we discuss how to extend this approach to a full 3D tool and its challenges in continuation of Michel et al. [2]. This elevates STRING from a post-production to an exploration tool for experts. In STRING moving pathlets provide an intuition of velocity and direction of both steady-state and transient flows. The visualization concept is based on the Lagrangian view of the flow. To capture every detail of the flow an advanced method for intelligent, time-dependent seeding is used building on the Finite Pointset Method (FPM) developed by Fraunhofer ITWM. Lifting our visualization approach from 2D into 3D provides many new challenges. With the implementation of a seeding strategy for 3D one of the major problems has already been solved (see Schröder et al. [3]). As pathlets only provide an overview of the velocity field other means are required for the visualization of additional flow properties. We suggest the use of Direct Volume Rendering and isosurfaces for scalar features. In this regard we were able to develop an efficient approach for combining the rendering through raytracing of the volume and regular OpenGL geometries. This is achieved through the use of Depth Peeling or A-Buffers for the rendering of transparent geometries. Animation of pathlets requires a strict boundary of the simulation domain. Hence, STRING needs to extract the boundary, even from unstructured data, if it is not provided. In 3D we additionally need a good visualization of the boundary itself. For this the silhouette based on the angle of

The chapter presents ten selected user interfaces and interaction challenges in extreme-scale visual analytics. The study of visual analytics is often referred to as 'the science of analytical reasoning facilitated by interactive visual interfaces' in the literature. The discussion focuses on the issues of applying visual analytics technologies to extreme-scale scientific and non-scientific data ranging from petabyte to exabyte in sizes. The ten challenges are: in situ interactive analysis, user-driven data reduction, scalability and multi-level hierarchy, representation of evidence and uncertainty, heterogeneous data fusion, data summarization and triage for interactive query, analytics of temporally evolving features, the human bottleneck, design and engineering development, and the Renaissance of conventional wisdom. The discussion addresses concerns that arise from different areas of hardware, software, computation, algorithms, and human factors. The chapter also evaluates the likelihood of success in meeting these challenges in the near future.

Advances in modeling and simulation for General Recognition Theory have produced more data than can be easily visualized using traditional techniques. In this area of psychological modeling, domain experts are struggling to find effective ways to compare large-scale simulation results. This paper describes methods that adapt the web-based D3 visualization framework combined with pre-processing tools to enable domain specialists to more easily interpret their data. The D3 framework utilizes Javascript and scalable vector graphics (SVG) to generate visualizations that can run readily within the web browser for domain specialists. Parallel coordinate plots and heat maps were developed for identification-confusion matrix data, and the results were shown to a GRT expert for an informal evaluation of their utility. There is a clear benefit to model interpretation from these visualizations when researchers need to interpret larger amounts of simulated data.

Graphene, a newly discovered and extensively investigated material, has many unique and extraordinary properties which promise major technological advances in fields ranging from electronics to mechanical engineering and food production. Unfortunately, complex techniques and high production costs hinder commonplace applications. Scaling of existing graphene production techniques to the industrial level without compromising its properties is a current challenge. This article focuses on the perspectives and challenges of scalability, equipment, and technological perspectives of the plasma-based techniques which offer many unique possibilities for the synthesis of graphene and graphene-containing products. The plasma-based processes are amenable for scaling and could also be useful to enhance the controllability of the conventional chemical vapour deposition method and some other techniques, and to ensure a good quality of the produced graphene. We examine the unique features of the plasma-enhanced graphene production approaches, including the techniques based on inductively-coupled and arc discharges, in the context of their potential scaling to mass production following the generic scaling approaches applicable to the existing processes and systems. This work analyses a large amount of the recent literature on graphene production by various techniques and summarizes the results in a tabular form to provide a simple and convenient comparison of several available techniques. Our analysis reveals a significant potential of scalability for plasma-based technologies, based on the scaling-related process characteristics. Among other processes, a greater yield of 1 g × h-1 m-2 was reached for the arc discharge technology, whereas the other plasma-based techniques show process yields comparable to the neutral-gas based methods. Selected plasma-based techniques show lower energy consumption than in thermal CVD processes, and the ability to produce graphene flakes of various

Graphene, a newly discovered and extensively investigated material, has many unique and extraordinary properties which promise major technological advances in fields ranging from electronics to mechanical engineering and food production. Unfortunately, complex techniques and high production costs hinder commonplace applications. Scaling of existing graphene production techniques to the industrial level without compromising its properties is a current challenge. This article focuses on the perspectives and challenges of scalability, equipment, and technological perspectives of the plasma-based techniques which offer many unique possibilities for the synthesis of graphene and graphene-containing products. The plasma-based processes are amenable for scaling and could also be useful to enhance the controllability of the conventional chemical vapour deposition method and some other techniques, and to ensure a good quality of the produced graphene. We examine the unique features of the plasma-enhanced graphene production approaches, including the techniques based on inductively-coupled and arc discharges, in the context of their potential scaling to mass production following the generic scaling approaches applicable to the existing processes and systems. This work analyses a large amount of the recent literature on graphene production by various techniques and summarizes the results in a tabular form to provide a simple and convenient comparison of several available techniques. Our analysis reveals a significant potential of scalability for plasma-based technologies, based on the scaling-related process characteristics. Among other processes, a greater yield of 1 g × h(-1) m(-2) was reached for the arc discharge technology, whereas the other plasma-based techniques show process yields comparable to the neutral-gas based methods. Selected plasma-based techniques show lower energy consumption than in thermal CVD processes, and the ability to produce graphene flakes of

Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach. PMID:26742135

Narrative visualizations combine conventions of communicative and exploratory information visualization to convey an intended story. We demonstrate visualization rhetoric as an analytical framework for understanding how design techniques that prioritize particular interpretations in visualizations that "tell a story" can significantly affect end-user interpretation. We draw a parallel between narrative visualization interpretation and evidence from framing studies in political messaging, decision-making, and literary studies. Devices for understanding the rhetorical nature of narrative information visualizations are presented, informed by the rigorous application of concepts from critical theory, semiotics, journalism, and political theory. We draw attention to how design tactics represent additions or omissions of information at various levels-the data, visual representation, textual annotations, and interactivity-and how visualizations denote and connote phenomena with reference to unstated viewing conventions and codes. Classes of rhetorical techniques identified via a systematic analysis of recent narrative visualizations are presented, and characterized according to their rhetorical contribution to the visualization. We describe how designers and researchers can benefit from the potentially positive aspects of visualization rhetoric in designing engaging, layered narrative visualizations and how our framework can shed light on how a visualization design prioritizes specific interpretations. We identify areas where future inquiry into visualization rhetoric can improve understanding of visualization interpretation. PMID:22034342

Statistical clustering is critical in designing scalable image retrieval systems. This paper presents a scalable algorithm for indexing and retrieving images based on region segmentation. The method uses statistical clustering on region features and IRM (Integrated Region Matching), a measure developed to evaluate overall similarity between images…

Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

The Scalable Coherent Interface standard defines a new generation of interconnection that spans the full range from supercomputer memory `bus` to campus-wide network. SCI provides bus-like services and a shared-memory software model while using an underlying, packet protocol on many independent communication links. Initially these links are 1 GByte/s (wires) and 1 GBit/s (fiber), but the protocol scales well to future faster or lower-cost technologies. The interconnect may use switches, meshes, and rings. The SCI distributed-shared-memory model is simple and versatile, enabling for the first time a smooth integration of highly parallel multiprocessors, workstations, personal computers, I/O, networking and data acquisition.

In this manuscript, we develop printable graphene ink through a solvent-exchange method. Printable graphene ink in ethanol and water free of any surfactant is dependent on matching the surface tension of the cross-solvent with the graphene surface energy. Percolative transport behavior is observed for films made of this printable ink. Optical conductivity is then calculated based on sheet resistance, optical transmittance, and thickness. Upon analyzing the ratio of dc/optical conductivity versus flake size/layer number, we report that our dc/optical conductivity is among the highest of films based on direct deposited graphene ink. This is the first demonstration of scalable, printable, surfactant-free graphene ink derived directly from graphite. PMID:23609377

In this manuscript, we develop printable graphene ink through a solvent-exchange method. Printable graphene ink in ethanol and water free of any surfactant is dependent on matching the surface tension of the cross-solvent with the graphene surface energy. Percolative transport behavior is observed for films made of this printable ink. Optical conductivity is then calculated based on sheet resistance, optical transmittance, and thickness. Upon analyzing the ratio of dc/optical conductivity versus flake size/layer number, we report that our dc/optical conductivity is among the highest of films based on direct deposited graphene ink. This is the first demonstration of scalable, printable, surfactant-free graphene ink derived directly from graphite.

We propose a scalable scheme for engineering multipartite entangled W states in a Heisenberg spin chain. The rather simple scheme is mainly built on the accumulative angular squeezing technique first proposed in the context of quantum kicked rotor for focusing a rotor to a delta-like angular distribution [I. Sh. Averbukh and R. Arvieu, Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.87.163601 87, 163601 (2001)]. We show how the efficient generation of various W states may be achieved by engineering the interaction between a spin chain (short or long) and a time-dependent parabolic magnetic field. Our results may further motivate the use of spin chains as a test bed to investigate complex properties of multipartite entangled states. We further numerically demonstrate that our scheme can be extended to engineer arbitrary spin chain quasimomentum states as well as their superposition states.

Human embryonic stem cells (hESCs) and human induced pluripotent stem cells (hiPSCs), collectively termed human pluripotent stem cells (hPSCs), are typically derived and maintained in adherent and semi-defined culture conditions. Recently a number of groups, including Chen et al., 2012, have demonstrated that hESCs can now be expanded efficiently and maintain pluripotency over long-term passaging as aggregates in a serum-free defined suspension culture system, permitting the preparation of scalable cGMP derived hPSC cultures for cell banking, high throughput research programs and clinical applications. In this short commentary we describe the utility and potential future uses of suspension culture systems for hPSCs. PMID:22771716

In order to realize a useful atom-based quantum computer, a means to efficiently distribute critical laser resources to multiple trap locations is essential. Optical micro-electromechanical systems (MEMS) can provide the scalability, flexibility, and stability needed to help bridge the gap between fundamental demonstrations of quantum gates to large scale quantum computing of multiple qubits. Using controllable, broadband micromirrors, an arbitrary atom in a 1, 2, or 3 dimensional optical lattice can be addressed with a single laser source. It is straightforward to scale this base system to address n arbitrary set of atoms simultaneously using n laser sources. We explore on-demand addressability of individual atoms trapped in a 1D lattice, as well as investigate the effect the micromirrors have on the laser beam quality and phase stability.

Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.

Good load balance is crucial on very large parallel systems, but the most sophisticated algorithms introduce dynamic imbalances through adaptation in domain decomposition or use of adaptive solvers. To observe and diagnose imbalance, developers need system-wide, temporally-ordered measurements from full-scale runs. This potentially requires data collection from multiple code regions on all processors over the entire execution. Doing this instrumentation naively can, in combination with the application itself, exceed available I/O bandwidth and storage capacity, and can induce severe behavioral perturbations. We present and evaluate a novel technique for scalable, low-error load balance measurement. This uses a parallel wavelet transform and other parallel encoding methods. We show that our technique collects and reconstructs system-wide measurements with low error. Compression time scales sublinearly with system size and data volume is several orders of magnitude smaller than the raw data. The overhead is low enough for online use in a production environment.

We turn the Self-organizing Map (SOM) into an Oriented and Scalable Map (OS-Map) by generalizing the neighborhood function and the winner selection. The homogeneous Gaussian neighborhood function is replaced with the matrix exponential. Thus we can specify the orientation either in the map space or in the data space. Moreover, we associate the map's global scale with the locality of winner selection. Our model is suited for a number of graphical applications such as texture/image synthesis, surface parameterization, and solid texture synthesis. OS-Map is more generic and versatile than the task-specific algorithms for these applications. Our work reveals the overlooked strength of SOMs in processing images and geometries. PMID:26897100

Background Biomedical research has traditionally been conducted via surveys and the analysis of medical records. However, these resources are limited in their content, such that non-traditional domains (eg, online forums and social media) have an opportunity to supplement the view of an individual’s health. Objective The objective of this study was to develop a scalable framework to detect personal health status mentions on Twitter and assess the extent to which such information is disclosed. Methods We collected more than 250 million tweets via the Twitter streaming API over a 2-month period in 2014. The corpus was filtered down to approximately 250,000 tweets, stratified across 34 high-impact health issues, based on guidance from the Medical Expenditure Panel Survey. We created a labeled corpus of several thousand tweets via a survey, administered over Amazon Mechanical Turk, that documents when terms correspond to mentions of personal health issues or an alternative (eg, a metaphor). We engineered a scalable classifier for personal health mentions via feature selection and assessed its potential over the health issues. We further investigated the utility of the tweets by determining the extent to which Twitter users disclose personal health status. Results Our investigation yielded several notable findings. First, we find that tweets from a small subset of the health issues can train a scalable classifier to detect health mentions. Specifically, training on 2000 tweets from four health issues (cancer, depression, hypertension, and leukemia) yielded a classifier with precision of 0.77 on all 34 health issues. Second, Twitter users disclosed personal health status for all health issues. Notably, personal health status was disclosed over 50% of the time for 11 out of 34 (33%) investigated health issues. Third, the disclosure rate was dependent on the health issue in a statistically significant manner (P

Diffuse correlation spectroscopy (DCS) is a technique which enables powerful and robust non-invasive optical studies of tissue micro-circulation and vascular blood flow. The technique amounts to autocorrelation analysis of coherent photons after their migration through moving scatterers and subsequent collection by single-mode optical fibers. A primary cost driver of DCS instruments are the commercial hardware-based correlators, limiting the proliferation of multi-channel instruments for validation of perfusion analysis as a clinical diagnostic metric. We present the development of a low-cost scalable correlator enabled by microchip-based time-tagging, and a software-based multi-tau data analysis method. We will discuss the capabilities of the instrument as well as the implementation and validation of 2- and 8-channel systems built for live animal and pre-clinical settings.

Exfoliation of graphite is a promising approach for large-scale production of graphene. Oxidation of graphite effectively facilitates the exfoliation process, yet necessitates several lengthy washing and reduction processes to convert the exfoliated graphite oxide (graphene oxide, GO) to reduced graphene oxide (RGO). Although filtration, centrifugation and dialysis have been frequently used in the washing stage, none of them is favorable for large-scale production. Here, we report the synthesis of RGO by sonication-assisted oxidation of graphite in a solution of potassium permanganate and concentrated sulfuric acid followed by reduction with ascorbic acid prior to any washing processes. GO loses its hydrophilicity during the reduction stage which facilitates the washing step and reduces the time required for production of RGO. Furthermore, simultaneous oxidation and exfoliation significantly enhance the yield of few-layer GO. We hope this one-pot and fully-scalable protocol paves the road toward out of lab applications of graphene. PMID:25976732

A facile strategy for fabricating scalable stamps has been developed using cross-linked polyacrylamide gel (PAMG) that controllably and precisely shrinks and swells with water content. Aligned patterns of natural DNA molecules were prepared by evaporative self-assembly on a PMMA substrate, and were transferred to unsaturated polyester resin (UPR) to form a negative replica. The negative was used to pattern the linear structures onto the surface of water-swollen PAMG, and the pattern sizes on the PAMG stamp were customized by adjusting the water content of the PAMG. As a result, consistent reproduction of DNA patterns could be achieved with feature sizes that can be controlled over the range of 40%–200% of the original pattern dimensions. This methodology is novel and may pave a new avenue for manufacturing stamp-based functional nanostructures in a simple and cost-effective manner on a large scale. PMID:26639572

The Center for Scalable Application Development Software (CScADS) was established as a part- nership between Rice University, Argonne National Laboratory, University of California Berkeley, University of Tennessee – Knoxville, and University of Wisconsin – Madison. CScADS pursued an integrated set of activities with the aim of increasing the productivity of DOE computational scientists by catalyzing the development of systems software, libraries, compilers, and tools for leadership computing platforms. Principal Center activities were workshops to engage the research community in the challenges of leadership computing, research and development of open-source software, and work with computational scientists to help them develop codes for leadership computing platforms. This final report summarizes CScADS activities at Rice University in these areas.

Fermilab provides a multi-Petabyte scale mass storage system for High Energy Physics (HEP) Experiments and other scientific endeavors. We describe the scalability aspects of the hardware and software architecture that were designed into the Mass Storage System to permit us to scale to multiple petabytes of storage capacity, manage tens of terabytes per day in data transfers, support hundreds of users, and maintain data integrity. We discuss in detail how we scale the system over time to meet the ever-increasing needs of the scientific community, and relate our experiences with many of the technical and economic issues related to scaling the system. Since the 2003 MSST conference, the experiments at Fermilab have generated more than 1.9 PB of additional data. We present results on how this system has scaled and performed for the Fermilab CDF and D0 Run II experiments as well as other HEP experiments and scientific endeavors.

We present a method to produce anti-fouling reverse osmosis (RO) membranes that maintains the process and scalability of current RO membrane manufacturing. Utilizing perfluorophenyl azide (PFPA) photochemistry, commercial reverse osmosis membranes were dipped into an aqueous solution containing PFPA-terminated poly(ethyleneglycol) species and then exposed to ultraviolet light under ambient conditions, a process that can easily be adapted to a roll-to-roll process. Successful covalent modification of commercial reverse osmosis membranes was confirmed with attenuated total reflectance infrared spectroscopy and contact angle measurements. By employing X-ray photoelectron spectroscopy, it was determined that PFPAs undergo UV-generated nitrene addition and bind to the membrane through an aziridine linkage. After modification with the PFPA-PEG derivatives, the reverse osmosis membranes exhibit high fouling-resistance. PMID:25042670

We have developed methods involving the use of alternate, safer reagents for the scalable syntheses of the potent BET bromodomain inhibitor JQ1. A one-pot three step method, involving the conversion of a benzodiazepine to a thioamde using Lawesson’s reagent, followed by amidrazone formation and installation of the triazole moiety furnished JQ1. This method provides good yields and a facile purification process. For the synthesis of enantiomerically enriched (+)-JQ1, the highly toxic reagent diethyl chlorophosphate, used in a previous synthesis, was replaced with the safer reagent diphenyl chlorophosphate in the three-step one-pot triazole formation without effecting yields and enantiomeric purity of (+)-JQ1. PMID:26034331

Changes in high performance computing have necessitated the ability to utilize and interrogate potentially many thousands of processors. The ASCI (Advanced Strategic Computing Initiative) program conducted by the United States Department of Energy, for example, envisions thousands of distinct operating systems connected by low-latency gigabit-per-second networks. In addition multiple systems of this kind will be linked via high-capacity networks with latencies as low as the speed of light will allow. Code which spans systems of this sort must be scalable; yet constructing such code whether for applications, debugging, or maintenance is an unsolved problem. Lilith is a research software platform that attempts to answer these questions with an end toward meeting these needs. Presently, Lilith exists as a test-bed, written in Java, for various spanning algorithms and security schemes. The test-bed software has, and enforces, hooks allowing implementation and testing of various security schemes.

A facile strategy for fabricating scalable stamps has been developed using cross-linked polyacrylamide gel (PAMG) that controllably and precisely shrinks and swells with water content. Aligned patterns of natural DNA molecules were prepared by evaporative self-assembly on a PMMA substrate, and were transferred to unsaturated polyester resin (UPR) to form a negative replica. The negative was used to pattern the linear structures onto the surface of water-swollen PAMG, and the pattern sizes on the PAMG stamp were customized by adjusting the water content of the PAMG. As a result, consistent reproduction of DNA patterns could be achieved with feature sizes that can be controlled over the range of 40%-200% of the original pattern dimensions. This methodology is novel and may pave a new avenue for manufacturing stamp-based functional nanostructures in a simple and cost-effective manner on a large scale.

A facile strategy for fabricating scalable stamps has been developed using cross-linked polyacrylamide gel (PAMG) that controllably and precisely shrinks and swells with water content. Aligned patterns of natural DNA molecules were prepared by evaporative self-assembly on a PMMA substrate, and were transferred to unsaturated polyester resin (UPR) to form a negative replica. The negative was used to pattern the linear structures onto the surface of water-swollen PAMG, and the pattern sizes on the PAMG stamp were customized by adjusting the water content of the PAMG. As a result, consistent reproduction of DNA patterns could be achieved with feature sizes that can be controlled over the range of 40%-200% of the original pattern dimensions. This methodology is novel and may pave a new avenue for manufacturing stamp-based functional nanostructures in a simple and cost-effective manner on a large scale. PMID:26639572

We describe the architecture of the Patient Centered Outcomes Research Institute (PCORI) funded Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS, http://www.SCILHS.org) clinical data research network, which leverages the $48 billion dollar federal investment in health information technology (IT) to enable a queryable semantic data model across 10 health systems covering more than 8 million patients, plugging universally into the point of care, generating evidence and discovery, and thereby enabling clinician and patient participation in research during the patient encounter. Central to the success of SCILHS is development of innovative ‘apps’ to improve PCOR research methods and capacitate point of care functions such as consent, enrollment, randomization, and outreach for patient-reported outcomes. SCILHS adapts and extends an existing national research network formed on an advanced IT infrastructure built with open source, free, modular components. PMID:24821734

We report on the development of electronics hardware, FPGA firmware and software to provide a flexible multi-chip readout of the Timepix ASIC within the framework of the Scalable Readout System (SRS). The system features FPGA-based zero-suppression and the possibility to read out up to 4×8 chips with a single Front End Concentrator (FEC). By operating several FECs in parallel, in principle an arbitrary number of chips can be read out, exploiting the scaling features of SRS. Specifically, we tested the system with a setup consisting of 160 Timepix ASICs, operated as GridPix devices in a large TPC field cage in a 1 T magnetic field at a DESY test beam facility providing an electron beam of up to 6 GeV. We discuss the design choices, the dedicated hardware components, the FPGA firmware as well as the performance of the system in the test beam.

Randomized benchmarking is a widely used experimental technique to characterize the average error of quantum operations. Benchmarking procedures that scale to enable characterization of n-qubit circuits rely on efficient procedures for manipulating those circuits and, as such, have been limited to subgroups of the Clifford group. However, universal quantum computers require additional, non-Clifford gates to approximate arbitrary unitary transformations. We define a scalable randomized benchmarking procedure over n-qubit unitary matrices that correspond to protected non-Clifford gates for a class of stabilizer codes. We present efficient methods for representing and composing group elements, sampling them uniformly, and synthesizing corresponding poly (n) -sized circuits. The procedure provides experimental access to two independent parameters that together characterize the average gate fidelity of a group element. We acknowledge support from ARO under Contract W911NF-14-1-0124.

As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

The evolution of the next generation sequencing technology increases the demand for efficient solutions, in terms of space and time, for several bioinformatics problems. This paper presents a practical and easy-to-implement solution for one of these problems, namely, the all-pairs suffix-prefix problem, using a compact prefix tree. The paper demonstrates an efficient construction of this time-efficient and space-economical tree data structure. The paper presents techniques for parallel implementations of the proposed solution. Experimental evaluation indicates superior results in terms of space and time over existing solutions. Results also show that the proposed technique is highly scalable in a parallel execution environment. PMID:25961045

A quantum computer protects a quantum state from the environment through the careful manipulations of thousands or millions of physical qubits. However, operating such quantities of qubits at the necessary level of precision is an open challenge, as optimal control parameters can vary between qubits and drift in time. We present a method to optimize physical qubit parameters while error detection is running using a nine qubit system performing the bit-flip repetition code. We demonstrate how gate optimization can be parallelized in a large-scale qubit array and show that the presented method can be used to simultaneously compensate for independent or correlated qubit parameter drifts. Our method is O(1) scalable to systems of arbitrary size, providing a path towards controlling the large numbers of qubits needed for a fault-tolerant quantum computer.

Being able to accurately record body motion allows complex movements to be characterised and studied. This is especially important in the film or sport coaching industry. Unfortunately, the human body has over 600 skeletal muscles, giving rise to multiple degrees of freedom. In order to accurately capture motion such as hand gestures, elbow or knee flexion and extension, vast numbers of sensors are required. Dielectric elastomer (DE) sensors are an emerging class of electroactive polymer (EAP) that is soft, lightweight and compliant. These characteristics are ideal for a motion capture suit. One challenge is to design sensing electronics that can simultaneously measure multiple sensors. This paper describes a scalable capacitive sensing device that can measure up to 8 different sensors with an update rate of 20Hz.

We describe the architecture of the Patient Centered Outcomes Research Institute (PCORI) funded Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS, http://www.SCILHS.org) clinical data research network, which leverages the $48 billion dollar federal investment in health information technology (IT) to enable a queryable semantic data model across 10 health systems covering more than 8 million patients, plugging universally into the point of care, generating evidence and discovery, and thereby enabling clinician and patient participation in research during the patient encounter. Central to the success of SCILHS is development of innovative 'apps' to improve PCOR research methods and capacitate point of care functions such as consent, enrollment, randomization, and outreach for patient-reported outcomes. SCILHS adapts and extends an existing national research network formed on an advanced IT infrastructure built with open source, free, modular components. PMID:24821734

The evolution of the next generation sequencing technology increases the demand for efficient solutions, in terms of space and time, for several bioinformatics problems. This paper presents a practical and easy-to-implement solution for one of these problems, namely, the all-pairs suffix-prefix problem, using a compact prefix tree. The paper demonstrates an efficient construction of this time-efficient and space-economical tree data structure. The paper presents techniques for parallel implementations of the proposed solution. Experimental evaluation indicates superior results in terms of space and time over existing solutions. Results also show that the proposed technique is highly scalable in a parallel execution environment. PMID:25961045

Managing petabytes of data with hundreds of millions of files is the first step necessary towards an effective big data computing and collaboration environment in a distributed system. We describe here the MODAPS LAADS Virtual File System (LVFS), a new storage architecture which replaces the previous MODAPS operational Level 1 Land Atmosphere Archive Distribution System (LAADS) NFS based approach to storing and distributing datasets from several instruments, such as MODIS, MERIS, and VIIRS. LAADS is responsible for the distribution of over 4 petabytes of data and over 300 million files across more than 500 disks. We present here the first LVFS big data comparative performance results and new capabilities not previously possible with the LAADS system. We consider two aspects in addressing inefficiencies of massive scales of data. First, is dealing in a reliable and resilient manner with the volume and quantity of files in such a dataset, and, second, minimizing the discovery and lookup times for accessing files in such large datasets. There are several popular file systems that successfully deal with the first aspect of the problem. Their solution, in general, is through distribution, replication, and parallelism of the storage architecture. The Hadoop Distributed File System (HDFS), Parallel Virtual File System (PVFS), and Lustre are examples of such file systems that deal with petabyte data volumes. The second aspect deals with data discovery among billions of files, the largest bottleneck in reducing access time. The metadata of a file, generally represented in a directory layout, is stored in ways that are not readily scalable. This is true for HDFS, PVFS, and Lustre as well. Recent experimental file systems, such as Spyglass or Pantheon, have attempted to address this problem through redesign of the metadata directory architecture. LVFS takes a radically different architectural approach by eliminating the need for a separate directory within the file system

Decades of research to build programmable intelligent machines have demonstrated limited utility in complex, real-world environments. Comparing their performance with biological systems, these machines are less efficient by a factor of 1 million1 billion in complex, real-world environments. The Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE) program is a multifaceted Defense Advanced Research Projects Agency (DARPA) project that seeks to break the programmable machine paradigm and define a new path for creating useful, intelligent machines. Since real-world systems exhibit infinite combinatorial complexity, electronic neuromorphic machine technology would be preferable in a host of applications, but useful and practical implementations still do not exist. HRL Laboratories LLC has embarked on addressing these challenges, and, in this article, we provide an overview of our project and progress made thus far. PMID:22344953

Exfoliation of graphite is a promising approach for large-scale production of graphene. Oxidation of graphite effectively facilitates the exfoliation process, yet necessitates several lengthy washing and reduction processes to convert the exfoliated graphite oxide (graphene oxide, GO) to reduced graphene oxide (RGO). Although filtration, centrifugation and dialysis have been frequently used in the washing stage, none of them is favorable for large-scale production. Here, we report the synthesis of RGO by sonication-assisted oxidation of graphite in a solution of potassium permanganate and concentrated sulfuric acid followed by reduction with ascorbic acid prior to any washing processes. GO loses its hydrophilicity during the reduction stage which facilitates the washing step and reduces the time required for production of RGO. Furthermore, simultaneous oxidation and exfoliation significantly enhance the yield of few-layer GO. We hope this one-pot and fully-scalable protocol paves the road toward out of lab applications of graphene.

Teranovi Technologies, Inc., has developed innovative network architecture, protocols, and algorithms for both lunar surface and orbit access networks. A key component of the overall architecture is a medium access control (MAC) protocol that includes a novel mechanism of overlaying time division multiple access (TDMA) and carrier sense multiple access with collision avoidance (CSMA/CA), ensuring scalable throughput and quality of service. The new MAC protocol is compatible with legacy Institute of Electrical and Electronics Engineers (IEEE) 802.11 networks. Advanced features include efficiency power management, adaptive channel width adjustment, and error control capability. A hybrid routing protocol combines the advantages of ad hoc on-demand distance vector (AODV) routing and disruption/delay-tolerant network (DTN) routing. Performance is significantly better than AODV or DTN and will be particularly effective for wireless networks with intermittent links, such as lunar and planetary surface networks and orbit access networks.

Exfoliation of graphite is a promising approach for large-scale production of graphene. Oxidation of graphite effectively facilitates the exfoliation process, yet necessitates several lengthy washing and reduction processes to convert the exfoliated graphite oxide (graphene oxide, GO) to reduced graphene oxide (RGO). Although filtration, centrifugation and dialysis have been frequently used in the washing stage, none of them is favorable for large-scale production. Here, we report the synthesis of RGO by sonication-assisted oxidation of graphite in a solution of potassium permanganate and concentrated sulfuric acid followed by reduction with ascorbic acid prior to any washing processes. GO loses its hydrophilicity during the reduction stage which facilitates the washing step and reduces the time required for production of RGO. Furthermore, simultaneous oxidation and exfoliation significantly enhance the yield of few-layer GO. We hope this one-pot and fully-scalable protocol paves the road toward out of lab applications of graphene. PMID:25976732

In this paper we present a scalable pointer analysis for embedded applications that is able to distinguish between instances of recursively defined data structures and elements of arrays. The main contribution consists of an efficient yet precise algorithm that can handle multithreaded programs. We first perform an inexpensive flow-sensitive analysis of each function in the program that generates semantic equations describing the effect of the function on the memory graph. These equations bear numerical constraints that describe nonuniform points-to relationships. We then iteratively solve these equations in order to obtain an abstract storage graph that describes the shape of data structures at every point of the program for all possible thread interleavings. We bring experimental evidence that this approach is tractable and precise for real-size embedded applications.

Observation data from radio telescopes is typically stored in three (or higher) dimensional data cubes, the resolution, coverage and size of which continues to grow as ever larger radio telescopes come online. The Square Kilometre Array, tabled to be the largest radio telescope in the world, will generate multi-terabyte data cubes - several orders of magnitude larger than the current norm. Despite this imminent data deluge, scalable approaches to file access in Astronomical visualisation software are rare: most current software packages cannot read astronomical data cubes that do not fit into computer system memory, or else provide access only at a serious performance cost. In addition, there is little support for interactive exploration of 3D data. We describe a scalable, hierarchical approach to 3D visualisation of very large spectral data cubes to enable rapid visualisation of large data files on standard desktop hardware. Our hierarchical approach, embodied in the AstroVis prototype, aims to provide a means of viewing large datasets that do not fit into system memory. The focus is on rapid initial response: our system initially rapidly presents a reduced, coarse-grained 3D view of the data cube selected, which is gradually refined. The user may select sub-regions of the cube to be explored in more detail, or extracted for use in applications that do not support large files. We thus shift the focus from data analysis informed by narrow slices of detailed information, to analysis informed by overview information, with details on demand. Our hierarchical solution to the rendering of large data cubes reduces the overall time to complete file reading, provides user feedback during file processing and is memory efficient. This solution does not require high performance computing hardware and can be implemented on any platform supporting the OpenGL rendering library.

The collision warnings produced by the Joint Space Operations Center (JSpOC) are of critical importance in protecting U.S. and allied spacecraft against destructive collisions and protecting the lives of astronauts during space flight. As the Space Surveillance Network (SSN) improves its sensor capabilities for tracking small and dim space objects, the number of tracked objects increases from thousands to hundreds of thousands of objects, while the number of potential conjunctions increases with the square of the number of tracked objects. Classical filtering techniques such as apogee and perigee filters have proven insufficient. Novel and orders of magnitude faster conjunction analysis algorithms are required to find conjunctions in a timely manner. Stellar Science has developed innovative filtering techniques for satellite conjunction processing using spatiotemporally indexed ephemeris data that efficiently and accurately reduces the number of objects requiring high-fidelity and computationally-intensive conjunction analysis. Two such algorithms, one based on the k-d Tree pioneered in robotics applications and the other based on Spatial Hash Tables used in computer gaming and animation, use, at worst, an initial O(N log N) preprocessing pass (where N is the number of tracked objects) to build large O(N) spatial data structures that substantially reduce the required number of O(N^2) computations, substituting linear memory usage for quadratic processing time. The filters have been implemented as Open Services Gateway initiative (OSGi) plug-ins for the Continuous Anomalous Orbital Situation Discriminator (CAOS-D) conjunction analysis architecture. We have demonstrated the effectiveness, efficiency, and scalability of the techniques using a catalog of 100,000 objects, an analysis window of one day, on a 64-core computer with 1TB shared memory. Each algorithm can process the full catalog in 6 minutes or less, almost a twenty-fold performance improvement over the

Whether it be for building multi-wavelength datasets from independent surveys, studying changes in objects luminosities, or detecting moving objects (stellar proper motions, asteroids), cross-catalog matching is a technique widely used in astronomy. The need for efficient, reliable and scalable cross-catalog matching is becoming even more pressing with forthcoming projects which will produce huge catalogs in which astronomers will dig for rare objects, perform statistical analysis and classification, or real-time transients detection. We have developed a formalism and the corresponding technical framework to address the challenge of fast cross-catalog matching. Our formalism supports more than simple nearest-neighbor search, and handles elliptical positional errors. Scalability is improved by partitioning the sky using the HEALPix scheme, and processing independently each sky cell. The use of multi-threaded two-dimensional kd-trees adapted to managing equatorial coordinates enables efficient neighbor search. The whole process can run on a single computer, but could also use clusters of machines to cross-match future very large surveys such as GAIA or LSST in reasonable times. We already achieve performances where the 2MASS (˜470M sources) and SDSS DR7 (˜350M sources) can be matched on a single machine in less than 10 minutes. We aim at providing astronomers with a catalog cross-matching service, available on-line and leveraging on the catalogs present in the VizieR database. This service will allow users both to access pre-computed cross-matches across some very large catalogs, and to run customized cross-matching operations. It will also support VO protocols for synchronous or asynchronous queries.

MPLS (Multi-Protocol Label Switching) has recently emerged to facilitate the engineering of network traffic. This can be achieved by directing packet flows over paths that satisfy multiple requirements. MPLS has been regarded as an enhancement to traditional IP routing, which has the following problems: (1) all packets with the same IP destination address have to follow the same path through the network; and (2) paths have often been computed based on static and single link metrics. These problems may cause traffic concentration, and thus degradation in quality of service. In this paper, we investigate by simulations a range of routing solutions and examine the tradeoff between scalability and performance. At one extreme, IP packet routing using dynamic link metrics provides a stateless solution but may lead to routing oscillations. At the other extreme, we consider a recently proposed Profile-based Routing (PBR), which uses knowledge of potential ingress-egress pairs as well as the traffic profile among them. Minimum Interference Routing (MIRA) is another recently proposed MPLS-based scheme, which only exploits knowledge of potential ingress-egress pairs but not their traffic profile. MIRA and the more conventional widest-shortest path (WSP) routing represent alternative MPLS-based approaches on the spectrum of routing solutions. We compare these solutions in terms of utility, bandwidth acceptance ratio as well as their scalability (routing state and computational overhead) and load balancing capability. While the simplest of the per-flow algorithms we consider, the performance of WSP is close to dynamic per-packet routing, without the potential instabilities of dynamic routing.

Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.

Automated processing, modeling, and analysis of unstructured text (news documents, web content, journal articles, etc.) is a key task in many data analysis and decision making applications. As data sizes grow, scalability is essential for deep analysis. In many cases, documents are modeled as term or feature vectors and latent semantic analysis (LSA) is used to model latent, or hidden, relationships between documents and terms appearing in those documents. LSA supplies conceptual organization and analysis of document collections by modeling high-dimension feature vectors in many fewer dimensions. While past work on the scalability of LSA modeling has focused on the SVD, the goal of our work is to investigate the use of distributed memory architectures for the entire text analysis process, from data ingestion to semantic modeling and analysis. ParaText is a set of software components for distributed processing, modeling, and analysis of unstructured text. The ParaText source code is available under a BSD license, as an integral part of the Titan toolkit. ParaText components are chained-together into data-parallel pipelines that are replicated across processes on distributed-memory architectures. Individual components can be replaced or rewired to explore different computational strategies and implement new functionality. ParaText functionality can be embedded in applications on any platform using the native C++ API, Python, or Java. The ParaText MPI Process provides a 'generic' text analysis pipeline in a command-line executable that can be used for many serial and parallel analysis tasks. ParaText can also be deployed as a web service accessible via a RESTful (HTTP) API. In the web service configuration, any client can access the functionality provided by ParaText using commodity protocols ... from standard web browsers to custom clients written in any language.

The authors propose visual embedding as a model for automatically generating and evaluating visualizations. A visual embedding is a function from data points to a space of visual primitives that measurably preserves structures in the data (domain) within the mapped perceptual space (range). The authors demonstrate its use with three examples: coloring of neural tracts, scatterplots with icons, and evaluation of alternative diffusion tensor glyphs. They discuss several techniques for generating visual-embedding functions, including probabilistic graphical models for embedding in discrete visual spaces. They also describe two complementary approaches--crowdsourcing and visual product spaces--for building visual spaces with associated perceptual--distance measures. In addition, they recommend several research directions for further developing the visual-embedding model. PMID:24808163

be devoted to morphological microsemiology (microscopic morphology semantics). Besides insuring the traceability of the results (second opinion) and supporting the orchestration of high-content image analysis modules, the role of semantics will be crucial for the correlation between digital pathology and noninvasive medical imaging modalities. In addition, semantics has an important role in modelling the links between traditional microscopy and recent label-free technologies. The massive amount of visual data is challenging and represents a characteristic intrinsic to digital pathology. The design of an operational integrative microscopy framework needs to focus on scalable multiscale imaging formalism. In this sense, we prospectively consider some of the most recent scalable methodologies adapted to digital pathology as marked point processes for nuclear atypia and point-set mathematical morphology for architecture grading. To orchestrate this scalable framework, semantics-based WSI management (analysis, exploration, indexing, retrieval and report generation support) represents an important means towards approaches to integrating big data into biomedicine. This insight reflects our vision through an instantiation of essential bricks of this type of architecture. The generic approach introduced here is applicable to a number of challenges related to molecular imaging, high-content image management and, more generally, bioinformatics. PMID:27100713

The pervasive aspect of the Internet increases the demand for tools that support both monitoring and auditing of security aspects in computer networks. Ideally, these tools should provide a clear and objective presentation of security data in such a way as to let network administrators detect or even predict network security breaches. However, most of these data are still presented only in raw text form, or through inadequate data presentation techniques. Our work tackles this problem by designing and developing a powerful tool that aims at integrating several information visualization techniques in an effective and expressive visualization. We have tested our tool in the context of network security, presenting two case studies that demonstrate important features such as scalability and detection of critical network security issues.

This paper presents a brand new approach for automated feature-point label de-confliction. It outlines a method for labeling the point-features on dynamic maps in real time without a pre-processing stage. The algorithm described provides an efficient, scalable, and exceptionally fast method of labeling interactive charts and diagrams, offering interaction speeds at multiple frames per second on maps with tens of thousands of nodes. To accomplish this, the algorithm employs an efficient approach -- called the "trellis strategy" -- along with a unique label candidate cost analysis, to determine the “least expensive” label configuration. The speed and scalability of this approach makes it suitable for the complex and ever-accelerating demands of interactive visual analytic applications.

Visual culture is a hot topic in art education right now as some teachers are dedicated to teaching it and others are adamant that it has no place in a traditional art class. Visual culture, the author asserts, can include just about anything that is visually represented. Although people often think of visual culture as contemporary visuals such…

A direct-realist account of visual sensation is outlined. The explanatory notion of elements in visual sensation (atomic sensations) is reinterpreted, and the suggested interpretation is formally justified by constructing a Boolean algebra for visual sensations. The related notion of sensory levels (visual field vs visual world) is discussed. PMID:887374

Within the last half decade or so, two technological evolutions have culminated in mature products of potentially great utility to computer simulation. One is the emergence of low-cost workstations with versatile graphics and substantial local CPU power. The other is the adoption of UNIX as a de facto ``standard`` operating system on at least some machines offered by virtually all vendors. It is now possible to perform transient simulations in which the number- crunching capability of a supercomputer is harnessed to allow both process control and graphical visualization on a workstation. Such a distributed computer system is described as it now exists: a large FORTRAN application on a CRAY communicates with the balance of the simulation on a SUN-3 or SUN-4 via remote procedure call (RPC) protocol. The hooks to the application and the graphics have been made very flexible. Piping of output from the CRAY to the SUN is nonselective, allowing the user to summon data and draw or plot at will. The ensemble of control, application, data handling, and graphics modules is loosely coupled, which further generalizes the utility of the software design.

Within the last half decade or so, two technological evolutions have culminated in mature products of potentially great utility to computer simulation. One is the emergence of low-cost workstations with versatile graphics and substantial local CPU power. The other is the adoption of UNIX as a de facto standard'' operating system on at least some machines offered by virtually all vendors. It is now possible to perform transient simulations in which the number- crunching capability of a supercomputer is harnessed to allow both process control and graphical visualization on a workstation. Such a distributed computer system is described as it now exists: a large FORTRAN application on a CRAY communicates with the balance of the simulation on a SUN-3 or SUN-4 via remote procedure call (RPC) protocol. The hooks to the application and the graphics have been made very flexible. Piping of output from the CRAY to the SUN is nonselective, allowing the user to summon data and draw or plot at will. The ensemble of control, application, data handling, and graphics modules is loosely coupled, which further generalizes the utility of the software design.

The runthru system consists of five programs: workcell filter, just do it, transl8g, decim8, and runthru. The workcell filter program is useful if the source of your 3D triangle mesh model is IGRIP. It will traverse a directory structure of Deneb IGRIP files and filter out any IGRIP part files that are not referenced by an accompanying IGRIP work cell file. The just do it program automates translating and/or filtering of large numbers of partsmore » that are organized in hierarchical directory structures. The transl8g program facilitates the interchange, topology generation, error checking, and enhancement of large 3D triangle meshes. Such data is frequently used to represent conceptual designs, scientific visualization volume modeling, or discrete sample data. Interchange is provided between several popular commercial and defacto standard geometry formats. Error checking is included to identify duplicate and zero area triangles. Model engancement features include common vertex joining, consistent triangle vertex ordering, vertex noemal vector averaging, and triangle strip generation. Many of the traditional O(n2) algorithms required to provide the above features have been recast and are o(nlog(n)) which support large mesh sizes. The decim8 program is based on a data filter algorithm that significantly reduces the number of triangles required to represent 3D models of geometry, scientific visualization results, and discretely sampled data. It eliminates local patches of triangles whose geometries are not appreciably different and replaces them with fewer, larger triangles. The algorithm has been used to reduce triangles in large conceptual design models to facilitate virtual walk throughs and to enable interactive viewing of large 3D iso-surface volume visualizations. The runthru program provides high performance interactive display and manipulation of 3D triangle mesh models.« less

Understanding of visual hallucinations is developing rapidly. Single-factor explanations based on specific pathologies have given way to complex multifactor models with wide potential applicability. Clinical studies of disorders with frequent hallucinations-dementia, delirium, eye disease and psychosis-show that dysfunction within many parts of the distributed ventral object perception system is associated with a range of perceptions from simple flashes and dots to complex formed figures and landscapes. Dissociations between these simple and complex hallucinations indicate at least two hallucinatory syndromes, though exact boundaries need clarification. Neural models of hallucinations variably emphasize the importance of constraints from top down dorsolateral frontal systems, bottom up occipital systems, interconnecting tracts, and thalamic and brainstem regulatory systems. No model has yet gained general acceptance. Both qualitative (a small number of necessary and sufficient constraints) and quantitative explanations (an accumulation of many nonspecific factors) fit existing data. Variable associations of hallucinations with emotional distress and thought disorders across and within pathologies may reflect the roles of cognitive and regulatory systems outside of the purely perceptual. Functional imaging demonstrates that hallucinations and veridical perceptions occur in the same brain areas, intimating a key role for the negotiating interface between top down and bottom up processes. Thus, hallucinations occur when a perception that incorporates a hallucinatory element can provide a better match between predicted and actual sensory input than does a purely veridical experience. Translational research that integrates understandings from clinical hallucinations and basic vision science is likely to be the key to better treatments. WIREs Cogn Sci 2010 1 781-786 For further resources related to this article, please visit the WIREs website. PMID:26271777

We propose the concept of teaching (and learning) unfamiliar visualizations by analogy, that is, demonstrating an unfamiliar visualization method by linking it to another more familiar one, where the in-betweens are designed to bridge the gap of these two visualizations and explain the difference in a gradual manner. As opposed to a textual description, our morphing explains an unfamiliar visualization through purely visual means. We demonstrate our idea by ways of four visualization pair examples: data table and parallel coordinates, scatterplot matrix and hyperbox, linear chart and spiral chart, and hierarchical pie chart and treemap. The analogy is commutative i.e. any member of the pair can be the unfamiliar visualization. A series of studies showed that this new paradigm can be an effective teaching tool. The participants could understand the unfamiliar visualization methods in all of the four pairs either fully or at least significantly better after they observed or interacted with the transitions from the familiar counterpart. The four examples suggest how helpful visualization pairings be identified and they will hopefully inspire other visualization morphings and associated transition strategies to be identified. PMID:26357285

Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or

To improve the scalability of InfiniBand on large scale clusters Open MPI introduced a protocol known as B-SRQ [2]. This protocol was shown to provide much better memory utilization of send and receive buffers for a wide variety of benchmarks and real-world applications. Unfortunately B-SRQ increases the number of connections between communicating peers. While addressing one scalability problem of InfiniBand the protocol introduced another. To alleviate the connection scalability problem of the B-SRQ protocol a small enhancement to the reliable connection transport was requested which would allow multiple shared receive queues to be attached to a single reliable connection. This modified reliable connection transport is now known as the extended reliable connection transport. X-SRQ is a new transport protocol in Open MPI based on B-SRQwhich takes advantage of this improvement in connection scalability. This paper introduces the X-SRQ protocol and details the significantly improved scalability of the protocol over B-SRQand its reduction of the memory footprint of connection state by as much as 2 orders of magnitude on large scale multi-core systems. In addition to improving scalability, performance of latency-sensitive collective operations are improved by up to 38% while significantly decreasing the variability of results. A detailed analysis of the improved memory scalability as well as the improved performance are discussed.

Biomass monitoring is vital for studying the carbon cycle of earth's ecosystem and has several significant implications, especially in the context of understanding climate change and its impacts. Recently, several change detection methods have been proposed to identify land cover changes in temporal profiles (time series) of vegetation collected using remote sensing instruments, but do not satisfy one or both of the two requirements of the biomass monitoring problem, i.e., {\\em operating in online mode} and {\\em handling periodic time series}. In this paper, we adapt Gaussian process regression to detect changes in such time series in an online fashion. While Gaussian process (GP) have been widely used as a kernel based learning method for regression and classification, their applicability to massive spatio-temporal data sets, such as remote sensing data, has been limited owing to the high computational costs involved. We focus on addressing the scalability issues associated with the proposed GP based change detection algorithm. This paper makes several significant contributions. First, we propose a GP based online time series change detection algorithm and demonstrate its effectiveness in detecting different types of changes in {\\em Normalized Difference Vegetation Index} (NDVI) data obtained from a study area in Iowa, USA. Second, we propose an efficient Toeplitz matrix based solution which significantly improves the computational complexity and memory requirements of the proposed GP based method. Specifically, the proposed solution can analyze a time series of length $t$ in $O(t^2)$ time while maintaining a $O(t)$ memory footprint, compared to the $O(t^3)$ time and $O(t^2)$ memory requirement of standard matrix manipulation based methods. Third, we describe a parallel version of the proposed solution which can be used to simultaneously analyze a large number of time series. We study three different parallel implementations: using threads, MPI, and a hybrid

We present an on-chip liquid routing technique intended for application in well-based microfluidic systems that require long-term active pumping at low to medium flowrates. Our technique requires only one fluidic feature layer, one pneumatic control line and does not rely on flexible membranes and mechanical or moving parts. The presented bubble pump is therefore compatible with both elastomeric and rigid substrate materials and the associated scalable manufacturing processes. Directed liquid flow was achieved in a microchannel by an in-series configuration of two previously described "bubble gates", i.e., by gas-bubble enabled miniature gate valves. Only one time-dependent pressure signal is required and initiates at the upstream (active) bubble gate a reciprocating bubble motion. Applied at the downstream (passive) gate a time-constant gas pressure level is applied. In its rest state, the passive gate remains closed and only temporarily opens while the liquid pressure rises due to the active gate's reciprocating bubble motion. We have designed, fabricated and consistently operated our bubble pump with a variety of working liquids for >72 hours. Flow rates of 0-5.5 μl min(-1), were obtained and depended on the selected geometric dimensions, working fluids and actuation frequencies. The maximum operational pressure was 2.9 kPa-9.1 kPa and depended on the interfacial tension of the working fluids. Attainable flow rates compared favorably with those of available micropumps. We achieved flow rate enhancements of 30-100% by operating two bubble pumps in tandem and demonstrated scalability of the concept in a multi-well format with 12 individually and uniformly perfused microchannels (variation in flow rate <7%). We envision the demonstrated concept to allow for the consistent on-chip delivery of a wide range of different liquids that may even include highly reactive or moisture sensitive solutions. The presented bubble pump may provide active flow control for

The rapid advance of technology enables a large number of processing cores to be integrated into a single chip which is called a Chip Multiprocessor (CMP) or a Multiprocessor System-on-Chip (MPSoC) design. The on-chip interconnection network, which is the communication infrastructure for these processing cores, plays a central role in a many-core system. With the continuously increasing complexity of many-core systems, traditional metallic wired electronic networks-on-chip (NoC) became a bottleneck because of the unbearable latency in data transmission and extremely high energy consumption on chip. Optical networks-on-chip (ONoC) has been proposed as a promising alternative paradigm for electronic NoC with the benefits of optical signaling communication such as extremely high bandwidth, negligible latency, and low power consumption. This dissertation focus on the design of high-performance and scalable ONoC architectures and the contributions are highlighted as follow: 1. A micro-ring resonator (MRR)-based Generic Wavelength-routed Optical Router (GWOR) is proposed. A method for developing any sized GWOR is introduced. GWOR is a scalable non-blocking ONoC architecture with simple structure, low cost and high power efficiency compared to existing ONoC designs. 2. To expand the bandwidth and improve the fault tolerance of the GWOR, a redundant GWOR architecture is designed by cascading different type of GWORs into one network. 3. The redundant GWOR built with MRR-based comb switches is proposed. Comb switches can expand the bandwidth while keep the topology of GWOR unchanged by replacing the general MRRs with comb switches. 4. A butterfly fat tree (BFT)-based hybrid optoelectronic NoC (HONoC) architecture is developed in which GWORs are used for global communication and electronic routers are used for local communication. The proposed HONoC uses less numbers of electronic routers and links than its counterpart of electronic BFT-based NoC. It takes the advantages of

Introduction Lack of access to empirically-supported psychological treatments (EPT) that are contextually appropriate and feasible to deliver by non-specialist health workers (referred to as ‘counsellors’) are major barrier for the treatment of mental health problems in resource poor countries. To address this barrier, the ‘Program for Effective Mental Health Interventions in Under-resourced Health Systems’ (PREMIUM) designed a method for the development of EPT for severe depression and harmful drinking. This was implemented over three years in India. This study assessed the relative usefulness and costs of the five ‘steps’ (Systematic reviews, In-depth interviews, Key informant surveys, Workshops with international experts, and Workshops with local experts) in the first phase of identifying the strategies and theoretical model of the treatment and two ‘steps’ (Case series with specialists, and Case series and pilot trial with counsellors) in the second phase of enhancing the acceptability and feasibility of its delivery by counsellors in PREMIUM with the aim of arriving at a parsimonious set of steps for future investigators to use for developing scalable EPT. Data and Methods The study used two sources of data: the usefulness ratings by the investigators and the resource utilization. The usefulness of each of the seven steps was assessed through the ratings by the investigators involved in the development of each of the two EPT, viz. Healthy Activity Program for severe depression and Counselling for Alcohol Problems for harmful drinking. Quantitative responses were elicited to rate the utility (usefulness/influence), followed by open-ended questions for explaining the rankings. The resources used by PREMIUM were computed in terms of time (months) and monetary costs. Results The theoretical core of the new treatments were consistent with those of EPT derived from global evidence, viz. Behavioural Activation and Motivational Enhancement for severe

Summary: NAViGaTOR is a powerful graphing application for the 2D and 3D visualization of biological networks. NAViGaTOR includes a rich suite of visual mark-up tools for manual and automated annotation, fast and scalable layout algorithms and OpenGL hardware acceleration to facilitate the visualization of large graphs. Publication-quality images can be rendered through SVG graphics export. NAViGaTOR supports community-developed data formats (PSI-XML, BioPax and GML), is platform-independent and is extensible through a plug-in architecture. Availability: NAViGaTOR is freely available to the research community from http://ophid.utoronto.ca/navigator/. Installers and documentation are provided for 32- and 64-bit Windows, Mac, Linux and Unix. Contact: juris@ai.utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19837718

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

This paper describes key concepts in the design and implementation of a deblocking filter (DF) for a H.264/SVC video decoder. The DF supports QCIF and CIF video formats with temporal and spatial scalability. The design flow starts from a SystemC functional model and has been refined using high-level synthesis methodology to RTL microarchitecture. The process is guided with performance measurements (latency, cycle time, power, resource utilization) with the objective of assuring the quality of results of the final system. The functional model of the DF is created in an incremental way from the AVC DF model using OpenSVC source code as reference. The design flow continues with the logic synthesis and the implementation on the FPGA using various strategies. The final implementation is chosen among the implementations that meet the timing constraints. The DF is capable to run at 100 MHz, and macroblocks are processed in 6,500 clock cycles for a throughput of 130 fps for QCIF format and 37 fps for CIF format. The proposed architecture for the complete H.264/SVC decoder is composed of an OMAP 3530 SOC (ARM Cortex-A8 GPP + DSP) and the FPGA Virtex-5 acting as a coprocessor for DF implementation. The DF is connected to the OMAP SOC using the GPMC interface. A validation platform has been developed using the embedded PowerPC processor in the FPGA, composing a SoC that integrates the frame generation and visualization in a TFT screen. The FPGA implements both the DF core and a GPMC slave core. Both cores are connected to the PowerPC440 embedded processor using LocalLink interfaces. The FPGA also contains a local memory capable of storing information necessary to filter a complete frame and to store a decoded picture frame. The complete system is implemented in a Virtex5 FX70T device.

Static crushing tests were conducted on graphite/epoxy and Kevlar/epoxy square cross section tubes to study the influence of specimen geometry on the energy-absorption capability and scalability of composite materials. The tube inside width-to-wall thickness (W/t) ratio was determined to significantly affect the energy-absorption capability of composite materials. As W/t ratio decreases, the energy-absorption capability increases nonlinearly. The energy-absorption capability of Kevlar epoxy tubes was found to be geometrically scalable, but the energy-absorption capability of graphite/epoxy tubes was not geometrically scalable.

Static crushing tests were conducted on graphite/epoxy and Kevlar/epoxy square cross section tubes to study the influence of specimen geometry on the energy-absorption capability and scalability of composite materials. The tube inside width-to-wall thickness (W/t) ratio was determined to significantly affect the energy-absorption capability of composite materials. As W/t ratio decreases, the energy-absorption capability increases nonlinearly. The energy-absorption capability of Kevlar epoxy tubes was found to be geometrically scalable, but the energy-absorption capability of graphite/epoxy tubes was not geometrically scalable.

Large datasets are becoming more and more common in science, particularly in neuroscience where experimental techniques are rapidly evolving. Obtaining interpretable results from raw data can sometimes be done automatically; however, there are numerous situations where there is a need, at all processing stages, to visualize the data in an interactive way. This enables the scientist to gain intuition, discover unexpected patterns, and find guidance about subsequent analysis steps. Existing visualization tools mostly focus on static publication-quality figures and do not support interactive visualization of large datasets. While working on Python software for visualization of neurophysiological data, we developed techniques to leverage the computational power of modern graphics cards for high-performance interactive data visualization. We were able to achieve very high performance despite the interpreted and dynamic nature of Python, by using state-of-the-art, fast libraries such as NumPy, PyOpenGL, and PyTables. We present applications of these methods to visualization of neurophysiological data. We believe our tools will be useful in a broad range of domains, in neuroscience and beyond, where there is an increasing need for scalable and fast interactive visualization. PMID:24391582

When droplets coalesce on a superhydrophobic nanostructured surface, the resulting droplet can jump from the surface due to the release of excess surface energy. If designed properly, these superhydrophobic nanostructured surfaces can not only allow for easy droplet removal at micrometric length scales during condensation but also promise to enhance heat transfer performance. However, the rationale for the design of an ideal nanostructured surface as well as heat transfer experiments demonstrating the advantage of this jumping behavior are lacking. Here, we show that silanized copper oxide surfaces created via a simple fabrication method can achieve highly efficient jumping-droplet condensation heat transfer. We experimentally demonstrated a 25% higher overall heat flux and 30% higher condensation heat transfer coefficient compared to state-of-the-art hydrophobic condensing surfaces at low supersaturations (<1.12). This work not only shows significant condensation heat transfer enhancement but also promises a low cost and scalable approach to increase efficiency for applications such as atmospheric water harvesting and dehumidification. Furthermore, the results offer insights and an avenue to achieve high flux superhydrophobic condensation.

This paper describes a new architecture for a scalable multicast ATM switch from a few tens to thousands of input ports. The switch, called Abacus switch, has a nonblocking memoryless switch fabric followed by small switch modules at the output ports; the switch has input and output buffers. Cell replication, cell routing, output contention resolution, and cell addressing are all performed distributedly in the Abacus switch so that it can be scaled up to thousnads input and output ports. A novel algorithm has been proposed to resolve output port contention while achieving input and output ports. A novel algorithm has been proposed to reolve output port contention while achieving input buffers sharing, fairness among the input ports, and multicast call splitting. The channel grouping concept is also adopted in the switch to reduce the hardware complexity and improve the switch's throughput. The Abacus switch has a regular structure and thus has the advantages of: 1) easy expansion, 2) relaxed synchronization for data and clock signals, and 3) building the switch fabric using existing CMOS technology.

Laser scanning has become a well established surveying solution for obtaining 3D geo-spatial information on objects and environment. Nowadays scanners acquire up to millions of points per second which makes point cloud huge. Laser scanning is widely applied from airborne, carborne and stable platforms, resulting in point clouds obtained at different attitudes and with different extents. Working with such different large point clouds makes the determination of their overlapping area necessary but often time consuming. In this paper, a scalable point cloud intersection determination method is presented based on voxels. The method takes two overlapping point clouds as input. It consecutively resamples the input point clouds according to a preset voxel cell size. For all non-empty cells the center of gravity of the points in contains is computed. Consecutively for those centers it is checked if they are in a voxel cell of the other point cloud. The same process is repeated after interchanging the role of the two point clouds. The quality of the results is evaluated by the distance to the pints from the other data set. Also computation time and quality of the results are compared for different voxel cell sizes. The results are demonstrated on determining he intersection between an airborne and carborne laser point clouds and show that the proposed method takes 0.10%, 0.15%, 1.26% and 14.35% of computation time compared the the classic method when using cell sizes of of 10, 8, 5 and 3 meters respectively.

Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug companies. In all cases, there needs to be an underlying data mining system, and this mining system must be highly scalable. To this end, we describe a new rule learner called DataSqueezer. The learner belongs to the family of inductive supervised rule extraction algorithms. DataSqueezer is a simple, greedy, rule builder that generates a set of production rules from labeled input data. In spite of its relative simplicity, DataSqueezer is a very effective learner. The rules generated by the algorithm are compact, comprehensible, and have accuracy comparable to rules generated by other state-of-the-art rule extraction algorithms. The main advantages of DataSqueezer are very high efficiency, and missing data resistance. DataSqueezer exhibits log-linear asymptotic complexity with the number of training examples, and it is faster than other state-of-the-art rule learners. The learner is also robust to large quantities of missing data, as verified by extensive experimental comparison with the other learners. DataSqueezer is thus well suited to modern data mining and business intelligence tasks, which commonly involve huge datasets with a large fraction of missing data. PMID:16468565

The anthrax toxin consists of three proteins, protective antigen (PA), lethal factor, and edema factor that are produced by the Gram-positive bacterium, Bacillus anthracis. Current vaccines against anthrax use PA as their primary component. In this study, we developed a scalable process to produce and purify multi-gram quantities of highly pure, recombinant PA (rPA) from Escherichia coli. The rPA protein was produced in a 50-L fermentor and purified to >99% purity using anion-exchange, hydrophobic interaction, and hydroxyapatite chromatography. The final yield of purified rPA from medium cell density fermentations resulted in approximately 2.7 g of rPA per kg of cell paste (approximately 270 mg/L) of highly pure, biologically active rPA protein. The results presented here exhibit the ability to generate multi-gram quantities of rPA from E. coli that may be used for the development of new anthrax vaccines and anthrax therapeutics. PMID:15935696

Some of the most intractable challenges in prehospital medicine include response time optimization, inefficiencies at the emergency medical services (EMS)-emergency department (ED) interface, and the ability to correlate field interventions with patient outcomes. Information technology (IT) can address these and other concerns by ensuring that system and patient information is received when and where it is needed, is fully integrated with prior and subsequent patient information, and is securely archived. Some EMS agencies have begun adopting information technologies, such as wireless transmission of 12-lead electrocardiograms, but few agencies have developed a comprehensive plan for management of their prehospital information and integration with other electronic medical records. This perspective article highlights the challenges and limitations of integrating IT elements without a strategic plan, and proposes an open, interoperable, and scalable prehospital information technology (PHIT) architecture. The two core components of this PHIT architecture are 1) routers with broadband network connectivity to share data between ambulance devices and EMS system information services and 2) an electronic patient care report to organize and archive all electronic prehospital data. To successfully implement this comprehensive PHIT architecture, data and technology requirements must be based on best available evidence, and the system must adhere to health data standards as well as privacy and security regulations. Recent federal legislation prioritizing health information technology may position federal agencies to help design and fund PHIT architectures. PMID:21294627

In the next generation sequencing techniques millions of short reads are produced from a genomic sequence at a single run. The chances of low read coverage to some regions of the sequence are very high. The reads are short and very large in number. Due to erroneous base calling, there could be errors in the reads. As a consequence, sequence assemblers often fail to sequence an entire DNA molecule and instead output a set of overlapping segments that together represent a consensus region of the DNA. This set of overlapping segments are collectively called contigs in the literature. The final step of the sequencing process, called scaffolding, is to assemble the contigs into a correct order. Scaffolding techniques typically exploit additional information such as mate-pairs, pair-ends, or optical restriction maps. In this paper we introduce a series of novel algorithms for scaffolding that exploit optical restriction maps (ORMs). Simulation results show that our algorithms are indeed reliable, scalable, and efficient compared to the best known algorithms in the literature. PMID:25081913

We describe a prototype grid proxy cache system developed at Nikhef, motivated by a desire to construct the first building block of a future https-based Content Delivery Network for grid infrastructures. Two goals drove the project: firstly to provide a “native view” of the grid for desktop-type users, and secondly to improve performance for physics-analysis type use cases, where multiple passes are made over the same set of data (residing on the grid). We further constrained the design by requiring that the system should be made of standard components wherever possible. The prototype that emerged from this exercise is a horizontally-scalable, cooperating system of web server / cache nodes, fronted by a customized webDAV server. The webDAV server is custom only in the sense that it supports http redirects (providing horizontal scaling) and that the authentication module has, as back end, a proxy delegation chain that can be used by the cache nodes to retrieve files from the grid. The prototype was deployed at Nikhef and tested at a scale of several terabytes of data and approximately one hundred fast cores of computing. Both small and large files were tested, in a number of scenarios, and with various numbers of cache nodes, in order to understand the scaling properties of the system. For properly-dimensioned cache-node hardware, the system showed speedup of several integer factors for the analysis-type use cases. These results and others are presented and discussed.

We design and implement a scalable hard particle Monte Carlo simulation toolkit (HPMC), and release it open source as part of HOOMD-blue. HPMC runs in parallel on many CPUs and many GPUs using domain decomposition. We employ BVH trees instead of cell lists on the CPU for fast performance, especially with large particle size disparity, and optimize inner loops with SIMD vector intrinsics on the CPU. Our GPU kernel proposes many trial moves in parallel on a checkerboard and uses a block-level queue to redistribute work among threads and avoid divergence. HPMC supports a wide variety of shape classes, including spheres/disks, unions of spheres, convex polygons, convex spheropolygons, concave polygons, ellipsoids/ellipses, convex polyhedra, convex spheropolyhedra, spheres cut by planes, and concave polyhedra. NVT and NPT ensembles can be run in 2D or 3D triclinic boxes. Additional integration schemes permit Frenkel-Ladd free energy computations and implicit depletant simulations. In a benchmark system of a fluid of 4096 pentagons, HPMC performs 10 million sweeps in 10 min on 96 CPU cores on XSEDE Comet. The same simulation would take 7.6 h in serial. HPMC also scales to large system sizes, and the same benchmark with 16.8 million particles runs in 1.4 h on 2048 GPUs on OLCF Titan.

Geometric partitioning is fast and effective for load-balancing dynamic applications, particularly those requiring geometric locality of data (particle methods, crash simulations). We present, to our knowledge, the first parallel implementation of a multidimensional-jagged geometric partitioner. In contrast to the traditional recursive coordinate bisection algorithm (RCB), which recursively bisects subdomains perpendicular to their longest dimension until the desired number of parts is obtained, our algorithm does recursive multi-section with a given number of parts in each dimension. By computing multiple cut lines concurrently and intelligently deciding when to migrate data while computing the partition, we minimize data movement compared to efficientmore » implementations of recursive bisection. We demonstrate the algorithm's scalability and quality relative to the RCB implementation in Zoltan on both real and synthetic datasets. Our experiments show that the proposed algorithm performs and scales better than RCB in terms of run-time without degrading the load balance. Lastly, our implementation partitions 24 billion points into 65,536 parts within a few seconds and exhibits near perfect weak scaling up to 6K cores.« less

We present PRRT (Parallel RRT) and PRRT* (Parallel RRT*), sampling-based methods for feasible and optimal motion planning designed for modern multicore CPUs. We parallelize RRT and RRT* such that all threads concurrently build a single motion planning tree. Parallelization in this manner requires that data structures, such as the nearest neighbor search tree and the motion planning tree, are safely shared across multiple threads. Rather than rely on traditional locks which can result in slowdowns due to lock contention, we introduce algorithms based on lock-free concurrency using atomic operations. We further improve scalability by using partition-based sampling (which shrinks each core’s working data set to improve cache efficiency) and parallel work-saving (in reducing the number of rewiring steps performed in PRRT*). Because PRRT and PRRT* are CPU-based, they can be directly integrated with existing libraries. We demonstrate that PRRT and PRRT* scale well as core counts increase, in some cases exhibiting superlinear speedup, for scenarios such as the Alpha Puzzle and Cubicles scenarios and the Aldebaran Nao robot performing a 2-handed task. PMID:26167135

Patch-based structured adaptive mesh refinement (SAMR) is widely used for high-resolution simu- lations. Combined with modern supercomputers, it could provide simulations of unprecedented size and resolution. A persistent challenge for this com- bination has been managing dynamically adaptive meshes on more and more MPI tasks. The dis- tributed mesh management scheme in SAMRAI has made some progress SAMR scalability, but early al- gorithms still had trouble scaling past the regime of 105 MPI tasks. This work provides two critical SAMR regridding algorithms, which are integrated into that scheme to ensure efficiency of the whole. The clustering algorithm is an extension of the tile- clustering approach, making it more flexible and efficient in both clustering and parallelism. The partitioner is a new algorithm designed to prevent the network congestion experienced by its prede- cessor. We evaluated performance using weak- and strong-scaling benchmarks designed to be difficult for dynamic adaptivity. Results show good scaling on up to 1.5M cores and 2M MPI tasks. Detailed timing diagnostics suggest scaling would continue well past that.

Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable. PMID:26737215

One of the challenges in experimental quantum information science involves reliable transport (communication) of quantum bits over long distances under realistic conditions involving decoherence and noise. Photons are the fastest and simplest carriers of quantum information since they interact weakly with environment, but they are difficult to localize and store. It appears that an ideal solution would be to store and process quantum information in matter (i.e. nodes of quantum memory), and to communicate between these nodes using photons. In this talk we discuss how quantum optical techniques can be used to accomplish this goal using atomic ensembles and light as tools. In particular, we describe a fast and robust mechanism for quantum state transfer between light fields and atoms. This is achieved by adiabatically reducing the group velocity of propagating light to zero, thereby ``trapping'' the photon states in atomic ensembles. We describe the basic principles of this technique as well as our recent experimental progress toward realization of these ideas. We then describe how these techniques can be used to implement scalable technique for long-distance quantum communication in realistic noisy channels.

We performed an investigation into explicit algorithms for the simulation of incompressible flows using methods with a finite, but small amount of compressibility added. Such methods include the artificial compressibility method and the lattice-Boltzmann method. The impetus for investigating such techniques stems from the increasing use of parallel computation at all levels (processors, clusters, and graphics processing units). Explicit algorithms have the potential to leverage these resources. In our investigation, a new form of artificial compressibility was derived. This method, referred to as the Entropically Damped Artificial Compressibility (EDAC) method, demonstrated superior results to traditional artificial compressibility methods by damping the numerical acoustic waves associated with these methods. Performance nearing that of the lattice- Boltzmann technique was observed, without the requirement of recasting the problem in terms of particle distribution functions; continuum variables may be used. Several example problems were investigated using a finite-di erence and finite-element discretizations of the EDAC equations. Example problems included lid-driven cavity flow, a convecting Taylor-Green vortex, a doubly periodic shear layer, freely decaying turbulence, and flow over a square cylinder. Additionally, a scalability study was performed using in excess of one million processing cores. Explicit methods were found to have desirable scaling properties; however, some robustness and general applicability issues remained.

Flexible thin-film transistors (TFTs) are of central importance for diverse electronic and particularly macroelectronic applications. The current TFTs using organic or inorganic thin film semiconductors are usually limited by either poor electrical performance or insufficient mechanical flexibility. Here, we report a new design of highly flexible vertical TFTs (VTFTs) with superior electrical performance and mechanical robustness. By using the graphene as a work-function tunable contact for amorphous indium gallium zinc oxide (IGZO) thin film, the vertical current flow across the graphene-IGZO junction can be effectively modulated by an external gate potential to enable VTFTs with a highest on-off ratio exceeding 10(5). The unique vertical transistor architecture can readily enable ultrashort channel devices with very high delivering current and exceptional mechanical flexibility. With large area graphene and IGZO thin film available, our strategy is intrinsically scalable for large scale integration of VTFT arrays and logic circuits, opening up a new pathway to highly flexible macroelectronics. PMID:24502192

Grids enable uniform access to resources by implementing standard interfaces to resource gateways. In the Open Science Grid (OSG), privileges are granted on the basis of the user's membership to a Virtual Organization (VO). However, Grid sites are solely responsible to determine and control access privileges to resources using users identity and personal attributes, which are available through Grid credentials. While this guarantees full control on access rights to the sites, it makes VO privileges heterogeneous throughout the Grid and hardly fits with the Grid paradigm of uniform access to resources. To address these challenges, we are developing the Scalable Virtual Organization Privileges Management Environment (SVOPME), which provides tools for VOs to define and publish desired privileges and assists sites to provide the appropriate access policies. Moreover, SVOPME provides tools for Grid sites to analyze site access policies for various resources, verify compliance with preferred VO policies, and generate directives for site administrators on how the local access policies can be amended to achieve such compliance without taking control of local configurations away from site administrators. This paper discusses what access policies are of interest to the OSG community and how SVOPME implements privilege management for OSG.

Programming models intended to run on exascale systems have a number of challenges to overcome, specially the sheer size of the system as measured by the number of concurrent software entities created and managed by the underlying runtime. It is clear from the size of these systems that any state maintained by the programming model has to be strictly sub-linear in size, in order not to overwhelm memory usage with pure overhead. A principal feature of Partitioned Global Address Space (PGAS) models is providing easy access to global-view distributed data structures. In order to provide efficient access to these distributed data structures, PGAS models must keep track of metadata such as where array sections are located with respect to processes/threads running on the HPC system. As PGAS models and applications become ubiquitous on very large transpetascale systems, a key component to their performance and scalability will be efficient and judicious use of memory for model overhead (metadata) compared to application data. We present an evaluation of several strategies to manage PGAS metadata that exhibit different space/time tradeoffs. We use two real-world PGAS applications to capture metadata usage patterns and gain insight into their communication behavior.

The Scalable Coherent Interface (SCI) project (IEEE P1596) found a way to avoid the limits that are inherent in bus technology. SCI provides bus-like services by transmitting packets on a collection of point-to-point unidirectional links. The SCI protocols support cache coherence in a distributed-shared-memory multiprocessor model, message passing, I/O, and local-area-network-like communication over fiber optic or wire links. VLSI circuits that operate parallel links at 1000 MByte/s and serial links at 1000 Mbit/s will be available early in 1992. Several ongoing SCI-related projects are applying the SCI technology to new areas or extending it to more difficult problems. P1596.1 defines the architecture of a bridge between SCI and VME; P1596.2 compatibly extends the cache coherence mechanism for efficient operation with kiloprocessor systems; P1596.3 defines new low-voltage (about 0.25 V) differential signals suitable for low power interfaces for CMOS or GaAs VLSI implementations of SCI; P1596.4 defines a high performance memory chip interface using these signals; P1596.5 defines data transfer formats for efficient interprocessor communication in heterogeneous multiprocessor systems. This paper reports the current status of SCI, related standards, and new projects. 16 refs.

We propose a scalable, low-noise imager architecture for terahertz recordings that helps to build large-scale integrated arrays from any field-effect transistor (FET)- or HEMT-based terahertz detector. It enhances the signal-to-noise ratio (SNR) by inherently enabling complex sampling schemes. The distinguishing feature of the architecture is the serially connected detectors with electronically controllable photoresponse. We show that this architecture facilitate room temperature imaging by decreasing the low-noise amplifier (LNA) noise to one-sixteenth of a non-serial sensor while also reducing the number of multiplexed signals in the same proportion. The serially coupled architecture can be combined with the existing read-out circuit organizations to create high-resolution, coarse-grain sensor arrays. Besides, it adds the capability to suppress overall noise with increasing array size. The theoretical considerations are proven on a 4 by 4 detector array manufactured on 180 nm feature sized standard CMOS technology. The detector array is integrated with a low-noise AC-coupled amplifier of 40 dB gain and has a resonant peak at 460 GHz with 200 kV/W overall sensitivity.

Indoor localization using Received Signal Strength Indication (RSSI) fingerprinting has been extensively studied for decades. The positioning accuracy is highly dependent on the density of the signal database. In areas without calibration data, however, this algorithm breaks down. Building and updating a dense signal database is labor intensive, expensive, and even impossible in some areas. Researchers are continually searching for better algorithms to create and update dense databases more efficiently. In this paper, we propose a scalable indoor positioning algorithm that works both in surveyed and unsurveyed areas. We first propose Minimum Inverse Distance (MID) algorithm to build a virtual database with uniformly distributed virtual Reference Points (RP). The area covered by the virtual RPs can be larger than the surveyed area. A Local Gaussian Process (LGP) is then applied to estimate the virtual RPs’ RSSI values based on the crowdsourced training data. Finally, we improve the Bayesian algorithm to estimate the user’s location using the virtual database. All the parameters are optimized by simulations, and the new algorithm is tested on real-case scenarios. The results show that the new algorithm improves the accuracy by 25.5% in the surveyed area, with an average positioning error below 2.2 m for 80% of the cases. Moreover, the proposed algorithm can localize the users in the neighboring unsurveyed area. PMID:26999139

If a piece of information is released from a media site, can we predict whether it may spread to one million web pages, in a month ? This influence estimation problem is very challenging since both the time-sensitive nature of the task and the requirement of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuous-time diffusion networks. Our algorithm can estimate the influence of every node in a network with |V| nodes and |ε| edges to an accuracy of ε using n = O(1/ε2) randomizations and up to logarithmic factors O(n|ε|+n|V|) computations. When used as a subroutine in a greedy influence maximization approach, our proposed algorithm is guaranteed to find a set of C nodes with the influence of at least (1 − 1/e) OPT − 2Cε, where OPT is the optimal value. Experiments on both synthetic and real-world data show that the proposed algorithm can easily scale up to networks of millions of nodes while significantly improves over previous state-of-the-arts in terms of the accuracy of the estimated influence and the quality of the selected nodes in maximizing the influence. PMID:26752940

The High Performance Storage System (HPSS) provides scalable hierarchical storage management (HSM), archive, and file system services. Its design, implementation and current dominant use are focused on HSM and archive services. It is also a general-purpose, global, shared, parallel file system, potentially useful in other application domains. When HPSS design and implementation began over a decade ago, scientific computing power and storage capabilities at a site, such as a DOE national laboratory, was measured in a few 10s of gigaops, data archived in HSMs in a few 10s of terabytes at most, data throughput rates to an HSM in a few megabytes/s, and daily throughput with the HSM in a few gigabytes/day. At that time, the DOE national laboratories and IBM HPSS design team recognized that we were headed for a data storage explosion driven by computing power rising to teraops/petaops requiring data stored in HSMs to rise to petabytes and beyond, data transfer rates with the HSM to rise to gigabytes/s and higher, and daily throughput with a HSM in 10s of terabytes/day. This paper discusses HPSS architectural, implementation and deployment experiences that contributed to its success in meeting the above orders of magnitude scaling targets. We also discuss areas that need additional attention as we continue significant scaling into the future.

Previous generation low light detection platforms have been based on the photomultiplier tube (PMT) or the silicon single photon counting module (SPCM) from Perkin Elmer1. A new generation of silicon CMOS compatible photon counting sensors are being developed offering high quantum efficiency, low operating voltage, high levels of robustness and compatibility with CMOS processing for integration into large format imaging arrays. This latest generation yields a new detector for emerging applications which demand photon counting performance providing high performance and flexibility not possible to date. We describe a 4-channel photon detection platform, which allows the use of 4 separate photon counting detectors in either free space or fibre-coupled mode. The platform is scalable up to 16 channels with plug in modules allowing active quenching or Peltier cooling as required. A graphical user interface allows feedback and control of all device parameters. We show a novel ability to integrate separate detection modules to extend the dynamic range of the system. This allows a PIN or APD mode detector to be used alongside sensitive photon counting detectors. An advanced FPGA and microcontroller interface has been designed which allows simultaneous time binning of counting rates and readout of the analog signals when used with linear detectors. This new architecture will be discussed, presenting a full characterization of count rate, quantum efficiency, time binning and sensitivity across the broad spectrum of light flux applicable to PIN diodes, APDs and Geiger-mode photon counting sensors.

Wide variety of applications (from industrial to entertainment) has a need for reliable and accurate 3D information about motion of an object and its parts. Very often the process of movement is rather fast as in cases of vehicle movement, sport biomechanics, animation of cartoon characters. Motion capture systems based on different physical principles are used for these purposes. The great potential for obtaining high accuracy and high degree of automation has vision-based system due to progress in image processing and analysis. Scalable inexpensive motion capture system is developed as a convenient and flexible tool for solving various tasks requiring 3D motion analysis. It is based on photogrammetric techniques of 3D measurements and provides high speed image acquisition, high accuracy of 3D measurements and highly automated processing of captured data. Depending on the application the system can be easily modified for different working areas from 100 mm to 10 m. The developed motion capture system uses from 2 to 4 technical vision cameras for video sequences of object motion acquisition. All cameras work in synchronization mode at frame rate up to 100 frames per second under the control of personal computer providing the possibility for accurate calculation of 3D coordinates of interest points. The system was used for a set of different applications fields and demonstrated high accuracy and high level of automation.

The production of Earth Science data from orbiting spacecraft is an activity that takes place 24 hours a day, 7 days a week. At the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC), this results in as many as 16,000 program executions each day, far too many to be run by human operators. In fact, when the Moderate Resolution Imaging Spectroradiometer (MODIS) was launched aboard the Terra spacecraft in 1999, the automated commercial system for running science processing was able to manage no more than 4,000 executions per day. Consequently, the GES DAAC developed a lightweight system based on the popular Per1 scripting language, named the Simple, Scalable, Script-based Science Processor (S4P). S4P automates science processing, allowing operators to focus on the rare problems occurring from anomalies in data or algorithms. S4P has been reused in several systems ranging from routine processing of MODIS data to data mining and is publicly available from NASA.

Indoor localization using Received Signal Strength Indication (RSSI) fingerprinting has been extensively studied for decades. The positioning accuracy is highly dependent on the density of the signal database. In areas without calibration data, however, this algorithm breaks down. Building and updating a dense signal database is labor intensive, expensive, and even impossible in some areas. Researchers are continually searching for better algorithms to create and update dense databases more efficiently. In this paper, we propose a scalable indoor positioning algorithm that works both in surveyed and unsurveyed areas. We first propose Minimum Inverse Distance (MID) algorithm to build a virtual database with uniformly distributed virtual Reference Points (RP). The area covered by the virtual RPs can be larger than the surveyed area. A Local Gaussian Process (LGP) is then applied to estimate the virtual RPs' RSSI values based on the crowdsourced training data. Finally, we improve the Bayesian algorithm to estimate the user's location using the virtual database. All the parameters are optimized by simulations, and the new algorithm is tested on real-case scenarios. The results show that the new algorithm improves the accuracy by 25.5% in the surveyed area, with an average positioning error below 2.2 m for 80% of the cases. Moreover, the proposed algorithm can localize the users in the neighboring unsurveyed area. PMID:26999139

Superconductor digital electronics using Josephson junctions as ultrafast switches and magnetic-flux encoding of information was proposed over 30 years ago as a sub-terahertz clock frequency alternative to semiconductor electronics based on complementary metal-oxide-semiconductor (CMOS) transistors. Recently, interest in developing superconductor electronics has been renewed due to a search for energy saving solutions in applications related to high-performance computing. The current state of superconductor electronics and fabrication processes are reviewed in order to evaluate whether this electronics is scalable to a very large scale integration (VLSI) required to achieve computation complexities comparable to CMOS processors. A fully planarized process at MIT Lincoln Laboratory, perhaps the most advanced process developed so far for superconductor electronics, is used as an example. The process has nine superconducting layers: eight Nb wiring layers with the minimum feature size of 350 nm, and a thin superconducting layer for making compact high-kinetic-inductance bias inductors. All circuit layers are fully planarized using chemical mechanical planarization (CMP) of SiO2 interlayer dielectric. The physical limitations imposed on the circuit density by Josephson junctions, circuit inductors, shunt and bias resistors, etc., are discussed. Energy dissipation in superconducting circuits is also reviewed in order to estimate whether this technology, which requires cryogenic refrigeration, can be energy efficient. Fabrication process development required for increasing the density of superconductor digital circuits by a factor of ten and achieving densities above 107 Josephson junctions per cm2 is described.

By using the principle of fixed time benchmarking, it is possible to compare a very wide range of computers, from a small personal computer to the most powerful parallel supercomputer, an a single scale. Fixed-time benchmarks promise far greater longevity than those based on a particular problem size, and are more appropriate for grand challenge'' capability comparison. We present the design of a benchmark, SLALOM{trademark}, that scales automatically to the computing power available, and corrects several deficiencies in various existing benchmarks: it is highly scalable, it solves a real problem, it includes input and output times, and it can be run on parallel machines of all kinds, using any convenient language. The benchmark provides a reasonable estimate of the size of problem solvable on scientific computers. Results are presented that span six orders of magnitude for contemporary computers of various architectures. The benchmarks also can be used to demonstrate a new source of superlinear speedup in parallel computers. 15 refs., 14 figs., 3 tabs.

Most statistical software packages implement a broad range of techniques but do so in an ad hoc fashion, leaving users who do not have a broad knowledge of statistics at a disadvantage since they may not understand all the implications of a given analysis or how to test the validity of results. These packages are also largely serial in nature, or target multicore architectures instead of distributed-memory systems, or provide only a small number of statistics in parallel. This paper surveys a collection of parallel implementations of statistics algorithm developed as part of a common framework over the last 3 years. The framework strategically groups modeling techniques with associated verification and validation techniques to make the underlying assumptions of the statistics more clear. Furthermore it employs a design pattern specifically targeted for distributed-memory parallelism, where architectural advances in large-scale high-performance computing have been focused. Moment-based statistics (which include descriptive, correlative, and multicorrelative statistics, principal component analysis (PCA), and k-means statistics) scale nearly linearly with the data set size and number of processes. Entropy-based statistics (which include order and contingency statistics) do not scale well when the data in question is continuous or quasi-diffuse but do scale well when the data is discrete and compact. We confirm and extend our earlier results by now establishing near-optimal scalability with up to 10,000 processes.

Geometric partitioning is fast and effective for load-balancing dynamic applications, particularly those requiring geometric locality of data (particle methods, crash simulations). We present, to our knowledge, the first parallel implementation of a multidimensional-jagged geometric partitioner. In contrast to the traditional recursive coordinate bisection algorithm (RCB), which recursively bisects subdomains perpendicular to their longest dimension until the desired number of parts is obtained, our algorithm does recursive multi-section with a given number of parts in each dimension. By computing multiple cut lines concurrently and intelligently deciding when to migrate data while computing the partition, we minimize data movement compared to efficient implementations of recursive bisection. We demonstrate the algorithm's scalability and quality relative to the RCB implementation in Zoltan on both real and synthetic datasets. Our experiments show that the proposed algorithm performs and scales better than RCB in terms of run-time without degrading the load balance. Lastly, our implementation partitions 24 billion points into 65,536 parts within a few seconds and exhibits near perfect weak scaling up to 6K cores.

This paper presents a design of scalable Partitioned Global Address Space (PGAS) communication subsystems on recently proposed Blue Gene/Q architecture. The proposed design provides an in-depth modeling of communication infrastructure using Parallel Active Messaging Interface (PAMI). The communication infrastructure is used to design time-space efficient communication protocols for frequently used data-types (contiguous, uniformly non-contiguous) using Remote Direct Memory Access (RDMA) get/put primitives. The proposed design accelerates load balance counters by using asynchronous threads, which are required due to the missing network hardware support for Atomic Memory Operations (AMOs). Under the proposed design, the synchronization traffic is reduced by tracking conflicting memory accesses in distributed space with slightly increment in space complexity. An evaluation with simple communication benchmarks show a adjacent node get latency of 2.89$\\mu$s and peak bandwidth of 1775 MB/s resulting in $\\approx$ 99\\% communication efficiency.The evaluation shows a reduction in the execution time by up to 30\\% for NWChem self consistent field calculation on 4096 processes using the proposed asynchronous thread based design.

Recently, density-based clustering algorithms (DBSCAN and OPTICS) have gotten significant attention of the scientific community due to their unique capability of discovering arbitrary shaped clusters and eliminating noise data. These algorithms have several applications, which require high performance computing, including finding halos and subhalos (clusters) from massive cosmology data in astrophysics, analyzing satellite images, X-ray crystallography, and anomaly detection. However, parallelization of these algorithms are extremely challenging as they exhibit inherent sequential data access order, unbalanced workload resulting in low parallel efficiency. To break the data access sequentiality and to achieve high parallelism, we develop new parallel algorithms, both for DBSCAN and OPTICS, designed using graph algorithmic techniques. For example, our parallel DBSCAN algorithm exploits the similarities between DBSCAN and computing connected components. Using datasets containing up to a billion floating point numbers, we show that our parallel density-based clustering algorithms significantly outperform the existing algorithms, achieving speedups up to 27.5 on 40 cores on shared memory architecture and speedups up to 5,765 using 8,192 cores on distributed memory architecture. In our experiments, we found that while achieving the scalability, our algorithms produce clustering results with comparable quality to the classical algorithms.

BlockSolve is a scalable parallel software library for the solution of large sparse, symmetric systems of linear equations. It runs on a variety of parallel architectures and can easily be ported to others. BlockSovle is primarily intended for the solution of sparse linear systems that arise from physical problems having multiple degrees of freedom at each node point. For example, when the finite element method is used to solve practical problems in structural engineering, eachmore » node will typically have anywhere from 3-6 degrees of freedom associated with it. BlockSolve is written to take advantage of problems of this nature; however, it is still reasonably efficient for problems that have only one degree of freedom associated with each node, such as the three-dimensional Poisson problem. It does not require that the matrices have any particular structure other than being sparse and symmetric. BlockSolve is intended to be used within real application codes. It is designed to work best in the context of our experience which indicated that most application codes solve the same linear systems with several different right-hand sides and/or linear systems with the same structure, but different matrix values multiple times.« less

To facilitate interoperability of models in a scalable environment, and provide a relevant virtual environment in which Survivability technologies can be evaluated, the US Army Research Development and Engineering Command (RDECOM) Modeling Architecture for Technology Research and Experimentation (MATREX) Science and Technology Objective (STO) program has initiated the Survivability Thread which will seek to address some of the many technical and programmatic challenges associated with the effort. In coordination with different Thread customers, such as the Survivability branches of various Army labs, a collaborative group has been formed to define the requirements for the simulation environment that would in turn provide them a value-added tool for assessing models and gauge system-level performance relevant to Future Combat Systems (FCS) and the Survivability requirements of other burgeoning programs. An initial set of customer requirements has been generated in coordination with the RDECOM Survivability IPT lead, through the Survivability Technology Area at RDECOM Tank-automotive Research Development and Engineering Center (TARDEC, Warren, MI). The results of this project are aimed at a culminating experiment and demonstration scheduled for September, 2006, which will include a multitude of components from within RDECOM and provide the framework for future experiments to support Survivability research. This paper details the components with which the MATREX Survivability Thread was created and executed, and provides insight into the capabilities currently demanded by the Survivability faculty within RDECOM.

Mixed species chains of barium and ytterbium ions are investigated as a tool for building scalable quantum computation devices. Ytterbium ions provide a stable, environmentally-insensitive qubit that is easily initialized and manipulated, while barium ions are easily entangled with photons that can allow quantum information to be transmitted between systems in modular quantum computation units. Barium and ytterbium are trapped together in a linear chain in a linear rf trap and their normal mode structure and the thermal occupation numbers of these modes are measured with a narrow band laser addressing an electric quadrupole transition in barium ions. Before these measurements, barium ions are directly cooled using Doppler cooling, while the ytterbium ions are sympathetically cooled by the barium. For radial modes strongly coupled to ytterbium ions the average thermal occupation numbers vary between 400 and 12,000 depending on ion species configuration and trap parameters. Ion chain temperatures are also measured using a technique based on ion species reordering. Surface traps with many dc electrodes provide the ability to controllably reorder the chain to optimize normal mode cooling, and initial work towards realizing this capability are discussed. Quantum information can be transferred between ions in a linear chain using an optical system that is well coupled to the motional degrees of freedom of the chain. For this reason, a 532 nm Raman system is developed and its expected performance is evaluated.

Scalable Coherent Interface (SCI, IEEE/ANSI Std 1596-1992) (SCI1, SCI2) is a high performance interconnect for shared memory multiprocessor systems. In this project we investigate an SCI Real Time Protocols (RTSCI1) using Directed Flow Control Symbols. We studied the issues of efficient generation of control symbols, and created a simulation model of the protocol on a ring-based SCI system. This report presents the results of the study. The project has been implemented using SES/Workbench. The details that follow encompass aspects of both SCI and Flow Control Protocols, as well as the effect of realistic client/server processing delay. The report is organized as follows. Section 2 provides a description of the simulation model. Section 3 describes the protocol implementation details. The next three sections of the report elaborate on the workload, results and conclusions. Appended to the report is a description of the tool, SES/Workbench, used in our simulation, and internal details of our implementation of the protocol.

Household assistance robots are expected to become more prominent in the future and will require inherently safe design. Conducting polymer-based artificial muscle actuators are one potential option for achieving this safety, as they are flexible, lightweight and can be driven using low input voltages, unlike electromagnetic motors; however, practical implementation also requires a scalable structure and stability in air. In this paper we propose and practically implement a multilayer conducting polymer actuator which could achieve these targets using polypyrrole film and ionic liquid-soaked separators. The practical work density of a nine-layer multilayer actuator was 1.4 kJ m-3 at 0.5 Hz, when the volumes of the electrolyte and counter electrodes were included, which approaches the performance of mammalian muscle. To achieve air stability, we analyzed the effect of air-stable ionic liquid gels on actuator displacement using finite element simulation and it was found that the majority of strain could be retained when the elastic modulus of the gel was kept below 3 kPa. As a result of this work, we have shown that multilayered conducting polymer actuators are a feasible idea for household robotics, as they provide a substantial practical work density in a compact structure and can be easily scaled as required.

In this paper, the general characteristics and the scalability of Schottky barrier metal-oxide-semiconductor field effect transistors (SB-MOSFETs) are introduced and reviewed. The most important factors, i.e., interface-trap density, lifetime and Schottky barrier height of erbium-silicided Schottky diode are estimated using equivalent circuit method. The extracted interface trap density, lifetime and Schottky barrier height for hole are estimated as 1.5 × 1013 traps/cm2, 3.75 ms and 0.76 eV, respectively. The interface traps are efficiently cured by N2 annealing. Based on the diode characteristics, various sizes of erbium-silicided/platinum-silicided n/p-type SB-MOSFETs are manufactured and analyzed. The manufactured SB-MOSFETs show enhanced drain induced barrier lowering (DIBL) characteristics due to the existence of Schottky barrier between source and channel. DIBL and subthreshold swing characteristics are comparable with the ultimate scaling limit of double gate MOSFETs which shows the possible application of SB-MOSFETs in nanoscale regime.

Quantum computing fundamentally depends on the ability to concurrently entangle and individually address/control a large number of qubits. In general, the primary inhibitors of large scale entanglement are qubit dependent; for example inhomogeneity in quantum dots, spectral crowding brought about by proximity-based entanglement in ions, weak interactions of neutral atoms, and the fabrication tolerances in the case of Si-vacancies or SQUIDs. We propose an inherently scalable solid-state qubit system with individually addressable qubits based on the coupling of a phonon with an acceptor impurity in a high-Q Phononic Crystal resonant cavity. Due to their unique nonlinear properties, phonons enable new opportunities for quantum devices and physics. We present a phononic crystal-based platform for observing the phonon analogy of cavity quantum electrodynamics, called phonodynamics, in a solid-state system. Practical schemes involve selective placement of a single acceptor atom in the peak of the strain field in a high-Q phononic crystal cavity that enables strong coupling of the phonon modes to the energy levels of the atom. A qubit is then created by entangling a phonon at the resonance frequency of the cavity with the atomic acceptor states. We show theoretical optimization of the cavity design and excitation waveguides, along with estimated performance figures of the phoniton system. Qubits based on this half-sound, half-matter quasi-particle, may outcompete other quantum architectures in terms of combined emission rate, coherence lifetime, and fabrication demands.

Patch-based structured adaptive mesh refinement (SAMR) is widely used for high-resolution simu- lations. Combined with modern supercomputers, it could provide simulations of unprecedented size and resolution. A persistent challenge for this com- bination has been managing dynamically adaptive meshes on more and more MPI tasks. The dis- tributed mesh management scheme in SAMRAI has made some progress SAMR scalability, but early al- gorithms still had trouble scaling past the regime of 105 MPI tasks. This work provides two critical SAMR regridding algorithms, which are integrated into that scheme to ensure efficiency of the whole. The clustering algorithm is an extensionmore » of the tile- clustering approach, making it more flexible and efficient in both clustering and parallelism. The partitioner is a new algorithm designed to prevent the network congestion experienced by its prede- cessor. We evaluated performance using weak- and strong-scaling benchmarks designed to be difficult for dynamic adaptivity. Results show good scaling on up to 1.5M cores and 2M MPI tasks. Detailed timing diagnostics suggest scaling would continue well past that.« less

Objective To create a clinical decision support (CDS) system that is shareable across healthcare delivery systems and settings over large geographic regions. Materials and methods The enterprise clinical rules service (ECRS) realizes nine design principles through a series of enterprise java beans and leverages off-the-shelf rules management systems in order to provide consistent, maintainable, and scalable decision support in a variety of settings. Results The ECRS is deployed at Partners HealthCare System (PHS) and is in use for a series of trials by members of the CDS consortium, including internally developed systems at PHS, the Regenstrief Institute, and vendor-based systems deployed at locations in Oregon and New Jersey. Performance measures indicate that the ECRS provides sub-second response time when measured apart from services required to retrieve data and assemble the continuity of care document used as input. Discussion We consider related work, design decisions, comparisons with emerging national standards, and discuss uses and limitations of the ECRS. Conclusions ECRS design, implementation, and use in CDS consortium trials indicate that it provides the flexibility and modularity needed for broad use and performs adequately. Future work will investigate additional CDS patterns, alternative methods of data passing, and further optimizations in ECRS performance. PMID:23828174

The proper allocation of network resources from a common physical substrate to a set of virtual networks (VNs) is one of the key technical challenges of network virtualization. While a variety of state-of-the-art algorithms have been proposed in an attempt to address this issue from different facets, the challenge still remains in the context of large-scale networks as the existing solutions mainly perform in a centralized manner which requires maintaining the overall and up-to-date information of the underlying substrate network. This implies the restricted scalability and computational efficiency when the network scale becomes large. This paper tackles the virtual network mapping problem and proposes a novel hierarchical algorithm in conjunction with a substrate network decomposition approach. By appropriately transforming the underlying substrate network into a collection of sub-networks, the hierarchical virtual network mapping algorithm can be carried out through a global virtual network mapping algorithm (GVNMA) and a local virtual network mapping algorithm (LVNMA) operated in the network central server and within individual sub-networks respectively with their cooperation and coordination as necessary. The proposed algorithm is assessed against the centralized approaches through a set of numerical simulation experiments for a range of network scenarios. The results show that the proposed hierarchical approach can be about 5-20 times faster for VN mapping tasks than conventional centralized approaches with acceptable communication overhead between GVNCA and LVNCA for all examined networks, whilst performs almost as well as the centralized solutions.

Coral reefs represent one of the roughest structures in the marine environment. This roughness is a significant component of the high degree of habitat complexity associated with reefs. Various studies have investigated reef rugosity at discrete spatial scales, ranging from kilometers down to millimeters, and the associated biological (e.g. organism assemblages and recruitment) and physical (e.g. circulation and mass transfer) impacts. In this study, we devised a new technique for quantifying rugosity over a continuum of fine spatial scales from 200 μm to 1 cm. To achieve this high spatial resolution, a digital approach was developed, based on images collected with a functional magnetic resonance imaging (fMRI) system. Each fMRI image represents a 200 μm thick slice through a coral. Consecutive image slices were acquired as the specimen was translated through the fMRI system. These images were processed to create a three dimensional model representing the external surface of a coral. Analogous to the commonly used chain method for quantifying rugosity, digital "chains" with link sizes ranging from 200 μm to 1 cm were draped over the coral model. Here, we present results pertaining to the scalability of rugosity over these small spatial scales for three different coral species exhibiting different morphologies.

The growing complexity of scientific data poses serious challenges for an effective visualization. Data sets, e.g., catalogs of objects detected in sky surveys, can have a very high dimensionality, ~ 100 - 1000. Visualizing such hyper-dimensional data parameter spaces is essentially impossible, but there are ways of visualizing up to ~ 10 dimensions in a pseudo-3D display. We have been experimenting with the emerging technologies of immersive virtual reality (VR) as a platform for a scientific, interactive, collaborative data visualization. Our initial experiments used the virtual world of Second Life, and more recently VR worlds based on its open source code, OpenSimulator. There we can visualize up to ~ 100,000 data points in ~ 7 - 8 dimensions (3 spatial and others encoded as shapes, colors, sizes, etc.), in an immersive virtual space where scientists can interact with their data and with each other. We are now developing a more scalablevisualization environment using the popular (practically an emerging standard) Unity 3D Game Engine, coded using C#, JavaScript, and the Unity Scripting Language. This visualization tool can be used through a standard web browser, or a standalone browser of its own. Rather than merely plotting data points, the application creates interactive three-dimensional objects of various shapes, colors, and sizes, and of course the XYZ positions, encoding various dimensions of the parameter space, that can be associated interactively. Multiple users can navigate through this data space simultaneously, either with their own, independent vantage points, or with a shared view. At this stage ~ 100,000 data points can be easily visualized within seconds on a simple laptop. The displayed data points can contain linked information; e.g., upon a clicking on a data point, a webpage with additional information can be rendered within the 3D world. A range of functionalities has been already deployed, and more are being added. We expect to make this

NASA Snowflake Video Imagers (SVIs) enable snowflake visualization at diverse field sites. The natural variability of frozen precipitation is a complicating factor for remote sensing retrievals in high latitude regions. Particle classification is important for understanding snow/ice physics, remote sensing polarimetry, bulk radiative properties, surface emissivity, and ultimately, precipitation rates and accumulations. Yet intermittent storms, low temperatures, high winds, remote locations and complex terrain can impede us from observing falling snow in situ. SVI hardware and software have some special features. The standard camera and optics yield 8-bit gray-scale images with resolution of 0.05 x 0.1 mm, at 60 frames per second. Gray-scale images are highly desirable because they display contrast that aids particle classification. Black and white (1-bit) systems display no contrast, so there is less information to recognize particle types, which is particularly burdensome for aggregates. Data are analyzed at one-minute intervals using NASA's Precipitation Link Software that produces (a) Particle Catalogs and (b) Particle Size Distributions (PSDs). SVIs can operate nearly continuously for long periods (e.g., an entire winter season), so natural variability can be documented. Let’s summarize results from field studies this past winter and review some recent SVI enhancements. During the winter of 2009-2010, SVIs were deployed at two sites. One SVI supported weather observations during the 2010 Winter Olympics and Paralympics. It was located close to the summit (Roundhouse) of Whistler Mountain, near the town of Whistler, British Columbia, Canada. In addition, two SVIs were located at the King City Weather Radar Station (WKR) near Toronto, Ontario, Canada. Access was prohibited to the SVI on Whistler Mountain during the Olympics due to security concerns. So to meet the schedule for daily data products, we operated the SVI by remote control. We also upgraded the

Modern imaging and simulation techniques have enhanced system-level understanding of neural function. In this article, we present an application of interactive visualization to understanding neuronal dynamics causing locomotion of a single hip joint, based on pattern generator output of the spinal cord. Our earlier work visualized cell-level responses of multiple neuronal populations. However, the spatial relationships were abstract, making communication with colleagues difficult. We propose two approaches to overcome this: (1) building a 3D anatomical model of the spinal cord with neurons distributed inside, animated by the simulation and (2) adding limb movements predicted by neuronal activity. The new system was tested using a cat walking central pattern generator driving a pair of opposed spinal motoneuron pools. Output of opposing motoneuron pools was combined into a single metric, called "Net Neural Drive", which generated angular limb movement in proportion to its magnitude. Net neural drive constitutes a new description of limb movement control. The combination of spatial and temporal information in the visualizations elegantly conveys the neural activity of the output elements (motoneurons), as well as the resulting movement. The new system encompasses five biological levels of organization from ion channels to observed behavior. The system is easily scalable, and provides an efficient interactive platform for rapid hypothesis testing.

We propose a multivariate volume visualization framework that tightly couples dynamic projections with a high-dimensional transfer function design for interactive volume visualization. We assume that the complex, high-dimensional data in the attribute space can be well-represented through a collection of low-dimensional linear subspaces, and embed the data points in a variety of 2D views created as projections onto these subspaces. Through dynamic projections, we present animated transitions between different views to help the user navigate and explore the attribute space for effective transfer function design. Our framework not only provides a more intuitive understanding of the attribute space but also allows the design of the transfer function under multiple dynamic views, which is more flexible than being restricted to a single static view of the data. For large volumetric datasets, we maintain interactivity during the transfer function design via intelligent sampling and scalable clustering. As a result, using examples in combustion and climate simulations, we demonstrate how our framework can be used to visualize interesting structures in the volumetric space.

Identification of sets of objects with shared features is a common operation in all disciplines. Analysis of intersections among multiple sets is fundamental for in-depth understanding of their complex relationships. However, so far no method has been developed to assess statistical significance of intersections among three or more sets. Moreover, the state-of-the-art approaches for visualization of multi-set intersections are not scalable. Here, we first developed a theoretical framework for computing the statistical distributions of multi-set intersections based upon combinatorial theory, and then accordingly designed a procedure to efficiently calculate the exact probabilities of multi-set intersections. We further developed multiple efficient and scalable techniques to visualize multi-set intersections and the corresponding intersection statistics. We implemented both the theoretical framework and the visualization techniques in a unified R software package, SuperExactTest. We demonstrated the utility of SuperExactTest through an intensive simulation study and a comprehensive analysis of seven independently curated cancer gene sets as well as six disease or trait associated gene sets identified by genome-wide association studies. We expect SuperExactTest developed by this study will have a broad range of applications in scientific data analysis in many disciplines. PMID:26603754

This report focuses on the visual component of verbo-visual literacy, a communications concept involving the production, transmission, and perception of verbal and visual images. Five current problem areas in verbal-visual research are introduced and discussed: (1) communication (communication models, media consumption, new media, the information…

Spelling problems arise due to problems with form discrimination and inadequate visualization. A child's sequence of visual development involves learning motor control and coordination, with vision directing and monitoring the movements; learning visual comparison of size, shape, directionality, and solidity; developing visual memory or recall;…

Mechanochemical approaches to chemical synthesis offer the promise of improved yields, new reaction pathways and greener syntheses. Scaling these syntheses is a crucial step toward realizing a commercially viable process. Although much work has been performed on laboratory-scale investigations little has been done to move these approaches toward industrially relevant scales. Moving reactions from shaker-type mills and planetary-type mills to scalable solutions can present a challenge. We have investigated scalability through discrete element models, thermal monitoring and reactor design. We have found that impact forces and macroscopic mixing are important factors in implementing a truly scalable process. These observations have allowed us to scale reactions from a few grams to several hundred grams and we have successfully implemented scalable solutions for the mechanocatalytic conversion of cellulose to value-added compounds and the synthesis of edge functionalized graphene. PMID:25407922

In this paper, we present the main algorithmic features in the software package SuperLU{_}DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with focus on scalability issues, and demonstrate the parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication pattern for sparse Gaussian elimination, which makes it more scalable on distributed memory machines. Based on this a priori knowledge, we designed highly parallel and scalable algorithms for both LU decomposition and triangular solve and we show that they are suitable for large-scale distributed memory machines.

This paper describes scalability issues of evolutionary-driven automatic synthesis of electronic circuits. The article begins by reviewing the concepts of circuit evolution and discussing the limitations of this technique when trying to achieve more complex systems.

Lilith is a general purpose framework, written in Java, that provides a highly scalable distribution of user code across a heterogeneous computing platform. By creation of suitable user code, the Lilith framework can be used for tool development. The scalable performance provided by Lilith is crucial to the development of effective tools for large distributed systems. Furthermore, since Lilith handles the details of code distribution and communication, the user code need focus primarily on the tool functionality, thus, greatly decreasing the time required for tool development. In this paper, the authors concentrate on the use of the Lilith framework to develop scalable tools. The authors review the functionality of Lilith and introduce a typical tool capitalizing on the features of the framework. They present new Objects directly involved with tool creation. They explain details of development and illustrate with an example. They present timing results demonstrating scalability.

The feasibility of software-defined optical networking (SDON) for a practical application critically depends on scalability of centralized control performance. The paper, highly scalable routing and wavelength assignment (RWA) algorithms are investigated on an OpenFlow-based SDON testbed for proof-of-concept demonstration. Efficient RWA algorithms are proposed to achieve high performance in achieving network capacity with reduced computation cost, which is a significant attribute in a scalable centralized-control SDON. The proposed heuristic RWA algorithms differ in the orders of request processes and in the procedures of routing table updates. Combined in a shortest-path-based routing algorithm, a hottest-request-first processing policy that considers demand intensity and end-to-end distance information offers both the highest throughput of networks and acceptable computation scalability. We further investigate trade-off relationship between network throughput and computation complexity in routing table update procedure by a simulation study. PMID:26480397

In this paper, we introduce layered low-density generator matrix (Layered-LDGM) codes for super high definition (SHD) scalable video systems. The layered-LDGM codes maintain the correspondence relationship of each layer from the encoder side to the decoder side. This resulting structure supports partial decoding. Furthermore, the proposed layered-LDGM codes create highly efficient forward error correcting (FEC) data by considering the relationship between each scalable component. Therefore, the proposed layered-LDGM codes raise the probability of restoring the important components. Simulations show that the proposed layered-LDGM codes offer better error resiliency than the existing method which creates FEC data for each scalable component independently. The proposed layered-LDGM codes support partial decoding and raise the probability of restoring the base component. These characteristics are very suitable for scalable video coding systems.

MACSio is a Multi-purpose, Application-Centric, Scalable I/O proxy application. It is designed to support a number of goals with respect to parallel I/O performance testing and benchmarking including the ability to test and compare various I/O libraries and I/O paradigms, to predict scalable performance of real applications and to help identify where improvements in I/O performance can be made within the HPC I/O software stack.

As clusters rapidly grow in size, transferring files between nodes can no longer be solved by the traditional transfer utilities due to their inherent lack of scalability. In this paper, we describe a new file transfer utility called XGet, which was designed to address the scalability problem of standard tools. We compared XGet against four transfer tools: Bittorrent, Rsync, TFTP, and Udpcast and our results show that XGet's performance is superior to the these utilities in many cases.

An automated visual examination apparatus for measuring visual sensitivity and mapping blind spot location including a projection system for displaying to a patient a series of visual stimuli. A response switch enables him to indicate his reaction to the stimuli, and a recording system responsive to both the visual stimuli per se and the patient's response. The recording system thereby provides a correlated permanent record of both stimuli and response from which a substantive and readily apparent visual evaluation can be made.

The ever increasing size and complexity of geophysical and other scientific datasets has forced developers to turn to more powerful alternatives for visualizing results of computations and experiments. These alternative need to be faster, scalable, more efficient, and able to be run on large machines. At the same time, advances in scripting languages and visualization libraries have significantly decreased the development time of smaller, desktop visualization tools. Ideally, programmers would be able to develop visualization tools in a high-level, local, scripted environment and then automatically convert their programs into compiled, remote visualization tools for integration into larger computation environments. The Web Automation and Translation Toolkit (WATT) [1] converts a Tcl script for the Visualization Toolkit (VTK) [2] into a standards-compliant web service. We will demonstrate the used of WATT for the automated conversion of a desktop visualization application (written in Tcl for VTK) into a remote visualization service of interest to geoscientists. The resulting service will allow real-time access to a large dataset through the Internet, and will be easily integrated into the existing architecture of the Virtual Laboratory for Earth and Planetary Materials (VLab) [3]. [1] Jensen, P.A., Yuen, D.A., Erlebacher, G., Bollig, E.F., Kigelman, D.G., Shukh, E.A., Automated Generation of Web Services for Visualization Toolkits, Eos Trans. AGU, 86(52), Fall Meet. Suppl., Abstract IN42A-06, 2005. [2] The Visualization Toolkit, http://www.vtk.org [3] The Virtual Laboratory for Earth and Planetary Materials, http://vlab.msi.umn.edu

We describe ASK-GraphView, a node-link-based graph visualization system that allows clustering and interactive navigation of large graphs, ranging in size up to 16 million edges. The system uses a scalable architecture and a series of increasingly sophisticated clustering algorithms to construct a hierarchy on an arbitrary, weighted undirected input graph. By lowering the interactivity requirements we can scale to substantially bigger graphs. The user is allowed to navigate this hierarchy in a top down manner by interactively expanding individual clusters. ASK-GraphView also provides facilities for filtering and coloring, annotation and cluster labeling. PMID:17080786

In order to enhance the perception of indoor and unfamiliar environments for the blind and visually-impaired, we introduce the PERCEPT system that supports a number of unique features such as: a) Low deployment and maintenance cost; b) Scalability, i.e. we can deploy the system in very large buildings; c) An on-demand system that does not overwhelm the user, as it offers small amounts of information on demand; and d) Portability and ease-of-use, i.e., the custom handheld device carried by the user is compact and instructions are received audibly. PMID:22254445

The development and deployment of data processing systems to process Earth Observing System (EOS) data has proven to be costly and prone to technical and schedule risk. Integration of science algorithms into a robust operational system has been difficult. The core processing system, based on commercial tools, has demonstrated limitations at the rates needed to produce the several terabytes per day for EOS, primarily due to job management overhead. This has motivated an evolution in the EOS Data Information System toward a more distributed one incorporating Science Investigator-led Processing Systems (SIPS). As part of this evolution, the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC) has developed a simplified processing system to accommodate the increased load expected with the advent of reprocessing and launch of a second satellite. This system, the Simple, Scalable, Script-based Science Processor (S42) may also serve as a resource for future SIPS. The current EOSDIS Core System was designed to be general, resulting in a large, complex mix of commercial and custom software. In contrast, many simpler systems, such as the EROS Data Center AVHRR IKM system, rely on a simple directory structure to drive processing, with directories representing different stages of production. The system passes input data to a directory, and the output data is placed in a "downstream" directory. The GES DAAC's Simple Scalable Script-based Science Processing System is based on the latter concept, but with modifications to allow varied science algorithms and improve portability. It uses a factory assembly-line paradigm: when work orders arrive at a station, an executable is run, and output work orders are sent to downstream stations. The stations are implemented as UNIX directories, while work orders are simple ASCII files. The core S4P infrastructure consists of a Perl program called stationmaster, which detects newly arrived work orders and forks a job to run the

Nanomaterials and nanotechnologies have attracted a great deal of attention in a few decades due to their novel physical properties such as, high aspect ratio, surface morphology, impurities, etc. which lead to unique chemical, optical and electronic properties. The awareness of importance of nanomaterials has motivated researchers to develop nanomaterial growth techniques to further control nanostructures properties such as, size, surface morphology, etc. that may alter their fundamental behavior. Carbon nanotubes (CNTs) are one of the most promising materials with their rigidity, strength, elasticity and electric conductivity for future applications. Despite their excellent properties explored by the abundant research works, there is big challenge to introduce them into the macroscopic world for practical applications. This thesis first gives a brief overview of the CNTs, it will then go on mechanical and oil absorption properties of macro-scale CNT assemblies, then following CNT energy storage applications and finally fundamental studies of defect introduced graphene systems. Chapter Two focuses on helically coiled carbon nanotube (HCNT) foams in compression. Similarly to other foams, HCNT foams exhibit preconditioning effects in response to cyclic loading; however, their fundamental deformation mechanisms are unique. Bulk HCNT foams exhibit super-compressibility and recover more than 90% of large compressive strains (up to 80%). When subjected to striker impacts, HCNT foams mitigate impact stresses more effectively compared to other CNT foams comprised of non-helical CNTs (~50% improvement). The unique mechanical properties we revealed demonstrate that the HCNT foams are ideally suited for applications in packaging, impact protection, and vibration mitigation. The third chapter describes a simple method for the scalable synthesis of three-dimensional, elastic, and recyclable multi-walled carbon nanotube (MWCNT) based light weight bucky-aerogels (BAGs) that are

HOW TO OBTAIN CONTACT HOURS BY READING THIS ISSUE Instructions: 1.2 contact hours will be awarded by Villanova University College of Nursing upon successful completion of this activity. A contact hour is a unit of measurement that denotes 60 minutes of an organized learning activity. This is a learner-based activity. Villanova University College of Nursing does not require submission of your answers to the quiz. A contact hour certificate will be awarded after you register, pay the registration fee, and complete the evaluation form online at http://goo.gl/gMfXaf. In order to obtain contact hours you must: 1. Read the article, "Improving Rural Geriatric Care Through Education: A Scalable, Collaborative Project," found on pages 306-313, carefully noting any tables and other illustrative materials that are included to enhance your knowledge and understanding of the content. Be sure to keep track of the amount of time (number of minutes) you spend reading the article and completing the quiz. 2. Read and answer each question on the quiz. After completing all of the questions, compare your answers to those provided within this issue. If you have incorrect answers, return to the article for further study. 3. Go to the Villanova website to register for contact hour credit. You will be asked to provide your name, contact information, and a VISA, MasterCard, or Discover card number for payment of the $20.00 fee. Once you complete the online evaluation, a certificate will be automatically generated. This activity is valid for continuing education credit until June 30, 2019. CONTACT HOURS This activity is co-provided by Villanova University College of Nursing and SLACK Incorporated. Villanova University College of Nursing is accredited as a provider of continuing nursing education by the American Nurses Credentialing Center's Commission on Accreditation. OBJECTIVES Describe the unique nursing challenges that occur in caring for older adults in rural areas. Discuss the