The bucket sort is a very efficient algorithm for sorting integers when the range is known in advance. The major portion of this algorithm, however, is inherently sequential. We have selected, as a working example for this study, the variant of the bucket sort algorithm such as described in the NAS Parallel Benchmark suite of the NASA Ames Research Center Ill. In general, bow one implements the bucket sort algorithm on a distributed memory system depends on a variety of factors, including the communications bandwidth and latency, the type of network topology, the amount of memory per processor, the typemore » of processor, and the size of the keys, i.e., the number of buckets. The VPP500 is a vector-parallel system with a distributed memory architecture, in which each processing element (PE) comprising a vector architecture. Attempting both efficient vectorization and parallelization of the bucket sort in a distributed memory system provides a backdrop for novel algorithm development.« less

Development of a complex metal-casting computer model requires information about how varying the problem parameters affects the results (metal flow and solidification). For example, we would like to know how the last point to solidify or the cooling rate at a given location changes when the physical properties of the metal, boundary conditions, or mold geometry are changed. As a preliminary step towards a complete sensitivity analysis of a three-dimensional casting simulation, we examine a one-dimensional version of a metal-alloy phase-change conductive-heat-transfer model by means of Automatic Differentiation (AD). This non-linear 'Jacobian-free' method is a combination of an outer Newton-basedmore » iteration and an inner conjugate gradient-like (Krylov) iteration. The implicit solution algorithm has enthalpy as the dependent variable from which temperatures are determined. We examine the sensitivities of the difference between an exact analytical solution for the final temperature and that produced by this algorithm to the problem parameters. In all there are 17 parameters (12 physical constants such as liquid density, heat capacity, and thermal conductivity, 2 initial and boundary condition parameters, the final solution time, and 2 algorithm tolerances). We apply AD in the forward and reverse mode and verify the sensitivities by means of finite differences. In general, the finite-difference method requires at least N+1 computer runs to determine sensitivities for N problem parameters. By forward and reverse, we mean the direction through the solution and in time and space in which the derivative values are obtained. The forward mode is typically more efficient for determining the sensitivity of many responses to one or a few parameters, while the reverse mode is better suited for sensitivities of one or a few responses with respect to many parameters. The sensitivities produced by all the methods agreed to at least three significant figures. The forward and reverse AD code run times were similar and were approximately 34% faster than those of the finite-difference sensitivities. Real problems in three dimensions will certainly have many more parameters describing mold geometry and pouring conditions. If the trend seen here holds true reverse mode AD is favored since the computational time increases only slightly for additional parameters.« less