~ Computational modeling & simulation doesn't have to be boring

Monthly Archives: May 2015

I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.

― Abraham Maslow

High performance computing is a big deal these days and may become a bigger deal very soon. It has become a new battleground for national supremacy. The United States will very likely soon commit to a new program for achieving progress in computing. This program by all accounts will be focused primarily on the computing hardware first, and then the system software that directly connects to this hardware. The goal will be the creation of a new generation of supercomputers that attempt to continue the growth of computing power into the next decade, and provide a path to “exascale”. I think it is past time to ask, “do we have the right priorities?” “Is this goal important and worthy of achieving?”

Lack of direction, not lack of time, is the problem. We all have twenty-four hour days.

― Zig Ziglar

I’ll return to these two questions at the end, but first I’d like to touch on an essential concept in high performance computing, scaling. Scaling is a big deal, it measures success in computing, in a nutshell describes efficiency of solving problems particular with respect to changing problem size or computing resource. In scientific computing one of the primary assumptions is that bigger faster computers yield better, more accurate results that have greater relevance to the real world. The success of computing depends on scaling and breakthroughs in achieving it, defines the sort of problems that could be solved.

Nothing is less productive than to make more efficient what should not be done at all.

― Peter Drucker

There are several types of scaling with distinctly different character. Lately the dominant scaling in computing has been associated with parallel computing performance. Originally the focus was on strong scaling, which is defined by the ability of greater computing resources to solve a problem of fixed size faster. In other words perfect strong scaling would result from solving a problem twice as fast with two CPUs than with one CPU.

Lately this has been replaced by weak scaling where the problem size is adjusted along with the resource. The goal is to solve a problem that is twice as big with two CPUs just as fast as the original problem is solved with one CPU. These scaling results depend both on the software implementation and the quality of the hardware. They are the stock and trade of success in the currently envisioned high performance-computing program nationally. They are also both relatively unimportant and poor measures of the power of computing to solve scientific problems.

Two things are infinite: the universe and human stupidity; and I’m not sure about the universe.

― Albert Einstein

Algorithmic scaling is another form of scaling and it is massive in its power. We are failing to measure, invest and utilize it in moving forward in computing nationally. The gains to be made through algorithmic scaling will almost certainly lay waste to anything that computing hardware will deliver. It isn’t that hardware investments aren’t necessary; they are simply grossly over-emphasized to a harmful degree.

The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom.

― Isaac Asimov

The archetype of algorithmic scaling is sorting a list, which is an amazingly common and important function for a computer program. Common sorting algorithms are things like insertion, or quick-sort, and each comes with a scaling for the memory required and the number of operations to work to completion. In most cases the best that can be done is linear scaling, in other words for a list that is items long, it takes order operations. This means that for a sufficiently large list the cost is proportional to some constant times the length of the list, . Other high-grade algorithms like quicksort take order , but may carry a smaller constant. These can be faster for shorter lists. If one chooses very poorly the sorting can scale like . There are alsoaspects of the algorithm and it’s scaling that speak to the memory-storage needed and the complexity of the algorithm’s implementation. These themes carry on to a discussion of more esoteric computational science algorithms next.

In scientific computing two categories of algorithm loom large over substantial swaths of the field: numerical linear algebra, and discretization-methods. Both of these categories have important scaling relations associated with their use that have a huge impact on the efficiency of solution. We have not been paying much attention at all to the efficiencies possible from these areas. Improvements in both areas could yield improvements in performance that would put any new computer to shame.

For numerical linear algebra the issue is the cost of solving the matrix problem with respect to the number of equations. For the simplest view of the problem one uses a naïve method like Gaussian elimination (or LU decomposition), which scales like where is the number of equations to be solved. This method is designed to solve a dense matrix where there are few non-zero entries. In scientific computing the matrices are typically “sparse” meaning most entries are zero. An algorithm specifically for sparse matrices lowers the scaling to . These methods both produce “exact” solutions to the system (modulo poorly conditioned problems).

If an approximate solution is desired or useful one can use lower cost iterative methods. The simplest methods like the Jacobi or Gauss-Seidel iteration also scale at . Modern iterative methods are based on the Krylov subspace with the conjugate gradient method being the classical method. As exact solutions these methods scale as , but as iterative methods for approximate solutions the scaling lowers to $N^\frac{3}{2}$. One can do even better with multigrid methods, lowering the scaling to .

Each of this sequence of methods has a constant in front of the scaling, and the constant gets larger as scaling gets better. Nonetheless it is easy to see that if you’re solving a billion unknowns the difference between and is immense, a billion billion. The difference in constants between the two methods is several thousand. In the long run multigrid wins. One might even do better than multigrid with current research in data analysis producing sublinear algorithms for large-scale data analysis. Another issue is the difficulty of making multigrid work in parallel, as the method is inherently NOT parallel in important parts. Multigrid performance is also not robust and Krylov subspace methods still dominate actual use.

Learn from yesterday, live for today, hope for tomorrow. The important thing is to not stop questioning.

― Albert Einstein

Discretization can provide even great wins. If a problem is amenable to high-order accuracy, a higher order method will unequivocally win over a low-order method. The problem is that most practical problems you can get paid to solve don’t have this property. In almost every case the solution will converge at a first-order accuracy. This is the nature of the world. The knee-jerk response is that this means that high-order methods are not useful. This shows a lack of understanding on what they bring to the table and how they scale. High-order methods produce lower errors than low-order methods even when high-order accuracy cannot be achieved.

As a simple example take a high-order method that delivers half the error of a low order method. To get equivalent results the high-order method would take half the mesh defined by a number of “cell” or “elements” per dimension . If one is interested in time-dependent problems, the number of time steps is usually proportional to . Hence a one-dimensional problem would require degrees of freedom. For equivalent accuracy the high-order method would require cells and one-fourth of the degrees of freedom. It breaks even at four times the cost. In three dimensional time dependent problems, the scaling is and the break-even point is 16 in cost. This is imminently doable. Even larger improvements in accuracy would provide an even more insurmountable advantage.

The counter-point to these methods is their computational cost and complexity. The second issue is their fragility, which can be recast as their robustness or stability in the face of real problems. Still their performance gains are sufficient to amortize the costs given the vast magnitude of the accuracy gains and effective scaling.

An expert is a person who has made all the mistakes that can be made in a very narrow field.

― Niels Bohr

The last issue to touch upon is the need to make algorithms robust, which is just another word for stable. Work on stability of algorithms is simply not happening these days. Part of the consequence is a lack of progress. For example one way to view the lack of ability of multigrid to dominate numerical linear algebra is its lack of robustness (stability). The same thing holds for high-order discretizations, which are typically not as robust or stable as low order ones. As a result low-order methods dominate scientific computing. For algorithms to prosper work on stability and robustness needs to be part of the recipe.

If we knew what it was we were doing, it would not be called research, would it?

― Albert Einstein

Performance is a monotonically sucking function of time. Our current approach to HPC will not help matters, and effectively ignores the ability of algorithms to make things better. So “do we have the right priorities?” and “is this goal (of computing supremacy) important and worthy of achieving?” The answers are an unqualified NO and a qualified YES. The goal of computing dominance and supremacy is certainly worth achieving, but having the fastest computer will absolutely not get us there. It is neither necessary, nor sufficient for success.

This gets to the issue of priorities directly. Our current program is so intellectually bankrupt as to be comical, and reflects a starkly superficial thinking that ignores the sort of facts staring them directly in the face such as the evidence of commercial computing. Computing matters because of how it impacts the real world we live it. This means the applications of computing matter most of all. In the approach to computing taken today the applications are taken completely for granted, and reality is a mere afterthought.

What can be asserted without evidence can also be dismissed without evidence.

― Christopher Hitchens

Commercial CFD and mechanics codes are an increasingly big deal. They would have you believe that the only thing people should concern themselves with is the meshing problems, graphical user interfaces, and computer power. The innards of the code with its models and methods are basically in the bag, and no big deal. The quality of the solutions is assured because it’s a solved problem. Input your problem in a point and click manner, mesh it, run it on a big enough computer and point and click visuals, then you’re done.

Marketing is what you do when your product is no good.

― Edwin H. Land

Of course this is not the case, not even close. One might understand why these vendors might prefer to sell their product with this mindset. The truth is that the methods in the codes are old, generally not very good by modern standards. If we had a healthy research agenda for developing improved methods, the methods in the codes would be appalling. On the other hand they are well understood, highly reliable or robust (meaning they will run to completion without undue user intervention). This doesn’t mean that they are correct or accurate. The problem is that they represent a very low bar of success. The codes and the methods utilized by them are far below what might be possible with a healthy computational science program.

Another huge issue is the targeted audience of the users for these codes. If we go back in time we could rely upon the codes being used only by people with PhD’s. Nowadays the codes are targeted at people with Bachelor’s degrees with little or no expertise or interest in numerical methods or advanced models in the context of partial differential equations. As a result, these aspects of the code’s makeup and behavior have been systematically reduced in importance to being practically ignored. Of course, because I work at these very details provides me with the knowledge and evidence that these aspects are paramount in importance. All the meshing, graphics and gee-whiz interface can’t overcome a bad model or method in a code.

Reality is one of the possibilities I cannot afford to ignore

― Leonard Cohen

One way to get to the truth is verification and validation (V&V). While V&V has become an important technical endeavor, it usually is applied more as a buzzword than an actual technical competence. The result is usually a set of activities that have more of the look and feel of V&V than the actual proper practice of V&V. Those marketing the codes tend to trumpet their commitment to V&V while actually espousing the cutting of V&V corners. Part of the problem is that rigorous V&V would in large part undercut many of their marketing premises.

We all die. The goal isn’t to live forever, the goal is to create something that will.

― Chuck Palahniuk

What is truly terrifying about the state of affairs today is that this attitude has gone beyond the commercial code vendors and increasingly defines the attitude at National Labs, and Academia, the place where innovation should be coming from? Money for developing new methods and models has dried up. The emphasis in computational science has shifted to parallel computing and the acquisition of massive new computer platforms.

Reality is that which, when you stop believing in it, doesn’t go away.

― Philip K. Dick

The unifying theme in all of this is that the perception that is being floated is modeling and numerical methods is a solved area of investigation and we simply await a powerful enough computer to unveil the secrets of the universe. This sort of mindset is more appropriate for some sort of cultish religion than science. It is actually antithetical to science, and the result is a lack of real scientific progress.

Don’t try to follow trends. Create them.

― Simon Zingerman

So, what is needed?

Computational science needs to acknowledge and play by the scientific method. Increasingly, today it does not. It acts on articles of faith and politically correct low risk paths to “progress”.

We need to cease believing that all our problems will be solved by a faster computer

The needs of computational science should balance the benefits of models, methods, algorithms, implementation, software and hardware instead of the articles of faith taken today.

Embrace risks needed for breakthroughs in all of these areas especially models, methods and algorithms, which require creative work and generally need inspired results for progress.

Acknowledge that the impact of computational science on reality is most greatly improved by modeling improvements. Next in impact are methods and algorithms, which provide greater efficiency. Instead our focus on implementation, software and hardware actually produces less impact on reality.

Practice V&V with rigor and depth in a way that provides unambiguous evidence that calculations are trustworthy in a well-defined and supportable manner.

Acknowledge the absolute need for experimental and observational science in providing the window into reality.

Stop overselling modeling and simulation as an absolute replacement for experiments, and more as a guide for intuition and exploration to be used in association with other scientific methods.

A life spent making mistakes is not only more honorable, but more useful than a life spent doing nothing.

― George Bernard Shaw

High performance computing is a hot topic these days. All sorts of promises have been made regarding its transformative potential. Computational modeling is viewed as the cure to the lack of ability to do expensive, dangerous or even illegal experiments. All sorts of benefits are supposed to rain down upon society as a driver of a faster, better and cheaper future. If we were collectively doing everything that should be done these promises might have a chance of coming true, but we’re not and they won’t, unless we start doing things differently.

The Chinese use two brush strokes to write the word ‘crisis.’ One brush stroke stands for danger; the other for opportunity. In a crisis, be aware of the danger–but recognize the opportunity.

― John F. Kennedy

So, what the hell?

Computing’s ability to deliver on these promises is at risk, ironically due to a lack of risk taking. The scientific computing community seems to have rallied around taking the safe path of looking toward faster computing hardware as the route toward enhanced performance. High payoff activities such as new model or algorithm development are also risky and likely to fail with high probability. The relatively small number of successful projects in these areas result in massive payoffs in terms of performance.

Despite a strong historical track record of providing greater benefits for computational simulation than hardware efforts to improve modeling, methods and algorithms are starved for support. This will kill the proverbial goose that is about to lay a golden egg. We are figuratively strangling the baby in the crib by failing to feed the core of creative value in simulation. We have prematurely declared that computational simulation is mature and ready for prime time. In the process we are stunting its growth and throwing money away on developing monstrous computers to feed computational power to a “petulant teen”. Instead we need to develop the field of simulation further and make some key steps toward providing society with a mature and vibrant scientific enterprise. Policy makers have defined a future where the only thing that determines computational simulation capability to be the computing power of the computer it runs on.

This mindset has allowed the focus to shift almost in its entirety toward computing hardware. Growth in the performance of computing power is commonly used as an advertisement for the access and ease of utilizing computational modeling. An increasing number of options exist for simply buying simulation capability in the form of computational codes. The user interfaces for the codes allows relatively broad access to modeling and definitely takes the capability out of the hands of the experts. For those selling capability this democratization is a benefit because it increases the size of the market. Describing this area as a mature, solved problem is another marketing benefit.

The question of whether this is a good thing still needs to be asked. How true are these marketing pitches?

It is relatively easy to solve problems today. Computer power allows the definition of seemingly highly detailed models and fine computational grid as well as stunning visual representations. All of these characteristics provide users with the feeling of simulation quality. The rise of verification and validation should allow the users to actually determine whether these feelings are justified. Generally V&V undermines ones belief in how good results are. On the other hand people like to feel that their analysis is good. This means that much of the negative evidence is discounted or even dismissed when conducting V&V. The real effect of the slipshod V&V is to avoid the sort of deep feedback that the quality of results should have on the codes.

When you fail, that is when you get closer to success.

― Stephen Richards

At this juncture it’s important to talk about current codes and the models and methods contained in them. The core of the philosophy of code based modeling goes all the way back into the 1960’s and has not changed much since. This is a problem. In many cases the methods used in the codes to solve the models are nearly as old. In many cases the methods were largely perfected during the 1970’s and 1980’s. Little or no effort is presently being put forth to advance the solutions techniques. In summary most effort is being applied to simply implementing the existing solution techniques on the next generation of computers.

Remember the two benefits of failure. First, if you do fail, you learn what doesn’t work; and second, the failure gives you the opportunity to try a new approach.

― Roger Von Ouch

Almost certainly the models themselves are even more deeply ensconced and effectively permanent. No one even considers changing the governing equations being solved. Models of course have a couple of components, the basic governing equations are generally quite classical, and their closure is the part that slowly evolves. These equations were the product of 17th-19th Century science and philosophical mindset that should being questioned if science itself were healthy. If one thinks about the approach we take today, the ability to resolve new length and time scales it has changed monumentally. We should be able to solve vastly nonlinear systems of equations (we really can’t in a practical robust manner). Is it even appropriate to have the same equations? Or should the nature of the equations change as a function of the characteristic scales of resolution. Closure modeling evolves more readily, but only within the philosophical confines defined by the governing equations. Again, we are woefully static, and the lack of risk taking is undermining any promise for actual promise.

Take the practice of how material properties are applied to a problem as a key point. The standard way to apply material properties is to “paint” the properties into regions containing a material. For example if aluminum exists in the problem, a model defines the properties with its response to forces. The aluminum is defined, as being the same everywhere there is aluminum. As the scale size gets smaller aluminum (or any material) gets less and less homogeneous. There begins to be significant differences in the structure typically defined by the grain structure of the material and any imperfections. The model is systematically ignoring these heterogeneous features. Usually their collective effects are incorporated in an average way in the model, but the local effects of these details are ignored. Modern application questions are more and more focused upon the sort of unusual effects that happen due to these local defects.

Experimental observations are only experience carefully planned in advance, and designed to form a secure basis of new knowledge.

― Sir Ronald Fisher

Let’s be perfectly blunt and clear about the topic of modeling. The model we solve in simulation is the single most important and valuable aspect of a computation. A change in the model that opens new physical vistas is more valuable than any computer including one with limitless power. A computer is no better than the model it solves. This point of view is utterly lost today.

More dangerously we are continuing to write codes for the future in the same manner today. In other words we have had the same philosophy in computational modeling for the last 50 years or more. The same governing equations and closure philosophy are being used today. How much longer will be continue to do the same thing? I believe we should have changed a while ago. We can begin to study the impact of material and solution heterogeneity already, but the models and methods to do so are not being given any priority.

The reason is that it would be disruptive and risky. It would require changing our codes and practices significantly. It would undermine the narrative of computer power as the tonic for what ails us. It would be a messy and difficult path. It would also be consistent with the scientific method instead of following a poorly thought through intellectually empty article of faith. Because risk taking is so antithetical today this path has been avoided.

Our most significant opportunities will be found in times of greatest difficulty.

― Thomas S. Monson

The investments in faster computers are valuable and beneficial, but only if these investments are balanced with other investments. Modeling is the aspect of computation that is closest to reality and holds the greatest leverage and value. Methods for solving models and associated algorithms are next closest and have the next highest leverage. Neither of these areas is being invested in at a healthy level. Implementing these algorithms and models is next most important. Here there is a little more effort because existing models need to work on the computers. The two areas with the highest level of effort are system software and hardware. Ironically these two areas have the least amount of value in terms of effecting reality. No one in a position of power seems to recognize how antithetical to progress this state of affairs is.

Sometimes it’s the mistakes that turn out to be the best parts of life

The overall quality of computational modeling depends on a lot of things and one very big one isn’t generally acknowledged, whoever is using the code. How much does it matter? A lot, much more than almost anyone would admit and the effect becomes greater as problem complexity grows.

The most important property of a program is whether it accomplishes the intention of its user.

― C.A.R. Hoare

The computer, the code, the computational resolution (i.e., mesh), the data, the models, and the theory all get acute and continuous attention from verification and validation. When the human element in quality is raised as an issue, people become immensely defensive. At the same time it is generally acknowledged by knowledgeable people that the impact of the user of the code (or modeler) is huge. In many cases it may be the single greatest source of uncertainty.

We don’t see things as they are, we see them as we are.

― Anaïs Nin

This isn’t a matter of simple mistakes made in the modeling process; this is associated with reasonable choices made in representing complex problems. Different modelers make different decisions about dealing with circumstances and representing all the “gray” area. In many cases these choices live in the space where variability in results should be. For example the boundary or initial conditions are common sources of the changes. Reality is rarely fully reproducible and details that are generally fuzzy result are subtle changes in outcomes. In this space, the user of a code can different, but equally reasonable choices about how to model a problem. These can result in very large changes in results.

Despite this the whole area of uncertainty quantification of this effect is largely missing. This is because it is such an uncomfortable source of variation in results. Only a few areas readily acknowledge or account for this such as nuclear reactor safety work, the Sandia “Fracture Challenge” and a handful of other isolated cases. It is something that needs much greater attention, but only if we are courageous enough to attack the problem.

It’s funny how humans can wrap their mind around things and fit them into their version of reality.

― Rick Riordan

The capacity to acknowledge this effect and measure it is largely resisted by the community. We are supposed to live in an age where everything is automatic and the computer will magically unveil the truths of the universe to us. This is magical thinking, but the commonly accepted dogma of modernity. Instead the core of value is fundamentally connected to the human element, and this truth seems to beyond our ability to admit.

We do not need magic to transform our world. We carry all of the power we need inside ourselves already.

Physics is becoming so unbelievably complex that it is taking longer and longer to train a physicist. It is taking so long, in fact, to train a physicist to the place where he understands the nature of physical problems that he is already too old to solve them.

— Eugene Paul Wigner

The past decade has seen the rise of commercial modeling and simulation tools with seemingly great capabilities. Computer power has opened vistas of simulation to the common engineer and scientist. Advances in other related technologies like visualization have provided an increasingly “turn-key” experience to users who can do seemingly credible cutting-edge work on their laptops. These advances also carry with them some real dangers most acutely summarized as a “black box” mentality toward the entire modeling and simulation enterprise.

artificial intelligence is no match for natural stupidity

—Albert Einstein

Black box thinking causes problems because people get answers without understanding how those answers are arrived at. When problems are simple and straightforward this can work, but as soon as the problems become difficult issues arise. The models and methods in a code can do funny things that only make sense knowing the inner workings of the code. The black box point of view usually comes with too much trust of what the code is doing. This can cause people to accept solutions that really should have been subjected to far more scrutiny.

The missing element in the black box mentality is the sense of collaboration needed to make modeling and simulation work. The best work is always collaborative including elements of computation and modeling, but also experiments, mathematics and its practical applications. This degree of multi-disciplinary work is strongly discouraged today. Ironically, the demands of cost accounting often work steadfastly to undermine the quality of work by dividing people and their efforts into tidy bins. The drive to make everything accountable discourages the ability to conduct work in the best way possible. Instead our current system of management encourages the black box mentality.

Another force for pushing black box thinking is education. Students now run codes whose interface is easy enough to bear some resemblance to video games. Of course with a generation of scientists and engineers raised on video games this could be quite powerful. At the same time the details of the codes are not generally emphasized, and instead they tend to be viewed as black boxes. In classes when the details of the codes are unveiled, eyes glaze over and it becomes clear that the only thing they are really interested in is getting results, not knowing how the results were arrived at.

One way this current trend is being blunted is the adoption of verification and validation (V&V) in modeling and simulation. V&V encourages a distinct multidisciplinary point-of-view in its execution particularly when coupled to uncertainty quantification. To do V&V correctly requires a significant amount of deep knowledge of many technical areas. This is really difficult. Instead of engaging deeply in the technical work necessary for good V&V is simply beyond the capacity of most people’s capabilities and tolerance for effort. People paying for modeling and simulation for the most part are unwilling to pay for good V&V. They would rather have V&V that is cheap and fools people into confidence.

Computers are incredibly fast, accurate, and stupid: humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination.

― Albert Einstein

Two elements are leading to this problem. No one is willing to pay for high-quality technical work either the development or use of simulation codes. Additionally no one is willing to pay for the developers of the code and the users to work together. The funding, environment and tolerance to support the sort of multi-disciplinary activities that produce good modeling and simulation (and by virtue of that goo V&V) is shrinking with each passing year. Developing professionals who do this sort of work well is really expensive and time-consuming. When the edict is to simply crank out calculations with a turnkey technology, the appetite for running issues to ground necessary for quality simply doesn’t exist.

A couple of issues have really “poisoned the well” of modeling and simulation. The belief is that the technology is completely trustworthy and mature enough for novices to use is an illusion. Commercial codes are certainly useful, but need to be used with skill and care by curious, doubtful users. These codes often place a serious premium on robustness over accuracy, and cut lots of corners to keep their users happy. A happy user is usually first and foremost someone with a completed calculation regardless of the credibility of that calculation. We also believe that everything is deeply enabled by almost limitless computing power.

Think? Why think! We have computers to do that for us.

— Jean Rostand

Computing power doesn’t relieve us of the responsibility to think about what we are doing. We should stop believing that the computational tools can be used like magic, black magic in black boxes that we don’t understand. If you don’t understand how you got your answers, you probably shouldn’t trust that answer until you do.

A computer lets you make more mistakes faster than any other invention with the possible exceptions of handguns and Tequila.