Posted
by
Soulskill
on Tuesday April 15, 2014 @01:59PM
from the new-ways-to-argue-about-your-favorite-language dept.

An anonymous reader writes "Deciding which programming language to use is often based on considerations such as what the development team is most familiar with, what will generate code the fastest, or simply what will get the job done. How secure the language might be is simply an afterthought, which is usually too late. A new WhiteHat Security report approaches application security not from the standpoint of what risks exist on sites and applications once they have been pushed into production, but rather by examining how the languages themselves perform in the field. In doing so, we hope to elevate security considerations and deepen those conversations earlier in the decision process, which will ultimately lead to more secure websites and applications."

Do they mean Classic ASP? They list.NET separately so I don't think they mean ASP.NET, but they also don't include ASP in their list of "legacy" languages. I also seriously doubt 16% of companies are still using Classic ASP.

ASP isn't even a language, it's a framework. You can write a Classic ASP app in vbscript or javascript. You can write ASP.NET in any.NET supported language. Then there is ASP.NET MVC.

If they can't get their list of tested "languages" straight, I doubt the rest of the article.

When you reduce a complex issue to just one number, like "mean number of vulnerabilities", it is often an over simplification. It is tempting to think it is better than nothing. But are we really better off making decisions based on an overly simplified view of things?

One bug that allows silent remote code execution on the WAN side and another bug that is a privilege escalation possibility on the LAN can not be treated as one bug each, right? This is not limited to just security vulnerabilities alone. Many software company top managers insist on looking at bug counts, sometimes sorted into 5 priority/severity levels or so.

It gets worse in the planning and progress monitoring. They use fancy tools like rallydev.com or something, but they allow each team to define its own story points. The Bangalore team uses 1 story point = 1 engineer week. The Boston team uses 1 story point = 1 engineer day. The Bangkok team uses engineer hour. And the top management gets the report, "This SAGA feature story was estimated to take 3264 story points, and it is 2376 points complete". Complete b.s. that is.

We pay ridiculously high salaries for the top management, and instead of expecting them to put in the time, energy and effort commensurate with that kind of pay, to make valuable judgement, hard decisions, step on people's toes, tell it like it is, and paint an accurate picture of the state of the company, we let them shirk their responsibilities.

In the wake of Heartbleed, one might think that this would be talking about array bounds checking or buffer overflow mitigation. No. It is talking about web site frameworks.

examined the vulnerability assessment results of the more than 30,000 websites

First of all: this is not measuring the security of the programming language. This is measuring the security of the OS infrastructure and toolchains. Notice C/C++ is not on the list, since it is hardly ever used for creating web sites.

There was no significant difference between languages in examining the highest averages of vulnerabilities per slot.

What the heck is a slot?

Any summary where Perl scores the best must be deeply questioned. I doubt this is an apples-to-apples comparison. Surely these Perl sites are not doing nearly as much as the sites written in other languages.

If the language specification doesn't expressly say what happen when things "outside the design" happen, then different implementations may work differently.

For example:

If the language design spec says

"If an array index is out of bounds, exit the program and return a value of ABEND_ARRAY_BOUNDS_VIOLATION to the calling program,"

that may seem very specific, but if how to "exit the program and return a value of ABEND_ARRAY_BOUNDS_VIOLATION to the calling program," isn't specified by someone (usually the operating system), then it may not be specific enough. if different operating systems specify how to do this differently, then expected "under the hood" behavior will not necessarily be consistent across operating systems.

For example, does "exit the program" mean simply returning control to the caller, or does it mean explicitly returning any resources that were previously granted to the program by the operating system first? Or is that optional? If it's optional as far as the operating system is concerned, does the language provide a compile- or run-time switch to force such a cleanup? Does returning memory to the operating system guarantee that the OS will sanitize the memory, and if not, does the language guarantee it? If the language doesn't guarantee it, does the language provide a compile- or runtime switch so the program will sanitize memory prior to returning it to the operating system?

These differences in language implementations and even differences in how operating systems handle the starting and stopping of processes can lead to differences in what the code actually does. Usually these differences are unimportant but sometimes they are very important.

I do not think C++ would have helped here, all it would have done was made things a bit more obscured. It should also be noted that you can build custom allocators in C++ too (I worked on a couple projects that used them) so that part of the problem would be there too.

C++ makes a lot of things easier, but under the hood it is still essentially C with an expanded library and fancy pre-processor (I know modern compliers do not actually preprocess C++ into C and then compile), thu all the same problems are still there and mostly are mitigated by using libraries that wrap things up in a safer way.

The security level of a piece of code with good security is 95% coder competence and 5% language, i.e. language is irrelevant. One thing though is that language can add to the security problems if it has insecure tun-rime implementation errors.

One reason most security-critical software is written in C is that there, the coder gets full control. A good coder with skills in secure coding will do fine with C. A coder that does not understand software security will to badly in any language, but in C he/she might not even produce anything that works, which will be an advantage. Also in C, it will be far more obvious if somebody is clueless, which makes review easier.

But "language is important for code security" is even more wrong than "language is important for code reliability". Language is important for code performance though, but only in the sense that it can kill it. Good language choice can also make a good coder more productive (a bad coder has negative productivity, so it hardly matters...). This nonsense about the language being capable of fixing problems with the people using is comes from "management" types that are unable to handle people that are individuals. These utterly incompetent "managers" can be found in many large companies and they believe that in IT, individuals do not matter. Typically these "managers" are not engineers and have no clue what a good engineer can do but a bad one cannot. They also believe that engineering productivity can be measured in hours spent or that all coders are equal and just implement specifications, so outsourcing is a good idea.

If you read TFA, you will find out quickly that the headline (both from this site and on TFA) is seriously misleading! What TFA is talking about is doing statistic on 30k websites, and determine what language/frame work they used to implement. Then check on each of them for vulnerability attack, such as SQL injection, XSS, etc. So, the language itself has NOTHING to do with the security, but it is the implementation of the site itself! The article itself is not well written either... Too many quotes from many people in copy-and-paste style. Then the author tries to thrown in numbers (i.e. percentage of this and that) to make it look like it is that useful... NOT!

TLDR? Below is what TFA is actually about...

WhiteHat researchers examined the vulnerability assessment results of the more than 30,000 websites to measure how the underlying programming languages and frameworks perform in the field. With that information, the report yields key findings around which languages are most prone to which classes of attack, for how often and how long as well as a determination as to whether or not popular modern languages and frameworks yield similar results in production websites

I thought that PHP was born around the same time as Java (and definately way before.NET). So how is PHP a legacy language and Java isn't? Or, is the writer just throwing in words to mess with my programming language history?

C++ (and do a lesser extent C) lose support because of their extremely poor support for utf8. And the absurd part of it is that they could easily do a good job. Utf8 is just a byte array with various routines to interpret the code. Glibc does a reasonable job for a C library...not ideal, but reasonable.

All the array needs is a way to address a chunk by character # rather than by byte #, a way to copy of a character or a slice of chars, and a way to determine the general character classification of any character. Also a few methods: first(), last(), hasnext(), hasprior(), next() and prior(). And these all "sort of" exist, except getting the general character classification. (Do note that these functions need to operate on utf-8 characters rather than on bytes.) But several different ways of doing this are already known. Vala, e.g., handles it without difficulty, and is able to emit C code (using Glibc libraries).

So it's not a programming difficulty that's holding things up. It's the standards bodies...or, perhaps, some members of them.

But I've looked at C++11, and it is not a satisfactory answer. Vala has a good answer. D (Digital Mars D) has a different good answer. Even Python3 has a pretty good answer. (I don't like that in Python you can't determine memory allocation within the string.) Also Racket, etc. But C++ doesn't.