Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

Anything that you write that uses a regex you should beat on with some fuzzing logic, since they can tend to increase in computational time non-linearly, and next thing you know you got a DOS on your hands.

This man speaks the truth. Just yesterday I had to deal with a Perl script whose execution time blew up once it had to process files larger than 1 KB in size. It'd work fine for 500-character files, but give it more than 1000 characters and the runtime would take over half an hour! (Yes, we had one user sit there and wait over 30 minutes for it to finish.)

In the end, it was a poorly-written regular expression that was to blame. It was easy enough to fix, and we've since ditched the Indian team that develope

Yeah, and checking specifically for a possible type of pathological regex in order to work around it would require slowing the engine for the common case. So I'd say the power of the engine to do so many things is a good trade for the easily avoided pathological cases.

There are only two situations in which such pathological cases are likely to be given to the regex engine anyway. One is when long regexes are built automatically by a program based on a set of rules and a particular data set, which should onl

No, it actually happens in the wild (see the post ^ that had BS claimed on it) typically when parsing a string up into many substrings where the boundaries aren't specified in such a way that backtracking is prevented.

Any artificial language you're parsing should be designed with this in mind, or at least a different tool could be clearly chosen for it.

If you're parsing any sort of natural language, well first you kind of deserve what you get.;-) The variability of the language would seem to preclude this sort of massive backtracking in the common case, and the parser could use multiple separate regexes to help lower the chances of triggering the problem.

Dare I inquire as to the thought process behind the notion that the inferiority of an OSS program called "Fuzz" and the superiority of an debian-based VM, running a GPLed perl script automating a WTFPLv2-licenced fuzzer proves the unimpressiveness of OSS?

If there is one place I've seen worse code than OSS, it would be in academia.

Bizarrely, this is also where I've seen the most brilliant code.

If you look closely, you'll find that the "brilliant code" is most often written by academics who have industry programming experience. Similarly, in industry, you will find that the best code is written by experienced programmers with rigorous academic backgrounds. In contrast, the academics who insist that computer science has nothing to do with programming, and the self-taught hackers who proudly proclaim their lack of all that fancy book-larnin', are two sides of the same worthless coin.

in their whitepaper they referenced my 'axfuzz' tool I wrote years ago and even used a modified version of it in their testing. Hope they didn't judge me on that code, it was a pile of crap that I kept hacking together until it finally worked, with no thought to proper software design.

I propose that every website which handles private data (credit, ssn, health, etc) should be integrating these kinds of tools into normal test procedures, both in development and on production mirrored sites.