while (true) {

While the mysql library is rather dangerous, do note that it is basically a carbon clone of the underlying MySQL C API. And yes, I suspect there are a considerable number of MySQL C applications that are vulnerable to the exact same sorts of issues. They're just harder to get at.

I'd wager that people writing in C tend to know more about what they're doing than people writing in MySQL. A language designed for non-technical people shouldn't take the same approach to safety as a language that expects the programmer to know enough to avoid buffer overflow exploits.

The MySQL C API doc for mysql_escape_string() actually says:

Quote:

Use mysql_real_escape_string() instead!

I think it's dangerous to assume C programmers know about every safety issue. I'd wager a lot of programmers spend very little time thinking about non-ASCII character sets regardless of what language they work in. Nobody knows every issue and a lot of security situations are much easier to get subtly wrong than exactly right.

Who knows - now that Oracle's gotten burned repeatedly on Java, maybe they'll do an all-up security review and fix crap like mysql_escape_string().

Only programmers will blame Oracle for this stuff, though. Everyone else blames the programmers who wrote the bad code*, rather than the company that wrote the libraries that encouraged them to write bad code.

* Well, more likely they just blame the company that lost their password and credit card number.

If you're going to be doing PHP in 2013, you need to be aware of the way the leading edge is going, and that's pretty much "wherever Sensio is taking us." Laravel is alright, but its good parts seem mostly to be "we interop with Symfony2" and I don't see why you wouldn't just go with the actual core framework.

I still don't understand why you'd spend all your time, in 2013, writing a Yet Another PHP On Rails Framework, and it's not like learning ruby takes more than about fifteen minutes. There are a handful of syntactical differences, an actual well-designed core library and a better object system in Ruby. Any PHP developer can pick it up almost instantly, and the gem system in Ruby is just so much better than PHP it's not even funny.

I still don't understand why you'd spend all your time, in 2013, writing a Yet Another PHP On Rails Framework, and it's not like learning ruby takes more than about fifteen minutes. There are a handful of syntactical differences, an actual well-designed core library and a better object system in Ruby. Any PHP developer can pick it up almost instantly, and the gem system in Ruby is just so much better than PHP it's not even funny.

It boggles the mind.

You could say the same thing, substituting Perl for ruby, and CPAN for gem.

I still don't understand why you'd spend all your time, in 2013, writing a Yet Another PHP On Rails Framework, and it's not like learning ruby takes more than about fifteen minutes. There are a handful of syntactical differences, an actual well-designed core library and a better object system in Ruby. Any PHP developer can pick it up almost instantly, and the gem system in Ruby is just so much better than PHP it's not even funny.

It boggles the mind.

You could say the same thing, substituting Perl for ruby, and CPAN for gem.

Well that, and the whole "Having a different prefix character for variable references depending on their data type" thing is sort of the hallmark of insane language design. Perl should also be automatically disqualified based solely on the way it passes parameters to methods/subroutines.

Well that, and the whole "Having a different prefix character for variable references depending on their data type" thing is sort of the hallmark of insane language design. Perl should also be automatically disqualified based solely on the way it passes parameters to methods/subroutines.

It's part and parcel of Perl's context sensitivity. I think the fact that no other language has really attempted to reproduce Perl's "guess what you think I mean" sensibilities suggests that nobody really thinks it's a very clever idea.

Fortunately, Javascript also has some pretty solid linter's like jshint/jslint, which are directly plumbed into common editors, like Sublime to provide instant feedback, in your editor when you run into the few corner cases where semicolon insertion could bite you, and it works really well.

This might have been a big problem at one time, but I think that with modern workflows and tools this one Javascript "feature" shouldn't be a big threat.

"Hey, you know that web service where an individual can sign up for a subscription? I'm trying to push a B2B contact/lead (as in, data with a completely different purpose and attributes) to that web service and it's not letting me."

When I write code, I make far too big a deal over dumb nuances by over thinking things and confusing myself.

Let's take an example of a business who sells contracts of services to customers. If I were to design my classes and functions, I get confused where the related objects should be handled from. Each customer has contracts, and a contract cannot exist if it doesn't have a specified customer. Do I pull the customer contracts from the customer instance (var cust = new Customer(); var contracts = cust.getContracts())? From the contracts given a specified customer (var cust = new Customer(); var contracts = Contracts.GetTheirs(cust))? Do I assign a new contract to a customer or a customer to a contract?

The first question I would ask is, "Do I really need an object graph of all the entities in my business model?" The answer to that question is frequently "No".

That said, you're really asking the questions:

Given a customer object, do I need a way to lookup (one or all) of their contracts?

Given a contract object, do I need a way to lookup (one or all) of its customers?

The two are effectively independent and should be treated as such.

If the question you're trying to ask is, 'Which one should take the other as a constructor parameter' then that depends entirely on the data source, but the answers are probably either, "Neither" or "Entirely arbitrary".

When I write code, I make far too big a deal over dumb nuances by over thinking things and confusing myself.

I do this too!

Quote:

Let's take an example of a business who sells contracts of services to customers. If I were to design my classes and functions, I get confused where the related objects should be handled from. Each customer has contracts, and a contract cannot exist if it doesn't have a specified customer. Do I pull the customer contracts from the customer instance (var cust = new Customer(); var contracts = cust.getContracts())? From the contracts given a specified customer (var cust = new Customer(); var contracts = Contracts.GetTheirs(cust))?

You know, I've traditionally tended to use dedicated static data access objects for that, though it isn't the most OO approach, so in my case it would look something like:

This is not always ideal and I'm actually trying to use more modern approaches, though -- I've starting really delving into Entity Framework and am liking it so far, but I still have a lot of legacy apps with code like that. It's a pattern that we used pretty heavily throughout my company in the past few years.

That said, when retrieving data, I'm not sure I always like having the object handle its own data access. In one system I worked on, we had some code that would retrieve a record based on a street address, and my coworker set it up so that it would instantiate itself based on an address in the constructor:

Code:

var record = new Record("1234 Main St", "Apt 1", "90210");

That, or he may have used a static method that returned the live record -- either way, similar and simple enough. The problem was that a few weeks later, the requirement changed to "pull all records for that street address" -- that was slightly frustrating to refactor my way out of.

As levine says, this is what happens when you try to shove relational data into an object system.

Quote:

Do I assign a new contract to a customer or a customer to a contract?

If a customer can have multiple contracts but a contract can only have one customer, I would probably assign the contract to the customer.

Django's ORM does references both ways: so if a customer may have many contracts, it'll create an accessor on the Customer object that lets you get a list of Contracts, and an accessor on the Contract object that lets you get the Customer that owns the contract.

Neither object "owns" one or the other, though. if ownership had to be assigned anywhere, it would probably best be assigned to the factory responsible for creating the objects in the first place (whatever it is that takes a query, runs it on the database, and constructs the appropriate Python objects).

If you want to enforce integrity, add internal methods that can add a customer or contract to the opposite class and when the public method it called, it can also call the associated internal method.

Then you can make a decision like "Does calling RemoveCustomer on a Contract where that's the last customer throw a CantRemoveLastCustomer exception?"

Another option is to have some sort of Contract Management class that adds contracts and customers to each other as necessary where the actual methods for creating contracts and associating customers are kept.

I've spent the past few weeks running a few of our products through a profiler and reworking bottlenecks/inefficiencies. With it, I've made some serious progress with some fairly small changes, but when I went to test my new changes on use cases that have been given to me by various customers, I find that my changes are having little effect because the bottleneck in their runs is/was not inefficient routines, but I/O (mostly O).

The obvious answer (and the one we've given in the past) is stop writing so much out, especially if you don't need a significant percent of it. However, this isn't always possible, and some customers really do need the large amounts of data that are being written to files many, many times throughout the program's life. So, are there any general strategies for speeding up I/O? As a bonus, are there any that are specific to Fortran? (Seriously?!? Firefox's spell-checker doesn't think Fortran is spelled right) Right now, the code is basically doing the simplest possible IO, and is dumping array contents to files so that other parts of our software suite can analyze them. Any easy/generic ways to improve this?

Right now, the code is basically doing the simplest possible IO, and is dumping array contents to files so that other parts of our software suite can analyze them. Any easy/generic ways to improve this?

Is it constantly writing out small amounts of data, or is it big files that get pushed out sporadically? If the former, caching the output and minimizing the actual write operations is a simple thing to do (you don't get as-real-time output that way but it doesn't sound necessary). One big write is faster than lots of small ones.

Alternatively, if the purpose of this IO is to let other software analyze it, you could couple those tools together so that they can pass stuff around in RAM instead of writing/reading files. If you still need that array in the end, maybe this doesn't save a ton of time (although it should skip a reading step at least). This might be a big software project, of course.

If those things aren't possible, then the fastest options might be on the hardware side--get an SSD or set up a fast memory cache for writes so that the slow part can be handled later by the OS.

If they're running on a medium to large parallel computer, then their IO probably needs to be made parallel as well as their computation. Depending on the decomposition and the storage format, this may be trivial, or this may be a monumental effort.

I've spent the past few weeks running a few of our products through a profiler and reworking bottlenecks/inefficiencies. With it, I've made some serious progress with some fairly small changes, but when I went to test my new changes on use cases that have been given to me by various customers, I find that my changes are having little effect because the bottleneck in their runs is/was not inefficient routines, but I/O (mostly O).

Use a profiler to measure the differences. You may have to do some fiddling with the output in Excel to get the numbers you want, but it should be possible without terribly too much work.

Quote:

The obvious answer (and the one we've given in the past) is stop writing so much out, especially if you don't need a significant percent of it. However, this isn't always possible, and some customers really do need the large amounts of data that are being written to files many, many times throughout the program's life. So, are there any general strategies for speeding up I/O?

You gave the answer, but it's a bit of a PITA in Fortran.

The biggest single thing to do is make sure your I/O operations and buffers are suitably large. Write out entire records and arrays, not elements. Adjust BUFFERCOUNT, then BUFFERSIZE as necessary.

However, if this is true:

Quote:

Right now, the code is basically doing the simplest possible IO, and is dumping array contents to files so that other parts of our software suite can analyze them. Any easy/generic ways to improve this?

then it may be as fast as it can go. Have you measured to see what your disk I/O rate is? CPU usage is?

I've spent the past few weeks running a few of our products through a profiler and reworking bottlenecks/inefficiencies. With it, I've made some serious progress with some fairly small changes, but when I went to test my new changes on use cases that have been given to me by various customers, I find that my changes are having little effect because the bottleneck in their runs is/was not inefficient routines, but I/O (mostly O).

The obvious answer (and the one we've given in the past) is stop writing so much out, especially if you don't need a significant percent of it. However, this isn't always possible, and some customers really do need the large amounts of data that are being written to files many, many times throughout the program's life. So, are there any general strategies for speeding up I/O? As a bonus, are there any that are specific to Fortran? (Seriously?!? Firefox's spell-checker doesn't think Fortran is spelled right) Right now, the code is basically doing the simplest possible IO, and is dumping array contents to files so that other parts of our software suite can analyze them. Any easy/generic ways to improve this?

First of all, do you really need to store the data on the hard drive or can you stream it directly to another process for analysis? Secondly, if you need to store the data on the HD, could you compress it? You'd have to measure whether the increase in CPU usage is worth it but it may be. Thirdly, you mention reducing the amount you write but it isn't clear if you're actually doing that when possible. Can the app producing data take flags that let it know when it can write less? Finally, maybe you should just get a SSD, at least for intermediate files. Depending on your usage pattern, you might wear it out fast, but it'd probably cost less than spending a couple days manually optimizing.