The plot:
By default, MySQL was running with innodb_flush_log_at_trx_commit = 1 which led to performance problems for even simple DML statements(i.e. delete from table where id = ..., id being a primary or having an index)
I set innodb_flush_log_at_trx_commit = 0, which really helped , being able to sustain the load, but now mysql is prone to lose data in case it crashes (and I've been told it's not acceptable that we could lose transactions). The so called happy compromise it the I've set it to innodb_flush_log_at_trx_commit = 2, which in theory, mysql should be able not to lose any transactions as long as a transaction made it to log_file. If it crashes, MySQL will roll-back or commit whatever it finds in the log_file.
The questions:
1. How reliable is this process? Should I risk and go with innodb_flush_log_at_trx_commit = 2 and count on the fact MySQL/innodb will be reliable enough to commit those changes found in the log_file or just use the ...trx_commit=1 and pray I will not face bigger loads?

2. In case I have to go with innodb_flush_log_at_trx_commit = 1, what workarounds do I have to further optimize Innodb and make it speedier? Increasing the allocated memory for the innodb_buffer_pool_size is not an option. The developers are using some g.d. triggers that have the habit of leaking memory and this eventually leads to a MySQL instance crash. The current value of 2560M seems to lead to a fairly stable instance. Taking out the triggers is also not option (a coliding course with developers' ego/insufficient human resources(?!)).

If the innodb_flush_log_at_trx_commit=0 sped things up for you then it indicates that your application is performing a lot of insert/updates.

The problem here is that if you are using:
innodb_flush_log_at_trx_commit=1
sync_binlog=1
then you have two flushes that has to happen to disk for each transaction.
And the problem with this is that if you don't have a RAID card with battery backed up cache (or the OS is faking sync's, but then you wont have any use for it) is that flushing data to disk after each transaction slows things down a _lot_ due to seek times for the disk.
And in this case if you haven't separated the InnoDB log files and the binary log to different disks then they will compete for the disk time.

What you can do:
1.
Buy a RAID card with battery backed up cache.

2.
Separate InnoDB log files on one disk, tablespace on another and binary log on a third.

3.
Change the application (if possible) to run longer transactions to avoid the disk writes.

4.
Skip sync_binlog=1, do you actually need it or is it ok to manually sync the master and slave on those rare occasions if they end up out of sync due to this?

5.
Run with innodb_flush_log_at_trx_commit = 2
The thing is that with this setting an OS crash can cause data loss.
But if you have a good UPS with lot of batteries and you got a stable OS and a new solid hardware.
How often does this kind of machine crash (as long as you don't let any developers have shell access to it )?

The answer is so very seldom, so that usually when you have to trade performance for solidity then this setting is quite enough.
Otherwise I suggest buying the RAID card.

Comment

Phasing out sync_binlog=1 is giving me some degree of restlessness ) . The application is suppose to rely on the slave in case of master-slave switch, so having data consistency is a must.

1. Ain't gonna happen to soon, at least not before the application launches.

2. It's the next logical step, will do.

3. Almost impossible, I have to check out though.

4. Might take it out eventually, especially that I got used to idea that I have to manually reseed the slave. Would have been nice though, if reliability would've been the first objective MySQL AB struggle to meet for their product and then speed/performance. sync_binlog=0 is a ticking bomb.

5. Is there a test case that I could build in order to show other people around that innodb_flush_log_at_trx_commit = 2 is REALLY able to recover all the commmited transaction after a mysql crash? Otherwise, I don't really trust MySQL/innodb what it says in the manual.
And yes, developers have access to the database but so far the machine didn't experience any crash. Don't ask me how this didn't happen though.

6. I case I have to go with innodb_flush_log_at_trx_commit = 1, how can I mess with innodb_log_buffer_size/innodb_log_file_size in order to provide the minimum the performance increase and meet the load? Provided that I implement the innodb log file/binlogs/data separation you suggested.

Comment

4.
As always a balance.
How often does a well functioning server crash compared to how often do you need good performance? Because these settings are only fixing a problem if the server crashes.

But the problem you are mainly facing is is tied to hardware.
When you force mysql to flush log and bin-log to disk for each commit you are forcing the OS to wait for the disk to complete the write.
On todays hard disks you have seek times averaging about 8ms which means that the OS can only perform about 125 writes per second per disk. So if you have both bin-log and innodb log on same disk you can't get more than 125/2 commits per second.
And if you have the tablespace on same disk then that drops even more.

5.
To test it I would write an application that inserts records and writes it's own log file in parallel over successfully commited transactions.
And then just run kill -9 on the mysqld process while your test application is running queries against mysql.
Then you start up mysqld and compare that all records that you inserted actually exist in the database.
I think that is as close as you can get to test it.

6.
You can't do anything about it.
Since it will still flush the log file for each transaction commit.
The only thing you can avoid by increasing the size is when mysqld reaches the end of log file flush, close, open and that is very minor since it happens pretty seldom.
And if you create a very large log file you can instead have a very long recovery time.