About this Author

College chemistry, 1983

The 2002 Model

After 10 years of blogging. . .

Derek Lowe, an Arkansan by birth, got his BA from Hendrix College and his PhD in organic chemistry from Duke before spending time in Germany on a Humboldt Fellowship on his post-doc. He's worked for several major pharmaceutical companies since 1989 on drug discovery projects against schizophrenia, Alzheimer's, diabetes, osteoporosis and other diseases.
To contact Derek email him directly: derekb.lowe@gmail.com
Twitter: Dereklowe

October 18, 2010

Palladium Couplings: You Can't Run Them All

Posted by Derek

This year's Nobel for palladium-catalyzed coupling reactions highlighted how useful these have become. But what every practicing organic chemist knows is how complicated they can be, particularly when you first couple of favorite recipes don't work. I've long thought that almost any metal-catalyzed transformation can be optimized, if you're just willing to devote enough of your life to it. But you have to have a good reason to wade into the swamp, because there sure are a lot of variables that can be tweaked. Here's a good case in point, recently published in Organic Letters. A perfectly reasonable reaction (C-H arylation of a chloropyrazole, which had been demonstrated before) was run through the statistical wringer to track down the best conditions.

They looked at 6 solvents, 10 bases, 4 catalysts, 5 ligands, and 4 additives, which would give you 7200 combinations if you ran the whole shebang. A Design of Experiments approach cut the number of actual runs down significantly, and then (fortunately) some of the variables turned out to be pretty insensitive. So this one wasn't as bad as some of them get - the ligand didn't seem to have too much effect, for example, whereas in some other Pd couplings it's crucial. (The choice of base had a much bigger effect, in case you're wondering). Their best set of conditions seems to work reasonably well across a range of possible substrates.

DoE is worth a post of its own, and that'll be a timely thing for me. After brushing up against it for years, I may finally have a use for the technique soon. For those who don't know it, it's basically a way to figure out how to most efficiently sample "experiment space", by getting the most information out of each different run. And then you use principal components analysis (or something similar) to see what the most important changes were, and how they correlate to each other. It's asking, mathematically, what a synthetic chemist wants to know about a complicated reaction recipe: what changes are responsible for most of the variation in the results, and how can I track them down by running a reasonable number of experiments? In the drug industry, process chemists think about this sort of thing a lot more than discovery chemists do, but it's worth keeping an eye out for any time the approach could be helpful.

I'm not familiar with JMP (the sw package they used) but, missing any information about the used design, probably they let the sw decide for them - the reduction of a full factorial design for their experiment, with 5 discrete parameters with several levels, it's not a trivial issue and some detail would be interesting.

Forget computer programs; one of the most interesting lectures that I every saw was from an old guy at Salford University (when they did chemistry....). He took the different parameters, put values for each of them on a piece of paper and drew them out of a hat (several hats) to get a random mixture.

Seemed to give the optimal conditions very quickly, but I don't know how applicable it is to the above.

When the conclusion is that the ligand isn't that critical, but the base is, I look twice at the base. In this case, I'll bet that acetate is not merely a base--it's on the metal at some critical point.

Sometimes DOE tells the obvious answer to a suitable reaction condition, but in other cases, it can really improve beyond what you would expect after some trying only with your knowledge and intuition. What book and software would those of you who has experience in this field recommend? It seems the standard for software is Minitab or Statistica and for books it would be Carlson or Montgomery. Any ideas? Thanks in advance.

I highly recommend using jump software. It's much easier and user friendly to the novice. In my experience, DOE has either told me the obvious, or lead me into a new direction which triggered another DOE. It's grossly underutilized in the chemistry world, but I've used it extensively for QbD work for a new process. It was so useful there, that it's trickled down to early process and Med Chem work to find quick fixes. The important thing about DOE in that for reaction conditions that contain multiple variables, it can quickly point you into the right direction and tell which variables are linked (secondary and even tertiary interactions if you power it well). I used to scoff, but just look through some of the crazy catalyst systems in the lit. and it becomes obvious that this is the only way to approach a problem with lots of variables.

The idea of this paper is interesting if you have a challenging direct arylation coupling (which this is not).
I don't like people who nitpick about "worthiness" but I would have declined to accept it in Org. Lett. since I could have guessed their final conditions off the top of my head.
I really wish they had picked a tricky coupling to demonstrate this cool screening method.

Derek has taken a very interesting point: DOE is a powerful tool not used very often by medicinal chemists, who see it as a tool for process chemists only.

In fact, DOE is not about mathematics, software, or others. It is about designing experiments, so you can get the most information for each one, usually with less experiments! Any medchem should appreciate this, being projects where you are looking for many compounds in the shortest possible time.

As some people has pointed out, sometimes DOE tells you what you knew from the beginning (we have run DOEs on some hydrogenations and the software was telling us that the main variable was catalyst loading - Great !!), but other times, specially when variables are aliased and you have several factors to play with, the software gets really really useful: the only practical way to approach it. Just think about a Heck reaction and how many factors can be of importance to the system. The software is then critical, but you can go without it and use even an Excel spreadsheet! We use the DesignEase, which we find user-friendly and intuitive.

But for med chems, it is really worth trying DOE when you have some time. As the CRO we are, we have run DOE for both process and medchem projects, and made a big effort to learn med chems that DOE can be really useful in drug discovery. Think that the improvement you are looking for may be considered in terms of yield, but also in terms of less by-products or others. For example, we ran a DOE for a critical reaction in a drug discovery route where yields were low, reaction was not complete and the purification was really nasty. Though the timeframe we had was really short and we were subjected to some limitations, we made improvements in the reaction (see http://www.galchimia.com/sites/default/files/newsletter/issue10/10art02v2.html). Note than in this example we made only the screening part of the DOE. No time to tune up. But the med chems told us that it was much easier to purify the reaction crudes and they were saving a lot of time both in the reaction and the purification.

As A Nonny Mouse has pointed out also, randomization is critical in DOE (you can use dice, a hat or the software). Any DOE handbook will comment thoroughly on this and other critical points.

And I concur with Fagnou Graduate. I have rejected a couple of papers trying to go into the J.Org.Chem. by bad DOE. Remember, DOE is about design... the experiments you choose to run must yield information.

Anyone know if DOE has been applied to any (molecular) biological problems? Strikes me that many things e.g. protein structure/function wrt amino acid sequence that are highly multifactorial could be studied in this manner? Or is it just too multi-dimensional?

Anyone know if DOE has been applied to any (molecular) biological problems? Strikes me that many things e.g. protein structure/function wrt amino acid sequence that are highly multifactorial could be studied in this manner? Or is it just too multi-dimensional?

It isn't so much that it's too multi-dimensional as that it's too non-linear. This means that you can't really study the effects of changing a protein one residue at a time and then combine the best single residue variants into an optimal protein.

"When the conclusion is that the ligand isn't that critical, but the base is, I look twice at the base. In this case, I'll bet that acetate is not merely a base--it's on the metal at some critical point." - Barry

In fact it is now widely believed that soluble carboxylates are involved in a concerted-metallation deprotonation event in which the Pd-C is forming simultaneously as the C-H bond is cleaving through a bridge carboxylate on the metal. See Fagnou, Echavarren and Maseras.

You do not need statistical analysis in this case, just read some literature on the reaction you are planning. Fagnou and echevarren have described, both experimentaly and calculated how the butyric acid (and pivaloylic acid) and acetate facilitate this reaction through a concerted intermediate. This is well known in direct arylation, and claiming that you get this from an analysis just shows that you forgot to do your homework and that there is no real inovative step...

Possibly ignorant comment - I thought the value of this paper wasn't to find conditions for direct arylation, but rather selective direct arylation for one position. Yes, those conditions would be considered standard for direct arylation, but conditions considered standard for direct arylation could just as easily give you C3 arylation as well.

This comment posted by someone familiar with the field of direct arylation, but by no means an expert.

HK, I agree. This is more about the use of DoE for both regio- and chemo-selectivity. Clearly the starting reaction conditions which are the more general set developed by Fagnou et al. were inferior and the optimization study resulted in a nicely selective reaction. I actually think this is a pretty good application, regardless of whether the transformation itself is Org Lett worthy or not.

I tend to use statistical mechanics to optimize my reactions using as few experiments as possible. For me, it is an invaluable tool (and my friends in process chemistry love it too). Sadly, I am working under an organic chemist who has no mathematical skill whatsoever, and she *complains* she can't follow my model for optimization. She will insist I run the reactions her way. Naturally, I refuse. And in a short amount of time, I get pretty good optimal reactions conditions anyway.