I am using example_cposv.c to solve equations through cholesky factorization and I found a strange thing happening. Whenever I take A= [100x100] the execution time taken is more than that for A=[120x120]. Ideally bigger problem size should take more time. I executed it using single and multiple cores, the behavior is still the same in both the cases. Any possible guesses or explanations?

Firstly, why it doesn't show up on your curve for one core ? You said you had the same problem ?Secondly, you should not use several threads/cores for those problem sizes. They corresponds to what we use for the tile size, so there is no good explanations, maybe in one case all the tasks which are only 3 or 4 are executed by the same core while they are distributed over several cores in the other case, or it appears when you just go over the NB parameter, creating new tasks.If you use the default parameters NB is set to 120, so the copy form lapack to tile and from tile to lapack is done with M=LDA. Maybe MKL is a little slower with a leading dimension greter than M.

Finally, if you want to do timing, you should use the timing directory which is more complete and up to date. It will allow you to change parmeters more easily.