I understand how texel tuning works for PST, or for other evaluation parameters. But I don't understand how someone can tune piece values this way. Usually in the sample positions (some millions) the numbers of queens will mostly always be the same (1 per color), and the number of pawns will also mostly be the same (+/-1). Only the number of knight/bishop/rook will differ sometimes thanks to some exchange that have been done. But in the millions of positions used, most of them will show the exact same number of pieces on the two side. This way there is no chance to tune piece values I think, the convergence rate would be infinitly small and mini batch gradient or stochastic gradient "sampling" will most alway slead to near 0 gradient for queens, or pawn value.

There also is little need to do it, because Larry Kaufman already did it for us, and shared the results.

It is true that you cannot derive imbalance values for, say, 3 Queens vs 7 Knights this way. But if you are interested in those you can simply generate your own set of positions with imbalances of that type through computer self-play from start positions with a similar imbalance. In the worst case the self-play sucked because the it used completely different values from those that came out of the tuning procedure. But then you simply repeat the procedure through self-play with the better tuned values, etc.

You cannot interpret these piece value outside the context of other evaluation terms. Especially '1 Pawn' is an ill-defined concept. There are many kinds of Pawns, varying in value from 50cP to 250cP, depending on whether it is a blocked, isolated, doubled edge Pawn or a 7th-rank passer. The common lore that R=5 and Q=9 probably was derived from end-games with few Pawns, where most Pawns that compensate the piece imbalance are passers, not scattered too much. In the world of Pawn evaluations that would count like one of the better Pawns.

You have to fix something.
You can fix the MG pawn value at 100, if you like. Personally, I fix the scale factor between evaluation score and expected outcome (I think it’s called “k” in most posts that describe the method) at 1.0.
It is true that the positions you sample must have the material imbalance terms you want to tune, but that is actually true of all terms you want to tune. You cannot tune passed pawn terms on positions that don’t have passed pawns, for instance.

Yes i am working at fixed "k", using the optimal one based on initial guess parameters.

I meant, doesn't at least one piece value shall be fixed also ?. That's why I was saying pawnMG=100 ; of course malus / bonus will be apply to that initial value for passed pawn for instance.

Using a mini batch gradient (with momentum, alpha=0.9, normalized gradient and learning rate of 20, with batch size of 1024 ; the full data size is 1300000 positions : the famous Ethereal.fens), the optimal "k" and starting from piece values all equal to 450, I see error decrease but piece values going quite weird. After 1200 iterations I get this :

The values Larry Kaufmann gives are just a guideline that most engines were using in the past, all values are relative to each other, so it really doesn't matter what value you assign to a pawn to begin with.

In the past I used hand tuned values, and I started by using values like { 100, 325, 325, 500, 900 }, these are really the old school values, after tuning them with Texel style tuning, they look now like this:

Of course these values also depends upon the values of the PST and the values you assign to the positional evaluation, my PST's are scaled in such a way that the sum of all squares for one piece == 0. Personally I think that the whole concept of PST's is flawed, it is just a remnant of the past where we used to have lazy evaluation with material and just a few positional terms to gain a few percent in speed.

BTW. my k-factor currently is: 1.68, but that also doesn't matter much of course.

My material evaluation is still in centipawns, but for the positional evaluation I use millipawns because the resolution seemed a bit too low.

Over here gradient descent works fine, I don't see any problem with it, so if it goes weird in your case there must be a bug of some kind.