24 feb. 2017

Win ISI help in the software show some concepts about how the LOCAL works. As Marc says in a comment, there are not so much info and this Help is important. We will continue, because I have some info about how the algorithm for weighted average works so hope to give you more information in the next days.

LOCAL Optimization

Background

LOCAL™ is a patented calibration technique developed by Infrasoft International LLC. For information on LOCAL and how it works, please see the LOCAL discussion topic.

WINISI 4™ introduces an enhanced version of LOCAL that will allow calibration originators to easily optimize the regression parameters for LOCAL databases.

Overview

LOCAL regressions are optimized using a number of calibration parameters. One of the most important parameters in the LOCAL regression is the maximum and minimum number of PLS factors used. The enhanced version of LOCAL (available for WIn ISI 4) can automatically determine the optimal values to use for the maximum and minimum number of PLS factors.

Details

LOCAL calibrations work by creating temporary custom calibrations for each unknown sample. The calibrations are made from a small sample set chosen by selecting spectrally similar samples from a larger calibration library. The spectrally similar samples are then regressed using PLS. The regression is performed for the each of the factors specified in the range described by the Minimum number of factors and the Maximum number of factors. The final predicted value is the weighted average of all predictions over the range of factors. (This point is important because several predictions are done, one for every number of factors in the range of number of factors choosen between minimun and maximum. At the same time several GHs and NHs are calculated in that range based on the PLS scores as PL1 makes in the option Create a Score file from a Spectra file used in Win ISI with the option PL1)

The LOCAL program in WinISI III worked by computing a weighted average of predicted values from PLS prediction models that vary the number of PLS factors. The user specified in advance the minimum and maximum number of PLS factors to include in the weighted average. To evaluate a different combination of Minimum and Maximum number of factors, a completely new regression needed to be performed.

The new LOCAL analysis program stores all the data needed to compute a weighted average with any minimum and maximum number of PLS factors for each sample in the test file. As a post-processing step, the program evaluates all possible minimum and maximum pairs to determine the set with the smallest prediction error on the test file. Thus the LOCAL analysis program might evaluate a 4 - 30 factor range and output an optimal range of 4 - 15 factors for best performance. The Min / Max range of 4 - 15 would then be entered into ISIScan™.

I see that to many of you likes the LOCAL concept, so why not to try to get deeper into it. I have some questions and hope to give you answers as far as I know and understand.

To give from this Blog thanks to Mark Westerhaus for answering my mails when I have some doubts.

First we have to say that when a sample is analyzed a certain number of samples are selected for every constituent and the "GH reported for each constituent is an average
of several H values, using the same weights and range of factors used for the
final predicted value" MW.

"Each
H value is computed during a PLS calibration using the PLS scores,
and is equivalent to using a PL1 file" MW . In the LOCAL database there can be several products quite different from the others but this is not the purpose of the LOCAL. The idea is to have a library of closely related product and in this case we should be able to find a common standardization to all of them.

20 feb. 2017

It is convenient to check the compensation of histograms
before the calibration procedure. In this case all sample set has been split it
into four groups to make a LOCAL calibration study. One group was kept it out
for validation and the other three mixed for calibration, making all possible combinations.

Before to proceed, a look to the histograms will help to see compensation of
the groups.

19 feb. 2017

We can have a cal file with several parameters, and with all of them we start a LOCAL study to develop a LOCAL Calibration. We can split the cal File into two o more sample sets (at less one for the calibration and another for the validation (75% of the samples for the calibration and 25% for the validation selected in a random way could be fine. Now we can start the LOCAL procedure to get which is the best configuration for:

Minimum and maximum numbers to select (there is a batch option to check this)Minimum and Maximum number of terms for the calibration..

Wavelength range.

Math treatments

If we select all the constituents for this study we will get the minimum and maximum range for the terms, and probably we get a maximum close or equal to 50 (due to the ash constituent for example) and a minimum of 3 (due to the moisture). So this is the configuration that we have to add in the LOCAL model profile.

That does not mean that it will take 50 samples for the moisture becouse the algorithm will treat every sample and constituent individually.

You can split the cal file into several Cal files of just one constituent and develop the study for one of them every time and you will see how in the case of the moisture you will get different minimum and maximum for each and the range, if we see all together, is almost the same the one we get with the LOCAL study with all the constituents.

To split the Cal file for constituents can be important in the case that some of the works better with a math treatment that with another which is better for others. In this case we can prepare different RED files and different configurations.

17 feb. 2017

I share this video as an example of the transflectance concept used to analyze liquid samples in a reflectance instrument. In this case the cup used is a slurry cup which is easier to clean, and the reflector is placed over the sample so the light goes through the sample and is reflected back to the detector. It is important to select the right pathlength in order to get a better signal and not saturated NIR bands in the spectra.

15 feb. 2017

This is the case of a sample scanned in the same cup at
different temperatures. Normally when we scan a sample is at a temperature
similar to the laboratory, but sometimes we received the sample from the
process and can be warmer and for different reasons we analyze the sample
warmer than normally. There are other cases, especially in the winter that we
take the sample from the truck with samples and the sample is very cold and we
analyzed by NIR anyway.

In both cases probably we get a warning from the
Mahalanobis distance, and a strange result and that is because the model does
not incorporate the variance due to the temperature of the sample. In this case
wait that the sample reaches the lab temperature and analyze it by NIR.

Of course maybe you can make the model robust to this
effect.

In the next figure we can see the spectra of a sample in
second derivative scanned 46 time at 46 different temperatures (from very warm
to very cold) and we can see that all the spectra seem the same except in
certain zones:

Now we calculate the average spectrum and we subtract
every spectrum from the average spectrum, in order to do this I have export the
spectra to Excel, and these curious spectra appear:

As we can see some kind of first derivatives appear at
the wavelength zones of O-H, due probably to hydrogen bonds which shift the
peaks of the water band.

Using repeatability files we can minimize this effect and
obtain similar results analyzing the same sample at different temperatures.