Spectroscopy and Chemometrics News Weekly #47, 2015

Near Infrared

NIR-Sensor ermittelt Trockensubstanz während der Mischwagenbefüllung | Futterkomponente LINK

Ultra-low maintenance FTNIR analyzer for the refining & petrochemical industries | pauto LINK


Infrared

Seeing Through Crude Oil for Efficient Oil Separations using Short-Wave Infrared (SWIR) Cameras – AZoSensors LINK


Facts

RoboBees Can Fly and Swim. What’s Next? Laser Vision – Smithsonian UAS UAV LINK


Equipment

Scientists create an all-organic UV on-chip spectrometer – The U.S. Department of Energy’s Ames LINK


Agriculture

… detection of contaminants in agro-food products, … melamine levels in milk using vibrational spectroscopy LINK


Laboratory

Examining Pigmented Human Tissue using SWIR Raman Spectroscopy – AZoSensors LINK


Other

SCiO Molecular Scanner UNBOXING – Video LINK



CalibrationModel.com

Dear NIR-Spectrometer vendors, this is about how you can improve customer web-traffic | NIRS Spectrometer LINK

Efficient development of new quantitative prediction equations for multivariate NIR spectra | spectra LINK

How to Develop Chemometric Near-Infrared Spectroscopy Calibrations in the 21st Century? | NIR LINK

How to Develop Near-Infrared Spectroscopy Application Today? | pharma lab analysis chemist TechTrends LINK

Improve chemical analysis accuracy by optimized chemometric models for Near-Infra-Red (NIR) Spectroscopy LINK

Improving Accuracy, Precision and Robustness of NIR-analysis LINK

NewsLetter: Spectroscopy and Chemometrics News Weekly 46, 2015 | Molecular Spectroscopy NIRS Chemometrics Raman LINK

Pro Tip: The NIR calibration is the central key to accurate NIR measurement LINK

Services for professional Development of Near-Infrared Spectroscopy Calibration Methods | NIR Quality Testing LINK



Procedures for NIR calibration – Creation of NIRS spectroscopy calibration curves

Do you know the effect that you prefer to try out their favorite data pretreatments in combination and often try the same wavelength selections based spectra of the visualized?

You try as six to ten combinations until one of them selects his favorite calibration model, to then continue to optimize. Since then suddenly fall to outliers, because it goes in depth, so is familiar with the data, we know now the spectra of numbers of outliers and is familiar with the extreme values.

Now, the focus is on the major components (principal components, Latent Variables, factors) and makes sure not to over-fit and under-fit not to. The whole takes a few hours and finally one is content with the model found.

So what would happen if you all in the beginning tried variants found outliers removed and re-evaluated and compared? The results would be better than that of the previous model choice? One does not try out? Because it is cumbersome and takes hours again?

We have developed a software which simplifies this so that also the number of model variations can be increased as desired. The variants generation is automated with an intelligent control system, as well as the optimization and comparing the models and finally the final selection of the best calibration model.

Our software includes all the usual known data pretreatment methods (data pre-processing) and can combine them useful. Since many Preteatments are directly dependent on the wavelength selection, such as the normalization the determined within a wavelength range of the scaling factors to normalize the spectra so that pretreatments with the wavelength ranges may be combined. So a variety of settings sensible model comes together that are all calculated and optimized. For the automatic selection of the relevant wavelength ranges, different methods are used, which are based on the spectral intensities. Thus, for example, regions with total absorption is not used, and often interfering water bands removed or retained.

Over all the calculated model variations as a summary outlier analysis can be made. Are there any new outliers (hidden outlier) discovered, all previous models can be automatically recalculated, optimized and compared without these outliers.

From this great number of calculated models with the statistical quality reviews (prediction performance) the optimum calibration can now be selected. For this purpose, not simply sorting by the prediction error (prediction error, SEP RMSEP) or the coefficient of determination (coefficient of determination r2), but by several statistical and test values are used jointly toward the final assessment of optimal calibration.

Thus we have created a platform that allows the highly automated work what a man can never do with a commercial software.

We therefore offer the largest number of matched to your application problem modeling calculations and choose the best calibration for you!

This means that our results are faster, more accurate, robust and objective basis (person independent) and quite easy for you to apply.

You have the full control of the models supplied by us, because we provide a clearly structured and detailed blueprint of the complete calibration, with all settings and parameters, with all necessary statistical characteristics and graphics.

Using this blueprint, you can adjust the quantitative calibration model itself in the software you use, understand and compare. You have everything under control form model creation, model validation and model refinement.

Your privacy is very important to us. The NIR data that you briefly provide us for the custom calibration development will remain of course your property. Your NIR data will be deleted after the job with us.

Interested, then do not hesitate to contact us.

NIRS Calibration Model Equation – Optimal Predictive Model Selection

To give you an insight what we do to find the optimal model, imagine a NIR data set, where a NIR specialist works hard for 4 hours in his chemometric software to try what he can with his chemometric-, NIR spectroscopic- and his product-knowledge to get a good model. During the 4 hours he finds 3 final candidate models for his application. With the RMSEP of 0.49 , 0.51 and 0.6. Now he has to choose one or to test all his three models on new measured NIR spectra.

That is common practice. But is this good practice?

And nobody asks, how long, how hard have you tried, how many trial have you done, if this really the best model that is possible from the data?
And imagine the cost of the data collection including the lab analytics!
And behind this costs, have you really tried hard enough to get the best out of your data? Was the calibration done quick and dirty on a Friday afternoon? Yes, time is limited and manually clicking around and wait in such kind of software is not really fun, so what are the consequences?

Now I come to the most important core point ever, if you own expensive NIR spectrometer system, or even many of them, and your company has collected a lot of NIR spectra and expensive Lab-reference data over years, do you spend just a few hours to develop and build that model, that will define the whole system’s measurement performance for the future? And ask yourself again (and your boss will ask you later), have you really tried hard enough, to get the best out of your data? really?

What else is possible? What does your competition do?

There is no measure (yet) what can be reached with a specific NIR data set.
And this is very interesting, because there are different beliefs if a secondary method like NIR or Raman can be more precise and accurate, as the primary method.

What we do different is, that our highly specialized software is capable of creating large amounts of useful calibrations to investigate this limits – what is possible. It’s done by permutation and combination of spectra-selection, wave-selection, pre-processing sequences and PC selections. If you are common with this, then you know that the possibilities are huge.

For a pre-screening, we create e.g. 42’000 useful calibrations for the mentioned data set. With useful we mean that the model is usable, e.g. R² is higher than 0.8, which shows a good correlation between the spectra and the constituent and it is well fitted (neither over-fitted nor under-fitted) because the PC selection for the calibration-set is estimated by the validation-set and the predictive performance of the test-set is used for model comparisons.

Here the sorted RMSEP values of the Test Set is shown for 42’000 calibrations.
You can immediately see that the manually found performance of 0.49 is just in the starting phase of our optimization. Interesting is the steep fall from 1.0 to 0.5 where manually optimization found it’s solutions. A range where ca. 2500 different useful calibrations exist. The following less steep fall from 0.5 to 0.2 contains a lot more useful models and between 0.2 to 0.08 the obvious high accurate models are around 2500 different ones. So the golden needle is not in the first 2500 models, it must be somewhere in the last 2500 models in the haystack.

Sorted RMSEP plot of 42'000 NIR Calibration Model Candidates

That allows us to pick the best calibration out of 42’000 models, depending on multiple statistical evaluation criteria, that is not just the R² or RPD, SEC, SEP or RMSEP, (or Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Multivariate AIC  (MAIC) etc.) we do the model selection based on multiple statistical parameters.

Dengrogram plot of similar  NIR Calibration Models

To compare the calibration models by similarity it is best viewed with dendrogram plots like this (zoomed in), where the settings are shown versus the models overall performance similarity. In the settings you can see a lot of different permutations of pre-processings combined with different wave-selections.

NIR Spectroscopy Calibration Report for quantitative predictive models

When you send your quantitative NIR spectra data to our NIR Calibration Model Service, you get a detailed calibration report (calibration protocol) of the found optimal calibration settings, so you are able to see all insights and easily re-build the model in your NIR/Chemometric software.

Here is a part of our calibration report, that exactly describes the data used in the calibration set (CSet), the validation set (VSet) and the test set (TSet). The numbers are the number ids of the spectra in your delivered NIR data file.


The calibration method settings and parameters are
Waveselection : the variable selection or wavenumber selection or wavelength selection
Pretreatments : the spectral data pre-processing
PCs : the number of  Principal Components (PC) or Latent Variables (LV)
Method : the modeling method algorithm used, e.g. PLS

Then the statistical analysis of the PLS model by the different sets (CSet, VSet, Tset).

Calibration Report

Statistical analysis of calibration, validation and test results : 1 Name, 2 Unit, 3 N : number of spectra, 4 N : number of samples, 5 Average spectra count per sample, 6 Reference values, 7 Min, 8 Mean, 9 Median, 10 Max, 11 Standard deviation, 12 Skewness : left (-) or right (+) lack of symmetry, 13 Kurtosis : flat (-) or peaked (+) shape, 14 Model statistics, 15 RPD, 16 R², 17 RMSEC, RMSEP, RMSET : root mean square of prediction errors, 18 SEC, SEP, SET : standard error (bias corrected), 19 Bias, 20 Skewness of prediction errors, 21 Kurtosis of prediction errors, 22 Intercept, 23 Slope, 24 Intercept (reverse), 25 Slope (reverse), 26 Sample Prediction Repeatability Error, 27 Sample Prediction Repeatability Error (of Missing data MSet)

This shows how we deliver the optimal settings. With the statistical values, the NIR model predicted values of all spectra and additional plots you are able to compare with your re-built model to verify that the models perform nearly equally.

How to develop near-infrared spectroscopy calibrations in the 21st Century?


The Problem

Calibration modeling is a complex and very important part of NIR spectroscopy, especially for quantitative analysis. If the model is badly designed the best instrument precision and highest data quality does not help getting good and robust measurement results. And NIR Spectroscopy requires periodically recalibration and validation.


How are NIR models built today?

In a typical usage in industry, a single person is responsible to develop the models (see survey). He or she uses a Chemometric software that has a click-and-wait working process to adjust all the possible settings for the used algorithms in dialogs and wait for calculations and graphics and then to think about the next modeling steps and the time is limited to do so. Do we expect to find the best use-able or optimal model that way? How to develop near-infrared spectroscopy calibrations in the 21st Century?


Our Solution

Why not put all the knowledge a good model builder is using into software and let the machines do the possibilities of calculations and presenting the result? Designing the software that way, that the domain knowledge is built-in, not just only the algorithms for machine learning and make it possible to scale the calculations to multi-core computers and up to cloud servers. Extend the Chemometric Software with the Domain Knowledge and make as much computer power available as needed.

As it was since the beginning

User  → Chemometric Software → one Computer → some results to choose from

==> User’s time needed to click-and-wait for creating results

Our Solution

User → (Domain Knowledge → automatized Chemometric Software) → many Computers → the best models

==> User’s time used to study the best models and reasoning about his product / process

Note that the “Domain Knowledge” here does perfectly support the User’s product and process knowledge to get the things done right and efficient.


Scaling at three layers

  • Knowledge : use the domain knowledge to drive the Chemometric Software
  • Chemometric Software : support many machine learning algorithms and data pre-processings and make it automatic
  • Computer : support multi-core calculations and scale it to the cloud

The hard part in doing this, is of course the aggregation of the needed domain knowledge and transform it into software. The Domain Knowledge for building Chemometric NIR Spectroscopic models is well known and it’s huge and spreads multiple disciplines. Knowledge-driven software for computing helps to find the gold needle in the haystacks. It’s all about scaling that makes it possible. See Proof of Concept.


New possibilities

  • NIR users can get help working more efficient and getting better models.
  • New types of applications for NIR can be discovered.
  • Evaluation of NIR Applications to replace conventional analytical methods.
  • Hopeless calibrations development efforts can be re-started.
  • Higher model accuracy and robustness can be delivered.
  • Automate the experimental data part of your application study.
  • Person independent optimization will show new solutions, because it’s not limited by a single mindset => combining all the aggregated knowledge and its combinations.
  • Software independent optimization will show new solutions, because none of vendor specific limitations and missing algorithms are present => combining all open available algorithms and there permutations.
  • Computing service is included.

Contact us for trial

Your NIR data is modeled by thousands of different useful calibration models and you get the best of them! That was not possible before in such a easy and fast way! See How it works

NIR Calibration Modeling (Part 2)

( to part 1 )

All the below categories are implemented by using multiple different algorithms and formulas which leads to many different calibrations.

Steps in modeling
  • Data Cleaning – (bad data, missing values, duplicate elimination, spectral quality / intensity / noise, input value typing errors, …)
  • Initial Calibration set up – selection of calibration, validation and test samples
  • Wavelengths selection
  • Data preprocessing, pretreatments
  • Method calculation
  • Choosing the number of Principal Components / Latent Variables
  • Validation of calibration model / Statistics of performance – (accuracy, precision, linearity, repeatability, range, distribution, robustness / stability, sensitivity, simplicity, etc.)
  • Outlier examination and removal


The problem of choosing the optimal number of factors to find the optimum between underfitting and overfitting is solved by having multiple methods and protocols implemented leading to multiple calibrations.

The evaluation and the selection of the best calibration is based on many individual statistical values including the most popular RMSEP, SEP, Bias, SEC, R2 and PCs etc.

Results and Reporting

A detailed calibration report is provided detailing the best available calibration containing all calibration parameter settings and statistics of prediction performance of the calibration set, the validation set and the test set. A visual expression of the calibration is provided with the most importance plots.

Our service works with any quantitative NIR spectra data set in the standard JCAMP-DX format and uses mainly PLS and PCR to be compatible with other chemometric calibration software.

NIR Calibration Service

Services and software for data analysis and analytical modeling for spectroscopy.

This NIR calibration service provides the custom development of optimal quantitative NIR calibration models based on your collected NIR and reference data for vendor independent full range NIR spectrometer analyzers (NIR = Near Infra Red spectroscopy) based on chemometric multivariate methods like Partial Least Square Regression (PLS, PLSR) and Principal Component Regression (PCR).

The key points

The NIR calibration model is decisive for the analysis accuracy.

NIR analysis results make the difference.

Near-Infrared Data Modeling Calibration Service

The problems

Imagine how many publications and literature of NIR spectroscopy (JNIRS) and chemometrics (Journal of Chemometrics) is present.

Did you find the time for the right to designate to read, to study, to incorporate them into practice? Do you have all this knowledge at your calibration developments always present, that you consider anything important, the statistical results, interpret them correctly, analyze the graphs accurately and apply all the tips & tricks of optimizing correctly?

We have the solution for you!

We’ll help you to create and optimize your calibrations. You retain complete control. You have your calibration, with our help, himself under control.

You can view the complete calibration of all the settings down to the smallest detail precisely documented and visualized.

You can also make any changes in the settings. This means you remain independent and have the control in your hand.

We will help you for the time-consuming and knowledge-intensive part. You get the best calibration solution and decide for yourself

Try it and see for yourself