SP: Smoothing F: movav P: prospect |
It is a simple moving average of a spectral data using a convolution function. |
|
|
SP: Binning F: binning P: prospectr |
Binning is used to reducing the effects of minor observation errors by computing average values of spectral data. To perform spectral binning, the bin size has to be specified (bin size). |
SP:Absorbance F: A = log10 1/R |
Absorbance is based on measuring the amount of light absorbed by a sample at a given wavelength. |
SP: Detrend F: detrend P: prospectr |
Detrend normalizes the spectral data by applying a standard normal variate transformation followed by fitting a second-degree polynomial regression model and returning the fitted residuals. |
SP: Continuum Removal (CR) F:continuumRemoval P: prospectr |
Continuum Removal remove the continuous features of the spectra and is often used to isolate specific absorption features present in the spectrum to minimize the noise. The continuum is represented by a mathematical function used to separate and highlight specific absorption bands of the reflectance spectrum. |
SP: Savitzky–Golay Derivative F: savitzkyGolay P: prospect |
Derivatives are performed to remove unimportant baseline signal from samples by taking the derivative of the measured responses with respect to the variable number (wavelength). The Savitzky-Golay derivatization algorithm requires selection of smoothing points (filter width), the orders of polynomial and derivative. |
SP: Standard Normal Variate (SNV) F:standardNormalVariate P: prospectr |
Standard Normal Variate is performed in spectral data to remove scatter. It is applied to every spectrum individually. Standard Normal Variate is designed to operate based on centering the underlying linear slope of each individual sample spectrum. |
SP: Multiplicative Scatter Correction (MSC) F: msc P: pls |
Multiplicative Scatter Correction is achieved by regressing a measured spectrum against a reference spectrum. The MSC is effective in minimizing baseline offsets and multiplicative effect. The outcome of MSC, in many cases, is very similar to SNV, except SNV corrects each spectrum individually and does not need the entire data set. |
SP: Normalization F: data.Normalization P: clusterSim |
Normalization means adjusting values measured on different scales to a common scale, where these normalized values eliminate scattering effects. Five types of normalization were included in AlradSpectra: standardization, normalization in range, quotient transformation, normalization, and normalization with zero being the central point |
SP: Multiple Linear Regression (MLR) F: glmStepAIC P: caret |
Multiple Linear Regression is a statistical method that uses several explanatory variables to predict the outcome of a response variable in a simple linear model (Galton, 1886). The MLR assumes the relationships between independent variables and the dependent variable are linear. |
M: Partial Least Squares Regression (PLSR) F: plsr P: pls |
Partial Least Squares Regression can handle complicated relationships between predictors and responses and can deal with complex modeling problems. Additionally, PLSR is a method for constructing predictive models when the factors are many and highly collinear ( Wold et al., 1984Wold S , Ruhe A , Wold H , Dunn III WJ . The collinearity problem in linear regression. the Partial Least Squares (PLS) approach to generalized inverses . SIAM J Sci Stat Comp . 1984 ; 5 : 735 - 43 . https://doi.org/10.1137/0905052
https://doi.org/10.1137/0905052...
), which is the case of hyperspectral data. |
M: Support Vector Machines (SVM) F: svm P: e1071 |
Support Vector Machines are a group of supervised learning methods, which represent an extension to nonlinear models of generalized algorithm with the capability of training nonlinear classifiers ( Ivanciuc, 2007Ivanciuc O . Applications of support vector machines in chemistry . In: Lipkowitz KB , Cundari TR , editors . Reviews in computational chemistry . New York : John Wiley & Sons, Inc .; 2007 . vol. 23 . p. 291 - 400 . ). Associated with SVM algorithm is the criteria of smaller number of support vectors yield a better model performance ( Loosli et al., 2007Loosli G , Bottou L , Canu S . Training invariant SVMs using selective sampling . In: Bottou L , Chapelle O , DeCoste D , Weston J , editors . Large-scale kernel mach . London : The MIT Press ; 2007 . p. 301 - 20 . ). |
M: Random Forest (RF) F: randomForest P: randomFores |
Random Forest is a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest ( Breiman, 2001Breiman L . Random forests . Mach Learn . 2001 ; 45 : 5 - 32 . https://doi.org/10.1023/A:1010933404324
https://doi.org/10.1023/A:1010933404324...
). The RF is versatile and flexible with a small or large data set. Model interpretability is an issue when compared to linear models. |
M: Gaussian Process Regression (GPR) F: gausspr P: kernla |
Gaussian Process Regression is a nonparametric regression using Gaussian processes, which applies a kernel function for training and predicting. In machine learning, kernel methods are a class of algorithms for pattern analysis. This approach replaces the features (predictors) by a kernel function. Several classes of kernels can be used for machine learning, and the selection of kernel is critical to the success of these algorithms ( Karatzoglou et al., 2004Karatzoglou A , Smola A , Hornik K , Zeileis A . kernlab - an S4 package for kernel methods in R . J Stat Softw . 2004 ; 11 : 1 - 20 . https://doi.org/10.1016/j.csda.2009.09.023
https://doi.org/10.1016/j.csda.2009.09.0...
). |