Abstract
To assess the physiological quality of soybean seeds, regulations require germination levels of at least 80% for a batch of seeds to be approved. With the evolution of technology and computing, spectrophotometric methods are advancing, making it possible to analyze the molecular vibrations present in seed samples and correlate them with germination results. The aim of this work was to analyze the use of Near Infrared (NIR) spectrometry with the aid of machine learning, in conjunction with physiological attributes, to predict the classification of soybean seed samples into accepted (above 80% germination) and rejected (below 79% germination), according to the legislation. A total of 134 soybean samples were analyzed for germination on paper, vigor by first count and tetrazolium. The NIR equipment was used to obtain the spectral curves of these samples. The Savitzky-Golay filter was used for pre-processing and data processing showed that the algorithm with the highest precision and accuracy was Random Forest, generating 96.1% recall and 98% average precision for the class of accepted lots, and 98.8% and 97.5% for the class of rejected lots. After using the algorithms, it was found that the range of 807.18 to 817.65nm can express information on physiological attributes and can be used to rank batches of soybean seeds, allowing for rapid estimation of the physiological quality of batches. Spectroscopy can be a complementary methodology with a quick response, with the potential to be used analogously to physiological quality analysis.
Keywords: machine learning; germination; physiological quality; NIR; glycine max
HIGHLIGHTS
The NIR technique expresses information on the physiological attributes of seeds.
The Random Forest algorithm showed 98% accuracy in classifying soybeans.
NIR, in conjunction with machine learning, is an effective approach.
INTRODUCTION
Soybeans are one of the most important agricultural commodities in the world and are becoming increasingly important on the world stage [1], due to their versatility as human and animal food, as well as their economic value [2]. Brazil is among the largest producers, with 43.459 million hectares of planted area [3], spread across different regions of the country, corresponding to around 56% of the area occupied by grains in Brazil. Among the states with the highest production are Mato Grosso, Paraná, Rio Grande do Sul and Goiás respectively [3]. Given this scenario, quality criteria for soybeans and seeds, and their marketing standards, are strict. However, the traditional analytical procedures used in these quality controls are time-consuming, destructive and expensive, requiring modernization of decision-making methods [4], [5], with fast and non-destructive methods [6].
In the search for faster techniques to assess the physiological quality of soybeans, optical methods such as near-infrared spectroscopy (NIR) can stand out [7]. The information obtained from the near infrared provides faster data interpretation, since there is a large amount of information on the seed's physiological quality components [4].
With the evolution of technology and computing, spectrophotometric methods are advancing, making it possible to analyze the molecular vibrations present in samples [8] [9] [6]. By analyzing molecular vibrations, information can be obtained about the structure and conformation of the molecules present in seeds, which is directly related to their biological functionality.
Therefore, the analysis of molecular vibrations can be a powerful tool for evaluating the quality of seeds, helping to select the best performing seeds and contributing to the optimization of agricultural production processes [10]. The infrared spectroscopy technique can identify and analyzing the molecular vibrations of the substances present in seeds [11] [10].
The use of NIR combined with artificial intelligence (AI) methods to analyze post-harvest seed processes is promising for investigating physiological quality and generating reliable results. AI is a field of computer science dedicated to the study and development of computer equipment and programs capable of reproducing human behavior in decision-making and multitasking simultaneously. According to [12], the main objective of this technique is to be able to predict or classify a variable based on previously collected data. The detection of seed physiological quality and the NIR technique have been reported in recent publications, with significant advantages for the technology [13] [14]. Machine learning and NIR are therefore promising tools for improving seed physiological quality control methods.
The objective of this work was to evaluate the use of the NIR technique as an alternative for the rapid determination of quality control parameters, predicting the classification of soybean seed lots based on these data, correlated to the quality attributes determined by the legislation governing the quality of commercial seeds.
MATERIAL AND METHODS
Soybean seed samples received from the production fields were used, totaling 134 samples, with 13 attributes (Table 1), accounting for 81 accepted lots and 53 rejected lots, based on the results obtained in the first germination assessment when the samples were received in the laboratory, with only the physiological quality of the samples being considered (Table 1).
According to IN 45 of September 17, 2013 [15], a lot of soybean seeds must have at least 80% germination to be considered acceptable for sale. Below this, it is rejected.
The samples were subjected to germination tests immediately after harvesting, called Initial Germination - G (5 days after sowing on paper) and First Germination Count - PCG (8 days after sowing on paper). After six months of storage, under controlled conditions of temperature and relative humidity (25°C and 60%), the samples were again subjected to Germination and First Germination Count assessments. This stage was conducted in accordance with the Rules for Seed Analysis - RAS [16]. Tetrazolium evaluation was also carried out, according to [17] and [18]. After these evaluations, the NIR reflectance spectral curves of these samples were obtained. We used a prototype developed by the company Fooze, in partnership with Grandeo Technology, which was made available to Base Assessoria Agronômica Ltda. The wave spectrum studied was between 634 and 1126 nm, with a total of 1023 bands. This range was used due to the factory setting of the equipment. Once the spectral curves had been collected, they were plotted individually to check their quality about the presence of noise.
The 1st order Savitzky-Golay noise filter was applied to the spectroscopy data using the scipy.signal package [19], with the objective of excluding noise interference, thus favoring the observation of spectral behavior. Next, the 1st derivative was performed on each of the curves, according to [20], using the signal processing library (scipy.signal), to optimize, interpolate and statistically integrate the data obtained via the spectrum. Thus, the window of points used to calculate the first-order Savitzky-Golay derivative in the scipy.signal package, defined by the window_length parameter, which specifies the number of adjacent points included in the analysis window, was 9, following the standard that this parameter should be an odd number. The polynomial used was of the second order.
After plotting the curves generated by the first derivative, spectra were chosen from the points close to those with significant differences, and the spectral range between 634.61 and 950nm was then used (Figure 1b).
As the objective is to understand the correlation between the data generated by the NIR technique and the results obtained by the seed physiological quality analysis (Table 1), machine learning algorithms were used to perform data mining. Therefore, the data mining was performed using machine learning techniques. The software Weka, version 3.9.6, was utilized, running on an NVIDIA GeForce MX250 processor, integrated with an Intel® Core™ i5-10210U CPU running at 2.11 GHz, with 8GB of RAM. After data preprocessing, there were a total of 700 rows for algorithm analysis, with 70 rows for each genotype.
This study used the J48, Random Forest, Random Tree, LDA and LIBSVM algorithms, the first three being decision tree algorithms and LDA and LIBSVM being linear regression algorithms. The k-fold method was used to cross-validate the models, dividing the data into 10 subsets, as described by [22, 23]. To improve the algorithm's performance, the Resample filter was used, according to [24].
Using the confusion matrix, with the objective of better evaluating the errors and successes of each of the classifiers used [22], the values of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) were extracted, with the aim of calculating the Recall and Precision metrics, according to equation (1) [21]:
Onde:
TP = True Positives
FP = False Positives
Where:
TP = True Positives
TN = True Negatives
FP = False Positives
FP = False Positives
To better assess this correlation, it was necessary to filter the NIR data. In this way, the noise presented in the other spectrum bands evaluated, due to their proximity to the wavelength spectrum band in which the water molecules are influenced, did not influence the results of the algorithms. Thus, the data used together with the quality attributes was from the range between 807.18 and 817.65 for the application of the algorithms.
RESULTS
In this study, the seeds used were categorized according to the results obtained in the germination test as accepted (>80% germination) and rejected (<80% germination). The requirement of 80% germination for a lot of soybeans to be considered accepted is based on the legal seed standards established by the Ministry of Agriculture - MAPA.
Figure 1a shows the spectral curves obtained from the reflectance readings acquired from the samples studied using NIR, in the spectral range between 634.61 and 1126 nm. The spectral profiles of the seed lots in the different spectral bands followed a similar trend with variations in amplitude and waveform across the levels of vigor but varied in the magnitude of relative reflectance (0.41 and 0.77). The spectral curves of the soybean seed lots decreased to a plateau in the 760 ~1114 nm region, and the fluctuation of the curves in the near infrared region (858~953 nm) was less than in the visible region (634-760 nm). Although assigning bands in the Near Infrared Spectrum is a difficult task due to its broad and highly overlapping bands, it is possible to identify important spectral regions in differentiating seed lots.
The difference in spectral reflectance in relation to vigor levels gradually increased at wavelengths above 690 nm. Therefore, spectra in the 600 to 900 nm range were used to classify the lots. To do this, a plot was made to magnify the differences in the spectral characteristics of seed quality (Figure 1a). To visualize the characteristic spectral behaviors, the first-order derivative was calculated (Figure 1b), according to the study by [25]. In this way, the spectral patterns between the seed samples and knowledge of their levels of vigor can be visualized.
(a) Spectral curves of soybean seed samples. (b) Spectral curves of soybean seed lots (1st derivative).
The performance of each algorithm used is presented in Table 2. The IBk algorithm showed the best performance in classifying the studied genotypes (89.43% correct classification of attributes), demonstrating a better fit to the proposed study data.
The performance of the machine learning techniques was assessed by analyzing the algorithm's confusion matrix, as shown in Table 3. When analyzing the algorithm's ability to predict real values using the Accuracy performance metric, it was observed that most of the algorithms obtained high values, above 80%. The exception was the LibSVM algorithm, which had values below 70%. These results indicate that, in general, the machine learning techniques used performed well in predicting values, with high accuracy in most cases. However, it is important to consider the specific results of each algorithm for a more detailed analysis of its performance.
The J48 algorithm was highly accurate in classifying soybean samples. In the class of samples with germination above 80% (accepted), its accuracy was 95.9%, which means that 95.9% of the predictions were correct, while 4.1% of the data was misclassified. In the class of rejected samples (germination below 79%), accuracy was also high, reaching 95.1%. These results indicate that the J48 algorithm was able to make highly assertive predictions when classifying soybean lots. Other algorithms also showed good results, such as Random Forest (98% accuracy for accepted samples), Random Tree (88.9% accuracy), LDA (94.7% accuracy), FLDA (88.9% accuracy) and LibSVM (100% accuracy). These values demonstrate the ability of the algorithms to make accurate and reliable predictions when classifying soybean samples, contributing to the assessment of their physiological quality (Table 4).
Metrics of the different algorithms used, being Recall (sensitivity), Precision, ROC Curve (Receiver Operating Characteristic) and F-Measure.
The ROC (Receiver Operating Characteristic) curve represents the relationship between the sensitivity and specificity of a classification model, considering a discrimination threshold. Models that are close to the point (0.1) on the ROS curve are considered better classifiers [26]. When analyzing the classification of soybean seed lots, it was observed that the LibSVM model showed the lowest values on the ROC curve for both classes (Table 4).
When analyzing the spectrum range between 807.18 and 817.65 nm, using data mining and the J48 supervised algorithm, the algorithm uses the variables “germination”, “moisture” and “first count” to classify the samples as accepted or rejected, as shown in Figure 2.
The Random Tree algorithm used physiological quality variables such as germination, moisture, TZ mechanical damage and the 808.60 and 811.46 nm spectra to predict the lot's classification, as shown in Figure 3. This result is relevant, as the algorithm identified a relationship between the spectra's reflectance information and the seed's quality parameters. This indicates that analyzing the spectra at these specific wavelengths can provide valuable information about seed quality. This finding could be useful for improving the processes of selecting and classifying lots of soybean seeds based on physiological and spectral characteristics.
Table 3 shows the confusion matrix of the algorithms. This matrix shows the random classification of lots as accepted or rejected. It is important to note that a 100% hit rate may indicate an overfitting or bias in the algorithm, rather than real assertiveness. It is therefore necessary to interpret these results with caution and consider other evaluation metrics, such as precision, recall and F1-score, to obtain a more complete of the algorithm's performance in classifying soybean seed lots.
The confusion matrices show the predictions of the results generated, making it possible to see that J48 and Random Forest behave similarly. This is due to the fact that mine using similar techniques. Table 3 shows the J48 matrix, which has more incorrectly classified lots, as it only builds one decision tree and is subject to more errors. On the other hand, Random Forest builds several trees, with different configurations, to choose the answer that occurs the most (random forest), managing to make more assertive decisions through various tests.
For this study, LDA and FLDA were also applied, where the former works generically with LDA models, and the latter constructs Fisher's linear discriminant function, selecting the threshold so that the separator is between the centroids. Both algorithms obtained close results, with LDA showing greater error in classifying accepted lots.
For a more detailed analysis of the accuracy of the models applied in this study, it is important to examine the algorithms' evaluation metrics. Table 4 shows the accuracy parameters of each algorithm, providing information on the classification of soybean seeds based on the reflectance of the Near Infrared Spectrum (NIR) and the quality parameters. This data is essential for deciding which algorithm is best suited to classifying soybean seeds. By considering these evaluation metrics, it is possible to gain a more complete understanding of the effectiveness and performance of each algorithm, helping to choose the most suitable model for classifying soybeans based on NIR reflectance and quality parameters.
Table 4 shows that the evaluation metrics have high values for classifying soybean seeds based on physiological parameters and NIR reflectance. The Random Forest algorithm obtained a recall of 96.1% and an average precision of 98% for the class of accepted lots. For the class of rejected lots, the values were 98.8% and 97.5%, respectively. The recall and precision values for the accepted lot class were 92.6% and 90.9%, respectively. For the rejected lots class, the values were 92.6% recall and 73.59% average precision.
DISCUSSION
The growth in the use of NIR technology by the scientific community has increased in recent years, as has the number of articles published on the use of this analytical technique in recent decades [27]. Quickly obtaining knowledge about the physiological aspects of seeds is extremely important for ensuring the establishment of a successful crop and for classifying seed lots during harvest. Monitoring the quality of soybean seeds using physiological tests is extremely significant for establishing a successful crop [28]. However, in view of the analysis time and the use of methodologies based on the analyst's visual assessment, tools, and technologies are being sought for this monitoring.
Studies show that Near Infrared Spectroscopy (NIR) can be used to predict the biodegradation and molecular structure properties of seeds. As reported by [10] in which they observed the distinction of cottonseed genotypes by spectroscopic methods. [29] analyzed the integration of hyperspectral imaging, untargeted metabolomics and machine learning for predicting the vigor of naturally aged and accelerated sweet corn seeds, in which they observed the optimization of seed storage prediction. In a study of the vibrational spectra of oilseeds [30], they evaluated the vibrational spectrum of the molecular structure properties of oilseeds. In a similar study, the authors [31] used a Fourier transform infrared micro spectroscopic approach to investigate the macromolecular distribution in cross-sections of the seed coat.
To evaluate the spectral patterns obtained between soybean samples and their known levels of vigor, adjustments and normalization in the analyses are necessary. This helps to obtain more accurate and reliable results, allowing decisions to be made based on quality information. Data quality is essential for obtaining accurate and reliable results. Poor quality data can lead to biased analyses, wrong conclusions and poorly performing models. It is therefore essential to invest time and effort in properly preparing the data, ensuring that it is representative and suitable for statistical or machine learning analysis [32].
The distinct patterns can indicate the potential challenges in predicting seed vigor in different lots, as their quality can vary in low, medium and high vigor. Improving statistical and machine learning techniques requires special attention to data quality, involving adjustments and normalization in the analyses. This contributes to obtaining more accurate and reliable results, allowing decisions to be made based on quality information. Data quality is essential for obtaining accurate and reliable results. Poor quality data can bias analysis, lead to wrong conclusions and poorly performing models. It is therefore essential to invest time and effort in properly preparing the data, ensuring that it is representative and suitable for statistical or machine learning analysis [32].
In this context, understanding and predicting the stability of vigor between seed lots requires extensive data analysis from quality control, which is fundamental for the management and efficiency of the seed sector chain. Combining data mining with near-infrared reflectance spectroscopy is a technique that is considered economical and fast for classifying seed lots [33]. This approach makes it possible to rank lots based on the interactions between the results of the tests applied. In addition, by fitting models with different data sets, it is possible to optimize laboratory analyses, resulting in more accurate and efficient identification of seed lots. This combination of techniques offers a promising alternative for speeding up the seed classification process, reducing costs and analysis time. A recent study by [34] highlights the benefits of this approach.
In a similar vein, data mining plays a crucial role in the ranking of seed lots, allowing the prediction of physiological quality based on data analysis. This approach involves defining methods and classifications for a large data set, with the objective of building models or functions that describe the classes or criteria relevant to seed quality. Recent studies, such as [35] and [36], highlight the importance of data mining in this context, emphasizing the classification and selection of seed lots based on physiological criteria. This approach allows for a more efficient and accurate analysis, contributing to the optimization of seed production and selection processes.
These results indicate that the J48 algorithm was able to make highly assertive predictions when classifying soybean lots. In his study, [37] did not find good accuracy for classifying lots with the J48 algorithm. The combination of a predictive model generated using hyperspectral image data with differential values captured in metabolites related to vigor tests proved to be an appropriate approach for interpreting and optimizing the application of the vigor predictive model. This corroborates [29] who carried out an effective approach to interpret and optimize the application of the predictive model, based on accelerated and natural aging, for sweet corn seeds.
The machine learning model used through the decision tree, built by the J48 algorithm, is a Java derivation of the C4.5 algorithm, which is widely used and considered reliable as a statistical classifier. It determines the construction of the decision tree based on the concept of entropy. In this way, the algorithm selects the attribute that best partitions the data based on the normalized information gain. This approach allows the algorithm to make decisions based on reducing uncertainty and maximizing information gain [24].
The Random Tree algorithm (Figure 3) identified a relationship between the reflectance information in the spectra and the seed quality parameters. This indicates that analyzing the spectra at these specific wavelengths can provide valuable information on seed quality. This finding could be useful for improving the processes of selecting and classifying soybean seed lots based on physiological and spectral characteristics.
Linear discriminant analysis (LDA), which can be called Fisher's linear discriminants, is a widely used model for recognizing characteristics, as is the PCA (Principal Component Analysis) model [38]. [34] also, on soybean seeds, used PCA to explain that seeds with high infrared values had low germination and vigor. LDA is a technique used in statistics and machine learning to search for linear combinations and can be known as a Bayesian probability model with three layers [39].
Linear discriminant analysis (LDA), which can be called Fisher's linear discriminants, is a widely used model for recognizing characteristics, as is the PCA (Principal Component Analysis) model [38]. [34] also, on soybean seeds, used PCA to explain that seeds with high infrared values had low germination and vigor. LDA is a technique used in statistics and machine learning to search for linear combinations and can be known as a Bayesian probability model with three layers [39].
In the confusion matrix of the LibSVM algorithm, it is important to note that a 100% hit rate may indicate overfitting, rather than real assertiveness [40]. Therefore, it is necessary to interpret these results with caution and consider other evaluation metrics, such as precision, recall and F1-score, to obtain a more complete picture of the LibSVM algorithm's performance in classifying soybean seed lots [23
The confusion matrices show the predictions of the results generated, making it possible to see that J48 and Random Forest behave similarly. This is due to the fact that mine using similar techniques [41]. Table 2 shows the J48 matrix, which has more incorrectly classified lots, as it only builds one decision tree and is subject to more errors. On the other hand, Random Forest (Table 2) builds several trees, with different configurations, to choose the answer that occurs the most (random forest), managing to make more assertive decisions through various tests.
For this study, LDA and FLDA were also applied, where the former works generically with LDA models, and the latter constructs Fisher's linear discriminant function, selecting the threshold so that the separator is between the centroids [39]. Both algorithms obtained close results, with LDA showing greater error in classifying accepted lots.
For a more detailed analysis of the accuracy of the models applied in this study, it is important to examine the algorithms' evaluation metrics. Table 3 shows the accuracy parameters of each algorithm, providing information on the classification of soybean seeds based on the reflectance of the Near Infrared Spectrum (NIR) and the quality parameters. This data is essential for deciding which algorithm is best suited to classifying soybean seeds [24]. By considering these evaluation metrics, it is possible to gain a more complete understanding of the effectiveness and performance of each algorithm, helping to choose the most suitable model for classifying soybeans based on NIR reflectance and quality parameters [24].
Table 3 shows that the evaluation metrics have high values for classifying soybean seeds based on physiological parameters and NIR reflectance. These results indicate that the model is highly accurate in identifying and classifying seed lots, for both classes, with a very high hit rate. This demonstrates the effectiveness of the Random Forest algorithm in analyzing physiological and reflectance parameters for classifying soybean seeds. According to the study carried out by [24], on prediction in the ranking of soybean seed lots, the algorithm that obtained the best results was Random Forest.
It is important to note that the percentages of germination, first germination count and humidity shown in the decision tree are choices made by the algorithm to predict the classification of the samples. The inclusion of the “humidity” variable in the decision tree in Figure 2 may indicate noise in the algorithm's processing, possibly due to the influence of water molecules in the wavelength range analyzed. To investigate the influence of humidity, water content was considered as a parameter in the data mining. The assertiveness of the NIR technique in the wavelength range between 807.18 and 817.65 nm stands out, since correlated studies have identified the 730 nm range as being associated with chlorophyll fluorescence [42].
It is important to note that the average accuracy may depend on the database used in the study, which indicates the importance of considering the quality and representativeness of the data when interpreting this metric. However, the results indicate that the Random Forest algorithm performed satisfactorily in classifying and ranking soybean seed lots.
CONCLUSION
After applying the algorithms, it was found that the NIR technique in the 807.18 to 817.65 nm range can express information on the physiological attributes of soybean seeds. This means that this technique can be used to rank seed lots, allowing a quick estimate of physiological quality. Furthermore, spectroscopy appears to be a complementary methodology with a quick response, with the potential to be used in conjunction with traditional physiological quality analyses.
The Random Forest algorithm showed 98% accuracy in classifying soybean seed lots, demonstrating a superior fit in relation to the data used in this study. These results indicate that using the NIR technique in conjunction with machine learning can be an effective approach to classifying and assessing the quality of soybean seeds.
REFERENCES
- 1 Aulia R, Kim Y, Amanah HZ, Andi AMA, Kim H, Kim H, et al. Non-destructive prediction of protein contents of soybean seeds using near-infrared hyperspectral imaging. Infrared Phys Technol. 2022 Dec 1;127:104365.
-
2 Hirakuri MH, Lazzarotto JJ. [The soybean agribusiness in the world and Brazilian contexts] [Internet]. Londrina: Embrapa Soja; 2014. p. 37. Available from: https://ainfo.cnptia.embrapa.br/digital/bitstream/item/104753/1/O-agronegocio-da-soja-nos-contextos-mundial-e-brasileiro.pdf
» https://ainfo.cnptia.embrapa.br/digital/bitstream/item/104753/1/O-agronegocio-da-soja-nos-contextos-mundial-e-brasileiro.pdf -
3 Companhia Nacional de Abastecimento. [Monitoring of the Brazilian harvest - grains - harvest 2023/24] [Internet]. 6th ed. Vol. 11. Brasília: Companhia Nacional de Abastecimento; 2024 [cited 2024 Mar 12]. Available from: https://www.conab.gov.br/info-agro/safras/graos/boletim-da-safra-de-graos
» https://www.conab.gov.br/info-agro/safras/graos/boletim-da-safra-de-graos - 4 Pinheiro RM, Gadotti GI, Bernardy R, Monteiro RCM, Moreira IB. [Image processing as an important tool for artificial intelligence in the seed sector]. Rev Agraria Acad. 2022 Jan 1;5(1):89-101.
- 5 Pinheiro R de M, Gadotti GI, Bernardy R, Tim RR, Pinto KVA, Buck G. Computer vision by unsupervised machine learning in seed drying process. Cienc Agrotec. 2023 Jan 1;47.
- 6 Ma H, Wang J, Chen Y, Cheng J, Lai Z. Rapid authentication of starch adulterations in ultrafine granular powder of Shanyao by near-infrared spectroscopy coupled with chemometric methods. Food Chem. 2017 Jan 1;215:108-15.
- 7 Bevilacqua M, Bucci R, Materazzi S, Marini F. Application of near infrared (NIR) spectroscopy coupled to chemometrics for dried egg-pasta characterization and egg content quantification. Food Chem. 2013 Oct;140(4):726-34.
-
8 Larios G, Nicolodelli G, Ribeiro M, Canassa T, Reis AR, Oliveira SL, et al. Soybean seed vigor discrimination by using infrared spectroscopy and machine learning algorithms. Anal Methods. 2020 Sep 17;12(35):4303-9. Available from: https://pubs.rsc.org/en/content/articlelanding/2020/ay/d0ay01238f/unauth
» https://pubs.rsc.org/en/content/articlelanding/2020/ay/d0ay01238f/unauth - 9 Restaino E, Fassio A, Cozzolino D. Discrimination of meat patés according to the animal species by means of near infrared spectroscopy and chemometrics. CyTA J Food. 2011 Sep 1;9(3):210-3.
- 10 Mata MM da, Rocha PD, Farias IKT de, Silva JLB da, Medeiros EP, Silva CS, et al. Distinguishing cotton seed genotypes by means of vibrational spectroscopic methods (NIR and Raman) and chemometrics. Spectrochim Acta A Mol Biomol Spectrosc. 2022 Feb;266:120399.
- 11 Reddy P, Guthridge KM, Panozzo J, Ludlow E, Spangenberg G, Rochfort S. Near-infrared hyperspectral imaging pipelines for pasture seed quality evaluation: An overview. Sensors. 2022 Mar 3;22(5):1981.
- 12 Ludermir TB. [Artificial intelligence and machine learning: current state and trends]. Estud Av. 2021 Apr;35(101):85-94.
- 13 Pasquini C. Near infrared spectroscopy: A mature analytical technique with new perspectives - A review. Anal Chim Acta. 2018 Oct;1026:8-36.
-
14 Andriazzi CVG, Rocha DK, Custódio CC. Determination of the physiological quality of corn seeds by infrared equipment. J Seed Sci. 2023 Jan 27;45. Available from: https://www.scielo.br/j/jss/a/mThdtjJ9fd8D3WkTMwgF93b/?lang=en
» https://www.scielo.br/j/jss/a/mThdtjJ9fd8D3WkTMwgF93b/?lang=en -
15 Brasil. [Normative Instruction No 45/2013] [Internet]. Sep 13, 2013. Available from: https://www.gov.br/agricultura/pt-br/assuntos/insumos-agropecuarios/insumos-agricolas/sementes-e-mudas/publicacoes-sementes-e-mudas/copy_of_INN45de17desetembrode2013.pdf
» https://www.gov.br/agricultura/pt-br/assuntos/insumos-agropecuarios/insumos-agricolas/sementes-e-mudas/publicacoes-sementes-e-mudas/copy_of_INN45de17desetembrode2013.pdf -
16 Brasil. [Rules for seed analysis] [Internet]. 1st ed. Brasília: Mapa/ACS; 2009 [cited 2024 Mar 12]. Available from: https://www.gov.br/agricultura/pt-br/assuntos/insumos-agropecuarios/arquivos-publicacoes-insumos/2946_regras_analise__sementes.pdf
» https://www.gov.br/agricultura/pt-br/assuntos/insumos-agropecuarios/arquivos-publicacoes-insumos/2946_regras_analise__sementes.pdf -
17 Neto F, Krzyzanowski FC, Costa NP da. [The tetrazolium test in soybean seeds] [Internet]. Londrina: Embrapa-CNPSo; 1998 [cited 2024 Mar 12]. Available from: https://www.agrolink.com.br/downloads/TRETRAZ%C3%93LIO.pdf
» https://www.agrolink.com.br/downloads/TRETRAZ%C3%93LIO.pdf -
18 França-Neto J de B, Henning AA. [Physiological and sanitary qualities of soybean seeds]. 1st ed. Londrina: Embrapa; 1986 [cited 2024 Mar 12]. Available from: https://www.infoteca.cnptia.embrapa.br/handle/doc/444358
» https://www.infoteca.cnptia.embrapa.br/handle/doc/444358 - 19 Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020 Feb 3;17(3):261-72.
-
20 Panero JS, Silva HEB da, Panero PS, Smiderle OJ, Panero FS, Faria FSEDV, et al. Separation of cultivars of soybeans by chemometric methods using near infrared spectroscopy. J Agric Sci. 2018 Mar 5;10(4):351. Available from: https://ainfo.cnptia.embrapa.br/digital/bitstream/item/173795/1/72794-275002-1-PB.pdf
» https://ainfo.cnptia.embrapa.br/digital/bitstream/item/173795/1/72794-275002-1-PB.pdf -
21 Lever J, Krzywinski M, Altman N. Classification evaluation. Nat Methods [Internet]. 2016 Jul 28;13(8):603-4. Available from: https://www.nature.com/articles/nmeth.3945
» https://www.nature.com/articles/nmeth.3945 - 22 Bernardy R, Gadotti GI, Monteiro RCM, Pinto KVA, Pinheiro RM. FITTING Data Mining Settings for Ranking Seed Lots. Eng Agrícola. 2023 Jan 1;43(2).
- 23 Adagbasa EG, Adelabu SA, Okello TW. Application of deep learning with stratified K-fold for vegetation species discrimination in a protected mountainous region using Sentinel-2 image. Geocarto Int. 2019 Dec 19;37(1):1-21.
- 24 Gadotti GI, Ascoli CA, Bernardy R, Monteiro RCM, Pinheiro RM. Machine learning for soybean seeds lots classification. Eng Agrícola. 2022 Jan 1;42(spe).
- 25 Femenias A, Gatius F, Ramos AJ, Teixido-Orries I, Marín S. Hyperspectral imaging for the classification of individual cereal kernels according to fungal and mycotoxins contamination: A review. Food Res Int. 2022 Mar 1;155:111102.
-
26 Valero-Carreras D, Alcaraz J, Landete M. Comparing two SVM models through different metrics based on the confusion matrix. Comput Oper Res [Internet]. 2023 Apr 1 [cited 2023 Feb 16];152:106131. Available from: https://www.sciencedirect.com/science/article/pii/S0305054822003616
» https://www.sciencedirect.com/science/article/pii/S0305054822003616 - 27 Guerrero C, Viscarra RA, Mouazen AM. Special issue “Diffuse reflectance spectroscopy in soil science and land resource assessment.” Geoderma. 2010 Aug 1;158(1-2):1-2.
-
28 Meneguzzo MRR, Meneghello GE, Nadal AP, Xavier FM, Dellagostin SM, Carvalho IR, et al. [Seedling length and soybean seed vigor]. Cienc Rural [Internet]. 2021 Mar 29 [cited 2024 Jan 28];51(7). Available from: https://www.scielo.br/j/cr/a/yGhqSW5Cxr69x5tYFxhCzmk/?lang=en
» https://www.scielo.br/j/cr/a/yGhqSW5Cxr69x5tYFxhCzmk/?lang=en - 29 Zhang T, Lu L, Yang N, Fisk ID, Wei W, Wang L, et al. Integration of hyperspectral imaging, non-targeted metabolomics and machine learning for vigour prediction of naturally and accelerated aged sweetcorn seeds. Food Control. 2023 Nov 1;153:109930.
- 30 Gomaa WMS, Zhang X, Deng H, Peng Q, Mosaad GM, Zhang H, et al. Vibrational spectroscopic study on feed molecular structure properties of oil-seeds and co-products from Canadian and Chinese bio-processing and relationship with protein and carbohydrate degradation fractions in ruminant systems. Spectrochim Acta A Mol Biomol Spectrosc. 2019 Jun 1;216:249-57.
- 31 Hossain M, Liyanage S, Abidi N. FTIR microspectroscopic approach to investigate macromolecular distribution in seed coat cross-sections. Vib Spectrosc. 2022 May;120:103376.
- 32 Arantes Filho LR, Guimarães LNF, Nascimento FB, Rosa RR. [Double filtering strategy using the Savitzky-Golay filter in supernova spectral data]. Rev Bras Comput Apl. 2019 Jun 26;11(2):86-99.
- 33 Shafiee S, Minaei S. Combined data mining/NIR spectroscopy for purity assessment of lime juice. Infrared Phys Technol. 2018 Jun;91:193-9.
-
34 Soares JM, Medeiros AD, Pinheiro DT, Rosas JTF, Silva LJ, Machado DLM, et al. Low-cost system for multispectral image acquisition and its applicability to analysis of the physiological potential of soybean seeds. Acta Sci Agron [Internet]. 2023 [cited 2023 Jan 17];45(1). Available from: https://periodicos.uem.br/ojs/index.php/ActaSciAgron/article/view/57060/751375155038
» https://periodicos.uem.br/ojs/index.php/ActaSciAgron/article/view/57060/751375155038 - 35 Gadotti GI, Moraes NAB, Silva JG, Pinheiro RM, Monteiro RCM. [Prediction of ranking of lots of corn seeds by artificial intelligence]. Eng Agrícola. 2022 Jan 1;42(4).
- 36 Pinheiro RM, Gadotti GI, Monteiro RCM, Bernardy R. [Artificial intelligence in agriculture with applicability in the seed sector]. Diversitas J. 2021;6(3):2984-95.
- 37 Belhumeur PN, Hespanha JP, Kriegman DJ. Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell. 1997 Jul;19(7):711-20.
-
38 Soares JM, Medeiros AD, Pinheiro DT, Rosas JTF, Silva LJ, Machado DLM, et al. Low-cost system for multispectral image acquisition and its applicability to analysis of the physiological potential of soybean seeds. Acta Sci Agron [Internet]. 2023 [cited 2023 Jan 17];45. Available from: https://periodicos.uem.br/ojs/index.php/ActaSciAgron/article/view/57060/751375155038
» https://periodicos.uem.br/ojs/index.php/ActaSciAgron/article/view/57060/751375155038 - 39 Guo C, Lu M, Wei W. An Improved LDA Topic Modeling Method Based on Partition for Medium and Long Texts. Ann Data Sci. 2019 Apr 25;8(2):331-44.
-
40 Chang CC, Lin CJ. LIBSVM -- A Library for Support Vector Machines [Internet]. www.csie.ntu.edu.tw 2023 [cited 2024 Dec 3]. Available from: https://www.csie.ntu.edu.tw/~cjlin/libsvm/
» www.csie.ntu.edu.tw» https://www.csie.ntu.edu.tw/~cjlin/libsvm -
41 John M, Shaiba H. Ensemble Based Foetal State Diagnosis. Conf Data Sci Mach Learn Appl [Internet]. 2020 Mar 1 [cited 2024 Mar 13];6:129-33. Available from: https://ieeexplore.ieee.org/document/9044244
» https://ieeexplore.ieee.org/document/9044244 - 42 França-Silva F, Cicero SM, Gomes-Junior FG, Medeiros AD, França-Neto JB, Dias DCFS. Quantification of chlorophyll fluorescence in soybean seeds by multispectral images and their relationship with physiological potential. J Seed Sci. 2022 Jan 1;44.
Publication Dates
-
Publication in this collection
15 Nov 2024 -
Date of issue
2024
History
-
Received
02 Apr 2024 -
Accepted
16 Sept 2024