NIR Spectroscopy Identification of Persimmon Varieties Based on PCA-SVM

Similar documents
Design and experiment of NIR wheat quality quick. detection system

Robust model of fresh jujube soluble solids content with near-infrared (NIR) spectroscopy

ADAPTIVE LINEAR NEURON IN VISIBLE AND NEAR INFRARED SPECTROSCOPIC ANALYSIS: PREDICTIVE MODEL AND VARIABLE SELECTION

Visible-near infrared spectroscopy to assess soil contaminated with cobalt

Influence of different pre-processing methods in predicting sugarcane quality from near-infrared (NIR) spectral data

Journal of Chemical and Pharmaceutical Research, 2015, 7(4): Research Article

Fast Detection of Inorganic Phosphorus Fractions and their Phosphorus Contents in Soil Based on Near-Infrared Spectroscopy

ABASTRACT. Determination of nitrate content and ascorbic acid in intact pineapple by Vis-NIR spectroscopy Student. Thesis

USE OF NIR SPECTROSCOPY AND LS-SVM MODEL FOR THE DISCRIMINATION OF VARIETIES OF SOIL

Detection of Chlorophyll Content of Rice Leaves by Chlorophyll Fluorescence Spectrum Based on PCA-ANN Zhou Lina1,a Cheng Shuchao1,b Yu Haiye2,c 1

Potential of Near-infrared Spectroscopy to Detect Defects on the Surface of Solid Wood Boards

Detection and Quantification of Adulteration in Honey through Near Infrared Spectroscopy

FTIR characterization of Mexican honey and its adulteration with sugar syrups by using chemometric methods

Application of Raman Spectroscopy for Noninvasive Detection of Target Compounds. Kyung-Min Lee

Application of Raman Spectroscopy for Detection of Aflatoxins and Fumonisins in Ground Maize Samples

Using radial basis elements for classification of food products by spectrophotometric data

Analysis of Cocoa Butter Using the SpectraStar 2400 NIR Spectrometer

Application of near-infrared (NIR) spectroscopy to sensor based sorting of an epithermal Au-Ag ore

Application of color detection by near infrared spectroscopy in color detection of printing

The Application of Classical Least Square Algorithm in the Quantitative Analysis of Lime in Wheat Flour by ATR-MIR Spectroscopy *

Multi-wind Field Output Power Prediction Method based on Energy Internet and DBPSO-LSSVM

Recent Progress of Terahertz Spectroscopy on Medicine and Biology in China

Plum-tasting using near infra-red (NIR) technology**

Putting Near-Infrared Spectroscopy (NIR) in the spotlight. 13. May 2006

Study of blood fat concentration based on serum ultraviolet absorption spectra and neural network

Chemometrics: Classification of spectra

Method Transfer across Multiple MicroNIR Spectrometers for Raw Material Identification

Neural network time series classification of changes in nuclear power plant processes

QUALITY AND DRYING BEHAVIOUR OF ORGANIC FRUIT PRODUCTS

Nondestructive Determination of Tomato Fruit Quality Characteristics Using Vis/NIR Spectroscopy Technique

Construction of a Ginsenoside Content-predicting Model based on Hyperspectral Imaging

Performance of spectrometers to estimate soil properties

Corresponding Author: Jason Wight,

Optimizing Model Development and Validation Procedures of Partial Least Squares for Spectral Based Prediction of Soil Properties

Classification of High Spatial Resolution Remote Sensing Images Based on Decision Fusion

Development of Near Infrared Spectroscopy for Rapid Quality Assessment of Red Ginseng

Spectroscopy in Transmission

PTG-NIR Powder Characterisation System (not yet released for selling to end users)

Dynamic Data Modeling of SCR De-NOx System Based on NARX Neural Network Wen-jie ZHAO * and Kai ZHANG

College of Life Science and Pharmacy, Nanjing University of Technology, Nanjing, China

Rapid and nondestructive detection of watercore and sugar content in Asian pear by near infrared spectroscopy for commercial trade

Online Evaluation of Yellow Peach Quality by Visible and Near-Infrared Spectroscopy

Introduction to Fourier Transform Infrared Spectroscopy

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Plant Leaf Water Detection Instrument Based on Near Infrared Spectroscopy

PRECISION AGRICULTURE APPLICATIONS OF AN ON-THE-GO SOIL REFLECTANCE SENSOR ABSTRACT

Introduction to Fourier Transform Infrared Spectroscopy

Fast Analysis of Superoxide Dismutase (SOD) Activity in Barley Leaves Using Visible and Near Infrared Spectroscopy

BASICS OF CHEMOMETRICS

Positive and Nondestructive Identification of Acrylic-Based Coatings

Bearing fault diagnosis based on TEO and SVM

Near Infrared reflectance spectroscopy (NIRS) Dr A T Adesogan Department of Animal Sciences University of Florida

Prediction of Vegetable Price Based on Neural Network and Genetic Algorithm

Making Accurate Field Spectral Reflectance Measurements By Dr. Alexander F. H. Goetz, Co-founder ASD Inc., Boulder, Colorado, 80301, USA October 2012

Application of Near Infrared Spectroscopy to Predict Crude Protein in Shrimp Feed

A Wavelet Neural Network Forecasting Model Based On ARIMA

Near Infrared Spectral Linearisation in Quantifying Soluble Solids Content of Intact Carambola

Comparison of the HPLC Method and FT-NIR Analysis for Quantification of Glucose, Fructose, and Sucrose in Intact Apple Fruits

APPLICATION OF GENETIC ALGORITHM IN THE MODELING OF LEAF CHLOROPHYLL LEVEL BASED ON VIS/NIR REFLECTION SPECTROSCOPY

RESEARCH ON COMPLEX THREE ORDER CUMULANTS COUPLING FEATURES IN FAULT DIAGNOSIS

A Comparative Study Concerning Linear and Nonlinear Models to Determine Sugar Content in Sugar Beet by Near Infrared Spectroscopy (NIR)

Fault prediction of power system distribution equipment based on support vector machine

Optical Sensor in the Measurement of Fruits Quality: A Review on an Innovative Approach

Power Supply Quality Analysis Using S-Transform and SVM Classifier

Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature

Method for Recognizing Mechanical Status of Container Crane Motor Based on SOM Neural Network

Eyal Ben-Dor Department of Geography Tel Aviv University

Predicting Roadway Surrounding Rock Deformation and Plastic Zone. Depth Using BP Neural Network

Comparison of Four Nondestructive Sensors for Firmness Assessment of Apples

Food authenticity and adulteration testing using trace elemental and inorganic mass spectrometric techniques

Active Learning with Support Vector Machines for Tornado Prediction

Myoelectrical signal classification based on S transform and two-directional 2DPCA

Research Article Honey Discrimination Using Visible and Near-Infrared Spectroscopy

Radial basis function neural networks model to estimate global solar radiation in semi-arid area

Nondestructive Determination of Tomato Fruit Quality Parameters Using Raman Spectroscopy

Detection of Soil Total Nitrogen by Vis-SWNIR Spectroscopy

Identification of terahertz fingerprint spectra extracted from Gas-Fat coal

Simulation on Magnetic Field Characteristics of Permanent-Magnet Seed-Metering Device

Study on Thermal Conductivities Prediction for Apple Fruit Juice by Using Neural Network

Journal of Chemical and Pharmaceutical Research, 2015, 7(4): Research Article. Kinetic model of pickles shelf life prediction

Reading, UK 1 2 Abstract

Study on the Concentration Inversion of NO & NO2 Gas from the Vehicle Exhaust Based on Weighted PLS

Discussion About Nonlinear Time Series Prediction Using Least Squares Support Vector Machine

Multivariate calibration

Elimination of variability sources in NIR spectra for the determination of fat content in olive fruits

Study on Computer-assisted Infrared Spectroscopy for Identification of Chemical Structure System

Quantification of Eplerenone Polymorphs by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS)

Hierarchical models for the rainfall forecast DATA MINING APPROACH

Building Kinetic Models for Determining Vitamin C Content in Fresh Jujube and Predicting Its Shelf Life Based on Near-Infrared Spectroscopy

The Study of Soil Fertility Spatial Variation Feature Based on GIS and Data Mining *

THE APPLICATION OF GREY SYSTEM THEORY TO EXCHANGE RATE PREDICTION IN THE POST-CRISIS ERA

Sparse Support Vector Machines by Kernel Discriminant Analysis

Information retrieval methods for high resolution γ-ray spectra

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

NOTE: The color of the actual product may differ from the color pictured in this catalog due to printing limitation.

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report

Development of Method for Non-destructive Measurement of Nitrate Concentration in Vegetable Leaves by Near-infrared Spectroscopy

Research Article Time Series Adaptive Online Prediction Method Combined with Modified LS-SVR and AGO

2.01 INFRARED ANALYZER

(Slides for Tue start here.)

Transcription:

NIR Spectroscopy Identification of Persimmon Varieties Based on PCA-SVM Shujuan Zhang, Dengfei Jie, and Haihong Zhang zsujuan@263.net Abstract. In order to achieve non-destructive measurement of varieties in persimmon, a fast discrimination method based on Vis NIRS spectroscopy was put forward. A Field Spec 3 spectroradiometer was used for collecting 22 sample spectra data of the three kinds of persimmon separately. Then principal component analysis (PCA) was used to process the spectral data after pretreatment. The near infrared fingerprint of persimmon was acquired by principal component analysis(pca) and support vector machine (SVM) methods were used to further identify the persimmon separately. The result of PCA indicated that the score map made by the scores of PCI PC2 and PC3 was used, and 8 principal components (PCs) were selected as the input of support vector machine (SVM) based on the reliabilities of PCs of 99 888.51 persimmon samples were used for calibration and the remaining 15 persimmon samples were used for validation. A one-against- all multi-class SVM model was built, and the result showed that SVM possessing with the RBF kernel function has the best identification capabilities with the accuracy of 100. This research indicated that the mixed algorithm method of principal component analysis(pca) and support vector Machine(SVM) has a good identification effect, and can work as a new method for quick, efficient and correct identification of persimmon separately. Keywords: Vis-NIR Spectroscopy, support vector machine (SVM), Persimmon, Principal component analysis (PCA). 1 Introduction Visible_Near Infrared Diffuse Reflectance (Vis_NIR) Spectroscopy analysis technology can be used for qualitative and quantitative data analysis with full band or multiwavelength spectrum. With many advantages of the spectra technology, such as dispense with pre-treatment, fast, pollution-free, no damage, multi-component analyzing synchronously, good reproducibility and online analysis, it has been widely used in quality detecting studies of agricultural products [1-10]. At present, some scholars have researched the Varieties of some kinds of fruit by near-infrared spectroscopy, for example, Peach [2] ÂSoy sauce [3] ÂYogurt [4 ]and Juice [5]. But there is no research paper to study the varieties of persimmon. Persimmon, as a characteristic fruit of our country, has many characteristics, such as artistic contour, sweet, succulent and rich nutrition. It contains a lot of carotene, vitamin C, glucose, fructose, calcium, phosphorus, iron and other minerals. So D. Li, Y. Liu, and Y. Chen (Eds.): CCTA 2010, Part II, IFIP AICT 345, pp. 118 123, 2011. IFIP International Federation for Information Processing 2011

NIR Spectroscopy Identification of Persimmon Varieties Based on PCA-SVM 119 persimmon has very high culinary and medicinal properties. It can be used to eat as the fresh food, to make wine and to make vinegar. It also can be processed into persimmons biscuits, dried persimmons and persimmon cake. Therefore, the persimmon enjoys the holy in the fruit of reputation. In recent years, with the rural industrial structure adjustment, the persimmon has become an important source of increasing income for farmers in some areas. However, as the case stands, the domestic and foreign markets of fresh persimmon sales have not developed fully. As native products, the storage and grading of persimmon can not meet the market demands. Therefore, the developments and research of detecting technology for the quality of persimmon are important. In particular, Quality detection of persimmon for Simple, rapid, nondestructive is very necessary. Initial establishment of persimmon varieties of forecast model was calculated, using Near-Infrared Spectroscopy, based on principal component analysis and support vector machine combined data mining approach. 2 Materials and Methods 2.1 Software and Analysis Equipment The experiment equipment is composed by computer, spectrometer, and halogen lamp and correction whiteboard. The spectrum sampling has been collected by diffuse reflectance methods using the FieldSpec3 spectrometer made by America ASD (Analytical Spectral Device) company. The interval of spectral sampling is 1nm, the range of sampling is 350 ~ 2500nm, the times of scanning are 30, the resolving power is 3nm and the view angle of probe field is 25. The halogen light of 14.5V is used to match with the spectrometer. Spectral data is exported into the form of ASCII code for processing and the analysis software is the ASD View Spec Pro, Unscramble V9.7, Matlab7.3 and DPS (Data Procession System for Practical Statistics). 2.2 Sample Source and Spectral Data Acquisition Three kinds of persimmon including covered persimmon, Cynanchum persimmon and red persimmon have been purchased from the market. The 22 samples have been collected for each kind of persimmon and the total are 66 samples. All the samples are randomly divided into modeling sets including 51 samples and prediction sets including 15 samples. The spectral data are collected by spectrometer located at the top of persimmon. The distance between the top of persimmon and spectrometer is 50mm, the view angle of probe field is 25 and the time for scanning each sample is 10. 2.3 Spectral Data Preprocessing In order to remove the high-frequency random noise, baseline drift, sample asymmetry and the impact of light scattering, the move average smoothing method with 9 smoothing points was used for spectral pre-processing, which can remove the high frequency noise effectively. Then the data was processed by MSC (Multiplicative Scatter Correction) [6]. All of the pretreatment process carried out in the Unscrambler V9.6. The typical near-infrared spectroscopy curve of three kinds of persimmon is shown in Fig.1.

120 S. Zhang, D. Jie, and H. Zhang 5 J O H F Q D E U R V E $ Disc-shaped persimmon Bovine heart-shaped persimmon Red persimmon :DYHOHQJWK QP Fig. 1. Visible_Near infrared reflectance spectroscopy of three different varieties of persimmon 2.4 Principal Component Analysis of Three Varieties of Persimmon Spectral curve has a larger noise form the first side to the end. Spectrum of 350 ~ 2050nm band has been selected for eliminating noise. PCA (principal component analysis) was used in this study. In the Fig.2, X-axis samples the PC1 (first principal component scores), Y-axis samples the PC2 (second principal component scores), Z- axis samples the PC3 (third principal component scores). Classification performance Fig. 2. Principal Component scores scatter plot (PC1 PC2 PC3) of persimmon

NIR Spectroscopy Identification of Persimmon Varieties Based on PCA-SVM 121 of persimmon is shown from the Fig.2, and the characteristics curve of three different varieties of persimmon was described. However, the distinction wasn t obvious between the edges of the sample. To improve prediction accuracy, the prediction model of three persimmons was established using SVM and PCA in this experiment. 3 Test Results and Analysis 3.1 Principal Component Analysis Sample spectral bands have total 2150 points from 350 ~ 2500nm.Using full spectrum, computation is large, spectral information is weak in some of the regional samples and correlation is lack between the composition of the sample and the nature. The purpose of PCA is data reduction. Under the premise of more variable spectral information without loss of participate, the original variables are replace using a smaller number of new options, and many information of overlapping chemical are excluded [11]. Therefore, the Unscramble V9.6 software is used, principal component of three persimmons were analyzed. The accumulated credibility of first 8 principal components of was got in Table 1.It has reached 99.888%. So the original wavelength can be interpreted by the 8 principal components. Table 1. Accumulative reliabilities of the first 8 Principal components PCs PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 Accumulative reliabilities /% 84.988 95.515 97.742 99.301 99.550 99.722 99.821 99.888 3.2 Support Vector Machine Identification of Persimmon SVM is a new pattern recognition method. The principle risk minimization of structural is uses, Training error and generalization are both, and many advantages are expressed in solving small sample, nonlinear, high dimension and local minimum problems. The 51 samples of the principal component scores as SVM training set, the remaining 15 samples were taken as the prediction set of SVM, and the radial basis function (RBF) is used as a core function to identify varieties of persimmon. Optimal parameter is found using Grid-search function. Namely, main parameters of gamma (γ) and the RBF of kernel function in the sig2 (σ2) are found. The appropriate gamma (γ) can increase the ability of SVM model. The parameter sig2 (σ2) can control the error of regression function, and directly affect the noise immunity of the SVM model. Generalization ability, as learning and prediction of the model, are largely determined by those two parameters. The SVM is realized by the Matlab 7.3. Classification performance of the RBF kernel function was listed in the Table 2. From Table 2, the forecast error of the SVM model was less than ± 0.1, and the recognition rate has reached to 100%. Compared with the conventional model of discriminate analysis, the SVM model of near infrared spectral classification was more accurate.

122 S. Zhang, D. Jie, and H. Zhang Table 2. Results of PLS model with different spectrum Sample number Real value Predicted value Bias 1 1 0.9608-0.0392 2 1 1.0566 0.0566 3 1 1.0131 0.0131 4 1 0.9829-0.0171 5 1 1.0191 0.0191 6 2 2.0263 0.0263 7 2 1.9957-0.0043 8 2 2.0493 0.0493 9 2 1.9331-0.0669 10 2 1.9269-0.0731 11 3 2.9930-0.0070 12 3 3.0981 0.0981 13 3 3.0272 0.0272 14 3 2.9040-0.0960 15 3 2.9606-0.0394 4 Conclusion Identification of persimmon varieties were studied based on Near Infrared Spectroscopy visible. On the basis of Visible-near infrared spectroscopy and spectral data pretreatment, the variety identification model of persimmon was established using PCA and SVM. The Principal Components of PCA as input set, the Gaussian radial basis (RBF) kernel function of SVM as choose, the model of varieties of persimmon has been established. Experimental results show that the model prediction error of unknown varieties of persimmon was less than ± 0.1, and the recognition rate has reached to 100%. The model was accurate, predicted better results, and improved the speed and accuracy. New way of persimmon detection was provided for future spectroscopy. Acknowledgements Funding for this research was provided by Shanxi Provincial Department of Science and technology (NO.2007031109-2). References 1. Zhao, J., Zhang, H., Liu, M.: Non-destructive determination of sugar contents of apples using near infrared diffuse reflectance. Transactions of the Chinese Society of Agricultural Engineering 21(3), 162 165 (2005) 2. Li, X., Hu, X., He, Y.: New approach of discrimination of varieties of juicy peach by Near Infrared spectra based on PCA and MDA model. Journal of Infrared and millimeter waves 25(6), 417 420 (2006)

NIR Spectroscopy Identification of Persimmon Varieties Based on PCA-SVM 123 3. Tong, X., Bao, Y., He, Y.: Study on Fast Discrimination of Brands of Soy Sauce Using Near Infrared Spectra. Spectroscopy and Spectral Analysis 8(3), 597 601 (2008) 4. He, Y., Feng, S., Li, X., et al.: Study on Fast Discrimination of Varieties of Acidophilous Milk Using Near Infrared Spectra. Spectroscopy and Spectral Analysis 26(11), 2021 2023 (2006) 5. Zhang, H., Zhang, S., Jie, D., et al.: Research on Varieties of Sea Buckthorn Juice by Near- Infrared Diffuse Reflectance Spectroscopy Based on the PCR and PLS. Journal of Shanxi Agricultural University (Nature Science Edition) 30(1), 46 48 (2010) 6. Li, G., Zhao, G., Wang, X., et al.: Nondestructive measurement and fingerprint analysis of apple texture quality based on NIR spectra. Transactions of the Chinese Society of Agricultural Engineering 24(6), 169 173 (2008) 7. Zhang, S., Wang, F., Zhang, H., et al.: Detection of the Fresh Jujube Varieties and SSC by NIR Spectroscopy. Transactions of the Chinese Society for Agricultural Machinery 40(4), 139 142 (2009) 8. Li, X., He, Y., Qiu, Z.: Application PCA-ANN Method to Fast Discrimination of Tea Varieties Using Visible/Near Infrared Spectroscopy. Spectroscopy and Spectral Analysis 27(2), 279 282 (2007) 9. Liu, Y., Sun, X., Chen, X.: Research on the Soluble Solids Content of Pear Internal Quality Index by Near-Infrared Diffuse Reflectance Spectroscopy. Spectroscopy and Spectral Analysis 28(4), 797 800 (2008) 10. Wang, G., Zhu, S., Kan, J., et al.: Nondestructive Detection of Volatile Oil Content in Zanthoxylum Bungeagum Maxim by Near Infrared Spectroscopy. Transactions of the Chinese Society for Agricultural Machinery 39(3), 79 85 (2008) 11. He, Y., Li, X., Shao, Y.: Discrimination of Varieties of Apple Using Near Infrared Spectra Based on Principal Component Analysis and Artificial Neural Network Model. Spectroscopy and Spectral Analysis 26(5), 850 853 (2006) 12. Li, G., Wang, M., Zeng, H.: An Introduction to Support Vector Machines and other Kernel-Based Learning Methods. Publishing House of Electronics Industry, Beijing (2004) 13. Deng, N., Tian, Y.: New Method of Data Minning_Support Vector Machine. Science Press, Beijing (2004)