Iron Ore Rock Recognition Trials

Similar documents
Analysis of the Performance of AdaBoost.M2 for the Simulated Digit-Recognition-Example

CS229 Supplemental Lecture notes

Learning with multiple models. Boosting.

A Brief Introduction to Adaboost

AdaBoost. Lecturer: Authors: Center for Machine Perception Czech Technical University, Prague

Introduction to Machine Learning Lecture 11. Mehryar Mohri Courant Institute and Google Research

Boosting. CAP5610: Machine Learning Instructor: Guo-Jun Qi

Data Mining und Maschinelles Lernen

ECE 5984: Introduction to Machine Learning

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

COMS 4771 Lecture Boosting 1 / 16

ECE 5424: Introduction to Machine Learning

The AdaBoost algorithm =1/n for i =1,...,n 1) At the m th iteration we find (any) classifier h(x; ˆθ m ) for which the weighted classification error m

Statistics and learning: Big Data

Learning Ensembles. 293S T. Yang. UCSB, 2017.

Stochastic Gradient Descent

CS7267 MACHINE LEARNING

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Boosting: Foundations and Algorithms. Rob Schapire

Machine Learning Lecture 5

Machine Learning Lecture 7

Ordinal Classification with Decision Rules

Neural Networks and Ensemble Methods for Classification

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

Voting (Ensemble Methods)

Ensembles of Classifiers.

A Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007

Logistic Regression and Boosting for Labeled Bags of Instances

Boosting & Deep Learning

Boosting. Ryan Tibshirani Data Mining: / April Optional reading: ISL 8.2, ESL , 10.7, 10.13

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 331 le-tex

Lecture 8. Instructor: Haipeng Luo

Boosting with decision stumps and binary features

Application and Challenges of Artificial Intelligence in Exploration

Boosting Methods: Why They Can Be Useful for High-Dimensional Data

Ensembles. Léon Bottou COS 424 4/8/2010

Machine Learning and Data Mining. Linear classification. Kalev Kask

ABC-LogitBoost for Multi-Class Classification

Learning theory. Ensemble methods. Boosting. Boosting: history

VBM683 Machine Learning

COMS 4721: Machine Learning for Data Science Lecture 13, 3/2/2017

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives

Hierarchical Boosting and Filter Generation

TDT4173 Machine Learning

A Gentle Introduction to Gradient Boosting. Cheng Li College of Computer and Information Science Northeastern University

Boos$ng Can we make dumb learners smart?

Robotics 2 AdaBoost for People and Place Detection

Boosting Based Conditional Quantile Estimation for Regression and Binary Classification

Evidence Contrary to the Statistical View of Boosting

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Open Problem: A (missing) boosting-type convergence result for ADABOOST.MH with factorized multi-class classifiers

Evidence Contrary to the Statistical View of Boosting

SPECIAL INVITED PAPER

Self Supervised Boosting

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

LogitBoost with Trees Applied to the WCCI 2006 Performance Prediction Challenge Datasets

Chapter 14 Combining Models

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Introduction to Support Vector Machines

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Machine Learning Linear Classification. Prof. Matteo Matteucci

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Support Vector Machines for Classification: A Statistical Portrait

Announcements Kevin Jamieson

Introduction to Machine Learning

10701/15781 Machine Learning, Spring 2007: Homework 2

Recap from previous lecture

Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC)

Infinite Ensemble Learning with Support Vector Machinery

TDT4173 Machine Learning

Gradient Boosting (Continued)

18.9 SUPPORT VECTOR MACHINES

The Boosting Approach to. Machine Learning. Maria-Florina Balcan 10/31/2016

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Boosting. Jiahui Shen. October 27th, / 44

Artificial Intelligence Roman Barták

Self Organizing Maps. We are drowning in information and starving for knowledge. A New Approach for Integrated Analysis of Geological Data.

FINAL: CS 6375 (Machine Learning) Fall 2014

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri

ABC-Boost: Adaptive Base Class Boost for Multi-class Classification

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Harrison B. Prosper. Bari Lectures

Geotechnical modelling based on geophysical logging data

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Introduction to Machine Learning Lecture 13. Mehryar Mohri Courant Institute and Google Research

Man, Machine and Data: A Mineral Exploration Perspective

Boosting. Acknowledgment Slides are based on tutorials from Robert Schapire and Gunnar Raetsch

A TWO-STAGE COMMITTEE MACHINE OF NEURAL NETWORKS

Nonlinear Classification

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Online Learning and Sequential Decision Making

Machine Learning Lecture 10

Learning and Memory in Neural Networks

Classification objectives COMS 4771

Transcription:

Australian Centre for Field Robotics A Key Centre of Teaching and Research The Rose Street Building J04 The University of Sydney 2006 NSW, Australia Iron Ore Rock Recognition Trials Fabio Ramos, Peter Hatherly & Sildomar Monteiro T: + 61 2 9036 7058 E: f.ramos@acfr.usyd.edu.au Technical Report ACFR-TR-2009-001 1 June 2008 Released 12 March 2009

Table of Contents 1 Introduction...1 2 The Learning Procedure...1 2.1 Boosting...1 3 Experiments...3 3.1 Trial Area...3 3.2 Drill Sensors...4 4 Geophysical Interpretation of Trial Site...5 5 Results...7 6 Conclusions...9 7 References...9 Appendix 1 Selected Drill Monitoring Data for Row 1...10 Appendix 2 Geophysical Logging Results for Row 1...17 Appendix 3 Core Photographs and Geotechnical Logs...25 ii

1 Introduction With the availability of monitoring data relating to drill performance and operation, the possibility arises for using that data to estimate the properties of the rocks being drilled. These properties could include rock type and strength. ACFR has previously developed theory and algorithms for the analysis of multi-parameter data for other automation projects requiring fusion of sensor information to provide a most likely estimate of a required situation. It was felt that these techniques could be applied to the problem of estimating ground conditions using the monitoring data gathered during the drilling of blast holes. For a trial in November 2006, a comprehensive set of drill monitoring data were obtained together with geotechnical, geological and geophysical support data aimed at providing an insight into the ground conditions. This report concerns the analysis of the drill monitoring data. In keeping with the terminology of the petroleum industry, the drill monitoring data is termed measurement-while-drilling (MWD) data. The analysis of the MWD data to determine subsurface geological conditions is termed rock recognition. 2 The Learning Procedure 2.1 Boosting A machine learning approach was used to combine the sensors inputs for rock recognition. Among the techniques tested (logistic regression, K-nearest neighbour and boosting), boosting performed the best. A classical supervised learning approach consists of two phases. In the learning phase, the algorithm is presented with input data and corresponding labels (rock recognition assignments). An optimisation procedure is then performed to learn a function that maps the inputs to the labels. Once this function is learned, new input data can be evaluated and a label obtained. This second phase is known as the testing phase. The final implementation of rock recognition will consist of an off-line algorithm for learning and an online (embedded) algorithm for testing. Importantly, the machine learning procedure employed in this work can easily accommodate new information. There is no need for a specialist on the method, the only requirement is a training set consisting of input data and corresponding labels. This allows extreme flexibility for using the method in different drills and different geologies. Boosting is a machine learning technique for supervised classification that employs a combination of weak simple classifiers to produce a powerful committee [2][1]. It has a sound theoretical foundation and provides probability estimates for each class. Boosting has become very popular due to many empirical studies showing that it tends to yield smaller classification error rates and be more robust to overfitting than competing methods such as Support Vector Machines or Neural Networks. Other advantages of boosting are implementation simplicity and computational efficiency. The main disadvantage is the difficulty in interpreting the classification boundary due to the high degree of nonlinearity. There are many different variations of boosting algorithms. The most commonly used version is AdaBoost (from adaptive boosting), developed by Freund and Schapire [2]. The idea of AdaBoost is to train many weak learners on various distributions (or set of weights) of the input data and then combine the classifiers produced into a single committee. A weak learner can be any classifier whose performance is guaranteed to be better than a random guess. A very common weak learner is the decision stump, which produces an axis-orthogonal hyperplane and can be viewed as a one-level binary decision tree. Initially, the weights of all training examples are set equally, but after each round of the AdaBoost

algorithm, the weights of incorrectly classified examples are increased. The final committee or ensemble is a weighted majority combination of the weak classifiers. A formal description of the AdaBoost algorithm for binary classification is shown below. Empirical observations have shown that, although AdaBoost presents good generalization, it may overfit in some cases and its performance is degraded in the presence of noise. AdaBoost Given a training set of n labelled examples ( 1 1) where y { 1, + 1}. i Initialize the weighting coefficients: w For m= 1,, M a. Normalize the weighting coefficients w i 1 = for i = 1,, n. n w m m i i = n m w j = 1 j x, y,,( x, y ) b. Fit a classifier g m to the training data that minimizes the weighted error n m function m = wi I( yi gm( xi) ) ε i= 1 c. Evaluate the log-odds 1 1 ε log m αm = 2 ε m m 1 m w w exp α g ( x ) y d. Update the weighting coefficients i i { m m i i} M α ( ) Return predictions of the final ensemble 1 mg m m x Y ( x = ) M 1 if > 0 = 1 otherwise. n n Several methods have been proposed to extend boosting to handle the multiclass case. The typical strategy, called one-vs-all approach, is to reduce the multiclass problem to a set of independent binary problems. However, this approach has scalability issues, i.e., increase in computational cost as the number of classes grows, due to the independent training and run required by each binary classifier. For rock recognition purposes we use a version of boosting named LogitBoost. LogitBoost fits additive logistic regression models by stagewise optimization of the maximum likelihood. The algorithm below shows the multiclass version of LogitBoost which optimizes the maximum likelihood through adaptive Newton steps [1]. 2

LogitBoost (J classes) 1. Start with weights w ij = 1/N, i = 1,, N, j = 1,,J, F j (x) = 0 and p j (x) = 1/J j. 2. Repeat for m = 1, 2,, M: (a) Repeat for j = 1,, J: (i) Compute working responses and weights in the jth class, p j ( ) y * z ij = ij p j x i p j ( x i )1 ( p j ( x i ), w ij = p j (x i )(1 p j (x i )). (ii) Fit the function f mj (x) by a weighted least-squares regression of z ij to x i with weights w ij. (b) Set f mj () x J 1 f mj () x 1 J f mk () x, and F j ( x) F j ()+ x f mj (). x J J k=1 (c) Update p j (x) using, ()= x ef j ( x) J, F k ()= x 0. 3. Output the classifier argmax j F j ( x). J e F k () x k=1 k=1 3 Experiments 3.1 Trial Area Figure 1 shows the arrangement of the drill holes. The site was chosen because a variety of down-hole conditions were expected to vary. Three rows of holes were drilled with the drill operating in percussion mode (down-hole hammer) with and without shock absorber, and rotary mode with and without shock absorber. Of the three rows of holes, row 1 was the most comprehensive. In this row, 28 holes 12 m deep were drilled at 3 m spacings with the drill in percussion mode and with the use of the shock absorber. In addition to the test holes drilled with the blast hole rig, 7 diamond drill holes were drilled to a depth of 12 m at the locations shown in Figure 1. The cores were recovered for logging and testing. At the completion of the drilling all blast holes and the diamond drill holes were geophysically logged using caliper, natural gamma, magnetic susceptibility and density (gamma gamma) logging tools. The detailed geology was determined by site geologists using a combination of cone logging, core logging and the geophysical logging results. As shown by the geological section for this set of holes (Figure 2), the geological conditions ranged from shale to ore to banded iron formation (BIF). All cores were photographed and geotechnical logs were prepared. Appendix 2 contains the geophysical logging data for row 1. Core photographs and the geotechnical logs are provided in Appendix 3. 3

5280 5260 Blast hole Resource hole Rock recognition row 1 Rock recognition row 2 Rock recognition row 3 Rock recognition core hole 5240 Local North 5220 5200 5180 5160 5140 13040 13060 13080 13100 13120 13140 13160 Local East Figure 1. Pattern of resource holes, test holes and subsequent blast holes drilled at the test site. Figure 2. Geological section through row 1 of the test holes. 3.2 Drill Sensors The MWD sensors used for the rock recognition were: 1. Bit air pressure; 2. Pull-down pressure; 3. Rotation pressure; 4. Pull-down rate; 5. Head speed; 6. Pressure transducers (7 in total); 7. Accelerometers mounted on the mast (5 in total); 4

A Programmable Logic Controller (PLC) was used to log the data from sensors (1-5) of the above list. Although the sampling frequency was variable, the average was 10 readings (per sensor) per second. The pressure transducers were logged at 500Hz. The accelerometers were logged at a much higher sampling rate of about 10 khz. As the data was considerably noisy, a feature extraction procedure was performed before the data were input into the classifier. For sensors 1-6, the feature extraction procedure consisted of grouping the measurements into sections of 0.1 m by using the sensors time stamp and the corresponding head position of the drill. From this, an average value was obtained. This process worked well for the sensor data for row 1 but unfortunately there was serious sensor failure for the subsequent rows and these data could not be analysed. For the accelerometers, the measurements were grouped into sections of 0.1m as before and the standard deviation was computed. The idea was to quantify the vibration in a simple manner. Different analyses of the accelerometer data were performed at University of Queensland in the frequency domain. However, no conclusive results were obtained and the simple feature extraction procedure above was used for the results reported. During the data acquisition, the drill operated in percussion and rotary mode, which caused large vibrations in the machine. This fact makes accelerometer data challenging to interpret. 4 Geophysical Interpretation of Trial Site Detailed analysis of MWD data for rock recognition purposes requires independent data for the purposes of training the learning algorithm and the evaluation of the results. While the geological data from the cone and core logging provided a general geological section, the data were not at the 0.1 m increments utilised for the MWD data. The geotechnical data was also not available at this scale. Fortunately the geophysical borehole logs provided the quantitative down-hole measurements that could be correlated with the geology and also allowed direct comparison with the MWD data at 0.1 m increments. The analysis of the geophysical logging data was undertaken as follows. The natural gamma logs are ideally suited for discriminating shale from ore and Banded Iron Formation (BIF). Figure 3 shows the raw data for row 1. If a cut-off of 15 API is used to mark the transition from the ore and BIF, the results shown in Figure 4 are obtained. These compare favourably with the geological section shown in Figure 2. Figure 3. Smoothed natural gamma data for row 1 shown at 0.1 m depth intervals (units are API) 5

Figure 4. The occurrence of shale (bright red) defined by natural gamma data being > 15 API. To discriminate ore from BIF, the magnetic susceptibility logs were used. Figure 5 shows the magnetic susceptibility data. The criterion applied to distinguish BIF from the ore types (the shale had already been identified) was that for the BIF, somewhere within a preset distance (less than 1 m) the magnetic susceptibility needed to be above a preset threshold (about 12). Figure 6 shows the results of applying this rule. Figure 5. Smoothed magnetic susceptibility data for row 1. (Dimensionless susceptibility values are by 10-5.) Figure 6. Interpreted geological section showing shale (bright red), BIF (blue) and ore (brown). Finally, an attempt was made to estimate the hardness of the ore types on the basis of the density logs. Figure 7 shows the smoothed density. It can be seen that there are higher densities present within the ore zones in holes 11 to 15. It was assumed that these higher densities represented harder ore. The distribution of the densities within the ore zones is shown in more detail in Figure 8. On the basis of this distribution, hard, medium and soft ore types were assumed to be present. Figure 9 shows the final geophysical interpretation 6

Figure 7. Smoothed density data for row 1 (units of t/m3). Figure 8. Distribution of densities within the ore zone. Interpretations of hard, medium and soft ore were made on the basis of this distribution. Figure 9. Final classification of the geological section for row 1 based on the analysis of the geophysical logs. 5 Results As mentioned above, there were problems in the MWD data for rows 2 and 3. It was therefore only possible to analyse the results from row 1. Furthermore the results shown below, accelerometer data has not been used. In the first experiment, the objective was to distinguish shale from the other rock types. The natural gamma-ray data was used to label the training data. Figure 10 shows results for leave-one-out cross 7

validation; i.e., each 0.1m section is evaluated using a model trained with all the other sections. A direct comparison between Figure 10 and the basic shale model in Figure 4 shows that the boosting classifier is able to predict shale bands quite accurately from the MWD data. This is not totally a surprise given that shale is the softest rock type considered and, in general, possesses very different mechanical properties than the others. Figure 10. Shale detection results using boosting. The red colour indicates shale. The second experiment was aimed at recognising four different rock types: shale, zone A iron ore, zone B iron ore and BIF. The same evaluation procedure as before was used, i.e., leave-one-out cross validation. Figure 11 depicts the results, which can be compared to the geological interpretation of Figure 2. As can be observed, the shale bands are detected accurately except in the top of hole 28 where a sensor failure occurred. The BIF zone is on the right part of the section, separated from Zone B by a shale band. In this particular experiment, it was possible to distinguish between iron ore zones (Zones A and B), however, we believe this should be difficult in the general case since iron zones might have very similar mechanical properties. Figure11. Geological recognition using boosting. In the final experiment, the goal was to combine geology and hardness in the classification process. Labelling of the training data was obtained using the procedure described in the previous section. In summary, the system was trained to identify shale, BIF, and iron ore zones. Within the iron ore zones, the system classifies sections according to the hardness as soft, medium and hard ores. In this particular experiment, the cross validation was performed differently. Instead of the leave-onesection out as before, the whole testing hole was left aside, i.e., the classifier was trained on 27 holes and tested on the remaining hole. This process was then repeated 28 times. Compared to the other cross validation procedure where only one section was left out of the training, this procedure is much more difficult and requires better generalisation properties therefore previous results are not directly comparable. However, comparison of Figure 12 with the geophysical interpretation in Figure 9 shows a favourable correlation. 8

Figure 12. Combined geological recognition and hardness results using boosting. A qualitative assessment of these results was obtained by visually inspecting the bench face at the trial position and the diamond drill core next to hole 12. The drill core in hole had large sections of relatively strong, competent ore with few fractures. By observing the size of the rocks on the floor formed after blasting, it was possible to note that larger rocks were formed around holes 12 and 13. This gave a good indication that this area had harder rocks which were not correctly blasted resulting in larger post blasting fragments. 6 Conclusions While the failure of the sensor logging for the drilling of the second and third rows was unfortunate, the results for row 1 which utilised percussion drilling are most encouraging given the degree of dampening from the shock sub. Indeed, this may be one of the first successful instances of rock recognition being undertaken on in the hole hammer percussion drilling anywhere. With the good geological control provided by the geophysical logging, the rock recognition tests utilising the MWD data were clearly able to delineate the ore zones from shale and BIF. If the variations in density shown by geophysical density logging can be taken to indicate the differences in the hardness of the ore, then it would seem that hardness can also be recognised from the MWD data. Provision of good geological control will continue to be an issue for further rock recognition trials. Geophysical logging will be essential. Coring and good geological logging are also desirable. Means of calibrating results against blast performance are needed. The approach to supervised learning that has been utilised here has been successful. It is hoped that a machine learning procedure that can easily accommodate new information will be the final outcome, without the need for specialist involvement. The main requirement will be for a training set consisting of input data and corresponding labels. This will allow extreme flexibility for using the method with different drills and different geological conditions. 7 References [1] Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. Annals of Statistics, vol. 28, no. 2, pp. 337-407. [2] Freund, Y. and Schapire, R.E. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148-156. [3] Torralba, A., Murphy, K.P. and Freeman, W.T. (2007). Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 854-869. 9

Appendix 1 Selected Drill Monitoring Data for Row 1 Of the following data, the PLC fitted to the drill provided the bit pressure, the rotation pressure, the pulldown pressure, the pull-down rate and the head speed. In addition, pressure transducers were installed to monitor the following drill functions: Feed down Transducer 1 Feed up Transducer 2 Reverse rotation Transducer 3 Forward rotation Transducer 4 Rotation relief Transducer 5 Feed relief Transducer 6 Hold back Transducer 7 10

11

12

13

14

15

16

Appendix 2 Geophysical Logging Results for Row 1 17

18

19

20

21

22

23

24

Appendix 3 Core Photographs and Geotechnical Logs 25

26

27

28

29

30

31

32