Neural Networks & Fuzzy Logic. Introduction

Similar documents
Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Outline. Introduction, or what is fuzzy thinking? Fuzzy sets Linguistic variables and hedges Operations of fuzzy sets Fuzzy rules Summary.

Financial Informatics IX: Fuzzy Sets

ME 534. Mechanical Engineering University of Gaziantep. Dr. A. Tolga Bozdana Assistant Professor

Lecture 5: Logistic Regression. Neural Networks

Application of Fuzzy Logic and Uncertainties Measurement in Environmental Information Systems

What Is Fuzzy Logic?

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Fuzzy Systems. Introduction

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Fuzzy Systems. Introduction

Lab 5: 16 th April Exercises on Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial Neural Network

Artificial Neural Networks

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. Edward Gatt

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16

CSC321 Lecture 5: Multilayer Perceptrons

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Handling Uncertainty using FUZZY LOGIC

Computational statistics

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

Artificial Neuron (Perceptron)

Artificial Neural Networks

Neural Networks (Part 1) Goals for the lecture

Learning and Memory in Neural Networks

Learning Neural Networks

Supervised Learning. George Konidaris

Machine Learning

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

The Perceptron. Volker Tresp Summer 2016

Neural Networks: Backpropagation

Artificial Neural Networks

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

2010/07/12. Content. Fuzzy? Oxford Dictionary: blurred, indistinct, confused, imprecisely defined

From perceptrons to word embeddings. Simon Šuster University of Groningen

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018

PATTERN CLASSIFICATION

April 9, Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá. Linear Classification Models. Fabio A. González Ph.D.

Reading Group on Deep Learning Session 1

Neural networks. Chapter 19, Sections 1 5 1

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Pattern Classification

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Uncertainty and Rules

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Artificial Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Incremental Stochastic Gradient Descent

Multilayer Neural Networks

10-701/ Machine Learning, Fall

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

Data Mining Part 5. Prediction

The Perceptron. Volker Tresp Summer 2014

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Machine Learning Lecture 7

Multilayer Perceptron

The Perceptron. Volker Tresp Summer 2018

MLPR: Logistic Regression and Neural Networks

Ch 4. Linear Models for Classification

Neural networks. Chapter 20, Section 5 1

Logistic Regression & Neural Networks

Outline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Linear Models for Classification

An Introduction to Statistical and Probabilistic Linear Models

CS 354R: Computer Game Technology

This time: Fuzzy Logic and Fuzzy Inference

Machine Learning (CSE 446): Neural Networks

Artificial Neural Networks

AI Programming CS F-20 Neural Networks

BACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

3. DIFFERENT MODEL TYPES

CS:4420 Artificial Intelligence

CSC242: Intro to AI. Lecture 21

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Lecture 4: Perceptrons and Multilayer Perceptrons

Tutorial on Machine Learning for Advanced Electronics

Fuzzy Logic and Fuzzy Systems Properties & Relationships

Input layer. Weight matrix [ ] Output layer

Transcription:

& Fuzzy Logic Introduction

0 0 0 1 0 0 0 adjustable weights 1 20 37 10 1 1

Definition & Area of Application Neural Networks (NN) are: mathematical models that resemble nonlinear regression models, but are also useful to model nonlinearly separable spaces knowledge acquisition tools that learn from examples Neural Networks are used for: pattern recognition (objects in images, voice, medical diagnostics for diseases, etc.) exploratory analysis (data mining) predictive models and control 4.07.12

Biological Analogy

Perceptrons Output units j Output of unit j: o j = f(a j ) Input to unit j: a j = Σw ij a i Input units i Input to unit i: a i measured value of variable i

Example: Logical AND function with NN y θ = 0.5 w 1 w 2 x 1 x 2 input output 00 0 01 0 10 0 11 1 f(x 1 w 1 + x 2 w 2 ) = y f(0w 1 + 0w 2 ) = 0 f(0w 1 + 1w 2 ) = 0 f(1w 1 + 0w 2 ) = 0 f(1w 1 + 1w 2 ) = 1 θ y=f(a) = 1, for a > θ 0, for a θ some possible values for w 1 and w 2 w 1 w 2 0.20 0.35 0.20 0.40 0.25 0.30 0.40 0.20

Example: Perceptrons (NN) in Medical Diagnostics Input units weights Cough Headache Δ rule change weights to decrease the error No disease Pneumonia Flu Meningitis Output units - what we got what we wanted error

Linear Separation

Nonlinear Separation Linear Activation Nonlinear Activation

Multilayered Perceptrons

Example: Multilayer NN for Diagnosis of Abdominal Pain Appendicitis 0 Diverticulitis 0 Neural Networks Perforated Duodenal Non-specific Ulcer Pain Cholecystitis Small Bowel Obstruction 0 1 0 0 0 Pancreatitis adjustable weights 1 20 37 10 1 1 Male Age Temp WBC Pain Intensity Pain Duration

Regression vs. Neural Networks Jargon Pseudo Correspondence Independent variable = input variable Dependent variable = output variable Coefficients = weights Estimates = targets Cycles = epoch

Inputs Age 34 Gender 2 Stage 4 Logistic Regression Model 5 Output 4 0.6 8 Σ Σ = 34.5 + 1.4 + 4.8 = 20.6 Probability of beingalive Independent variables x1, x2, x3 Coefficients a, b, c Dependent variable Prediction

Neural Network Model Inputs Age 34 Gender 2 Stage 4.6.2.1.3.7.2 Σ Σ.4.2 Activation functions Linear Threshold or step function Logistic, sigmoid, squash Hyperbolic tangent.5.8 Σ Output 0.6 Probability of beingalive Input variables Weights Hidden Layer Weights Output variable Prediction

Learning: Hidden Units and Backpropagation

Minimizing the Error Error surface initial error negative derivative final error Error Functions Mean Squared Error (for most problems) Σ(t o) 2 /n Cross Entropy Error (for dichotomous or binary outcomes) Σ(t ln o) + (1 t) ln (1 o) local minimum w initial w trained Epochs positive change

Implementation of Learning: Gradient descent & Minima Error Global minimum Local minimum Epochs

Implementation of Learning: Problem of Overfitting CHD Overfitted model Real model error holdout Overfitted model 0 age training epochs

tss Neural Networks Implementation of Learning: Problem of Overfitting Overfitted model tss a min (Δtss) a = test set b = training set tss b Stopping criterion Epochs

Parameter Estimation Logistic Regression It models just one function Maximum likelihood Fast Optimizations Fisher Newton Raphson Neural Network It models several functions Backpropagation Iterative Slow Optimizations Quickprop Scaled conjugate g.d. Adaptive learning rate

What Do You Want? Insight versus Prediction Insight into the model Explain importance of each variable Assess model fit to existing data Accurate predictions Make a good estimate of the real probability Assess model prediction in new data

Model Selection: Finding Influential Variables Logistic Forward Backward Stepwise Arbitrary All combinations Relative risk Neural Network Weight elimination Automatic Relevance Determination Relevance

Regression Diagnostics: Finding Influential Observations Logistic Analysis of residuals Cook s distance Deviance Difference in coefficients when case is left out Neural Network Ad hoc

How Accurate are Predictions? Construct training and test sets or bootstrap to assess unbiased error Assess Discrimination How model separates alive and dead Calibration How close the estimates are from real probability

Unbiased Evaluation: Training and Tests Sets Training set is used to build the model (may include holdout set to control for overfitting) Test set left aside for evaluation purposes Ideal: yet another validation data set, from different source to test if model generalizes to other settings

Evaluation of NN

More Examples: ECG Interpretation

More Examples: Thyroid Diseases

Expert Systems and Neural Nets

Fuzzy Logic

Fuzzy Logic Definition Experts rely on common sense when they solve problems. How can we represent expert knowledge that uses vague and ambiguous terms in a computer? Fuzzy logic is not logic that is fuzzy, but logic that is used to describe fuzziness. Fuzzy logic is the theory of fuzzy sets, sets that calibrate vagueness. Fuzzy logic is based on the idea that all things admit of degrees. Temperature, height, speed, distance, beauty all come on a sliding scale. The motor is running really hot. Tom is a very tall guy.

Fuzzy Logic Definition Many decision making and problem solving tasks are too complex to be understood quantitatively, however, people succeed by using knowledge that is imprecise rather than precise. Fuzzy set theory resembles human reasoning in its use of approximate information and uncertainty to generate decisions. It was specifically designed to mathematically represent uncertainty and vagueness and provide formalized tools for dealing with the imprecision intrinsic to many engineering and decision problems in a more natural way. Boolean logic uses sharp distinctions. It forces us to draw lines between members of a class and non members. For instance, we may say, Tom is tall because his height is 181 cm. If we drew a line at 180 cm, we would find that David, who is 179 cm, is small. Is David really a small man or we have just drawn an arbitrary line in the sand?

Fuzzy Logic Bit of History Fuzzy, or multi valued logic, was introduced in the 1930s by Jan Lukasiewicz, a Polish philosopher. While classical logic operates with only two values 1 (true) and 0 (false), Lukasiewicz introduced logic that extended the range of truth values to all real numbers in the interval between 0 and 1. For example, the possibility that a man 181 cm tall is really tall might be set to a value of 0.86. It is likely that the man is tall. This work led to an inexact reasoning technique often called possibility theory. In 1965 Lotfi Zadeh, published his famous paper Fuzzy sets. Zadeh extended the work on possibility theory into a formal system of mathematical logic, and introduced a new concept for applying natural language terms. This new logic for representing and manipulating fuzzy terms was called fuzzy logic.

Why fuzzy? Why Fuzzy Logic? As Zadeh said, the term is concrete, immediate and descriptive; we all know what it means. However, many people in the West were repelled by the word fuzzy, because it is usually used in a negative sense. Why logic? Fuzzy Logic Fuzziness rests on fuzzy set theory, and fuzzy logic is just a small part of that theory. The term fuzzy logic is used in two senses: Narrow sense: Fuzzy logic is a branch of fuzzy set theory, which deals (as logical systems do) with the representation and inference from knowledge. Fuzzy logic, unlike other logical systems, deals with imprecise or uncertain knowledge. In this narrow, and perhaps correct sense, fuzzy logic is just one of the branches of fuzzy set theory. Broad Sense: fuzzy logic synonymously with fuzzy set theory

Fuzzy Logic Fuzzy Applications Theory of fuzzy sets and fuzzy logic has been applied to problems in a variety of fields: taxonomy; topology; linguistics; logic; automata theory; game theory; pattern recognition; medicine; law; decision support; Information retrieval; etc. And more recently fuzzy machines have been developed including: automatic train control; tunnel digging machinery; washing machines; rice cookers; vacuum cleaners; air conditioners, etc.

Fuzzy Logic Fuzzy Applications Advertisement: Extraklasse Washing Machine 1200 rpm. The Extraklasse machine has a number of features which will make life easier for you. Fuzzy Logic detects the type and amount of laundry in the drum and allows only as much water to enter the machine as is really needed for the loaded amount. And less water will heat up quicker which means less energy consumption. Foam detection Too much foam is compensated by an additional rinse cycle: If Fuzzy Logic detects the formation of too much foam in the rinsing spin cycle, it simply activates an additional rinse cycle. Fantastic! Imbalance compensation In the event of imbalance, Fuzzy Logic immediately calculates the maximum possible speed, sets this speed and starts spinning. This provides optimum utilization of the spinning time at full speed [ ] Washing without wasting with automatic water level adjustment Fuzzy automatic water level adjustment adapts water and energy consumption to the individual requirements of each wash programme, depending on the amount of laundry and type of fabric [ ]

Fuzzy Logic More Definitions Fuzzy logic is a set of mathematical principles for knowledge representation based on degrees of membership. Unlike two valued Boolean logic, fuzzy logic is multi valued. It deals with degrees of membership and degrees of truth. Fuzzy logic uses the continuum of logical values between 0 (completely false) and 1 (completely true). Instead of just black and white, it employs the spectrum of colours, accepting that things can be partly true and partly false at the same time. 0 0 0 1 1 1 0 0 0.2 0.4 0.6 0.8 1 1 (a) Boolean Logic. (b) Multi-valued Logic.

Fuzzy Logic Fuzzy Sets The concept of a set is fundamental to mathematics. Name Height, cm Degree of Membership Crisp Fuzzy However, our own language is also the supreme expression of sets. For example, car indicates the set of cars. When we say a car, we mean one out of the set of cars. The classical example in fuzzy sets is tall men. The elements of the fuzzy set tall men are all men, but their degrees of membership depend on their height. Chris Mark John Tom David Mike Bob Steven Bill Peter 208 205 198 181 179 172 167 158 155 152 1 1 1 1 0 0 0 0 0 0 1.00 1.00 0.98 0.82 0.78 0.24 0.15 0.06 0.01 0.00

Fuzzy Logic Crisp vs. Fuzzy Sets Degree of The x-axis represents Membership the 1.0 universe of discourse the 0.8 range of all possible values 0.6 applicable to a chosen 0.4 variable. In our case, the 0.2 variable is the man height. 0.0 According to this representation, the universe of Degree of men s heights consists Membership of all 1.0 tall men. 0.8 Crisp Sets Tall Men 150 160 170 180 190 200 210 Fuzzy Sets Height, cm The y-axis represents 0.6 the membership value 0.4 of the 0.2 fuzzy set. In our case, the 0.0 fuzzy set of tall men maps 150 160 170 180 190 200 210 height values into corresponding membership values. Height, cm

Fuzzy Logic A Fuzzy Set has Fuzzy Boundaries Let X be the universe of discourse and its elements be denoted as x. In the classical set theory, crisp set A of X is defined as function f A (x) called the characteristic function of A: f A (x) : X {0, 1}, where f A ( x) 1, if = 0, if x A x A For any element x of universe X, characteristic function f A (x) is equal to 1 if x is an element of set A, and is equal to 0 if x is not an element of A. In the fuzzy theory, fuzzy set A of universe X is defined by function µ A (x) called the membership function of set A µ A (x) : X {0, 1}, where µ A (x) = 1 if x is totally in A; µ A (x) = 0 if x is not in A; 0 < µ A (x) < 1 if x is partly in A. For any element x of universe X, membership function µ A (x) is the degree of membership to which x is an element of set A.

Fuzzy Logic Fuzzy Set Representation First, we determine the membership functions. In our tall men example, we can obtain fuzzy sets of tall, short and average men. The universe of discourse the men s heights consists of three sets: short, average and tall men. As you will see, a man who is 184 cm tall is a member of the average men set with a degree of membership of 0.1, and at the same time, he is also a member of the tall men set with a degree of 0.4. Degree of Membership 1.0 0.8 0.6 0.4 0.2 0.0 150 160 170 180 190 200 210 Degree of Membership 1.0 0.8 0.6 0.4 0.2 Short Short Average Short Tall Tall Men Average Crisp Sets Fuzzy Sets Height, cm 0.0 150 160 170 180 190 200 210 Tall Tall

Fuzzy Logic Linguistic Variables and Inference At the root of fuzzy set theory lies the idea of linguistic variables. A linguistic variable is a fuzzy variable. For example, the statement John is tall implies that the linguistic variable John takes the linguistic value tall. In fuzzy expert systems, linguistic variables are used in fuzzy rules. For example: IF wind is strong THEN sailing is good IF project_duration is long THEN completion_risk is high IF speed is slow THEN stopping_distance is short