UNIVERSITY OF SURREY B.Sc. Undergraduate Programmes in Computing B.Sc. Undergraduate Programmes in Mathematical Studies Level HE3 Examination MODULE CS364 Artificial Intelligence Time allowed: 2 hours Autumn Semester 2005 Attempt TWO Questions from THREE, each question is worth 50 marks. If any candidate attempts more than TWO questions, only the best TWO solutions will be taken into account. SEE NEXT PAGE 1
1. This question is about (a) Knowledge Acquisition and (b) Knowledge Representation. (a) Knowledge Acquisition is defined as an important step that characterises the Knowledge Representation process and system implementation thereafter. (i) Elaborate this definition of Knowledge Acquisition, and specify the main players of Knowledge Acquisition. (ii) Briefly describe the process of Knowledge Acquisition. [9 marks] (b) John Sowa is one of the key proponents of conceptual graphs (CG) for Knowledge Representation. (i) Define what conceptual graphs are, and briefly describe what the main characteristics are. [8 marks] (ii) According to the concept nodes, relations, and arrow directions, write in words what the following conceptual graphs means: [Person]< (Agnt)< [Walk] [Man: John] >(Poss) >[PC] >(Attr) >[Powerful] [Mouse: Jerry] >(Chrc) >[Colour: Brown] (iii) Create the conceptual graphs of the following sentences: "Bus number 9 is going to Copenhagen" "John was singing" "Romeo marries Juliet" QUESTION 1 CONTINUES ON THE NEXT PAGE 2
(iv) New conceptual graphs may be derived from other canonical graphs either by generalising or specialising. Considering the following two graphs (g1 and g2). Construct a new single graph that may derive from the two other graphs (g1 and g2), when applying the generalisation and/or specialisation rules accordingly. g1: man age old agent play object guitar g2: person: "Tom" age old location pub [8 marks] (c) Collins and Quillian s semantic networks are found to be logically inadequate. The notion of concept nodes, within a conceptual graph, tackles this problem. (i) Explain what concept nodes represent and how these are used to tackle the semantic network disadvantages. Please use examples to support your statements. [8 marks] SEE NEXT PAGE 3
2. This question is about Uncertainty Management and Fuzzy Logic. (a) Define what uncertainty management is, and provide three main areas of uncertain knowledge, with example(s). [8 marks] (b) Fuzzy Logic is one of the methods used to tackle uncertainty. Consider the example of a washing machine: The washing machine is one of the first devices to use fuzzy logic. Basically, the problem is to identify the appropriate time t needed to wash the load, given the dirtiness of the load and the volume of the load. The following table shows the rule base for the washing time problem: load volume load dirtiness vd md ld nd fl vlot vlot lot lit ml vlot mt mt lit ll lot lot lit lit where: vd: very dirty fl: full load vlot: very long time md: medium dirty ml: medium load lot: long time ld: lightly dirty ll: low load mt: medium time nd: not dirty lit: little time The rule table consists of 12 rules. The rule table is interpreted in the following way. For instance, the entry in the second row and the third column of the table specifies the rule: If load volume is medium load and load dirtiness is little load then washing time is medium time QUESTION 2 CONTINUES ON THE NEXT PAGE 4
The fuzzy sets for load dirtiness, volume and washing time respectively are based on the linear equation μ(x)=ax + b, and are defined based on the following tables: Load Dirtiness (D) nd ld md vd μ(d)=0 if D 1 2 6 μ(d)=1 if D = 0 3 5 10 μ(d)=0 if D 2 5 7 Load Volume (V) ll ml fl μ(v)=0 if V 2 6 μ(v)=1 if V = 0 5 10 μ(v)=0 if V 4 8 Washing Time (T) lit mt lot vlot μ(t)=0 if T 20 50 90 μ(t)=1 if T = 10 50 80 120 μ(t)=0 if T 30 60 100 The fuzzy sets table is interpreted in the following way. For instance, the mt set of T: µ mt 0, ( T ) = 1, 0, if if if : T : T : T 20 = 50 60 1 20 50 60 QUESTION 2 CONTINUES ON THE NEXT PAGE 5
(i) Based on the fuzzy sets tables above, draw three individual graphs, one for each D, V, and T individually, showing the fuzzy sets. (ii) The inference of a fuzzy expert system, based on the Mamdani method, depends on the execution of four major tasks: Fuzzification, Rule Evaluation, Aggregation, and Defuzzification. Consider the case when the input variables are: D = 9, V= 3. Using the rule base, execute each of the four inference tasks to compute the washing time T necessary to wash the load, using Centre of Gravity in the Defuzzification task. NOTE: To calculate the degrees of truth μ(x) for a given member you can either use triangular proportions, or calculate and use the appropriate linear function μ(x)=ax + b. [20 marks] (iii) How, in your opinion, you would create the output sets T, if using the Sugeno s concept of spikes? Draw the graph comprising the fuzzy sets for T and explain your thinking. (iv) Calculate the output of the mashing machine, given the same inputs, if Sugeno was used instead of the Mamdani inference mechanism. (v) Which one of the two inference methods, Mamdani or Sugeno, is more appropriate for the problem and why? SEE NEXT PAGE 6
3. This question is about Machine Learning and Decision Trees. (a) Describe three key characteristics of machine learning techniques. When would you use machine learning compared to other artificial intelligence techniques? (b) Neural networks and decision trees are two popular machine learning techniques, with decision trees often favoured for classification. Describe the reasons why you would use a decision tree rather than a neural network for a classification task. (c) Data for an example classification task is given in the following table: Example Blood Test Build Diagnosis 1 Present Slight Positive 2 Clear Medium Negative 3 Present Heavy Positive 4 Clear Slight Positive 5 Present Medium Negative 6 Clear Heavy Negative The table shows six example medical diagnoses. The two attributes show the presence of a particular substance in the blood of the example patient ( Present, Clear ) and the patient s build ( Slight, Medium or Heavy ). The diagnosis has two outcomes ( Positive, Negative ). A decision tree can be constructed to assist in the diagnosis of future examples using the ID3 algorithm. The ID3 algorithm uses the value of the Entropy (E) for each attribute/value pair: E c ( a = v ) = i = 1 p i log 2 p i where c is the number of classification categories ( Diagnosis ), a is the attribute ( Blood Test, Build ) with value v, and p i is the probability of a particular diagnosis for an attribute with a given value. QUESTION 3 CONTINUES ON THE NEXT PAGE 7
The Entropy values for each attribute/value are then used to calculate the Information Gain for each attribute: Gain T v a = j (, a ) = E ( T ) E ( T a = j ) where T is the set of examples, E(T) is the Entropy for all of the examples, v is the number of values for the given attribute, T is the total number of examples and attribute/value pair. T a = j j = 1 T T is the number of examples with the given (i) Using the ID3 algorithm, determine which of the two attributes ( Blood Test or Build ) should be used as the root node in a decision tree. You should show the Entropy values for each attribute and value pair, together with the Information Gain for each attribute. Describe how you would use these values to select the root attribute. [24 marks] (ii) Draw the resulting decision tree. (iii) Explain why you think your tree is the best representation. Illustrate your answer with a drawing of an alternative tree. (d) Machine learning techniques rely upon the data to construct an appropriate classifier. What techniques would you use to ensure that you construct the best possible classifier given the available data? Illustrate your answer by giving examples of the techniques applied to the construction of a decision tree. INTERNAL EXAMINERS: DR B. VRUSIAS, DR M. CASEY EXTERNAL EXAMINER: 8