Knowledge-Based Morphological Classification of Galaxies from Vision Features

Size: px
Start display at page:

Download "Knowledge-Based Morphological Classification of Galaxies from Vision Features"

Transcription

1 The AAAI-17 Workshop on Knowledge-Based Techniques for Problem Solving and Reasoning WS Knowledge-Based Morphological Classification of Galaxies from Vision Features Devendra Singh Dhami School of Informatics and Computing Indiana University David Leake School of Informatics and Computing Indiana University Sriraam Natarajan School of Informatics and Computing Indiana University Abstract This paper presents a knowledge-based approach to the task of learning and identifying galaxies from their images. To this effect, we propose a crowd-sourced pipeline approach that employs two systems - case based and rule based systems First, the approach extracts morphological features i.e. features describing the structure of the galaxy such as its shape, central characteristics e.g., has a bar or bulge at its center) etc., using computer vision techniques. Then it employs a case based reasoning system and a rule based system to perform the classification task. Our initial results show that this pipeline is effective in learning reasonably accurate models on this complex task. Introduction We consider the task of morphological classification of galaxies from images, i.e., identifying galaxies from astronomy images. This task has been classically addressed using the machinery of standard supervised learning methods (De La Calleja and Fuentes 2004b; Goderya and Lolling 2002). While reasonably successful, most of these methods made restrictive assumptions including but not limited to, rotational/scale invariance, availability of standardized features etc. In addition, they cannot provide explanations for their conclusions. Knowledge-based approaches are a step in the direction of overcoming this restriction, as rules can express the knowledge obtained from the features in a natural way. We take a knowledge-based approach for this task. Specifically, we design two systems - first is a case-based reasoning system (CBR) and the second is a rule-based system. These systems are then employed in a pipeline where the features are obtained from an automated computer vision system. Our hypothesis, that we verify empirically, is that such a computer vision feature extraction system when coupled with the knowledge-based systems for prediction would allow for effective and efficient extraction of the galaxies from the raw images. We compare our automated system Copyright c 2017, Association for the Advancement of Artificial Intelligence ( All rights reserved. with a human annotated data set called Galaxy Zoo and demonstrate the superiority of the proposed system in this challenging task. We first present the related work in the next section. We then outline the feature extraction component which identifies the computer vision features. Next, we discuss the two knowledge-based systems in greater detail. Finally, we present our initial empirical results before concluding the paper by discussing avenues for future research. Background and Related work De La Calleja and Fuentes (2004a) developed a three-state pipeline for galaxy classification image analysis, data compression, and machine learning. They applied three learning methods, Naive Bayes, C.4.5 and Random forests, on the New General Catalog (NGC) released by the Astronomical Society of the Pacific. Their results show that random forests performed better than Naive Bayes or C4.5. Banerji et al. (2010) applied neural networks to classify images into three classes: early types, spirals, and point sources/artifacts. Their results show that the color or the shape parameters, when taken individually, are not sufficient to capture the morphological features of the galaxy. However, combining those parameters increased the accuracy remarkably. Cui et al. (2014) developed a system, where a galaxy is queried by providing a galaxy image as an input, after which the system retrieves and ranks the most similar galaxies. They made the assumption that the input images must be invariant to rotation and scale. Kasivajhula, Raghavan, and Shah (2007) compare three classical machine learning algorithms, namely, Support vector machines, Random forests and Naive Bayes in the task of galaxy classification. The galaxies are classified into 3 categories, Elliptical, Spiral and Irregular and then subdivided into 7 classifications, although the dataset considered is different from the one we consider in this paper. Rule based systems (Hayes-Roth 1985) operate by using if-then rules that are generally defined by experts. Leake (1996) contrasts this process, in which reasoning draws conclusions by chaining together generalized rules, starting 719

2 from scratch, with the reasoning from prior cases in CBR and observes that it differs in two ways: first, the general rules are replaced by specific cases and second, chaining is replaced by retrieval and adaptation of cases. Mantaras et al. (2005) provide an overview of the foundations of CBR, including various aspects such as case retrievel, reuse etc., and standard methods. Our work can be seen as building on these early ideas for the task of morphological classification. To our knowledge, although rule based expert systems have been used for this task (Heck and Murtagh 1989), case based systems have not. Thus it becomes interesting to compare the two methods. We develop a pipeline and employ a knowledge based approach with an aim to perform a comprehensive analysis on this task. Morphological Classification using Knowledge-based systems We now outline our Morphological Classification of Galaxies using Knowledge-based systems (MCG-KB) algorithm. We first present the feature extraction process before presenting the knowledge based systems that we design for this task. Feature extraction We develop six unique features for determining the characteristics of a galaxy. The images we use are colored images of size 424 x 424. The center of the galaxies are located at the center of the image as thus the need of designing position invariant feature detectors is inessential. The images are pre-processed using the following steps: the color image is first converted into grayscale, and then to a binary image. Here, manually adjusting the binarization threshold was not practical due to the nature of the images as some galaxies have sparse patterns, which got neglected during the conversion. To handle this, we applied Otsu thresholding (Otsu 1975) which adaptively adjusts the threshold. We now list each set of the features and outline the procedure of feature extraction for each set. Detecting shape: Circularity A given galaxy can be either circular, elliptical, or spiral. We detect these shapes after converting the galaxy to a binary image. Although every galaxy image has noise due to the several stars in the galaxy, if we draw contours about every shape in the image, the galaxy will be the shape with the longest contour. This allows us to calculate the area of the galaxy. Given the length of the contour about the galaxy, i.e. the length of the perimeter L, as well as the area A, we plug these values into the isoperimetric inequality: 4 π A L 2 By using this equation, we calculate a value which tells us how similar a shape is to a circle i.e. the circularity of the shape. A value of 1.0 indicates the shape is a perfect circle. As the value approaches zero, the shape is less like a circle. Although a value close to zero could be any shape, there are a limited number of shapes in the domain of shapes a galaxy may take, and thus we may assume galaxies with low values are spiral galaxies. An interim threshold value indicates an elliptical galaxy. Detecting Viewing Angle Galaxies may be viewed as either being head-on or edge-on. To understand this, consider an analogy to a frisbee that may be viewed such that it appears to be a circle when held still (head-on), or may look like a thin bar, as when it is flying in the air (edge-on). After a galaxy has been reduced to a binary image, our approach is to compare the magnitudes of the gradients in two directions, the width and height of the galaxy. In a circular galaxy, since the width and the height are approximately the same, the two gradients will be approximately the same. In a galaxy viewed edge-on, the width of the galaxy is much greater than the height of the galaxy, or vice versa. Thus, a mismatch in the gradients can be expected and rather desired. To detect the two gradients of a circular galaxy, it is enough to take the gradients in the x and y direction. However, because an edge-on galaxy might be oriented at any degree, it is necessary to consider gradients in more than two directions. In addition, it was found that simply rotating the binary image 45, and taking the x and y gradients a second time was sufficient for determining the viewing angle of a galaxy. Thus, we take the magnitude of the gradient of the binary image in the x and y directions, rotate the binary image 45, and take the magnitude of the gradient in the x and y direction again. We then use the two sets of magnitudes (x and y in the normal image, x and y in the rotated image) to identify the one that yields the greatest difference between the two magnitudes. This set allows us to determine the difference between the height and width of a galaxy at any angle. In order to generate a value for thresholding, we use the ratio of the x magnitude over the y magnitude; higher values indicate a galaxy being viewed edge-on, whereas lower values indicate a galaxy being viewed head-on. This enables us to reliably determine the viewing angle of a galaxy, and produces an angle value that tells us the general orientation of the galaxy. Detecting existence of a bulge If a galaxy is being viewed edge-on, it is possible to see whether the galaxy has a bulge about the center or not as shown in figure 1. Thus we only Figure 1: An example image where bulge can be seen search for the presence of a bulge after detecting the viewing angle of the galaxy. Once we know the orientation of the galaxy, a one-pixel wide line along the width of the binary 720

3 Figure 2: Detecting spiral arms Figure 3: Detecting tightness of spiral arms image of the galaxy is passed. If the width of the galaxy is along the x-axis, a line going from the top of the image to the bottom of the image from left to right is passed. Likewise, if the width of the galaxy is along the y-axis, a line going from the left side of the image to the right, from top to bottom is passed. If the width of the galaxy is oriented at some other angle, we use the rotated image instead of the normal image. Since galaxies are typically symmetrical perpendicular to the galaxys width, it is sufficient and computationally efficient to only pass the line across half of the width of the galaxy. As we pass the bar over the galaxy we calculate the height of the galaxy by counting the number of white pixels along the line. Our method for detecting bulges is to look for the sudden increase in the height of the galaxy over a short distance towards the center of the galaxy. In a galaxy without a bulge, we can expect the height of the galaxy to stay consistent along the full width of the galaxy. Therefore, we score a galaxy by the derivative of the height of the galaxy, applying a weight towards the center of the galaxy, and use a threshold to determine the presence of a bulge. Detecting number of spiral arms Galaxies may possess two or more arms. It is only possible to detect spiral arms on galaxies which are being viewed head-on. We first calculate the pixels which belong to a series of lines radiating from the center of a galaxy. Each line is one pixel wide and extends a set length away from the center of the galaxy. The set length is a thresholded value that is sufficient to extend beyond the limits of the galaxy. A length extending just beyond the limits of the galaxy is used to avoid wasting computation time determining the pixel values of empty space. The lines are drawn at 10 intervals about the galaxy as shown in figure 2. Then, by examining the distribution of black and white pixels along an individual line, it can be determined whether the line intersects a spiral arm. If, beginning at the point which lies in the galaxy and going outwards to the point which lies in space, a large distribution of white pixels followed by a large distribution of black pixels is seen, it can be inferred that the line does not intersect a spiral arm. If, however, a large distribution of white pixels, followed by a small distribution of black pixels, a small distribution of white pixels, then a large distribution of black pixels is seen, it can be inferred that the line intersects a spiral arm. The small distribution of black pixels is the empty space in between the galaxy and the spiral arm, and the small distribution of white pixels is the spiral arm itself. At 10 intervals, multiple lines might intersect the same spiral arm; to avoid this, we use a flag which initially assumes that a spiral arm has not been detected. Beginning at the line at 0, we check for a spiral arm. If a spiral arm exists, we modify the flag, and do not change it back until we no longer detect a spiral arm as we move through each subsequent line. Detecting the tightness of the arms For spiral galaxies, the spiral arms may be tightly wound about the galaxy, loosely bound, or somewhere in between. Our approach for detecting how tightness of the arms is very simple. We fit the smallest bounding box around the binary image of a galaxy which does not intersect the arms of the galaxy; that is, the bounding box which best fits the galaxy as shown in figure 3. Then the ratio of black pixels over white pixels within the bounding box is taken. Galaxies with tightly bounded arms will tend to have smaller bounding boxes and have a lower ratio of black pixels to white pixels, whereas galaxies with loosely bounded arms tend to have bounding boxes which extend well beyond the center of the galaxy, and thus tend to have a high ratio of black pixels over white pixels. Also, the number of spiral arms present is taken into consideration, as it is possible that a spiral galaxy with four arms might have roughly the same bounding box as a galaxy with two arms. In such a situation, the bounding box of the galaxy with four arms would contain significantly more white pixels, although it does make it difficult to set a threshold value which would work across all galaxies. Detecting existence of a bar Galaxies may have bars in their center, which emit brighter light than the rest of the galaxy. The first step in exploiting the brightness of a potential bar, is to increase the contrast of the galaxy. We first convert the RGB image into an HSV color scale since we found that contrasting the image in the HSV color scale produced better results after applying thresholding. The threshold for the bar may be set very high; we used a threshold value of 255. We then extract from the contrasted image the pixels which satisfy the threshold. After constructing a contour around the mass of bright pixels, we determine the width and height of the shape bounded by the contour. If the width is much greater than the height, a bar is present in the galaxy. The process is shown in figure

4 (a) Original image (b) Contrasted image Figure 5: Examples of cases for case based reasoning system for human data (c) Extracted countor image (d) Rotated countor Figure 4: Detecting bar in an image Class Class 1.1 Class 1.2 Class 1.1 Class 2.1 Class 2.2 Class 3.1 Class 3.2 Class 4.1 Class 4.2 Class 9.1 Class 9.2 Class 9.3 Class 10.1 Class 10.2 Class 10.3 Class 11.1 Class 11.2 Class 11.3 Class 11.4 Class 11.5 Class 11.6 Description Is the galaxy smooth Is the galaxy a disk Is the galaxy a star/artifact The galaxy can be viewed edge on The galaxy cannot be viewed edge on (thus face-on) The galaxy has a bar structure in its center The galaxy does not have a bar structure in its center The galaxy has a spiral arm pattern The galaxy does not have a spiral arm pattern The galaxy has a rounded bulge at its center The galaxy has a boxy bulge at its center The galaxy has no rounded bulge at its center The galaxy has tightly bounded arms The galaxy has medium bounded arms The galaxy has loosely bounded arms The galaxy has 1 spiral arm The galaxy has 2 spiral arms The galaxy has 3 spiral arms The galaxy has 4 spiral arms The galaxy has more than 4 spiral arms The galaxy has no spiral arms Table 1: Class description With respect to the extracted features only few of the classes from the original human classification (Willett et al. 2013) remain relevant. These classes are shown in table 1. Knowledge-based systems We implement three systems in this work, two of which are case based reasoning systems and a rule based system. One of the CBR systems uses the human data obtained from the galaxy zoo project (Lintott et al. 2008) in which the human volunteers classified the galaxies into various classes (we denote this as CBH). The other case based and the rule based systems were implemented for the features obtained from our computer vision system. We denote these systems as CBCVF and RBCVF respectively. CBH system The format of the human classified data is static. The data provides us with the galaxy id and the class names with the probabilities as stated by humans of each galaxy belonging to the certain class. We make an assumption that the galaxies classified by the humans will belong to a single class. The class to which they belong will depend on the probabilities of their subclasses. Note that by subclasses we denote the classes for the CBCVF and RBCVF systems Figure 6: Examples of rules for rule based system because of the above assumption. A sample of the data is presented in table 2. We build the cases by converting the data to the appropriate case representation. Kolodner (1993) mentions that getting a case into a representation that works is more important than deciding on a single format for all systems. Our experimentation showed that the representation for the CBH system did not perform well with the CBCVF system and thus two representations were designed. For the CBH system our cases are of the format shown in figure 5. As can be seen, in the CBH system we have explicit probabilities for all subclasses for a galaxy class. In the CBCVH system, we employ the numerical values for the parameters of every features to construct a case as shown in figure 7. Our first goal is to find the class for which the probabilities of the subclasses sum up to 1 (or close enough due to inconsistencies with floating point numbers), as this indicates a 100% probability that the galaxy will belong to that specific class. If the above goal is not satisfied, then we take the approach of finding the nearest neighbor for the case in question. We determine the case for which the distance between the new case and the case is lowest. Then we obtain the class for which the distance is lowest. RBCVF System For the rule-based system we use the computer vision features extracted as explained above. An example output from the CV system is shown in table 3. The rules consist of the complete rule number as its first part (Rule-1, Rule-2 etc.). This is necessary to model the fact that Rule 1 is related to the Rule 2 and Rule 3 to Rule 9 etc. The second part of the rule has two components: (1) the antecedent that represents the feature of the galaxies in that subclass and, (2) the consequent which contains the class number and the rule which needs to fired next for that galaxy. 722

5 Galaxy ID Class 1.1 Class 1.2 Class 1.3 Class 2.1 Class 2.2 Class 3.1 Class Table 2: Example human annotated data Image ID Shape Arm tightness Disk arms smooth/disk Edge On Bar feature Bulge not-a-spiral no-arms smooth not-viewed-edge-on no-bar no-bulge not-a-spiral loose one smooth not-viewed-edge-on bar no-bulge not-a-spiral no-arms smooth not-viewed-edge-on no-bar no-bulge spiral loose one smooth viewed-edge-on bar no-bulge spiral tight one disk viewed-edge-on bar no-bulge spiral loose four disk not-viewed-edge-on no-bar no-bulge Table 3: Extracted features from the vision system for rule based system Image ID Circularity Bulge Viewing angle Bar feature Arm tightness number of arms Table 4: Extracted features from the vision system for case based system Next, all the values for that feature are normalized by dividing the feature values with the minimum value. Finally, the nearest case to the normalized value is found to obtain the feature. After the features are obtained we can map them to the corresponding class based on the corresponding case in the case base. Figure 7: Examples of training cases for case based reasoning system Our system includes a conflict resolution component that prevents the system from firing rules whose antecedents are false. CBCVF The extracted features for the CBR system differ from the rule-based system. As shown in table 4, in contrast to the RBCVF, we employ the numerical values for the parameters of every features. This is necessary to test the ability of the system to perform automatic feature extraction. We train our CBR system with selected cases which are not among the input cases. Some of the cases are shown in Figure 7. We take two different approaches for the first three features and the last three. For the first 3 features a weight matrix is used. This consists of weights that are assigned for each features for each case. These weights are assigned according to the intuition that the new case should be near to the one of the training cases. The weights are either 1 (very near to a case in the knowledge base), 0.5 ( neither very near nor very far to a case in the knowledge base) or -1 (very far from a case in the knowledge base). Then the best case is obtained according to the weights to determine the feature. For the last 3 features we find the minimum value for a feature from the new case and all cases in the knowledge base. Experiments We aim to explicitly ask the following questions: Q1: Does the choice between these knowledge based systems matter for the task of morphological classification? Q2: Do the features extracted from our computer vision system provide as much information for the task as the human labelled data? Q3: Finally, do knowledge based system techniques provide a potential tool for astronomy tasks, specifically, the task of classification of galaxies? To this effect, we employed a data set from the SDSS database (Lintott et al. 2008). This data set contains images of the galaxies and the crowd-sourced data. We test the systems on 32 images but claim that the system can be scaled up to any number of images with none or minor modification. The output obtained by using the case based reasoning approach on the crowd sourced human data is shown in table 5. The output obtained by using the rule based approach on the data obtained from our vision system is shown in table 6. The output obtained by using the case based reasoning approach on the the data obtained from our vision system is shown in table 7. As can be seen that both the rule-based system and the CBR system identify the galaxies correctly. We take the human annotated data as the baseline for evaluation of the systems. As an example, for the galaxy id case-based system gives the following classes: Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 723

6 CBR system output Knowledge Base is empty Adding the first case to knowledge base The galaxy belongs to [ Class 6 ] The galaxy belongs to [ Class 10 ] Appending the case to KB. The galaxy belongs to [ Class 5 ] Appending the case to KB. The galaxy belongs to [ Class 1 ] Appending the case to KB. Table 5: CBR system output for human annotated data Rule based system output Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.1 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.2 Class-2.1 Class-3.1 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.2 Class-3.2 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.2 Class-3.2 Class-4.1 Class-9.2 Class-10.2 Class Class-1.2 Class-2.1 Class-3.1 Class-4.1 Class-9.2 Class-10.1 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-10.3 Class Class-1.1 Class-2.2 Class-3.1 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.2 Class-2.1 Class-3.1 Class-4.1 Class-10.3 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.2 Class-2.2 Class-3.2 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.2 Class-3.2 Class-4.1 Class-9.2 Class-10.2 Class Class-1.2 Class-2.2 Class-3.2 Class-4.1 Class-9.2 Class-10.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.1 Class-2.1 Class-3.2 Class-4.2 Class-9.2 Class Class-1.2 Class-2.2 Class-3.2 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-10.3 Class Class-1.2 Class-2.1 Class-3.1 Class-4.1 Class-10.3 Class Class-1.1 Class-2.1 Class-3.2 Class-4.2 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-10.3 Class Class-1.2 Class-2.2 Class-3.1 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-10.3 Class Class-1.1 Class-2.2 Class-3.2 Class-4.2 Class-9.2 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-9.2 Class-10.3 Class Class-1.2 Class-2.1 Class-3.2 Class-4.1 Class-9.2 Class-10.3 Class-11.1 Table 6: Rule based system output for extracted features data Case based system output Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.1, Class Class 1.2, Class 3.1, Class 4.1, Class 9.1, Class 10.2, Class Class 1.1, Class 2.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.2, Class 2.1, Class 3.1, Class 4.1, Class 9.2, Class 10.2, Class Class 1.2, Class 3.2, Class 4.1, Class 9.2, Class 10.2, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.1, Class 2.1, Class 3.1, Class 4.2, Class 9.2, Class 10.1, Class Class 1.2, Class 2.1, Class 3.1, Class 4.1, Class 9.1, Class 10.3, Class Class 1.1, Class 3.1, Class 4.2, Class 9.2, Class 10.1, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.1, Class Class 1.2, Class 2.1, Class 3.1, Class 4.1, Class 9.2, Class 10.3, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.2, Class 3.2, Class 4.1, Class 9.2, Class 10.3, Class Class 1.1, Class 3.2, Class 4.2, Class 9.1, Class 10.2, Class Class 1.2, Class 3.2, Class 4.1, Class 9.2, Class 10.2, Class Class 1.1, Class 3.2, Class 4.2, Class 9.2, Class 10.2, Class Class 1.1, Class 2.1, Class 3.2, Class 4.2, Class 9.1, Class 10.2, Class Class 1.1, Class 2.1, Class 3.2, Class 4.2, Class 9.2, Class 10.3, Class Class 1.2, Class 3.2, Class 4.1, Class 9.1, Class 10.3, Class Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.1, Class 10.3, Class Class 1.2, Class 2.1, Class 3.1, Class 4.1, Class 9.1, Class 10.3, Class Class 1.1, Class 2.1, Class 3.2, Class 4.2, Class 9.1, Class 10.2, Class Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.2, Class 10.3, Class Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.1, Class 10.3, Class Class 1.2, Class 2.1, Class 3.1, Class 4.1, Class 9.2, Class 10.3, Class Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.2, Class 10.3, Class Class 1.1, Class 2.1, Class 3.2, Class 4.2, Class 9.1, Class 10.3, Class Class 1.2, Class 2.1, Class 3.1, Class 4.1, Class 9.2, Class 10.3, Class Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.2, Class 10.3, Class 11.1 Table 7: CBR system output for extracted features data 10.2, Class The rule-based system for the same galaxy gives the following result: Class-1.1, Class-2.2, Class-3.2, Class-4.2, Class- 9.2, Class It can be seen clearly the results are fairly close to each other with only the Class 10.2 missing from the rule-based system and Class 2.2 missing from the case-based system. The human data classifies the galaxy into the following classes Class 1.1, Class 3.2, Class 4.2, Class 10.2 It is easy to note that the case-based system is closer to the human classification than the rule-based system. Consider another example for galaxy id The case-based system classifies it into: Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.2, Class 10.3, Class whereas the rule-based system outputs: Class-1.2, Class-2.1, Class-3.2, Class-4.1, Class- 9.2, Class-10.3, Class The human data classifies the galaxy into the following classes Class 1.2, Class 2.1, Class 3.2, Class 4.1, Class 9.2, Class 10.3 As can be seen from the results, both the rule-based and the case-based systems perform almost the same. This helps us in answering Q1 clearly. The choice of the knowledge based system choice does not seem to matter for the task of morphological classification although we can see that the case based system performs slightly better. The results show that our knowledge based systems classify the galaxies into all the classes as the humans for all galaxies tested and some extra classes. Since human labellers cannot be considered perfect and the extra classes are related to the previous classification. Given that both systems exhibit similar performance to the human labelled data we can answer Q2 affirmatively, that is, the extracted vision features provide enough information to identify the galaxies correctly. Also the initial results helps us in answering Q3 positively. Knowledge based system techniques do provide a potential tool for astronomy tasks, and are worthy of additional study for such domains. Conculsion We considered the problem of classifying galaxy images and constructed three knowledge based systems. Our initial results show our MCG-KB algorithm can indeed identify the galaxies robustly and comparatively to the human classification. An interesting insight that we obtained from our experiments is that the choice of the knowledge based system did not significantly affect the performance. Extending these systems to a larger set of images remains an important research direction. Dhami (2015) has defined a larger set of features for the dataset. Using these features in the knowledge based systems remains a challenge. Marling et al. (2002) focuses primarily on systems that simultaneously use both case-based and rule-based methods (to mutu- 724

7 ally support a single process), and this integrated approach, or integrations with other methods, might be a potential future research area. References Banerji, M.; Lahav, O.; Lintott, C. J.; Abdalla, F. B.; Schawinski, K.; Bamford, S. P.; Andreescu, D.; Murray, P.; Raddick, M. J.; Slosar, A.; et al Galaxy zoo: reproducing galaxy morphologies via machine learning. Monthly Notices of the Royal Astronomical Society 406(1): Cui, Y.; Xiang, Y.; Rong, K.; Feris, R.; and Cao, L A spatial-color layout feature for content-based galaxy image retrieval. In IEEE Winter Conference on Applications of Computer Vision (WACV). De La Calleja, J., and Fuentes, O. 2004a. Automated classification of galaxy images. In International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Springer. De La Calleja, J., and Fuentes, O. 2004b. Machine learning and image analysis for morphological galaxy classification. Monthly Notices of the Royal Astronomical Society 349(1): De Mantaras, R. L.; McSherry, D.; Bridge, D.; Leake, D.; Smyth, B.; Craw, S.; Faltings, B.; Maher, M. L.; T COX, M.; Forbus, K.; et al Retrieval, reuse, revision and retention in case-based reasoning. The Knowledge Engineering Review 20(03): Dhami, D. S Morphological Classification of galaxies into spirals and non-spirals. Ph.D. Dissertation, Indiana University. Goderya, S. N., and Lolling, S. M Morphological classification of galaxies using computer vision and artificial neural networks: A computational scheme. Astrophysics and space science 279(4): Hayes-Roth, F Rule-based systems. Communications of the ACM 28(9): Heck, A., and Murtagh, F Knowledge-based systems in astronomy. In Knowledge Based Systems in Astronomy, volume 329. Kasivajhula, S.; Raghavan, N.; and Shah, H Morphological galaxy classification using machine learning. Monthly Notices of the Royal Astronomical Society 8:1 8. Kolodner, J Case-based reasoning. Morgan Kaufmann. Leake, D Cbr in context: the present and future. case based reasoning experiences-lessons and future experiences. d. leake. Lintott, C. J.; Schawinski, K.; Slosar, A.; Land, K.; Bamford, S.; Thomas, D.; Raddick, M. J.; Nichol, R. C.; Szalay, A.; Andreescu, D.; et al Galaxy zoo: morphologies derived from visual inspection of galaxies from the sloan digital sky survey. Monthly Notices of the Royal Astronomical Society 389(3): Marling, C.; Sqalli, M.; Rissland, E.; Muñoz-Avila, H.; and Aha, D Case-based reasoning integrations. AI magazine 23(1):69. Otsu, N A threshold selection method from graylevel histograms. Automatica 11( ): Willett, K. W.; Lintott, C. J.; Bamford, S. P.; Masters, K. L.; Simmons, B. D.; Casteels, K. R.; Edmondson, E. M.; Fortson, L. F.; Kaviraj, S.; Keel, W. C.; et al Galaxy zoo 2: detailed morphological classifications for galaxies from the sloan digital sky survey. Monthly Notices of the Royal Astronomical Society stt

Morphological Classification of Galaxies based on Computer Vision features using CBR and Rule Based Systems

Morphological Classification of Galaxies based on Computer Vision features using CBR and Rule Based Systems Morphological Classification of Galaxies based on Computer Vision features using CBR and Rule Based Systems Devendra Singh Dhami Tasneem Alowaisheq Graduate Candidate School of Informatics and Computing

More information

Classifying Galaxy Morphology using Machine Learning

Classifying Galaxy Morphology using Machine Learning Julian Kates-Harbeck, Introduction: Classifying Galaxy Morphology using Machine Learning The goal of this project is to classify galaxy morphologies. Generally, galaxy morphologies fall into one of two

More information

A Spatial-Color Layout Feature for Content-based Galaxy Image Retrieval

A Spatial-Color Layout Feature for Content-based Galaxy Image Retrieval A Spatial-Color Layout Feature for Content-based Galaxy Image Retrieval Yin Cui, Yongzhou Xiang, Kun Rong, Rogerio Feris, Liangliang Cao Department of Electrical Engineering, Columbia University IBM T.

More information

A Spatial-Color Layout Feature for Representing Galaxy Images

A Spatial-Color Layout Feature for Representing Galaxy Images A Spatial-Color Layout Feature for Representing Galaxy Images Yin Cui, Yongzhou Xiang, Kun Rong, Rogerio Feris, Liangliang Cao Department of Electrical Engineering, Columbia University IBM T. J. Watson

More information

Galaxy Morphologies with

Galaxy Morphologies with Galaxy Morphologies with Karen Masters ICG, Portsmouth 6.5 years of Galaxy Zoo! July 2007Feb 2009 Feb 2009April 2010 Sept 2009Jan 2010 Karen Masters: Galaxy Zoo, 18th November 2013 Apr 2010Aug 2012 Aug

More information

Automated Classification of Galaxy Zoo Images CS229 Final Report

Automated Classification of Galaxy Zoo Images CS229 Final Report Automated Classification of Galaxy Zoo Images CS229 Final Report 1. Introduction Michael J. Broxton - broxton@stanford.edu The Sloan Digital Sky Survey (SDSS) in an ongoing effort to collect an extensive

More information

Shape Descriptors in Morphological Galaxy Classification

Shape Descriptors in Morphological Galaxy Classification Shape Descriptors in Morphological Galaxy Classification Ishita Dutta 1, S. Banerjee 2 & M. De 3 1&2 Department of Natural Science, West Bengal University of Technology 3 Department of Engineering and

More information

Galaxy Growth and Classification

Galaxy Growth and Classification Observational Astronomy Lab: I-1FS Objectives: First Name: Last Name: Galaxy Growth and Classification To understand the concept of color in astronomy. To be able to classify galaxies based on their morphology

More information

Combining Human and Machine Learning for Morphological Analysis of Galaxy Images

Combining Human and Machine Learning for Morphological Analysis of Galaxy Images PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 126:959 967, 2014 October 2014. The Astronomical Society of the Pacific. All rights reserved. Printed in U.S.A. Combining Human and Machine Learning

More information

Galaxies. What is a Galaxy? A bit of History. A bit of History. Three major components: 1. A thin disk consisting of young and intermediate age stars

Galaxies. What is a Galaxy? A bit of History. A bit of History. Three major components: 1. A thin disk consisting of young and intermediate age stars What is a Galaxy? Galaxies A galaxy is a collection of billions of stars, dust, and gas all held together by gravity. Galaxies are scattered throughout the universe. They vary greatly in size and shape.

More information

Automatic morphological classification of galaxy images

Automatic morphological classification of galaxy images Mon. Not. R. Astron. Soc. 000, 1?? (2005) Printed 17 July 2009 (MN LATEX style file v2.2) Automatic morphological classification of galaxy images Lior Shamir 1 1 Laboratory of Genetics, NIA/NIH, 251 Bayview

More information

9. High-level processing (astronomical data analysis)

9. High-level processing (astronomical data analysis) Master ISTI / PARI / IV Introduction to Astronomical Image Processing 9. High-level processing (astronomical data analysis) André Jalobeanu LSIIT / MIV / PASEO group Jan. 2006 lsiit-miv.u-strasbg.fr/paseo

More information

Combining human and machine learning for morphological analysis of galaxy images

Combining human and machine learning for morphological analysis of galaxy images 1 Combining human and machine learning for morphological analysis of galaxy images Evan Kuminski, Lawrence Technological University, 21 W Ten Mile Rd., Southfield, MI 4875, USA Email: ekuminski@ltu.edu

More information

Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks

Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Tian Li tian.li@pku.edu.cn EECS, Peking University Abstract Since laboratory experiments for exploring astrophysical

More information

BHS Astronomy: Galaxy Classification and Evolution

BHS Astronomy: Galaxy Classification and Evolution Name Pd Date BHS Astronomy: Galaxy Classification and Evolution This lab comes from http://cosmos.phy.tufts.edu/~zirbel/ast21/homework/hw-8.pdf (Tufts University) The word galaxy, having been used in English

More information

Modern Image Processing Techniques in Astronomical Sky Surveys

Modern Image Processing Techniques in Astronomical Sky Surveys Modern Image Processing Techniques in Astronomical Sky Surveys Items of the PhD thesis József Varga Astronomy MSc Eötvös Loránd University, Faculty of Science PhD School of Physics, Programme of Particle

More information

AUTOMATIC MORPHOLOGICAL CLASSIFICATION OF GALAXIES. 1. Introduction

AUTOMATIC MORPHOLOGICAL CLASSIFICATION OF GALAXIES. 1. Introduction AUTOMATIC MORPHOLOGICAL CLASSIFICATION OF GALAXIES ZSOLT FREI Institute of Physics, Eötvös University, Budapest, Pázmány P. s. 1/A, H-1117, Hungary; E-mail: frei@alcyone.elte.hu Abstract. Sky-survey projects

More information

Laboratory: Milky Way

Laboratory: Milky Way Department of Physics and Geology Laboratory: Milky Way Astronomy 1402 Equipment Needed Quantity Equipment Needed Quantity Milky Way galaxy Model 1 Ruler 1 1.1 Our Milky Way Part 1: Background Milky Way

More information

Galaxy Zoo. Materials Computer Internet connection

Galaxy Zoo. Materials Computer Internet connection Name: Date: Galaxy Zoo Objectives: Distinguish between different types of galaxies Identify the various features of each subclass Contribute data that will be used by astronomers in their work Learn to

More information

Galaxy Zoo: the independence of morphology and colour

Galaxy Zoo: the independence of morphology and colour Galaxy Zoo: the independence of morphology and colour Steven Bamford University of Portsmouth / University of Nottingham Chris Lintott, Kevin Schawinski, Kate Land, Anze Slosar, Daniel Thomas, Bob Nichol,

More information

Inferring Galaxy Morphology Through Texture Analysis

Inferring Galaxy Morphology Through Texture Analysis Inferring Galaxy Morphology Through Texture Analysis 1 Galaxy Morphology Kinman Au, Christopher Genovese, Andrew Connolly Galaxies are not static objects; they evolve by interacting with the gas, dust

More information

Midterm, Fall 2003

Midterm, Fall 2003 5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are

More information

Galaxy Classification and the Hubble Deep Field

Galaxy Classification and the Hubble Deep Field Galaxy Classification and the Hubble Deep Field A. The Hubble Galaxy Classification Scheme Adapted from the UW Astronomy Dept., 1999 Introduction A galaxy is an assembly of between a billion (10 9 ) and

More information

Machine Learning and Deep Learning! Vincent Lepetit!

Machine Learning and Deep Learning! Vincent Lepetit! Machine Learning and Deep Learning!! Vincent Lepetit! 1! What is Machine Learning?! 2! Hand-Written Digit Recognition! 2 9 3! Hand-Written Digit Recognition! Formalization! 0 1 x = @ A Images are 28x28

More information

Lecture 27 Galaxy Types and the Distance Ladder December 3, 2018

Lecture 27 Galaxy Types and the Distance Ladder December 3, 2018 Lecture 27 Galaxy Types and the Distance Ladder December 3, 2018 1 2 Early Observations Some galaxies had been observed before 1900 s. Distances were not known. Some looked like faint spirals. Originally

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

The Neighbors Looking outward from the Sun s location in the Milky Way, we can see a variety of other galaxies:

The Neighbors Looking outward from the Sun s location in the Milky Way, we can see a variety of other galaxies: Galaxies The Neighbors Looking outward from the Sun s location in the Milky Way, we can see a variety of other galaxies: Small Magellanic Cloud (Digital Sky Survey) Large Magellanic Cloud (credit: Eckhard

More information

SOURCES AND RESOURCES:

SOURCES AND RESOURCES: A Galactic Zoo Lesson plan for grades K-2 Length of lesson: 1 Class Period (60 minutes) Adapted by: Jesús Aguilar-Landaverde, Environmental Science Institute, February 24, 2012 SOURCES AND RESOURCES: An

More information

Galaxy classification

Galaxy classification Galaxy classification Questions of the Day What are elliptical, spiral, lenticular and dwarf galaxies? What is the Hubble sequence? What determines the colors of galaxies? Top View of the Milky Way The

More information

Chapter 30. Galaxies and the Universe. Chapter 30:

Chapter 30. Galaxies and the Universe. Chapter 30: Chapter 30 Galaxies and the Universe Chapter 30: Galaxies and the Universe Chapter 30.1: Stars with varying light output allowed astronomers to map the Milky Way, which has a halo, spiral arm, and a massive

More information

Physics Lab #10: Citizen Science - The Galaxy Zoo

Physics Lab #10: Citizen Science - The Galaxy Zoo Physics 10263 Lab #10: Citizen Science - The Galaxy Zoo Introduction Astronomy over the last two decades has been dominated by large sky survey projects. The Sloan Digital Sky Survey was one of the first

More information

CS 188: Artificial Intelligence. Outline

CS 188: Artificial Intelligence. Outline CS 188: Artificial Intelligence Lecture 21: Perceptrons Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. Outline Generative vs. Discriminative Binary Linear Classifiers Perceptron Multi-class

More information

A Tour of the Messier Catalog. ~~ in ~~ Eight Spellbinding and Enlightening Episodes. ~~ This Being Episode Three ~~

A Tour of the Messier Catalog. ~~ in ~~ Eight Spellbinding and Enlightening Episodes. ~~ This Being Episode Three ~~ A Tour of the Messier Catalog ~~ in ~~ Eight Spellbinding and Enlightening Episodes ~~ This Being Episode Three ~~ Globulars and Galaxies Warm-up for The Realm M83 Spiral Galaxy Constellation Hydra

More information

Applied Machine Learning for Design Optimization in Cosmology, Neuroscience, and Drug Discovery

Applied Machine Learning for Design Optimization in Cosmology, Neuroscience, and Drug Discovery Applied Machine Learning for Design Optimization in Cosmology, Neuroscience, and Drug Discovery Barnabas Poczos Machine Learning Department Carnegie Mellon University Machine Learning Technologies and

More information

View of the Galaxy from within. Lecture 12: Galaxies. Comparison to an external disk galaxy. Where do we lie in our Galaxy?

View of the Galaxy from within. Lecture 12: Galaxies. Comparison to an external disk galaxy. Where do we lie in our Galaxy? Lecture 12: Galaxies View of the Galaxy from within The Milky Way galaxy Rotation curves and dark matter External galaxies and the Hubble classification scheme Plotting the sky brightness in galactic coordinates,

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)

More information

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI Bayes Classifiers CAP5610 Machine Learning Instructor: Guo-Jun QI Recap: Joint distributions Joint distribution over Input vector X = (X 1, X 2 ) X 1 =B or B (drinking beer or not) X 2 = H or H (headache

More information

Analyzing Spiral Galaxies Observed in Near-Infrared

Analyzing Spiral Galaxies Observed in Near-Infrared Analyzing Spiral Galaxies Observed in Near-Infrared Preben Grosbøl European Southern Observatory Karl-Schwarzschild-Str. 2, D-85748 Garching, Germany Abstract A sample of 54 spiral galaxies was observed

More information

GALAXIES. Hello Mission Team members. Today our mission is to learn about galaxies.

GALAXIES. Hello Mission Team members. Today our mission is to learn about galaxies. GALAXIES Discussion Hello Mission Team members. Today our mission is to learn about galaxies. (Intro slide- 1) Galaxies span a vast range of properties, from dwarf galaxies with a few million stars barely

More information

Understanding Generalization Error: Bounds and Decompositions

Understanding Generalization Error: Bounds and Decompositions CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the

More information

Hubble s Law and the Cosmic Distance Scale

Hubble s Law and the Cosmic Distance Scale Lab 7 Hubble s Law and the Cosmic Distance Scale 7.1 Overview Exercise seven is our first extragalactic exercise, highlighting the immense scale of the Universe. It addresses the challenge of determining

More information

arxiv: v2 [astro-ph.co] 19 Aug 2013

arxiv: v2 [astro-ph.co] 19 Aug 2013 Mon. Not. R. Astron. Soc., 1 9 (13) Printed August 13 (MN LATX style file v.) Galaxy Zoo : detailed morphological classifications for 3,1 galaxies from the Sloan Digital Sky Survey arxiv:138.396v [astro-ph.co]

More information

Galaxies & Introduction to Cosmology

Galaxies & Introduction to Cosmology Galaxies & Introduction to Cosmology Other Galaxies: How many are there? Hubble Deep Field Project 100 hour exposures over 10 days Covered an area of the sky about 1/100 the size of the full moon Probably

More information

Excerpts from previous presentations. Lauren Nicholson CWRU Departments of Astronomy and Physics

Excerpts from previous presentations. Lauren Nicholson CWRU Departments of Astronomy and Physics Excerpts from previous presentations Lauren Nicholson CWRU Departments of Astronomy and Physics Part 1: Review of Sloan Digital Sky Survey and the Galaxy Zoo Project Part 2: Putting it all together Part

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional

More information

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each

More information

The Milky Way Galaxy (ch. 23)

The Milky Way Galaxy (ch. 23) The Milky Way Galaxy (ch. 23) [Exceptions: We won t discuss sec. 23.7 (Galactic Center) much in class, but read it there will probably be a question or a few on it. In following lecture outline, numbers

More information

Lecture 15: Galaxy morphology and environment

Lecture 15: Galaxy morphology and environment GALAXIES 626 Lecture 15: Galaxy morphology and environment Why classify galaxies? The Hubble system gives us our basic description of galaxies. The sequence of galaxy types may reflect an underlying physical

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity. A global picture of the protein universe will help us to understand

More information

SKINAKAS OBSERVATORY. Astronomy Projects for University Students PROJECT GALAXIES

SKINAKAS OBSERVATORY. Astronomy Projects for University Students PROJECT GALAXIES PROJECT 7 GALAXIES Objective: The topics covered in the previous lessons target celestial objects located in our neighbourhood, i.e. objects which are within our own Galaxy. However, the Universe extends

More information

Group Member Names: You may work in groups of two, or you may work alone. Due November 20 in Class!

Group Member Names: You may work in groups of two, or you may work alone. Due November 20 in Class! Galaxy Classification and Their Properties Group Member Names: You may work in groups of two, or you may work alone. Due November 20 in Class! Learning Objectives Classify a collection of galaxies based

More information

Lecture 2. Judging the Performance of Classifiers. Nitin R. Patel

Lecture 2. Judging the Performance of Classifiers. Nitin R. Patel Lecture 2 Judging the Performance of Classifiers Nitin R. Patel 1 In this note we will examine the question of how to udge the usefulness of a classifier and how to compare different classifiers. Not only

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington Bayesian Classifiers and Probability Estimation Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 Data Space Suppose that we have a classification problem The

More information

Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University

Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Outline What is Surprise Detection? Example Application: The LSST

More information

Kjersti Aas Line Eikvil Otto Milvang. Norwegian Computing Center, P.O. Box 114 Blindern, N-0314 Oslo, Norway. sharp reexes was a challenge. machine.

Kjersti Aas Line Eikvil Otto Milvang. Norwegian Computing Center, P.O. Box 114 Blindern, N-0314 Oslo, Norway. sharp reexes was a challenge. machine. Automatic Can Separation Kjersti Aas Line Eikvil Otto Milvang Norwegian Computing Center, P.O. Box 114 Blindern, N-0314 Oslo, Norway e-mail: Kjersti.Aas@nr.no Tel: (+47) 22 85 25 00 Fax: (+47) 22 69 76

More information

A Hierarchical Model for Morphological Galaxy Classification

A Hierarchical Model for Morphological Galaxy Classification Proceedings of the Twenty-Sixth International Florida Artificial Intelligence Research Society Conference A Hierarchical Model for Morphological Galaxy Classification Maribel Marin and L. Enrique Sucar

More information

Observing Dark Worlds (Final Report)

Observing Dark Worlds (Final Report) Observing Dark Worlds (Final Report) Bingrui Joel Li (0009) Abstract Dark matter is hypothesized to account for a large proportion of the universe s total mass. It does not emit or absorb light, making

More information

National Aeronautics and Space Administration. Glos. Glossary. of Astronomy. Terms. Related to Galaxies

National Aeronautics and Space Administration. Glos. Glossary. of Astronomy. Terms. Related to Galaxies National Aeronautics and Space Administration Glos of Astronomy Glossary Terms Related to Galaxies Asterism: A pattern formed by stars not recognized as one of the official 88 constellations. Examples

More information

The Galaxy Zoo Project

The Galaxy Zoo Project Astronomy 201: Cosmology Fall 2009 Prof. Bechtold NAME: The Galaxy Zoo Project 200 points Due: Nov. 23, 2010, in class Professional astronomers often have to search through enormous quantities of data

More information

CSE 546 Final Exam, Autumn 2013

CSE 546 Final Exam, Autumn 2013 CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

INSIDE LAB 9: Classification of Stars and Other Celestial Objects

INSIDE LAB 9: Classification of Stars and Other Celestial Objects INSIDE LAB 9: Classification of Stars and Other Celestial Objects OBJECTIVE: To become familiar with the classification of stars by spectral type, and the classification of celestial objects such as galaxies.

More information

Classification of Galaxy Morphological Image Based on Convolutional Neural Network

Classification of Galaxy Morphological Image Based on Convolutional Neural Network Classification of Galaxy Morphological Image Based on Convolutional Neural Network Wahyono, Muhammad Arif Rahman, and Azhari SN Department of Computer Science and Electronics, Universitas Gadjah Mada,

More information

HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data

HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data Sam L. Shue, Andrew R. Willis, and Thomas P. Weldon Dept. of Electrical and Computer Engineering University of North

More information

April 11, Astronomy Notes Chapter 16.notebook. Types of Galaxies

April 11, Astronomy Notes Chapter 16.notebook. Types of Galaxies The Milky Way is just one of about 50 billion galaxies that are thought to exist. Just as stars can be classified using an H R diagram, galaxies can also be classified according to certain physical properties.

More information

Prediction of Citations for Academic Papers

Prediction of Citations for Academic Papers 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

Quality and Coverage of Data Sources

Quality and Coverage of Data Sources Quality and Coverage of Data Sources Objectives Selecting an appropriate source for each item of information to be stored in the GIS database is very important for GIS Data Capture. Selection of quality

More information

Automatic detection and of dipoles in large area SQUID magnetometry

Automatic detection and of dipoles in large area SQUID magnetometry Automatic detection and of dipoles in large area SQUID magnetometry Lisa Qian December 4, Introduction. Scanning SQUID magnetometry Scanning SQUID magnetometry is a powerful tool for metrology of individual

More information

MIRA, SVM, k-nn. Lirong Xia

MIRA, SVM, k-nn. Lirong Xia MIRA, SVM, k-nn Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values Each feature has a weight Sum is the activation activation w If the activation is: Positive: output +1 Negative, output

More information

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros Corners, Blobs & Descriptors With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros Motivation: Build a Panorama M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 How do we build panorama?

More information

Galactic Census: Population of the Galaxy grades 9 12

Galactic Census: Population of the Galaxy grades 9 12 Galactic Census: Population of the Galaxy grades 9 12 Objective Introduce students to a range of celestial objects that populate the galaxy, having them calculate estimates of how common each object is

More information

Probabilistic Graphical Models for Image Analysis - Lecture 1

Probabilistic Graphical Models for Image Analysis - Lecture 1 Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.

More information

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20. 10-601 Machine Learning, Midterm Exam: Spring 2008 Please put your name on this cover sheet If you need more room to work out your answer to a question, use the back of the page and clearly mark on the

More information

arxiv: v1 [cs.ds] 3 Feb 2018

arxiv: v1 [cs.ds] 3 Feb 2018 A Model for Learned Bloom Filters and Related Structures Michael Mitzenmacher 1 arxiv:1802.00884v1 [cs.ds] 3 Feb 2018 Abstract Recent work has suggested enhancing Bloom filters by using a pre-filter, based

More information

Randomized Decision Trees

Randomized Decision Trees Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,

More information

A100H Exploring the Universe: Discovering Galaxies. Martin D. Weinberg UMass Astronomy

A100H Exploring the Universe: Discovering Galaxies. Martin D. Weinberg UMass Astronomy A100H Exploring the Universe: Discovering Galaxies Martin D. Weinberg UMass Astronomy astron100h-mdw@courses.umass.edu April 05, 2016 Read: Chap 19 04/05/16 slide 1 Exam #2 Returned by next class meeting

More information

Detecting Dark Matter Halos using Principal Component Analysis

Detecting Dark Matter Halos using Principal Component Analysis Detecting Dark Matter Halos using Principal Component Analysis William Chickering, Yu-Han Chou Computer Science, Stanford University, Stanford, CA 9435 (Dated: December 15, 212) Principal Component Analysis

More information

Efficiently merging symbolic rules into integrated rules

Efficiently merging symbolic rules into integrated rules Efficiently merging symbolic rules into integrated rules Jim Prentzas a, Ioannis Hatzilygeroudis b a Democritus University of Thrace, School of Education Sciences Department of Education Sciences in Pre-School

More information

Rapid Object Recognition from Discriminative Regions of Interest

Rapid Object Recognition from Discriminative Regions of Interest Rapid Object Recognition from Discriminative Regions of Interest Gerald Fritz, Christin Seifert, Lucas Paletta JOANNEUM RESEARCH Institute of Digital Image Processing Wastiangasse 6, A-81 Graz, Austria

More information

Normal Galaxies (Ch. 24) + Galaxies and Dark Matter (Ch. 25) Symbolically: E0.E7.. S0..Sa..Sb..Sc..Sd..Irr

Normal Galaxies (Ch. 24) + Galaxies and Dark Matter (Ch. 25) Symbolically: E0.E7.. S0..Sa..Sb..Sc..Sd..Irr Normal Galaxies (Ch. 24) + Galaxies and Dark Matter (Ch. 25) Here we will cover topics in Ch. 24 up to 24.4, but then skip 24.4, 24.5 and proceed to 25.1, 25.2, 25.3. Then, if there is time remaining,

More information

Massachusetts Tests for Educator Licensure (MTEL )

Massachusetts Tests for Educator Licensure (MTEL ) Massachusetts Tests for Educator Licensure (MTEL ) BOOKLET 2 Mathematics Subtest Copyright 2010 Pearson Education, Inc. or its affiliate(s). All rights reserved. Evaluation Systems, Pearson, P.O. Box 226,

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for

More information

Machine Learning Applications in Astronomy

Machine Learning Applications in Astronomy Machine Learning Applications in Astronomy Umaa Rebbapragada, Ph.D. Machine Learning and Instrument Autonomy Group Big Data Task Force November 1, 2017 Research described in this presentation was carried

More information

Machine Learning Methods for Radio Host Cross-Identification with Crowdsourced Labels

Machine Learning Methods for Radio Host Cross-Identification with Crowdsourced Labels Machine Learning Methods for Radio Host Cross-Identification with Crowdsourced Labels Matthew Alger (ANU), Julie Banfield (ANU/WSU), Cheng Soon Ong (Data61/ANU), Ivy Wong (ICRAR/UWA) Slides: http://www.mso.anu.edu.au/~alger/sparcs-vii

More information

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Predictive analysis on Multivariate, Time Series datasets using Shapelets 1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,

More information

arxiv: v1 [astro-ph.im] 20 Jan 2017

arxiv: v1 [astro-ph.im] 20 Jan 2017 IAU Symposium 325 on Astroinformatics Proceedings IAU Symposium No. xxx, xxx A.C. Editor, B.D. Editor & C.E. Editor, eds. c xxx International Astronomical Union DOI: 00.0000/X000000000000000X Deep learning

More information

PRACTICAL ANALYTICS 7/19/2012. Tamás Budavári / The Johns Hopkins University

PRACTICAL ANALYTICS 7/19/2012. Tamás Budavári / The Johns Hopkins University PRACTICAL ANALYTICS / The Johns Hopkins University Statistics Of numbers Of vectors Of functions Of trees Statistics Description, modeling, inference, machine learning Bayesian / Frequentist / Pragmatist?

More information

Introduction to Machine Learning Midterm Exam

Introduction to Machine Learning Midterm Exam 10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but

More information

Final Examination CS 540-2: Introduction to Artificial Intelligence

Final Examination CS 540-2: Introduction to Artificial Intelligence Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11

More information

An analogy. "Galaxies" can be compared to "cities" What would you like to know about cities? What would you need to be able to answer these questions?

An analogy. Galaxies can be compared to cities What would you like to know about cities? What would you need to be able to answer these questions? An analogy "Galaxies" can be compared to "cities" What would you like to know about cities? how does your own city look like? how big is it? what is its population? history? how did it develop? how does

More information

Lesson 4 Galaxies and the Universe

Lesson 4 Galaxies and the Universe Lesson 4 Galaxies and the Universe Student Labs and Activities Page Launch Lab 66 Content Vocabulary 67 Lesson Outline 68 MiniLab 70 Content Practice A 71 Content Practice B 72 School to Home 73 Key Concept

More information

Galaxies and Star Systems

Galaxies and Star Systems Chapter 5 Section 5.1 Galaxies and Star Systems Galaxies Terms: Galaxy Spiral Galaxy Elliptical Galaxy Irregular Galaxy Milky Way Galaxy Quasar Black Hole Types of Galaxies A galaxy is a huge group of

More information

Our Solar System: A Speck in the Milky Way

Our Solar System: A Speck in the Milky Way GALAXIES Lesson 2 Our Solar System: A Speck in the Milky Way The Milky Way appears to be curved when we view it but in reality it is a straight line. It is curved due to the combination of pictures taken

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

Galaxies Guiding Questions

Galaxies Guiding Questions Galaxies Guiding Questions How did astronomers first discover other galaxies? How did astronomers first determine the distances to galaxies? Do all galaxies have spiral arms, like the Milky Way? How do

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information