Mining in Hepatitis Data by LISp-Miner and SumatraTT
|
|
- Dale Harris
- 6 years ago
- Views:
Transcription
1 Mining in Hepatitis Data by LISp-Miner and SumatraTT Petr Aubrecht 1, Martin Kejkula 2, Petr Křemen 1, Lenka Nováková 1, Jan Rauch 2, Milan Šimůnek2, Olga Štěpánková1, and Monika Žáková1 1 Czech Technical University in Prague, FEE, Prague 6, Czech Republic, {aubrech,kremep1,novakova,step}@felcvutcz 2 University of Economics, Prague, W Churchill Sq 4, Praha 3, Czech Republic {kejkula,rauch,simunek}@vsecz Abstract The paper suggests a methodology for search of temporal patterns, which is tested on the problem of difference between hepatitis B and C To reach this goal two software systems LISp-Miner and SumatraTT are combined sophisticated data transformations and enhancements are designed and ensured through SumatraTT while LISp- Miner takes care for the search of significant interesting differences in the resulting datasets The main obtained results are reviewed in the section 3 there are identified some examinations the values of which significantly differ for both types of attributes This proves that the suggested general methodology has promising potential when applied to the considered type of data A plan for additional data-mining questions to be studied later is presented in the Conclusions 1 Introduction This paper presents results of mining in the hepatitis data set that was offered as a part of Discovery Challenge at PKDD 2005 We try to contribute to discovering differences in temporal patterns between hepatitis B and C Similar question has been analyzed by [2]; while they are using temporal abstraction, we introduce trend characteristics (see section 22) which are calculated during data preprocessing ensured by SumatraTT and we apply LISp-Miner to find relevant association rules (see section 3) SumatraTT 3 [8] is a modular system for data preprocessing and data transformations It offers a number of easy-to-use reusable modules for loading or exporting data from/to different formats, for analysis of individual attributes (elementary statistics, contingency tables, etc) and for definition of additional derived attributes through sophisticated processing (scripting, integration with SQL databases etc) The last property represents one of the main advantages of SumatraTT for the considered data-mining task: to characterize the temporal 3 homepage:
2 patterns in the measured attributes of patients, it proved necessary to introduce number of new derived attributes they have been obtained through transformations and aggregation ensured by SumatraTT The individual modules can be combined into a project through an intuitive graphical interface which creates automatically a detailed documentation as a by-product In this way, the SumatraTT project becomes an efficient communication platform for our team during the work on any data-mining task LISp-Miner is an academic software system intended to support teaching and research It consists of six data mining procedures, machine learning procedure KEX and two procedures for data transformation [5, 7] see also http: //lispminervsecz All six data mining procedures of LISp-Miner system are GUHA procedures in the sense of [1] The input of GUHA procedure consists of the analyzed data and of a simple definition of relevant (ie potentially interesting) patterns GUHA procedure automatically generates each particular pattern and tests if it is true in the analyzed data The output of the procedure consists of all prime patterns The pattern is prime if it is true in the analyzed data and if it does not immediately follow from other simpler output patterns [1] Each GUHA procedure of the LISp-Miner system mines for a particular type of patterns Its most frequently used procedure is 4ft-Miner, which mines for enhanced association rules [5] In this paper, we use the procedure SD4ft-Miner [6] The paper is organized as follows Applications of the SumatraTT to derive several data matrices suitable for further analysis are described in section 2 The procedure SD4ft-Miner mines for SD4ft-patterns that are introduced in section 3 together with results of several applications of the SD4ft-Miner Some concluding remarks are in section 5 2 Transforming Data by SumatraTT 21 Data Understanding Review of Important Properties The hepatitis source data is in a form of CSV files with a good documentation Data was primarily loaded from the text files (ilab e030704csv etc) into SQL database We prepared several steps of data preprocessing First, the data from both tables ilab and olab (internal and external exams) was merged and further considered together The merged table contains 1060 different types of exams (mainly due to disparate exam types in olab with 845 diff exams) To reduce this excessive number we have first decided to omit rare exam types and take into account only exam types with more than occurrences there are 41 exams of this sort Moreover, 81 important exam types were considered upon specialist s recommendation In this way, we have ended up with 105 exam types During the data preprocessing we have identified some important properties of the considered dataset which were not explicitly mentioned in the former articles dedicated to this dataset After reviewing several summary numbers on the next page, we decided to study the group with exactly one biopsy and without interferon therapy
3 Patients Description Patients Description 771 all 99 > 1 biopsy (all have type C) 503 with a biopsy 123 = 1 biopsy and interferon 74 > 1 biopsy and interferon biopsy, no interferon 1 with interferon, no biopsy exam before the first biopsy 281 = 1 biopsy, no interferon exam in hemat table before 3 1 biopsy, no other exam the first biopsy (mid=808, 500, 202) 22 Temporal Characteristics of the Considered Attributes The patients are not examined regularly: a period between two examinations can range form one day to several months Some periods, when a patient is observed frequently, alternate with more restful periods The highest number of all exams for one patient is (patient # 321) This irregularity has to be taken into account when choosing the characteristics for description of the temporal properties of the measured values In order to standardize information provided about individual patients, we have decided to concentrate on data collected during a specific well defined time interval of a fixed length τ for each individual patient The considered interval does not start on a single date for all the patients On the contrary, it is tightly bound to the state of the individual patient: the considered interval ends (or begins) in a significant instant, which can be easily recognized in the available data of the patient, eg the time of his/her first biopsy, on the time when a specific treatment was introduced (eg interferon) The length of the time interval is set constant for all patients and it is understood to be a parameter τ of the considered project The number of measurements for one patient during one year ranges usually between 2-10 To make up for this non-uniformity we have decided to use the following trend characteristics of the considered sequence of time-stamped data: average, number of measurements, gradient (resulting from linear approximation), maximum, minimum, and variance For purposes of further data mining the results were saved as a matrix with patients in rows and trend characteristics in columns The rest of the paper tries to prove that this type of derived attributes can depict interesting dependencies in the considered data For that purpose we have to fix the significant instant and τ In the rest of the paper, the significant instant is set to the time of the first biopsy Moreover, we do not include in the studied dataset patients treated by interferon Only those patients with measurements during τ period (see further) before the first biopsy were selected In the next step, all the exams were filtered according to the following requirements: the exam must provide numeric value (omitting values like +, > 3 etc) and the type of exam must be measured at least 10 times for each considered patient The size of the resulting set is mentioned in section 23 Data selected in the previous steps was then analyzed as sequences and there were calculated upper mentioned trend characteristics Finally, data about patients was added (sex, age, type of hepatitis, maximum fibrosis and activity)
4 The data preprocessing resulted in a data matrix, rows of which correspond to particular patients identified by MID Columns of this data matrix contain various trend characteristics of considered examinations for the corresponding patients (eg ALB avg is an average of the values of ALB exam results) 23 Enhanced Datasets The following data matrices describing behavior of various characteristics before the first biopsy were prepared for data mining: TRENDS BIO 24 relates to patients who have history of exams at least τ = 24 months long, TRENDS BIO 12 to patients with exam history of τ = 12 months (investigated in detail further in the article), and TRENDS BIO 3 to patients with exam history of τ = 3 months The size of resulting datasets is increasing (53, 85, and 171) For pilot experiments the dataset corresponding to 12 months was chosen 3 SD4ft-patterns We use data matrix TRENDS BIO 12 shown in fig 1 to introduce the SD4ftpatterns Each row of TRENDS BIO 12 corresponds to one patient identified row Basic attributes In-hospital examinations number MID Sex Age Type Fibrosis Activity CL avg CL grad 1 1 M 29 B 2 2 X X 2 42 M 33 C E M 58 C 1 FALSE X X Fig 1 Data matrix TRENDS BIO 12 by the value of column MID Values of the column Sex come from the table pt e030704csv Column Age contains the age of the patient in the time of the first biopsy (bio e030704csv and pt e030704csv are used) Columns Type (ie hepatitis type), Fibrosis and Activity come from bio e030704csv and they indicate values at the time of the first biopsy The value X in the column CL avg in the row 1 means that the value of the of CL (ie chloride, see ilab e030704csv) for the patient with MID = 1 was not measured The value -55E-7 in the column CL grad in the row 2 is the value of the gradient of the linear approximation of the time series of the examinations of CL taken during the 12 months before the first biopsy for the patient with MID = 42 Analogously for further patients and columns The data matrix TRENDS BIO 12 has 224 columns with gradient, average, etc values of specific examinations The procedure SD4ft-Miner mines for SD4ft-patterns of the form α β : ϕ ψ / γ
5 M/(α γ) ψ ψ ϕ a α γ b α γ ϕ c α γ d α γ 4ft(ϕ, ψ, M/(α γ)) M/(β γ) ψ ψ ϕ a β γ b β γ ϕ c β γ d β γ 4ft(ϕ, ψ, M/(β γ)) Fig 2 4ft-tables 4ft(ϕ, ψ, M/(α γ)) and 4ft(ϕ, ψ, M/(β γ)) Here α, β, γ, ϕ, and ψ are Boolean attributes defined from the columns of analyzed data matrix M The SD4ft-pattern α β : ϕ ψ/γ means that the subsets of patients meeting the Boolean conditions α and β differ in what concerns the validity of association rule ϕ ψ when the condition given by Boolean attribute γ is satisfied A measure of difference is defined by the symbol that is called SD4ft-quantifier The association rule ϕ ψ means here a general relation of Boolean attributes ϕ and ψ in the sense of [5] An example of the SD4ft-pattern is the pattern Type(B) Type(C) : LDH grad( 0) D 04 GOT grad( 0) / Age(30 69) It means that the patients with hepatitis B differ from the patients with hepatitis C what concerns relation of Boolean attributes LDH grad( 0) (ie the value of LDH grad is 0) and GOT grad( 0) when we consider patients of the age years The difference is given by the SD4ft-quantifier D 04 We introduce it using general notation α, β, γ, ϕ, and ψ The SD4ft-quantifier concerns two four-fold contingency tables (ie 4ft-tables) 4ft(ϕ, ψ, M/(α γ)) and 4ft(ϕ, ψ, M/(β γ)), see fig 2 The 4ft-table 4ft(ϕ, ψ, M/(α γ)) of ϕ and ψ on M/(α γ) is the contingency table of ϕ and ψ on M/(α γ) The data matrix M/(α γ) is a data submatrix of M that consists of exactly all rows of M satisfying α γ It means that M/(α γ) corresponds to all objects (ie rows) from the set defined by α that satisfy the condition γ It is 4ft(ϕ, ψ, M/(α γ)) = a α γ, b α γ, c α γ, d α γ where a α γ is the number of rows of data matrix M/(α γ) satisfying both ϕ and ψ, etc The 4ft-table 4ft(ϕ, ψ, M/(β γ)) of ϕ and ψ on M/(β γ) is defined analogously The SD4ft-quantifier D 04 is defined by the condition a α γ a β γ 04 a α γ + b α γ a β γ + b β γ This condition means that the difference between the confidence of the classical association rule ϕ ψ on data matrix M/(α γ)) and the confidence of this association rule on data matrix M/(β γ)) is at least 04 The SD4ft-pattern α β : ϕ D 04 ψ / γ is true on data matrix M if the condition a β γ a β γ +b β γ a α γ a α γ +b α γ 04 is satisfied The example SD-4ft pattern is verified using the 4ft-tables T B and T C see Fig 3 Let us note that the sum of all frequencies from 4ft-tables T B and T C is
6 TRENDS BIO 12 / (Type(B) Age(30-69)) GOT grad( 0) GOT grad( 0) LDH grad( 0) 11 0 LDH grad( 0) 6 5 T B = 4ft(LDH grad( 0),GOT grad( 0), TRENDS BIO 12/(Type(B) Age(30-69)) TRENDS BIO 12 / (Type(C) Age(30-69)) GOT grad( 0) GOT grad( 0) LDH grad( 0) LDH grad( 0) 0 4 T C = 4ft(LDH grad( 0),GOT grad( 0), TRENDS BIO 12/(Type(C) Age(30-69)) Fig 3 4ft-tables T B and T C smaller than 60 because of omitting missing values X It is easy to verify that the the condition corresponding to the SD4ft quantifier D 04 is satisfied We can conclude that the SD4ft pattern Type(B) Type(C) : LDH grad( 0) D 04 GOT grad( 0) / Age(30 69) is true on the data matrix TRENDS BIO 12 Very informally speaking we can interpret this SD4ft pattern as The confidence of association rule (not negative gradient LDH) (not negative gradient GOT) is 04 greater for type B than for type C when we consider the patients years old 4 SD4ft-Miner Application Results We solved three different tasks In the first task we searched for very simple SD4ft-patterns (without condition γ) Type(B) Type(C) : T RUE ψ where T RU E is a specially prepared basic Boolean attribute that is identically true and is a suitable SD4ft-quantifier (see below) Remark that the confidence of the association rule T RUE ψ is equal to the relative frequency of rows of analyzed data matrix satisfying ψ It means that we can use the SD4ft quantifier D 015 a α 10 a β 10 that says that the difference of relative frequencies is at least 015 and that there are at least 10 patients with type B hepatitis satisfying ψ and also at least 10 patients with type C hepatitis We use the set of relevant SD4ft-patterns Type(B) Type(C) : T RUE ψ such that the succedents are 903 intervals of averages of 22 in-hospital examinations, namely CL, D-BIL, F-CHO, FE, G-GL, G-GTP, GOT, GPT, HBE-AB,
7 HBE-AG, CHE, I-BIL, K, LDH, NA, Oudan, T-BIL, T-CHO, TG, TP, U-UBG, UN The amount of 903 intervals is defined by few parameters such that the resulting intervals are of reasonable size This amount was generated and verified in 1 sec (PC with 306 GHz, 512 MB DDR SDRAM) Due to various optimizations only 308 verifications was really done The result is 18 true SD4ft-patterns concerning 8 attributes One strongest pattern for each of these attributes is shown in table?? Remark that there are 27 patients with the hepatitis type B and 33 patients with the hepatitis type C frequency type B frequency type C literal relative R B absolute relative R C absolute R B R C TP avg( 7) CHE avg(100; F CHO avg(45; CL avg(102; I BIL avg(03; UN avg(12; T BIL avg(06; G GTP avg(20; Table 1 Differences of relative frequencies The difference of relative frequencies can be understood as a difference of confidences of association rules T RUE ψ for types B and C Thus it is reasonable to ask if there is a stronger difference than 04 for confidences of association rules ϕ ψ where both ϕ and ψ are similar literals as ψ in the previous section Thus we searched for SD4ft-patterns (without condition γ) of the form Type(B) Type(C) : ϕ ψ where is the SD4ft-quantifier defined as D 04 a α 10 a β 10 This quantifier says among other that the difference of confidences is at least 04 We defined the set of more than of relevant SD4ft-patterns Due to various optimizations only was generated and verified in about 2 seconds, see also [5] There are 27 SD4ft patterns satisfying given condition, all of them have the attribute TP avg in the succedent Thus we show only the three strongest ones and also further three ones not containing the attribute TP avg and found by an another run of the SD4ft-Miner procedure, see table?? We tried also to find some conditions under which is the difference of confidences even stronger We searched for SD4ft-patterns of the form more α β : ϕ ψ / γ where where the condition γ was created from Sex, Age, Fibrosis and Activity About relevant patterns was verified The amount of 76 true
8 type B type C rule Conf B support Conf C support Conf B a B % a C % - Conf C CHE avg(100; 300 TP avg(5; CL avg(102; 106 TP avg(65; K avg 4; 44) TP avg(6; Further rules with TP avg in succedent skipped LDH grad 0; 05) GPT grad 0; 05) F-CHO grad 0; 05) GOT grad 0; 05) CL avg(103; 107 I BIL avg 03; 06) Table 2 Differences concerning pairs of examinations type B type C rule Conf B support Conf C support Conf B a B % a C % Conf C Condition: Age 40; Type B: 8 patients; Type C: 26 patients CHE avg(0; 300 TP avg(50; Condition: Age 35; Type B: 15 patients; Type C: 28 patients K avg 4; 44) TP avg(60; Oudan avg(4; 6 T BIL avg(05; Oudan avg(4; 6 TP avg(05; Condition: Fibrosis(1,2); Type B: 18 patients; Type C: 21 patients LDH grad 0; 05) GPT grad 0; 05) F-CHO grad 0; 05) GPT grad 0; 05) D BIL avg 01; 03) TP avg(50; Table 3 Differences concerning pairs of examinations under conditions SD4ft-patterns with condition were found in 2 minutes and 23 seconds Some examples of strongest and interesting ones are in table?? 5 Conclusions and Further Work We have succeeded to find several patterns that indicate existence of differences in trend characteristics for hepatitis type B and type C The process is far from straightforward First, it was necessary to transform original data into suitable data matrix using SumatraTT and then the SD4ft-Miner procedure has been applied several times There seem to appear some strong rules but interpretation of the obtained results given in tables 1, 2, and 3 is impossible without relevant medical knowledge It will be very interesting to compare our results with those in [2] there are several attributes which have been identified as important by both approaches, namely T/BIL, CHE, GOT, GPT and TP Anyway, the considered set of 60 patients is too small the applied restrictions leading to
9 creation of the considered data matrix do not take optimal advantage of all the available data All the steps of our approach are easy to repeat and modify There are lot of possibilities how to do so Based on the experience with the present data and results of current data mining efforts, we are planning to modify selection criteria for the used preprocessing We believe that the suggested methodology based on selection of a time window related to some significant instant could prove useful when studying influence of the interferon therapy Further analysis should work with a new enhanced data set in which two significant instants are considered: one corresponds to the beginning of the interferon therapy, while the other is set several months after that This setting makes it possible to study changes in time patterns due to the therapy Moreover, measurements from the table hemat will be included Results from this new data are under investigation now The project showed, that a cooperation of the both tools, SumatraTT and LISp-Miner, is effective and allows fast data preprocessing and data mining cycle The whole process can be easily modified and reused for different data mining tasks (eg influence of interferon) and even to different datasets Acknowledgements The work described here has been supported by the grant 201/05/0325 of the Czech Science Foundation and the research program No MSM Transdisciplinary Research in Biomedical Engineering II of the CTU in Prague References 1 Hájek, P, Havránek, T: Mechanizing Hypothesis Formation (Mathematical Foundations for a General Theory), Springer Verlag Ho TB et al: Combining temporal abstraction and data mining to study hepatitis In Proceedings of the Discovery Chalenge 2004 A Collaborative Effort in Knowledge Discovery from Databases Prague: University of Economics, Kléma, J - Nováková, L - Karel, F - Štěpánková, O: Trend Analysis in Stulong Data In Proceedings of the Discovery Chalenge 2004 A Collaborative Effort in Knowledge Discovery from Databases Prague: University of Economics, 2004, pp Rauch J, Šimůnek M (2000): Mining for 4ft Association Rules In: Arikawa S, Morishita (eds) Discovery Science, Springer Verlag, pp Rauch J, Šimůnek M (2005) An Alternative Approach to Mining Association Rules In: Lin T Y, Ohsuga S, Liau C J, and Tsumoto S (eds) Data Mining: Foundations, Methods, and Applications, Springer-Verlag, 2005, pp (to appear) 6 Rauch J, Šimůnek M (2005) GUHA Method and Granular Computing In: HU, Xiaohua, LIU, Qing, SKOWRON, Andrzej, LIN, Tsau Young, YAGER, Ronald R, ZANG, Bo (ed) Proceedings of IEEE International Conference on Granular Computing IEEE, 2005, pp Šimůnek M (2003) Academic KDD Project LISp-Miner In Abraham A et al (eds) Advances in Soft Computing Intelligent Systems Design and Applications, Springer, Berlin Heidelberg New York
10 8 Štěpánková O, Aubrecht P, Kouba Z, Mikšovský P Preprocessing for Data Mining and Decision Support, pp Kluwer Academic Publishers, Dordrecht, 2003
Alternative Approach to Mining Association Rules
Alternative Approach to Mining Association Rules Jan Rauch 1, Milan Šimůnek 1 2 1 Faculty of Informatics and Statistics, University of Economics Prague, Czech Republic 2 Institute of Computer Sciences,
More informationInvestigating Measures of Association by Graphs and Tables of Critical Frequencies
Investigating Measures of Association by Graphs Investigating and Tables Measures of Critical of Association Frequencies by Graphs and Tables of Critical Frequencies Martin Ralbovský, Jan Rauch University
More informationApplying Domain Knowledge in Association Rules Mining Process First Experience
Applying Domain Knowledge in Association Rules Mining Process First Experience Jan Rauch, Milan Šimůnek Faculty of Informatics and Statistics, University of Economics, Prague nám W. Churchilla 4, 130 67
More informationUSING THE AC4FT-MINER PROCEDURE IN THE MEDICAL DOMAIN. Viktor Nekvapil
USING THE AC4FT-MINER PROCEDURE IN THE MEDICAL DOMAIN Viktor Nekvapil About the author 2 VŠE: IT 4IQ, 2. semestr Bachelor thesis: Using the Ac4ft-Miner procedure in the medical domain Supervisor: doc.
More informationThe GUHA method and its meaning for data mining. Petr Hájek, Martin Holeňa, Jan Rauch
The GUHA method and its meaning for data mining Petr Hájek, Martin Holeňa, Jan Rauch 1 Introduction. GUHA: a method of exploratory data analysis developed in Prague since mid-sixties of the past century.
More informationA Logical Formulation of the Granular Data Model
2008 IEEE International Conference on Data Mining Workshops A Logical Formulation of the Granular Data Model Tuan-Fang Fan Department of Computer Science and Information Engineering National Penghu University
More informationMachine Learning and Association rules. Petr Berka, Jan Rauch University of Economics, Prague {berka
Machine Learning and Association rules Petr Berka, Jan Rauch University of Economics, Prague {berka rauch}@vse.cz Tutorial Outline Statistics, machine learning and data mining basic concepts, similarities
More informationHigh Frequency Rough Set Model based on Database Systems
High Frequency Rough Set Model based on Database Systems Kartik Vaithyanathan kvaithya@gmail.com T.Y.Lin Department of Computer Science San Jose State University San Jose, CA 94403, USA tylin@cs.sjsu.edu
More informationClassification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach
Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach Krzysztof Pancerz, Wies law Paja, Mariusz Wrzesień, and Jan Warcho l 1 University of
More informationComparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees
Comparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees Tomasz Maszczyk and W lodzis law Duch Department of Informatics, Nicolaus Copernicus University Grudzi adzka 5, 87-100 Toruń, Poland
More informationDetecting Anomalous and Exceptional Behaviour on Credit Data by means of Association Rules. M. Delgado, M.D. Ruiz, M.J. Martin-Bautista, D.
Detecting Anomalous and Exceptional Behaviour on Credit Data by means of Association Rules M. Delgado, M.D. Ruiz, M.J. Martin-Bautista, D. Sánchez 18th September 2013 Detecting Anom and Exc Behaviour on
More informationSPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA
SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA D. Pokrajac Center for Information Science and Technology Temple University Philadelphia, Pennsylvania A. Lazarevic Computer
More informationFeature Selection by Reordering *
Feature Selection by Reordering * Marcel Jirina and Marcel Jirina jr. 2 Institute of Computer Science, Pod vodarenskou vezi 2, 82 07 Prague 8 Liben, Czech Republic marcel@cs.cas.cz 2 Center of Applied
More informationNOMINAL VARIABLE CLUSTERING AND ITS EVALUATION
NOMINAL VARIABLE CLUSTERING AND ITS EVALUATION Hana Řezanková Abstract The paper evaluates clustering of nominal variables using different similarity measures. The created clusters can serve for dimensionality
More informationHYBRID FLOW-SHOP WITH ADJUSTMENT
K Y BERNETIKA VOLUM E 47 ( 2011), NUMBER 1, P AGES 50 59 HYBRID FLOW-SHOP WITH ADJUSTMENT Jan Pelikán The subject of this paper is a flow-shop based on a case study aimed at the optimisation of ordering
More informationHOW TO WRITE PROOFS. Dr. Min Ru, University of Houston
HOW TO WRITE PROOFS Dr. Min Ru, University of Houston One of the most difficult things you will attempt in this course is to write proofs. A proof is to give a legal (logical) argument or justification
More informationOn Improving the k-means Algorithm to Classify Unclassified Patterns
On Improving the k-means Algorithm to Classify Unclassified Patterns Mohamed M. Rizk 1, Safar Mohamed Safar Alghamdi 2 1 Mathematics & Statistics Department, Faculty of Science, Taif University, Taif,
More informationMulti-Plant Photovoltaic Energy Forecasting Challenge with Regression Tree Ensembles and Hourly Average Forecasts
Multi-Plant Photovoltaic Energy Forecasting Challenge with Regression Tree Ensembles and Hourly Average Forecasts Kathrin Bujna 1 and Martin Wistuba 2 1 Paderborn University 2 IBM Research Ireland Abstract.
More informationECE521 Lecture 7/8. Logistic Regression
ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression
More informationEnsembles of classifiers based on approximate reducts
Fundamenta Informaticae 34 (2014) 1 10 1 IOS Press Ensembles of classifiers based on approximate reducts Jakub Wróblewski Polish-Japanese Institute of Information Technology and Institute of Mathematics,
More informationAnalysis of Evolutionary Trends in Astronomical Literature using a Knowledge-Discovery System: Tétralogie
Library and Information Services in Astronomy III ASP Conference Series, Vol. 153, 1998 U. Grothkopf, H. Andernach, S. Stevens-Rayburn, and M. Gomez (eds.) Analysis of Evolutionary Trends in Astronomical
More informationClassification Based on Logical Concept Analysis
Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.
More informationPrincipal Component Analysis, A Powerful Scoring Technique
Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new
More informationCOMPUTER SCIENCE TEMPORAL LOGICS NEED THEIR CLOCKS
Bulletin of the Section of Logic Volume 18/4 (1989), pp. 153 160 reedition 2006 [original edition, pp. 153 160] Ildikó Sain COMPUTER SCIENCE TEMPORAL LOGICS NEED THEIR CLOCKS In this paper we solve some
More informationPredictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 3 (2017) pp. 461-469 Research India Publications http://www.ripublication.com Predictive Analytics on Accident Data Using
More informationMinimal Attribute Space Bias for Attribute Reduction
Minimal Attribute Space Bias for Attribute Reduction Fan Min, Xianghui Du, Hang Qiu, and Qihe Liu School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More informationENVIRONMENTAL DATA ANALYSIS WILLIAM MENKE JOSHUA MENKE WITH MATLAB COPYRIGHT 2011 BY ELSEVIER, INC. ALL RIGHTS RESERVED.
ENVIRONMENTAL DATA ANALYSIS WITH MATLAB WILLIAM MENKE PROFESSOR OF EARTH AND ENVIRONMENTAL SCIENCE COLUMBIA UNIVERSITY JOSHUA MENKE SOFTWARE ENGINEER JOM ASSOCIATES COPYRIGHT 2011 BY ELSEVIER, INC. ALL
More information1 Algebraic Methods. 1.1 Gröbner Bases Applied to SAT
1 Algebraic Methods In an algebraic system Boolean constraints are expressed as a system of algebraic equations or inequalities which has a solution if and only if the constraints are satisfiable. Equations
More informationKnowledge Discovery Based Query Answering in Hierarchical Information Systems
Knowledge Discovery Based Query Answering in Hierarchical Information Systems Zbigniew W. Raś 1,2, Agnieszka Dardzińska 3, and Osman Gürdal 4 1 Univ. of North Carolina, Dept. of Comp. Sci., Charlotte,
More informationA Simple Implementation of the Stochastic Discrimination for Pattern Recognition
A Simple Implementation of the Stochastic Discrimination for Pattern Recognition Dechang Chen 1 and Xiuzhen Cheng 2 1 University of Wisconsin Green Bay, Green Bay, WI 54311, USA chend@uwgb.edu 2 University
More informationResearch on Complete Algorithms for Minimal Attribute Reduction
Research on Complete Algorithms for Minimal Attribute Reduction Jie Zhou, Duoqian Miao, Qinrong Feng, and Lijun Sun Department of Computer Science and Technology, Tongji University Shanghai, P.R. China,
More informationMachine Learning for Disease Progression
Machine Learning for Disease Progression Yong Deng Department of Materials Science & Engineering yongdeng@stanford.edu Xuxin Huang Department of Applied Physics xxhuang@stanford.edu Guanyang Wang Department
More information(S1) (S2) = =8.16
Formulae for Attributes As described in the manuscript, the first step of our method to create a predictive model of material properties is to compute attributes based on the composition of materials.
More informationRegrese a predikce pomocí fuzzy asociačních pravidel
Regrese a predikce pomocí fuzzy asociačních pravidel Pavel Rusnok Institute for Research and Applications of Fuzzy Modeling University of Ostrava Ostrava, Czech Republic pavel.rusnok@osu.cz March 1, 2018,
More informationEasy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix
Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Manuel S. Lazo-Cortés 1, José Francisco Martínez-Trinidad 1, Jesús Ariel Carrasco-Ochoa 1, and Guillermo
More informationEfficiently merging symbolic rules into integrated rules
Efficiently merging symbolic rules into integrated rules Jim Prentzas a, Ioannis Hatzilygeroudis b a Democritus University of Thrace, School of Education Sciences Department of Education Sciences in Pre-School
More informationSYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS
SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS Hans-Jürgen Winkler ABSTRACT In this paper an efficient on-line recognition system for handwritten mathematical formulas is proposed. After formula
More informationCompenzational Vagueness
Compenzational Vagueness Milan Mareš Institute of information Theory and Automation Academy of Sciences of the Czech Republic P. O. Box 18, 182 08 Praha 8, Czech Republic mares@utia.cas.cz Abstract Some
More informationVisualizing Logical Thinking using Homotopy A new learning method to survive in dynamically changing cyberworlds
Visualizing Logical Thinking using Homotopy A new learning method to survive in dynamically changing cyberworlds Kenji Ohmori 1, Tosiyasu L. Kunii 2 1 Computer and Information Sciences, Hosei University,
More informationImago: open-source toolkit for 2D chemical structure image recognition
Imago: open-source toolkit for 2D chemical structure image recognition Viktor Smolov *, Fedor Zentsev and Mikhail Rybalkin GGA Software Services LLC Abstract Different chemical databases contain molecule
More informationAnomaly Detection for the CERN Large Hadron Collider injection magnets
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing
More informationAdvanced Techniques for Mining Structured Data: Process Mining
Advanced Techniques for Mining Structured Data: Process Mining Frequent Pattern Discovery /Event Forecasting Dr A. Appice Scuola di Dottorato in Informatica e Matematica XXXII Problem definition 1. Given
More informationIntegrated Cheminformatics to Guide Drug Discovery
Integrated Cheminformatics to Guide Drug Discovery Matthew Segall, Ed Champness, Peter Hunt, Tamsin Mansley CINF Drug Discovery Cheminformatics Approaches August 23 rd 2017 Optibrium, StarDrop, Auto-Modeller,
More informationOptimizing Abstaining Classifiers using ROC Analysis. Tadek Pietraszek / 'tʌ dek pɪe 'trʌ ʃek / ICML 2005 August 9, 2005
IBM Zurich Research Laboratory, GSAL Optimizing Abstaining Classifiers using ROC Analysis Tadek Pietraszek / 'tʌ dek pɪe 'trʌ ʃek / pie@zurich.ibm.com ICML 2005 August 9, 2005 To classify, or not to classify:
More informationDesigning and Evaluating Generic Ontologies
Designing and Evaluating Generic Ontologies Michael Grüninger Department of Industrial Engineering University of Toronto gruninger@ie.utoronto.ca August 28, 2007 1 Introduction One of the many uses of
More informationTopology Proceedings. COPYRIGHT c by Topology Proceedings. All rights reserved.
Topology Proceedings Web: http://topology.auburn.edu/tp/ Mail: Topology Proceedings Department of Mathematics & Statistics Auburn University, Alabama 36849, USA E-mail: topolog@auburn.edu ISSN: 0146-4124
More informationPattern Structures 1
Pattern Structures 1 Pattern Structures Models describe whole or a large part of the data Pattern characterizes some local aspect of the data Pattern is a predicate that returns true for those objects
More informationCommunication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi
Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking
More informationGeospatial Intelligence
Geospatial Intelligence Geospatial analysis has existed as long as humans have made and studied maps but its importance to the intelligence community has skyrocketed in the past several years, with Unmanned
More informationMIXED DATA GENERATOR
MIXED DATA GENERATOR Martin Matějka Jiří Procházka Zdeněk Šulc Abstract Very frequently, simulated data are required for quality evaluation of newly developed coefficients. In some cases, datasets with
More informationMapcube and Mapview. Two Web-based Spatial Data Visualization and Mining Systems. C.T. Lu, Y. Kou, H. Wang Dept. of Computer Science Virginia Tech
Mapcube and Mapview Two Web-based Spatial Data Visualization and Mining Systems C.T. Lu, Y. Kou, H. Wang Dept. of Computer Science Virginia Tech S. Shekhar, P. Zhang, R. Liu Dept. of Computer Science University
More informationBIOS 2041: Introduction to Statistical Methods
BIOS 2041: Introduction to Statistical Methods Abdus S Wahed* *Some of the materials in this chapter has been adapted from Dr. John Wilson s lecture notes for the same course. Chapter 0 2 Chapter 1 Introduction
More informationCalculus at Rutgers. Course descriptions
Calculus at Rutgers This edition of Jon Rogawski s text, Calculus Early Transcendentals, is intended for students to use in the three-semester calculus sequence Math 151/152/251 beginning with Math 151
More informationOn Tuning OWA Operators in a Flexible Querying Interface
On Tuning OWA Operators in a Flexible Querying Interface Sławomir Zadrożny 1 and Janusz Kacprzyk 2 1 Warsaw School of Information Technology, ul. Newelska 6, 01-447 Warsaw, Poland 2 Systems Research Institute
More informationEYE-TRACKING TESTING OF GIS INTERFACES
Geoinformatics EYE-TRACKING TESTING OF GIS INTERFACES Bc. Vaclav Kudelka Ing. Zdena Dobesova, Ph.D. Department of Geoinformatics, Palacký University, Olomouc, Czech Republic ABSTRACT Eye-tracking is currently
More informationInferring Passenger Boarding and Alighting Preference for the Marguerite Shuttle Bus System
Inferring Passenger Boarding and Alighting Preference for the Marguerite Shuttle Bus System Adrian Albert Abstract We analyze passenger count data from the Marguerite Shuttle system operating on the Stanford
More informationStandards-Based Quantification in DTSA-II Part II
Standards-Based Quantification in DTSA-II Part II Nicholas W.M. Ritchie National Institute of Standards and Technology, Gaithersburg, MD 20899-8371 nicholas.ritchie@nist.gov Introduction This article is
More informationMODULE- 07 : FLUIDICS AND FLUID LOGIC
MODULE- 07 : FLUIDICS AND FLUID LOGIC LECTURE- 26 : INTRODUCTION TO FLUID LOGIC INTRODUCTION Fluidics (also known as Fluidic logic) is the use of a fluid or compressible medium to perform analog or digital
More informationImpact of Data Characteristics on Recommender Systems Performance
Impact of Data Characteristics on Recommender Systems Performance Gediminas Adomavicius YoungOk Kwon Jingjing Zhang Department of Information and Decision Sciences Carlson School of Management, University
More informationAdministering your Enterprise Geodatabase using Python. Jill Penney
Administering your Enterprise Geodatabase using Python Jill Penney Assumptions Basic knowledge of python Basic knowledge enterprise geodatabases and workflows You want code Please turn off or silence cell
More informationLearning ArcGIS: Introduction to ArcCatalog 10.1
Learning ArcGIS: Introduction to ArcCatalog 10.1 Estimated Time: 1 Hour Information systems help us to manage what we know by making it easier to organize, access, manipulate, and apply knowledge to the
More informationAn Efficient Decision Procedure for Functional Decomposable Theories Based on Dual Constraints
An Efficient Decision Procedure for Functional Decomposable Theories Based on Dual Constraints Khalil Djelloul Laboratoire d Informatique Fondamentale d Orléans. Bat. 3IA, rue Léonard de Vinci. 45067 Orléans,
More informationRanking Verification Counterexamples: An Invariant guided approach
Ranking Verification Counterexamples: An Invariant guided approach Ansuman Banerjee Indian Statistical Institute Joint work with Pallab Dasgupta, Srobona Mitra and Harish Kumar Complex Systems Everywhere
More informationAnalysis of United States Rainfall
Analysis of United States Rainfall Trevyn Currie, Stephen Blatt CurrieTrevyn@gmail.com, SBlattJ@gmail.com Abstract Using hourly rainfall data in the United States, we used SQL to construct a data mart
More informationSPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM
SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors
More informationPredictive Modelling of Ag, Au, U, and Hg Ore Deposits in West Texas Carl R. Stockmeyer. December 5, GEO 327G
Predictive Modelling of Ag, Au, U, and Hg Ore Deposits in West Texas Carl R. Stockmeyer December 5, 2013 - GEO 327G Objectives and Motivations The goal of this project is to use ArcGIS to create models
More informationIntroducing GIS analysis
1 Introducing GIS analysis GIS analysis lets you see patterns and relationships in your geographic data. The results of your analysis will give you insight into a place, help you focus your actions, or
More informationPatent Searching using Bayesian Statistics
Patent Searching using Bayesian Statistics Willem van Hoorn, Exscientia Ltd Biovia European Forum, London, June 2017 Contents Who are we? Searching molecules in patents What can Pipeline Pilot do for you?
More informationAbout the impossibility to prove P NP or P = NP and the pseudo-randomness in NP
About the impossibility to prove P NP or P = NP and the pseudo-randomness in NP Prof. Marcel Rémon 1 arxiv:0904.0698v3 [cs.cc] 24 Mar 2016 Abstract The relationship between the complexity classes P and
More informationDrawing Conclusions from Data The Rough Set Way
Drawing Conclusions from Data The Rough et Way Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of ciences, ul Bałtycka 5, 44 000 Gliwice, Poland In the rough set theory
More informationModule 03 Lecture 14 Inferential Statistics ANOVA and TOI
Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module
More informationBV4.1 Methodology and User-friendly Software for Decomposing Economic Time Series
Conference on Seasonality, Seasonal Adjustment and their implications for Short-Term Analysis and Forecasting 10-12 May 2006 BV4.1 Methodology and User-friendly Software for Decomposing Economic Time Series
More informationApplying Bayesian networks in the game of Minesweeper
Applying Bayesian networks in the game of Minesweeper Marta Vomlelová Faculty of Mathematics and Physics Charles University in Prague http://kti.mff.cuni.cz/~marta/ Jiří Vomlel Institute of Information
More informationComputational Tasks and Models
1 Computational Tasks and Models Overview: We assume that the reader is familiar with computing devices but may associate the notion of computation with specific incarnations of it. Our first goal is to
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More informationA new Approach to Drawing Conclusions from Data A Rough Set Perspective
Motto: Let the data speak for themselves R.A. Fisher A new Approach to Drawing Conclusions from Data A Rough et Perspective Zdzisław Pawlak Institute for Theoretical and Applied Informatics Polish Academy
More informationGaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress
Gaussian EDA and Truncation Selection: Setting Limits for Sustainable Progress Petr Pošík Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics Technická, 66 7 Prague
More informationan efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.
1 More on NP In this set of lecture notes, we examine the class NP in more detail. We give a characterization of NP which justifies the guess and verify paradigm, and study the complexity of solving search
More informationAdvanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting)
Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting) Professor: Aude Billard Assistants: Nadia Figueroa, Ilaria Lauzana and Brice Platerrier E-mails: aude.billard@epfl.ch,
More informationSatisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games
Satisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games Stéphane Ross and Brahim Chaib-draa Department of Computer Science and Software Engineering Laval University, Québec (Qc),
More informationMachine Learning 2010
Machine Learning 2010 Concept Learning: The Logical Approach Michael M Richter Email: mrichter@ucalgary.ca 1 - Part 1 Basic Concepts and Representation Languages 2 - Why Concept Learning? Concepts describe
More informationEvaluation, transformation, and parameterization of epipolar conics
Evaluation, transformation, and parameterization of epipolar conics Tomáš Svoboda svoboda@cmp.felk.cvut.cz N - CTU CMP 2000 11 July 31, 2000 Available at ftp://cmp.felk.cvut.cz/pub/cmp/articles/svoboda/svoboda-tr-2000-11.pdf
More informationLet s now begin to formalize our analysis of sequential machines Powerful methods for designing machines for System control Pattern recognition Etc.
Finite State Machines Introduction Let s now begin to formalize our analysis of sequential machines Powerful methods for designing machines for System control Pattern recognition Etc. Such devices form
More informationMathematics 1104B. Systems of Equations and Inequalities, and Matrices. Study Guide. Text: Mathematics 11. Alexander and Kelly; Addison-Wesley, 1998.
Adult Basic Education Mathematics Systems of Equations and Inequalities, and Matrices Prerequisites: Mathematics 1104A, Mathematics 1104B Credit Value: 1 Text: Mathematics 11. Alexander and Kelly; Addison-Wesley,
More informationPrediction of Citations for Academic Papers
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationDescribing Data Table with Best Decision
Describing Data Table with Best Decision ANTS TORIM, REIN KUUSIK Department of Informatics Tallinn University of Technology Raja 15, 12618 Tallinn ESTONIA torim@staff.ttu.ee kuusik@cc.ttu.ee http://staff.ttu.ee/~torim
More informationSCIENCE PROGRAM CALCULUS III
SCIENCE PROGRAM CALCULUS III Discipline: Mathematics Semester: Winter 2005 Course Code: 201-DDB-05 Instructor: Objectives: 00UV, 00UU Office: Ponderation: 3-2-3 Tel.: 457-6610 Credits: 2 2/3 Local: Course
More informationAffine Normalization of Symmetric Objects
Affine Normalization of Symmetric Objects Tomáš Suk and Jan Flusser Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Pod vodárenskou věží 4, 182 08 Prague 8, Czech
More informationROUGH set methodology has been witnessed great success
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 2, APRIL 2006 191 Fuzzy Probabilistic Approximation Spaces and Their Information Measures Qinghua Hu, Daren Yu, Zongxia Xie, and Jinfu Liu Abstract Rough
More informationKeywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and
More informationOn the use of Long-Short Term Memory neural networks for time series prediction
On the use of Long-Short Term Memory neural networks for time series prediction Pilar Gómez-Gil National Institute of Astrophysics, Optics and Electronics ccc.inaoep.mx/~pgomez In collaboration with: J.
More informationFeature Engineering, Model Evaluations
Feature Engineering, Model Evaluations Giri Iyengar Cornell University gi43@cornell.edu Feb 5, 2018 Giri Iyengar (Cornell Tech) Feature Engineering Feb 5, 2018 1 / 35 Overview 1 ETL 2 Feature Engineering
More informationEE290H F05. Spanos. Lecture 5: Comparison of Treatments and ANOVA
1 Design of Experiments in Semiconductor Manufacturing Comparison of Treatments which recipe works the best? Simple Factorial Experiments to explore impact of few variables Fractional Factorial Experiments
More informationMatrix Factorization Techniques For Recommender Systems. Collaborative Filtering
Matrix Factorization Techniques For Recommender Systems Collaborative Filtering Markus Freitag, Jan-Felix Schwarz 28 April 2011 Agenda 2 1. Paper Backgrounds 2. Latent Factor Models 3. Overfitting & Regularization
More informationIntegrating State Constraints and Obligations in Situation Calculus
Integrating State Constraints and Obligations in Situation Calculus Robert Demolombe ONERA-Toulouse 2, Avenue Edouard Belin BP 4025, 31055 Toulouse Cedex 4, France. Robert.Demolombe@cert.fr Pilar Pozos
More informationarxiv: v1 [cs.cl] 21 May 2017
Spelling Correction as a Foreign Language Yingbo Zhou yingbzhou@ebay.com Utkarsh Porwal uporwal@ebay.com Roberto Konow rkonow@ebay.com arxiv:1705.07371v1 [cs.cl] 21 May 2017 Abstract In this paper, we
More informationGuaranteeing the Accuracy of Association Rules by Statistical Significance
Guaranteeing the Accuracy of Association Rules by Statistical Significance W. Hämäläinen Department of Computer Science, University of Helsinki, Finland Abstract. Association rules are a popular knowledge
More informationFORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK
FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK Dusan Marcek Silesian University, Institute of Computer Science Opava Research Institute of the IT4Innovations
More informationClassification of Be Stars Using Feature Extraction Based on Discrete Wavelet Transform
Classification of Be Stars Using Feature Extraction Based on Discrete Wavelet Transform Pavla Bromová 1, David Bařina 1, Petr Škoda 2, Jaroslav Vážný 2, and Jaroslav Zendulka 1 1 Faculty of Information
More information