Pooling Experiments for High Throughput Screening in Drug Discovery
|
|
- Christopher Chambers
- 5 years ago
- Views:
Transcription
1 Pooling Experiments for High Throughput Screening in Drug Discovery Jacqueline M. Hughes-Oliver Department of Statistics North Carolina State University Spring Research Conference, June
2 Outline Motivation What is a Pooling Experiment? + synergism & blocking, saves money, time, materials logistics are difficult, needs careful design & analysis Analysis of Pooling Experiments Issues Current Work Spring Research Conference, June
3 High Throughput Screening 500,000+ molecules available for screening < 5% are active (high potencies) Must find m diverse leads leads toxicity Phase I clinical trials etc. Search for Structure-Activity-Relationships, SARs Relate activity to chemical structure Often, n= #responses << p= #descriptors Assay 1: n = 1000 p = 1873 Assay 2: n 500, 000 p>1mil Testing done by liquid-handling robotic systems Spring Research Conference, June
4 State of the Art Test all molecules in training set Recursive partitioning (RP) Nonlinear, fragmented relationships Use hypothesis testing to split nodes Excellent for n<<p Needs large n, since Pr(active) is small Make predictions for untested molecules, then do ordered testing Accumulation curves: # actives found vs. # tests performed Spring Research Conference, June
5 State of the Art Test all molecules in training set Recursive partitioning (RP) Nonlinear, fragmented relationships Use hypothesis testing to split nodes Excellent for n<<p Needs large n, since Pr(active) is small Predict for untested molecules, then do ordered testing Accumulation curves: # actives found vs. # tests performed Can We Increase efficiency? Discover combination therapies in vitro? Spring Research Conference, June
6 Pooling Experiment Test molecules in mixtures, not individually Spring Research Conference, June
7 Pooling Experiment Test molecules in mixtures, not individually HTS Plate Spring Research Conference, June
8 Pooling Experiment Test molecules in mixtures, not individually Individual Compounds Pools Figure 1: One-way Pooling experiment where pooling is by column. Spring Research Conference, June
9 Pooling Experiment: Dorfman Assumptions Test molecules in mixtures, not individually p =Pr(active) k = pool size n =#pools X =#active pools X bin(n, θ) θ =1 (1 p) k p same for all molecules No errors in interpreting pooled responses all molecules in pool are inactive inactive pool 1+ molecule active active pool Pooling does not alter behavior of individuals No degeneration of activity No enhancement of activity Spring Research Conference, June
10 Pooling Experiment: Dorfman Assumptions Violated Test molecules in mixtures, not individually p =Pr(active) k = pool size n =#pools X =#active pools X bin(n, θ) θ =1 (1 p) k p same for all molecules SAR No errors in interpreting pooled responses all molecules inactive inactive pool Specificity+ 1+ molecule active active pool Sensitivity+ Pooling does not alter behavior of individuals No degeneration of activity Dilution? Blocking? No enhancement of activity Additivity? Synergism? Spring Research Conference, June
11 Active Compound Inactive Compound Blocker Compound Individual Compounds Pools Synergism occurs active Pool Blocking occurs active Pool Figure 2: One-way Pooling experiment where pooling is by column. Pool 1 illustrates synergism and Pool 8 illustrates blocking. Pools 4 and 11 show regular activity. Spring Research Conference, June
12 Assay 1 y =%inhibition relative to reference molecule n = 100 pools each of size k =10 Pooling by dissimilarity according to Burden Numbers avoid additivity conc. for pool =10 conc. for individual avoid dilution Control over design Active is y 60 Active pools: 4 of 100 (4%) Active molecules: 40 of 1000 (4%) Spring Research Conference, June
13 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] pools [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] [2,] [3,] [4,] * [5,] [6,] * [7,] *1 0 *1 [8,] [9,] [10,] pools Blocking Spring Research Conference, June
14 Pool along the rows, using activity thresholds 60 (individuals) and (pools) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] pools [1,] [2,] [3,] [4,] *1 [5,] [6,] [7,] [8,] [9,] [10,] Synergism Spring Research Conference, June
15 Pooling Experiment: Decoding Retesting? Dorfman: individually test all molecules in active pools Random: individually test some molecules in active pools and some molecules in inactive pools Can estimate synergism and blocking probabilities No retesting? Saves time and money Lose information Not a good idea Spring Research Conference, June
16 Analysis of Pooling Experiments Nonparametric Fully parametric Semiparametric Model as a missing data problem Yi et al. (2003, JSM) Chemical descriptors Atom pairs, BCUT numbers, Mol weight, etc. Spring Research Conference, June
17 Nonparametric: RP on Pools Pooled descriptors: binary Atom pair in pool? Need large number of pools for this to be effective Useful for determining preliminary covariate classes for (semi-) parametric models Excellent indicator of synergism Spring Research Conference, June
18 n= 140 u= 13 s= 29 ap= 7.28E-004 bp= 6.74E-001 N I NO x.1348 n= 52 u= 24 s= 38 ap= 3.16E-003 bp= 1.00E+000 N1 YES x.1348 n= 88 u= 7 s= 21 ap= 9.12E-005 bp= 7.43E-002 N2 I I NO x.1637 YES x.1637 NO x.1048 YES x.1048 n= 38 u= 14 s= 28 ap= 9.24E-004 bp= 2.45E-001 N11 n= 14 u= 49 s= 50 ap= 3.42E-004 bp= 6.46E-002 N12 n= 83 u= 4 s= 17 N21 n= 5 u= 40 s= 44 N22 I I I I NO x.106 YES x.106 NO x.1392 YES x.1392 n= 31 u= 8 s= 17 ap= 4.43E-003 bp= 7.85E-001 N111 n= 7 u= 44 s= 45 N112 n= 7 u= 88 s= 42 N121 n= 7 u= 9 s= 7 N122 I I I I NO x.583 n= 26 u= 4 s= 10 N1111 YES x.583 n= 5 u= 27 s= 34 N1112 I I min node size is 5; splits forced based on = 140 tests Spring Research Conference, June
19 Atom Pairs In Tree Individuals class active total the rest Synergism in class 2? Spring Research Conference, June
20 Number of Actives Found Random testing RP on pools, PT= Number of Tests RP on only 140 tests; need more data Testing order within a node? Spring Research Conference, June
21 Number of Actives Found Random testing RP on pools, PT=60 RP on pools, PT= Number of Tests RP on 390 tests when PT=13.14 Why PT=13.14? Spring Research Conference, June
22 Fully Parametric Model at the individual molecule level Trinomial (active, blocker, other), with class probabilities dependent on chemical features; see Zhu et al (2001) Binomial (active or not), conditioned on interactions in a pool Blocking probability same across all classes Synergism probability same across all classes Activity probabilities dependent on chemical features Scale-up to obtain model on pooled responses Predict activities of untested molecules Test molecules according to rank from predictions Spring Research Conference, June
23 Parametric: Conditional Binomial For i =1,...,n and l =1,...,L, s il = # molecules in pool i and covariate class l W il = # active molecules in pool i and class l Y i = I(pool i active) W il bin(s il,p l ), independent over i and l l s il = k b =Pr(Y i =0 l W il > 0), constant blocking g =Pr(Y i =1 l W il =0), constant synergism Can also model sensitivity and specificity in this manner Spring Research Conference, June
24 Dorfman: test all molecules in active pools. Then L(θ) = i φ y i i (1 ψ i) 1 y i, where ψ i =Pr(Y i =1)=(1 b)+(g + b 1) l (1 p l ) s il φ i = (1 b) l (s il w il p w il l )(1 p l f l ) s il w il l w il > 0 g l (1 p l) s il l w il =0 Spring Research Conference, June
25 Assay 1: Dorfman Experiment, PT=13.14 Class Observed Active Total Pr(active) Conditionally Binomial Pr(active) Pr(blocking).292 Pr(synergism).101 Spring Research Conference, June
26 Number of Actives Found Random testing RP on pools, PT=13.14 MLE, PT= Number of Tests Spring Research Conference, June
27 Issues Design of Pooling Experiments large Flawed designs are not as informative as small but carefully selected designs (additivity, dilution) Zhu et al. (2002), Remlinger et al. (2002) Dilution effect may be unavoidable. Model it. Can we truly disentangle Yi et al. (2002) (synergism,blocking), (additivity,dilution), (effect of activity threshold), (sensitivity,specificity)? Variable selection under parametric models Large dataset Spring Research Conference, June
28 Pooling experiments can be risky: pharmaceutical industry is cautious Pooling experiments can pay off in big ways: Reduce testing costs Shorten testing and development cycle Discover synergistic relationships Discover blocking relationships Spring Research Conference, June
29 Acknowledgements Katja Remlinger, NC State Bingming Yi, Merck Stan Young, NISS & CGStat Ke Zhang, NC State Lei Zhu, GlaxoSmithKline Spring Research Conference, June
30 Current Work Design Random retesting schemes Effect of pool threshold for activity Semi-parametric model, data missing at random Explore pairs/triplets of chemical descriptors, stochastic search Multiple trees Spring Research Conference, June
Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing
Analysis of a Large Structure/Biological Activity Data Set Using Recursive Partitioning and Simulated Annealing Student: Ke Zhang MBMA Committee: Dr. Charles E. Smith (Chair) Dr. Jacqueline M. Hughes-Oliver
More informationStatistical Learning in Drug Discovery via Clustering and Mixtures
Statistical Learning in Drug Discovery via Clustering and Mixtures by Xu Wang A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Doctor of Philosophy
More informationEMPIRICAL VS. RATIONAL METHODS OF DISCOVERING NEW DRUGS
EMPIRICAL VS. RATIONAL METHODS OF DISCOVERING NEW DRUGS PETER GUND Pharmacopeia Inc., CN 5350 Princeton, NJ 08543, USA pgund@pharmacop.com Empirical and theoretical approaches to drug discovery have often
More informationLecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech
Lecture 9 Two-Sample Test Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Computer exam 1 18 Histogram 14 Frequency 9 5 0 75 83.33333333
More informationData Mining in the Chemical Industry. Overview of presentation
Data Mining in the Chemical Industry Glenn J. Myatt, Ph.D. Partner, Myatt & Johnson, Inc. glenn.myatt@gmail.com verview of presentation verview of the chemical industry Example of the pharmaceutical industry
More informationDrug Combination Analysis
Drug Combination Analysis Gary D. Knott, Ph.D. Civilized Software, Inc. 12109 Heritage Park Circle Silver Spring MD 20906 USA Tel.: (301)-962-3711 email: csi@civilized.com URL: www.civilized.com abstract:
More informationA Sequential Approach for Identifying Lead Compounds in Large Chemical Databases
Statistical Science 2001, Vol. 16, No. 2, 154 168 A Sequential Approach for Identifying Lead Compounds in Large Chemical Databases Markus Abt, YongBin Lim, Jerome Sacks, Minge Xie and S. Stanley Young
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationElectrical and Computer Engineering Department University of Waterloo Canada
Predicting a Biological Response of Molecules from Their Chemical Properties Using Diverse and Optimized Ensembles of Stochastic Gradient Boosting Machine By Tarek Abdunabi and Otman Basir Electrical and
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationthe long tau-path for detecting monotone association in an unspecified subpopulation
the long tau-path for detecting monotone association in an unspecified subpopulation Joe Verducci Current Challenges in Statistical Learning Workshop Banff International Research Station Tuesday, December
More informationIntroduction to Chemoinformatics and Drug Discovery
Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013 The Chemical Space There are atoms and space. Everything else is opinion. Democritus (ca.
More informationChapter 14 Combining Models
Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More informationEarly Stages of Drug Discovery in the Pharmaceutical Industry
Early Stages of Drug Discovery in the Pharmaceutical Industry Daniel Seeliger / Jan Kriegl, Discovery Research, Boehringer Ingelheim September 29, 2016 Historical Drug Discovery From Accidential Discovery
More informationDivCalc: A Utility for Diversity Analysis and Compound Sampling
Molecules 2002, 7, 657-661 molecules ISSN 1420-3049 http://www.mdpi.org DivCalc: A Utility for Diversity Analysis and Compound Sampling Rajeev Gangal* SciNova Informatics, 161 Madhumanjiri Apartments,
More informationInteractive Feature Selection with
Chapter 6 Interactive Feature Selection with TotalBoost g ν We saw in the experimental section that the generalization performance of the corrective and totally corrective boosting algorithms is comparable.
More informationFRAUNHOFER IME SCREENINGPORT
FRAUNHOFER IME SCREENINGPORT Design of screening projects General remarks Introduction Screening is done to identify new chemical substances against molecular mechanisms of a disease It is a question of
More informationDr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre
Dr. Sander B. Nabuurs Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre The road to new drugs. How to find new hits? High Throughput
More informationApplications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University
i Applications of Basu's TheorelTI by '. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University January 1997 Institute of Statistics ii-limeo Series
More informationA Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data
A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine
More informationCOMBINATORIAL CHEMISTRY IN A HISTORICAL PERSPECTIVE
NUE FEATURE T R A N S F O R M I N G C H A L L E N G E S I N T O M E D I C I N E Nuevolution Feature no. 1 October 2015 Technical Information COMBINATORIAL CHEMISTRY IN A HISTORICAL PERSPECTIVE A PROMISING
More informationLecture 21: Spectral Learning for Graphical Models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation
More informationPriority Setting of Endocrine Disruptors Using QSARs
Priority Setting of Endocrine Disruptors Using QSARs Weida Tong Manager of Computational Science Group, Logicon ROW Sciences, FDA s National Center for Toxicological Research (NCTR), U.S.A. Thanks for
More informationStatistical concepts in QSAR.
Statistical concepts in QSAR. Computational chemistry represents molecular structures as a numerical models and simulates their behavior with the equations of quantum and classical physics. Available programs
More informationUsing AutoDock for Virtual Screening
Using AutoDock for Virtual Screening CUHK Croucher ASI Workshop 2011 Stefano Forli, PhD Prof. Arthur J. Olson, Ph.D Molecular Graphics Lab Screening and Virtual Screening The ultimate tool for identifying
More informationsphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19
additive tree structure, 10-28 ADDTREE, 10-51, 10-53 EXTREE, 10-31 four point condition, 10-29 ADDTREE, 10-28, 10-51, 10-53 adjusted R 2, 8-7 ALSCAL, 10-49 ANCOVA, 9-1 assumptions, 9-5 example, 9-7 MANOVA
More informationRetrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a
Retrieving hits through in silico screening and expert assessment M.. Drwal a,b and R. Griffith a a: School of Medical Sciences/Pharmacology, USW, Sydney, Australia b: Charité Berlin, Germany Abstract:
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More information10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification
10-810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationData Analysis in the Life Sciences - The Fog of Data -
ALTAA Chair for Bioinformatics & Information Mining Data Analysis in the Life Sciences - The Fog of Data - Michael R. Berthold ALTAA-Chair for Bioinformatics & Information Mining Konstanz University, Germany
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationDistance between multinomial and multivariate normal models
Chapter 9 Distance between multinomial and multivariate normal models SECTION 1 introduces Andrew Carter s recursive procedure for bounding the Le Cam distance between a multinomialmodeland its approximating
More informationSpatial Bayesian Nonparametrics for Natural Image Segmentation
Spatial Bayesian Nonparametrics for Natural Image Segmentation Erik Sudderth Brown University Joint work with Michael Jordan University of California Soumya Ghosh Brown University Parsing Visual Scenes
More informationZoe Blaxill Analytical Sciences Discovery Research. High Frequency Acoustic Technology: Evaluation for Compound Mixing and Dissolution in HTS.
Introduction Zoe Blaxill Analytical Sciences Discovery Research High Frequency Acoustic Technology: Evaluation for Compound Mixing and Dissolution in HTS. Introduction Current Issues Technology Overview
More informationPubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION
PubH 745: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION Let Y be the Dependent Variable Y taking on values and, and: π Pr(Y) Y is said to have the Bernouilli distribution (Binomial with n ).
More informationData Quality Issues That Can Impact Drug Discovery
Data Quality Issues That Can Impact Drug Discovery Sean Ekins 1, Joe Olechno 2 Antony J. Williams 3 1 Collaborations in Chemistry, Fuquay Varina, NC. 2 Labcyte Inc, Sunnyvale, CA. 3 Royal Society of Chemistry,
More informationExtrapolating New Approaches into a Tiered Approach to Mixtures Risk Assessment
Extrapolating New into a Tiered Approach to Mixtures Risk Assessment Michael L. Dourson, PhD, DABT, FATS, FSRA Toxicology Excellence for Risk Assessment (TERA) dourson@tera.org Conflict of Interest Statement
More informationClustering using Mixture Models
Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior
More informationInfinitely Imbalanced Logistic Regression
p. 1/1 Infinitely Imbalanced Logistic Regression Art B. Owen Journal of Machine Learning Research, April 2007 Presenter: Ivo D. Shterev p. 2/1 Outline Motivation Introduction Numerical Examples Notation
More informationSociety for Biomolecular Screening 10th Annual Conference, Orlando, FL, September 11-15, 2004
Society for Biomolecular Screening 10th Annual Conference, Orlando, FL, September 11-15, 2004 Advanced Methods in Dose-Response Screening of Enzyme Inhibitors Petr uzmič, Ph.D. Bioin, Ltd. TOPICS: 1. Fitting
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationFRAGMENT SCREENING IN LEAD DISCOVERY BY WEAK AFFINITY CHROMATOGRAPHY (WAC )
FRAGMENT SCREENING IN LEAD DISCOVERY BY WEAK AFFINITY CHROMATOGRAPHY (WAC ) SARomics Biostructures AB & Red Glead Discovery AB Medicon Village, Lund, Sweden Fragment-based lead discovery The basic idea:
More informationLecture 13 and 14: Bayesian estimation theory
1 Lecture 13 and 14: Bayesian estimation theory Spring 2012 - EE 194 Networked estimation and control (Prof. Khan) March 26 2012 I. BAYESIAN ESTIMATORS Mother Nature conducts a random experiment that generates
More informationThe Conformation Search Problem
Jon Sutter Senior Manager Life Sciences R&D jms@accelrys.com Jiabo Li Senior Scientist Life Sciences R&D jli@accelrys.com CAESAR: Conformer Algorithm based on Energy Screening and Recursive Buildup The
More informationKernel-based Machine Learning for Virtual Screening
Kernel-based Machine Learning for Virtual Screening Dipl.-Inf. Matthias Rupp Beilstein Endowed Chair for Chemoinformatics Johann Wolfgang Goethe-University Frankfurt am Main, Germany 2008-04-11, Helmholtz
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model
More informationDecision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore
Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy
More informationUsing Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results
Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results Jing Zhang Miami University August 12, 2014 Jing Zhang (Miami University) Using Historical
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationTutorial: Sparse Signal Recovery
Tutorial: Sparse Signal Recovery Anna C. Gilbert Department of Mathematics University of Michigan (Sparse) Signal recovery problem signal or population length N k important Φ x = y measurements or tests:
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationIsothermal Titration Calorimetry in Drug Discovery. Geoff Holdgate Structure & Biophysics, Discovery Sciences, AstraZeneca October 2017
Isothermal Titration Calorimetry in Drug Discovery Geoff Holdgate Structure & Biophysics, Discovery Sciences, AstraZeneca October 217 Introduction Introduction to ITC Strengths / weaknesses & what is required
More informationHuman or Cylon? Group testing on the Battlestar Galactica
Human or Cylon? Group testing on the Statistics and The story so far Video Christopher R. Bilder Department of Statistics University of Nebraska-Lincoln chris@chrisbilder.com Slide 1 of 37 Slide 2 of 37
More informationScaling up Bayesian Inference
Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background
More informationGeneralizing the MCPMod methodology beyond normal, independent data
Generalizing the MCPMod methodology beyond normal, independent data José Pinheiro Joint work with Frank Bretz and Björn Bornkamp Novartis AG ASA NJ Chapter 35 th Annual Spring Symposium June 06, 2014 Outline
More informationCSC 411 Lecture 3: Decision Trees
CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationSeries 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning)
Exercises Introduction to Machine Learning SS 2018 Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning) LAS Group, Institute for Machine Learning Dept of Computer Science, ETH Zürich Prof
More informationGeneralizing the MCPMod methodology beyond normal, independent data
Generalizing the MCPMod methodology beyond normal, independent data José Pinheiro Joint work with Frank Bretz and Björn Bornkamp Novartis AG Trends and Innovations in Clinical Trial Statistics Conference
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationPubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH
PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;
More informationAn Integrated Approach to in-silico
An Integrated Approach to in-silico Screening Joseph L. Durant Jr., Douglas. R. Henry, Maurizio Bronzetti, and David. A. Evans MDL Information Systems, Inc. 14600 Catalina St., San Leandro, CA 94577 Goals
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationSimplifying Drug Discovery with JMP
Simplifying Drug Discovery with JMP John A. Wass, Ph.D. Quantum Cat Consultants, Lake Forest, IL Cele Abad-Zapatero, Ph.D. Adjunct Professor, Center for Pharmaceutical Biotechnology, University of Illinois
More informationEmpirical Risk Minimization, Model Selection, and Model Assessment
Empirical Risk Minimization, Model Selection, and Model Assessment CS6780 Advanced Machine Learning Spring 2015 Thorsten Joachims Cornell University Reading: Murphy 5.7-5.7.2.4, 6.5-6.5.3.1 Dietterich,
More informationDispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses
Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses Sean Ekins 1, Joe Olechno 2 Antony J. Williams 3 1 Collaborations in Chemistry, Fuquay Varina, NC. 2 Labcyte Inc,
More informationExpectation-Maximization
Expectation-Maximization Léon Bottou NEC Labs America COS 424 3/9/2010 Agenda Goals Representation Capacity Control Operational Considerations Computational Considerations Classification, clustering, regression,
More informationA Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors
A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors Rajarshi Guha, Debojyoti Dutta, Ting Chen and David J. Wild School of Informatics Indiana University and Dept.
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationCensored Data Analysis for Performance Data V. Bram Lillard Institute for Defense Analyses
Censored Data Analysis for Performance Data V. Bram Lillard Institute for Defense Analyses 4/20/2016-1 Power The Binomial Conundrum Testing for a binary metric requires large sample sizes Sample Size Requirements
More informationMachine Learning, Midterm Exam: Spring 2009 SOLUTION
10-601 Machine Learning, Midterm Exam: Spring 2009 SOLUTION March 4, 2009 Please put your name at the top of the table below. If you need more room to work out your answer to a question, use the back of
More informationAMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE
AMRI COMPOUD LIBRARY COSORTIUM: A OVEL WAY TO FILL YOUR DRUG PIPELIE Muralikrishna Valluri, PhD & Douglas B. Kitchen, PhD Summary The creation of high-quality, innovative small molecule leads is a continual
More informationClassification and Prediction
Classification Classification and Prediction Classification: predict categorical class labels Build a model for a set of classes/concepts Classify loan applications (approve/decline) Prediction: model
More informationMachine-learning scoring functions for docking
Machine-learning scoring functions for docking Dr Pedro J Ballester MRC Methodology Research Fellow EMBL-EBI, Cambridge, United Kingdom EBI is an Outstation of the European Molecular Biology Laboratory.
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationAlgorithm-Independent Learning Issues
Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning
More informationChemoinformatics and information management. Peter Willett, University of Sheffield, UK
Chemoinformatics and information management Peter Willett, University of Sheffield, UK verview What is chemoinformatics and why is it necessary Managing structural information Typical facilities in chemoinformatics
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationPubh 8482: Sequential Analysis
Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 10 Class Summary Last time... We began our discussion of adaptive clinical trials Specifically,
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationRECENT TRENDS IN PHARMACEUTICAL CHEMISTRY FOR DRUG DISCOVERY
INTERNATIONAL JOURNAL OF RESEARCH IN PHARMACY AND CHEMISTRY Available online at www.ijrpc.com Review Article RECENT TRENDS IN PHARMACEUTICAL CHEMISTRY FOR DRUG DISCOVERY Sathyaraj A Department of Chemistry,
More informationEstimation for nonparametric mixture models
Estimation for nonparametric mixture models David Hunter Penn State University Research supported by NSF Grant SES 0518772 Joint work with Didier Chauveau (University of Orléans, France), Tatiana Benaglia
More informationData Structures and Algorithms
Data Structures and Algorithms Spring 2017-2018 Outline 1 Sorting Algorithms (contd.) Outline Sorting Algorithms (contd.) 1 Sorting Algorithms (contd.) Analysis of Quicksort Time to sort array of length
More informationCombinatorial Heterogeneous Catalysis
Combinatorial Heterogeneous Catalysis 650 μm by 650 μm, spaced 100 μm apart Identification of a new blue photoluminescent (PL) composite material, Gd 3 Ga 5 O 12 /SiO 2 Science 13 March 1998: Vol. 279
More informationST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD
ST21 Design and Analysis of Clinical Trials Lecture 9: Non-parametric procedures for CRBD Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 9, 2016 Outline Nonparametric tests
More informationIntroduction to Data Science Data Mining for Business Analytics
Introduction to Data Science Data Mining for Business Analytics BRIAN D ALESSANDRO VP DATA SCIENCE, DSTILLERY ADJUNCT PROFESSOR, NYU FALL 2014 Fine Print: these slides are, and always will be a work in
More informationMolecular Descriptors Theory and tips for real-world applications
Molecular Descriptors Theory and tips for real-world applications Francesca Grisoni University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, Milan, Italy ETH Zurich, Dept. of Chemistry
More informationA review of some semiparametric regression models with application to scoring
A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France
More informationAdvanced Medicinal Chemistry SLIDES B
Advanced Medicinal Chemistry Filippo Minutolo CFU 3 (21 hours) SLIDES B Drug likeness - ADME two contradictory physico-chemical parameters to balance: 1) aqueous solubility 2) lipid membrane permeability
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationQuality control analytical methods- Switch from HPLC to UPLC
Quality control analytical methods- Switch from HPLC to UPLC Dr. Y. Padmavathi M.pharm,Ph.D. Outline of Talk Analytical techniques in QC Introduction to HPLC UPLC - Principles - Advantages of UPLC - Considerations
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More informationDescribing Contingency tables
Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds
More informationA3. Statistical Inference Hypothesis Testing for General Population Parameters
Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,
More informationKeywords: anti-coagulants, factor Xa, QSAR, Thrombosis. Introduction
PostDoc Journal Vol. 2, No. 3, March 2014 Journal of Postdoctoral Research www.postdocjournal.com QSAR Study of Thiophene-Anthranilamides Based Factor Xa Direct Inhibitors Preetpal S. Sidhu Department
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More information