Determining Optimum Path in Synthesis of Organic Compounds using Branch and Bound Algorithm

Similar documents
Chapter 3: Cluster Analysis

Administrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1

Department of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

, which yields. where z1. and z2

CHAPTER Read Chapter 17, sections 1,2,3. End of Chapter problems: 25

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India

You need to be able to define the following terms and answer basic questions about them:

We can see from the graph above that the intersection is, i.e., [ ).

Pattern Recognition 2014 Support Vector Machines

Admin. MDP Search Trees. Optimal Quantities. Reinforcement Learning

Graduate AI Lecture 16: Planning 2. Teachers: Martial Hebert Ariel Procaccia (this time)

Five Whys How To Do It Better

Physics 2B Chapter 23 Notes - Faraday s Law & Inductors Spring 2018

Thermodynamics Partial Outline of Topics

CHEM Thermodynamics. Change in Gibbs Free Energy, G. Review. Gibbs Free Energy, G. Review

Engineering Decision Methods

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

x x

Lab #3: Pendulum Period and Proportionalities

Public Key Cryptography. Tim van der Horst & Kent Seamons

Section 6-2: Simplex Method: Maximization with Problem Constraints of the Form ~

Lecture 5: Equilibrium and Oscillations

Preparation work for A2 Mathematics [2017]

The Law of Total Probability, Bayes Rule, and Random Variables (Oh My!)

Intelligent Pharma- Chemical and Oil & Gas Division Page 1 of 7. Global Business Centre Ave SE, Calgary, AB T2G 0K6, AB.

Thermodynamics and Equilibrium

NAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling?

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS

Lab 1 The Scientific Method

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms

Matter Content from State Frameworks and Other State Documents

Activity Guide Loops and Random Numbers

Sequential Allocation with Minimal Switching

Lecture 17: Free Energy of Multi-phase Solutions at Equilibrium

Medium Scale Integrated (MSI) devices [Sections 2.9 and 2.10]

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

A Few Basic Facts About Isothermal Mass Transfer in a Binary Mixture

IAML: Support Vector Machines

In the OLG model, agents live for two periods. they work and divide their labour income between consumption and

Computational modeling techniques

4 Fe + 3 O 2 2 Fe 2 O 3

Kinetic Model Completeness

Assessment Primer: Writing Instructional Objectives

Inflow Control on Expressway Considering Traffic Equilibria

CESAR Science Case The differential rotation of the Sun and its Chromosphere. Introduction. Material that is necessary during the laboratory

Differentiation Applications 1: Related Rates

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

When a substance heats up (absorbs heat) it is an endothermic reaction with a (+)q

Unit 9: The Mole- Guided Notes What is a Mole?

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

CHM112 Lab Graphing with Excel Grading Rubric

Experiment #3. Graphing with Excel

B. Definition of an exponential

Find this material useful? You can help our team to keep this site up and bring you even more content consider donating via the link on our site.

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Support-Vector Machines

Unit 14 Thermochemistry Notes

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

Dataflow Analysis and Abstract Interpretation

UN Committee of Experts on Environmental Accounting New York, June Peter Cosier Wentworth Group of Concerned Scientists.

Physics 2010 Motion with Constant Acceleration Experiment 1

ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION. Instructions: If asked to label the axes please use real world (contextual) labels

NUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION

5 th grade Common Core Standards

AP CHEMISTRY CHAPTER 6 NOTES THERMOCHEMISTRY

The steps of the engineering design process are to:

20 Faraday s Law and Maxwell s Extension to Ampere s Law

Large Sample Hypothesis Tests for a Population Proportion

Introduction to Spacetime Geometry

Materials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals of Diffusion

I. Analytical Potential and Field of a Uniform Rod. V E d. The definition of electric potential difference is

Preparation work for A2 Mathematics [2018]

AP Literature and Composition. Summer Reading Packet. Instructions and Guidelines

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

LECTURE NOTES. Chapter 3: Classical Macroeconomics: Output and Employment. 1. The starting point

Find this material useful? You can help our team to keep this site up and bring you even more content consider donating via the link on our site.

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

Tree Structured Classifier

NUMBERS, MATHEMATICS AND EQUATIONS

Chapters 29 and 35 Thermochemistry and Chemical Thermodynamics

A proposition is a statement that can be either true (T) or false (F), (but not both).

ECEN 4872/5827 Lecture Notes

Tutorial 3: Building a spectral library in Skyline

Building Consensus The Art of Getting to Yes

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

Floating Point Method for Solving Transportation. Problems with Additional Constraints

In SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Getting Involved O. Responsibilities of a Member. People Are Depending On You. Participation Is Important. Think It Through

AP Statistics Notes Unit Two: The Normal Distributions

Section 5.8 Notes Page Exponential Growth and Decay Models; Newton s Law

Department of Electrical Engineering, University of Waterloo. Introduction

Checking the resolved resonance region in EXFOR database

How do scientists measure trees? What is DBH?

Reinforcement Learning" CMPSCI 383 Nov 29, 2011!

CHAPTER 8b Static Equilibrium Units

Testing Groups of Genes

The standards are taught in the following sequence.

Transcription:

Determining Optimum Path in Synthesis f Organic Cmpunds using Branch and Bund Algrithm Diastuti Utami 13514071 Prgram Studi Teknik Infrmatika Seklah Teknik Elektr dan Infrmatika Institut Teknlgi Bandung, Jl. Ganesha 10 Bandung 40132, Indnesia 13514071@std.stei.itb.ac.id Abstract Synthesis f an rganic cmpund frm a certain cmpund ffers a wide variety f pssibility. Getting the ptimum yield in a synthesis reactin has been a cncern fr many years in the field f rganic chemistry. Smetimes, the reactin can be mre cmplicated that it requires sme time t determine the mst efficient path. Nwadays, there is a number f sftware that culd assist in determining the ptimum path, such as sftware ffering a database f reactins that culd be used in path determining. In this paper, we will discuss hw branch and bund algrithm culd find ptimum path in synthesis f rganic cmpund(s). Keywrds branch and bund; cmputatinal chemistry;branch and bund algrithm; I. INTRODUCTION Organic chemistry is a field f chemistry that studies cmpunds and substances made f a chain f carbn atms. Reactins in rganic chemistry ften cnsidered unique because it have distinct mechanism and almst always nt yielding 100% prducts. The thing with rganic reactins is that it takes a lt f time. Therefre, catalysts are ften invlved in many reactins, but as we knew, the cst f catalyst ften nt cheap. Anther thing is, the reactin is nt always effective. Suppse we want t synthesize cmpund A frm cmpund B, it will nly yield 45% cmpund A, meaning that if we predict we can get, say, ne mle f A, in reality we nly get 0.45 mles f A. Organic cmpunds are ften used in industrial scales, yet the prcess t synthesize an rganic cmpund ften invlves many steps and takes a lt f time. On the ther hand, there are many ways pssible t synthesize an rganic cmpund. This results in a challenge f getting the mst efficient way t synthesize an rganic cmpund. Anther challenge is t pen all the pssibilities f reactins. This is essential t determine the ptimum path, a synthesis reactin ften has many pssibilities, but t create a cmprehensive database f rganic reactins is anther challenge. Furthermre, we have t match the cmpund we want t search with the database that s anther thing. The breakthrugh in cmputatinal chemistry allws an insight t slve the challenge. Nwadays, many sftware ffer assistant n searching the pssible reactins, as database f rganic reactins have been develped everywhere. Out f many pssibilities t determine the mst efficient way t synthesize an rganic cmpund, we will see hw the branch and bund algrithm slves the prblem. II. THEORY OF BRANCH AND BOUND ALGORITHM Branch and bund is a state space search methd in which all the children f a nde are generated befre expanding any f its children. It is similar t backtracking technique but uses BFS-like search. In branch and bund, we will use the term live-nde t describe a nde that has nt been expanded, dead nde which is a nde that has been expanded, and slutin nde. Figure 1 Synthesis f benzcaine frm tluene Makalah IF2211 Strategi Algritma, Semester II Tahun 2015/2016

Figure 2 FIFO and LIFO Branch & Bund The searching used in branch and bund is called least-cst search (LC). Least-cst search have sme attributes: The selectin rule fr the next E-nde in FIFO r LIFO branch-and-bund is smetimes blind. i.e. the selectin rule des nt give any preference t a nde that has a very gd chance f getting the search t an answer nde quickly. The search fr an answer nde can ften be speeded by using an intelligent ranking functin, als called an apprximate cst functin C Expanded-nde (E-nde): is the live nde with best C value Requirements: Branching: A set f slutins, which is represented by a nde, can be partitined int mutually exclusive sets. Each subset in the partitin is represented by a child f the riginal nde. Lwer bunding: An algrithm is available fr calculating a lwer bund n the cst f any slutin in a given subset. Example is 8-puzzle, where: Cst functin: C = g(x) +h(x), where h(x) = the number f misplaced tiles and g(x) = the number f mves s far Assumptin: mve ne tile in any directin cst 1 Figure 3 8-puzzle branch and bund Glbal branch and bund algrithm: 1. Insert rt nde t queue Q. If rt nde is a slutin nde, then stp. 2. If Q empty and n slutin fund, then stp. 3. If Q nt empty, select frm Q a nde i that have minimum cst C(i). if there is mre than 1 nde that satisfies the cnditin, chse ne randmly. 4. If nde i is a slutin nde, then stp. Else, generate all the child ndes. If there is n child fund, return t step 2. 5. Fr each child j frm nde i, cunt C(j), and insert the children t Q. 6. Return t step 2. The algrithm in this paper is similar t the knapsack prblem. Knapsack prblem is a prblem where: Input Capacity K Gal n items with weights w i and values v i Output a set f items S such that the sum f weights f items in S is at mst K and the sum f values f items in S is maximized Makalah IF2211 Strategi Algritma, Semester II Tahun 2015/2016

Where: Figure 4 Tree fr a knapsack prblem Prperties f a knapsack tree as fllws: Left child is always xi = 1 and right child is always xi = 0 Bunding functin t prune tree: At a live nde in the tree, if we can estimate the upper bund (best case) prfit at that nde, and if that upper bund is less than the prfit f an actual slutin fund already, then we dn t need t explre that nde. We can use the greedy knapsack as ur bund functin as it gives an upper bund, since the last item in the knapsack is usually fractinal Greedy algrithms are ften gd ways t cmpute upper (ptimistic) bunds n prblems, e.g., fr jb scheduling with varying jb times, we can cut each jb int equal length parts and use the greedy jb scheduler t get an upper bund Linear prgrams that treat the 0-1 variables as cntinuus between 0 and 1 are ften anther gd chice Suppse we have a knapsack prblem like in the table abve, where maximum weight is 110. This generate a slutin tree: Numbers inside a nde are prfit and weight at that nde, based n decisins frm rt t that nde Ndes withut numbers inside have same values as their parent Numbers utside the nde are upper bund calculated by greedy algrithm Upper bund fr every feasible left child (xi =1) is same as its parent s bund Chain f left children in tree is same as greedy slutin at that pint in the tree We nly recmpute the upper bund when we can t mve t a feasible left child Final prfit and final weight (lwer bund) are updated at each leaf nde reached by algrithm Slutin imprves at each leaf nde reached N further leaf ndes reached after D because lwer bund (ptimal value) is sufficient t prune all ther tree branches befre leaf is reached By using flr f upper bund at ndes E and F, we avid generating the tree belw either nde Since ptimal slutin must be integer, we can truncate upper bunds By truncating bunds at E and F t 159, we avid explring E and F Branch and bund algrithm can als use depth first search. Depth first search is used in cmbinatin with breadth first search in many prblems. Cmmn strategy is t use depth first search n ndes that have nt been pruned. This gets t a leaf nde, and a feasible slutin, which is a lwer bund that can be used t prune the tree in cnjunctin with the greedy upper bunds. If greedy upper bund is less than lwer bund, prune the tree. Once a nde has been pruned, breadth first search is used t mve t a different part f the tree. Depth first search bunds tend t be very quick t cmpute if we mve dwn the tree sequentially, e.g. the greedy bund desn t need t be recmputed. Linear prgram as bunds are ften quick t: few simplex pivts. III. BRANCH AND BOUND IN SYNTHESIS REACTIONS First ff, we shuld determine hw we will cunt the bund. The factrs related t the bund include yield, time, and cst. Because here we can t predict the cst, the bund will be determined using yield and time nly. Let s take a lk at this simple reactin diagram. We can make a table ut f it, cnsisting reagents, prducts, yield, time, and yield/time rati. In this table C stands fr cmpund, s C 1 means cmpund number 1, and s n. We will assume that the alkyl R is CH 3. Figure 5 Generated tree frm the knapsack prblem Makalah IF2211 Strategi Algritma, Semester II Tahun 2015/2016

In real life, the diagram was generated thrugh a searching in the database. A user will input a final prduct and sme pssible reagent that he/she has, and then frm the database, we will get a few pssibilities available frm reagents t prduce a certain prduct. The reactin path is represented with a directed graph. Reagent Prduct Yield Time(m) Yield/Time C 2 C 1 90% 60 1.500 C 2 C 3 82% 35 2.343 C 2 C 4 78% 75 1.040 C 2 C 5 75% 120 0.625 C 2 C 7 85% 80 1.063 C 3 C 6 62% 95 0.653 C 3 C 7 46% 120 0.383 C 5 C 2 70% 120 0.583 C 7 C 6 73% 34 3.042 C 7 C 8 80% 40 2 C 7 C 9 79% 60 1.317 C 7 C 10 60% 55 1.091 C 7 C 11 55% 25 2.200 C 9 C 12 74% 20 3.700 C 9 C 13 92% 30 3.067 Table 1 Yield and time data table f reactin map in figure 6 Figure 6 Reactin Map f Armatic Cmpunds As this prblem similar t the knapsack prblem, we can derived the bund frmula as bund = ttal yield + (( time time taken) * next best yield/time) with ttal yield = current yield * selected reactin yield There are a few differences frm the knapsack prblem, such as: The next best yield/time can nly be determined by accessible cmpund frm the reagent cmpund. The final bund calculatin use the yield/time fr final nde t replace the next best yield/time Makalah IF2211 Strategi Algritma, Semester II Tahun 2015/2016

We assume that the ndes accessible frm a certain nde have been determined when we retrieved the data frm the given database. If there was n further reactin pssible frm a nde, and the nde isn t slutin nde, then deactivate the nde. Suppse we want t synthesize cmpund 13, and we dn t have any cmpund 7 yet (and later cmpunds), there are few pssibilities. First, we can start frm cmpund 2, 3, r 5. The prcess f getting the result will be summarized n the table belw. C 0 Expanded Ndes C 3, C 2 C 27 C 279 Active Ndes C 2 = 0 + (969 * 2.343) = 2270.367 C 5 = 0 + (969 * 0.583) = 564.927 C 3 = 0 + (969 * 0.653) = 632.757 C 36 = 0.62 + ((969 95) * 0) (dead because n further reactin pssible) C 37 = 0.46 + ((969-120) * 3.042) = 2583.118 - C 21 = 0.9 + ((969 60) * 0) = 0.9 (dead because n further reactin pssible) C 24 = 0.78 + ((969 75) * 0) = 0.78 (dead because n further reactin pssible) C 27 = 0.85 + ((969 80) * 3.042) = 2705.188 (C 5 dead because reactin pssible is nly t cmpund 2 and cmpund 2 already accessible frm the start) C 37 = 2583.118 C 276 = 0.85 * 0.73 + ((969 80 34) * 0) = 0.621 (dead because n further reactin pssible) C 278 = 0.68 (dead because n further reactin pssible) C 279 = 3067.9715 C 2710 = 0.51 (dead because n further reactin pssible) C 2711 = 0.468 (dead because n further reactin pssible) C 37 = 2583.118 C 27912 = 0.497 (dead because n further reactin pssible) C 27913 Slutin fund, bund = 2451.151 C 37 C 379 C 27913 C 27913 = 2451.151 C 378 = 0.368 (dead because n further reactin pssible) C 379 = 2919.6634 C 3710 = 0.276 (dead because n further reactin pssible) C 3711 = 0.253 (dead because n further reactin pssible) C 27913 = 2451.151 C 37913 = 2328.187 Final slutin fund. Nde C 37913 dead because the bund is lwer. N ther nde active. Table 2 Branch and Bund Decmpsitin f the Armatic Reactin Frm the table abve, we can determine the slutin is C 2 C 7 C 9 C 13, which has the ptimum path cnsidering the yield and time taken. We shuld nte that the result n the table abve is generated by nly cnsidering yield (number f prduct that can be generated) and time taken. In reality, many things shuld be taken int accunt and there shuld be a way t make a quantificatin f thse things, such as cst, accessibility t the prduct, and s n. By lking at the diagram, we culd see that this result is valid because the nly pssible way t synthesize cmpund 13 is by is C 2 C 7 C 9 C 13 r is C 3 C 7 C 9 C 13, and it is clear that the yield and time taken in via C 2 is much mre effective than via C 3. In real life, the algrithm itself can be applied t a mre cmplicated reactin map, such as: Figure 7 A Mre Cmplicated Reactin Paths which wn t be discussed here because there are numerus pssibilities and it will be t lng. Makalah IF2211 Strategi Algritma, Semester II Tahun 2015/2016

IV. CONCLUSION By using branch and bund algrithm strategy, the bund can be cunted by using the frmula: bund = ttal yield + (( time time taken) * next best yield/time) with ttal yield = current yield * selected reactin yield and is prven valid as it culd generate the ptimum path f the reactin n figure 6. The branch and bund algrithm can find the ptimum path in rganic reactins, given many pssibilities, yield, and time taken. This algrithm can be imprved by cnsidering the cst r mney t underg a reactin. This is als essential because in industrial wrld, we shuld make a prductin nt nly time effective, but als cst effective. V. ACKNOWLEDGEMENT I thank Allah SWT fr his will s I can finish this paper and my parents wh had always supprted me. I als want t send my gratitude t Mr. Rinaldi Munir and Mrs. Nur Ulfa Maulidevi as IF 2211 lecturers fr bth have taught Algrithm Strategy. I wuld als like t thank all the peple wh were invlved in the prcess f writing this paper. Thank yu fr the peple wh have helped me bth directly and indirectly. I hpe this paper I wrte will help peple wh read it t gain mre knwledge like I did while writing it. REFERENCES [1] Munir, Rinaldi. Diktat Kuliah IF2211 Strategi Algritma. Bandung: Teknik Infrmatika Institut Teknlgi Bandung, 2009. [2] Slmn, Graham T.W. Organic Chemistry. Jhn Wiley & Sns, 2011. [3] J, Clayden, et al. Organic Chemistry. Ocfrd, 2001 [4] Zumdahl, Steven S, Susan A. Zumdahl. General Chemistry. Brks & Cle, 2010 [5] http://cw.mit.edu/curses/civil-and-envirnmental-engineering/1-204- cmputer-algrithms-in-systems-engineering-spring-2010/lecturentes/mit1_204s10_lec16.pdf [6] https://www.seas.gwu.edu/~bell/csci212/branch_and_bund.pdf [7] http://stanfrd.edu/class/ee364b/lectures/bb_slides.pdf STATEMENT I hereby declare that this paper is my wn wrk and nt a cpy, translatin, nr plagiarism f smebdy else s wrk. Bandung, May 9 2016 Diastuti Utami 13514071 Makalah IF2211 Strategi Algritma, Semester II Tahun 2015/2016