How to Make the Most Out of Evolving Information: Function Identification Using Epi-Splines
|
|
- Sybil Fox
- 6 years ago
- Views:
Transcription
1 How to Make the Most Out of Evolving Information: Function Identification Using Epi-Splines Johannes O. Royset Operations Research, NPS in collaboration with L. Bonfiglio, S. Buttrey, S. Brizzolara, J. Carbaugh, I.Rios, D. Singham, K. Spurkel, G. Vernengo, J-P. Watson, R. Wets, D. Woodruff Smart Energy and Stochastic Optimization, ENPC ParisTech June 2015 This material is based upon work supported in part by the U.S. Army Research Laboratory and the U.S. Army Research Office under grants , W911NF and W911NF as well as by DARPA under HR / 126
2 Goal: estimate, predict, forecast 2 / 126
3 Copper prices Price[USD] Months 3 / 126
4 Copper prices Price[USD] Real prices E(hist+market) + σ E(hist+market) σ E(hist+market) Months 4 / 126
5 Hourly electricity loads 5 / 126
6 Hard information: data Observations Usually scarce, excessive, corrupted, uncertain 6 / 126
7 Soft information Indirect observations, external information Knowledge about structures established laws physical restrictions shapes, smoothness of curves 7 / 126
8 Evolving information data acquisition, information growth 8 / 126
9 Function Identification Problem Identify a function that best represents all available information 9 / 126
10 Applications of function identification approach energy natural resources financial markets image reconstruction uncertain quantification variograms nonparametric regression deconvolution density estimation 10 / 126
11 Outline Illustrations and formulations (slides 12-33) Framework (slides 34-60) Epi-splines and approximations (slides 61-77) Implementations (slides 78-85) Examples (slides ) 11 / 126
12 Illustrations and formulations 12 / 126
13 Identifying financial curves Estimate discount factor curve given instruments i = 1, 2,..., I and payments p i 0,p i 1,...,p i N i at times 0 = t i 0,t i 1,...,t i N i 13 / 126
14 Identifying financial curves Estimate discount factor curve given instruments i = 1, 2,..., I and payments p i 0,p i 1,...,p i N i at times 0 = t i 0,t i 1,...,t i N i Identify nonnegative, nonincreasing function f with f(0) = 1 N i f(tj i )pi j 0 for all i j=0 13 / 126
15 Identifying financial curves Estimate discount factor curve given instruments i = 1, 2,..., I and payments p i 0,p i 1,...,p i N i at times 0 = t i 0,t i 1,...,t i N i Identify nonnegative, nonincreasing function f with f(0) = 1 N i f(tj i )pi j 0 for all i j=0 Constrained infinite-dimensional optimization problem: I N i minimize f(tj)p i j i i=1 j=0 such that f(0) = 1, f 0, f 0 13 / 126
16 Day-ahead electricity load forecast Data: predicted temperature, dew point; observed load 14 / 126
17 Day-ahead load forecast problem Using a relevant data and weather forecast for tomorrow, determine tomorrow s load curve and its uncertainty 15 / 126
18 Day-ahead load forecast problem Using a relevant data and weather forecast for tomorrow, determine tomorrow s load curve and its uncertainty j = 1,2,...,J: days in data set 15 / 126
19 Day-ahead load forecast problem Using a relevant data and weather forecast for tomorrow, determine tomorrow s load curve and its uncertainty j = 1,2,...,J: days in data set h = 1,2,...,24: hours of the day 15 / 126
20 Day-ahead load forecast problem Using a relevant data and weather forecast for tomorrow, determine tomorrow s load curve and its uncertainty j = 1,2,...,J: days in data set h = 1,2,...,24: hours of the day t j h : predicted temperature in hour h of day j 15 / 126
21 Day-ahead load forecast problem Using a relevant data and weather forecast for tomorrow, determine tomorrow s load curve and its uncertainty j = 1,2,...,J: days in data set h = 1,2,...,24: hours of the day t j h : predicted temperature in hour h of day j d j h : predicted dew point in hour h of day j 15 / 126
22 Day-ahead load forecast problem Using a relevant data and weather forecast for tomorrow, determine tomorrow s load curve and its uncertainty j = 1,2,...,J: days in data set h = 1,2,...,24: hours of the day t j h : predicted temperature in hour h of day j d j h : predicted dew point in hour h of day j l j h : actual observed load in hour h of day j 15 / 126
23 Day-ahead load forecast problem (cont.) Functional regression model: l j h = f tmp (h)t j h +fdpt (h)d j h +ej h, regression coefficients f tmp and f dpt are functions of time e j h : error between the observed and predicted loads 16 / 126
24 Day-ahead load forecast problem (cont.) Functional regression model: l j h = f tmp (h)t j h +fdpt (h)d j h +ej h, regression coefficients f tmp and f dpt are functions of time e j h : error between the observed and predicted loads Regression problem: find functions f tmp and f dpt on [0,24] that J 24 minimize l j h [f tmp (h)t j h +fdpt (h)dh] j j=1 h=1 subject to smoothness, curvature conditions 16 / 126
25 Day-ahead load forecast problem (cont.) Functional regression model: l j h = f tmp (h)t j h +fdpt (h)d j h +ej h, regression coefficients f tmp and f dpt are functions of time e j h : error between the observed and predicted loads Regression problem: find functions f tmp and f dpt on [0,24] that J 24 minimize l j h [f tmp (h)t j h +fdpt (h)dh] j j=1 h=1 subject to smoothness, curvature conditions Constrained infinite-dimensional optimization problem 16 / 126
26 Day-ahead load forecast problem (cont.) Given temperature t and dew point d forecasts (functions of time) for tomorrow, load forecast becomes f tmp (h)t(h)+f dpt (h)d(h), h [0,24] 17 / 126
27 Day-ahead load forecast problem (cont.) Given temperature t and dew point d forecasts (functions of time) for tomorrow, load forecast becomes f tmp (h)t(h)+f dpt (h)d(h), h [0,24] But, what about uncertainty? 17 / 126
28 Estimation of errors For hour h: minimized errors e j h = lj h [f tmp (h)t j h +fdpt (h)d j h ], j = 1,...,J = samples from actual error density But that wouldn t capture the uncertainty! one would expect: h 18 / 126
29 Estimation of errors For hour h: minimized errors e j h = lj h [f tmp (h)t j h +fdpt (h)d j h ], j = 1,...,J = samples from actual error density But that wouldn t capture the uncertainty! one would expect: h Probability density estimation problem with very small data set 18 / 126
30 Estimation of errors (cont.) Probability density estimation problem: Given iid sample x 1,x 2,...,x ν, find a function h H that maximizes ν h(x i ) i=1 19 / 126
31 Estimation of errors (cont.) Probability density estimation problem: Given iid sample x 1,x 2,...,x ν, find a function h H that maximizes ν h(x i ) i=1 H = constraints on h: h 0, h(x)dx = 1, and more 19 / 126
32 Estimation of errors (cont.) Probability density estimation problem: Given iid sample x 1,x 2,...,x ν, find a function h H that maximizes ν h(x i ) i=1 H = constraints on h: h 0, h(x)dx = 1, and more Constrained infinite-dimensional optimization problem 19 / 126
33 Estimation of errors (cont.) 20 / 126
34 Generating load scenarios Also need to consider conditioning: if load above regression function early, then likely above later error data for density estimation reduced further 21 / 126
35 Generating load scenarios Also need to consider conditioning: if load above regression function early, then likely above later error data for density estimation reduced further 21 / 126
36 Forecast precision (Connecticut ) Season Mean % error Fall 3.99 Spring 2.73 Summer 4.19 Winter / 126
37 Uncertainty quantification (UQ) Engineering, biological, physical systems Input: random vector V ( known distribution) System function g; implicitly defined e.g. by simulation Output: random variable X = g(v) 23 / 126
38 Illustration of UQ challenges Output amplitude of dynamical system: 1.5 density of X x How to estimate densities like this with a small sample? 24 / 126
39 Probability density estimation Sample X 1,...,X ν : maximize ν i=1 h(xi ) s.t. h H ν H Sample size might be growing Constraint set H ν might be evolving 25 / 126
40 Evolving probability density estimation maximize ν i=1 (h(xi )) 1/ν s.t. h H ν H 26 / 126
41 Evolving probability density estimation maximize ν i=1 (h(xi )) 1/ν s.t. h H ν H maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H 26 / 126
42 Evolving probability density estimation maximize ν i=1 (h(xi )) 1/ν s.t. h H ν H maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H actual problem: maximize E[logh(X)] s.t. h H H 26 / 126
43 Evolving constraints Constraints: nonnegative support, smooth, curvature bound 1.5 True Density Epi Spline Estimate Kernel Estimate Empirical Density 1 density x MSE = (epi-spline) and (kernel) 27 / 126
44 Evolving constraints (cont.) Also log-concave 1.5 True Density Epi Spline Estimate Kernel Estimate Empirical Density 1 density x MSE = (epi-spline) and (kernel) 28 / 126
45 Evolving constraints (cont.) Also nonincreasing 1.5 True Density Epi Spline Estimate Kernel Estimate Empirical Density 1 density x MSE = (epi-spline) and (kernel) 29 / 126
46 Evolving constraints (cont.) Also slope bounds 1.5 True Density Epi Spline Estimate Kernel Estimate Empirical Density 1 density x MSE = (epi-spline) and (kernel) 30 / 126
47 Take-aways so far Challenges in forecasting and estimation are function identification problems Day-ahead forecasting of electricity demand involves functional regression to get trend probability density estimation to get errors Soft information supplements hard information (data) 31 / 126
48 Questions? 32 / 126
49 Outline Illustrations and formulations (slides 12-33) Framework (slides 34-60) Epi-splines and approximations (slides 61-77) Implementations (slides 78-85) Examples (slides ) 33 / 126
50 Framework 34 / 126
51 Actual problem Constrained infinite-dimensional optimization problem minψ(f) such that f F F some function space 35 / 126
52 Actual problem Constrained infinite-dimensional optimization problem minψ(f) such that f F F some function space criterion ψ (norms, error measures, etc) 35 / 126
53 Actual problem Constrained infinite-dimensional optimization problem minψ(f) such that f F F some function space criterion ψ (norms, error measures, etc) feasible set F (shape restrictions, external info) 35 / 126
54 Actual problem Constrained infinite-dimensional optimization problem minψ(f) such that f F F some function space criterion ψ (norms, error measures, etc) feasible set F (shape restrictions, external info) subspace of interest F (continuity, smoothness) 35 / 126
55 Special case: linear regression Food Expenditure Superquantile Regression 0.75 Quantile Regression Least squares Regression find affine f that min j Household Income (y j f(x j )) 2 or other error measures 36 / 126
56 Special case: interpolation data smoothing spline interpolating spline min f 2 such that f(x j ) = y j for all j and f Sobolev space 37 / 126
57 Special case: smoothing data smoothing spline interpolating spline min j (y j f(x j )) 2 +λ f 2 such that f Sobolev space 38 / 126
58 Evolving problems Evolving infinite-dimensional optimization problems minψ ν (f) such that f F ν F some function space evolving/approximating criterion ψ ν evolving/approximating feasible set F ν 39 / 126
59 Evolving problems Evolving infinite-dimensional optimization problems minψ ν (f) such that f F ν F some function space evolving/approximating criterion ψ ν evolving/approximating feasible set F ν 40 / 126
60 Function space extended real-valued lower semicontinuous functions on IR n 41 / 126
61 Function space extended real-valued lower semicontinuous functions on IR n lower semicont. f x 41 / 126
62 Function space extended real-valued lower semicontinuous functions on IR n lower semicont. f x (lsc-fcns(ir n ),dl): complete separable metric, dl = epi-distance 41 / 126
63 Modeling possibilities with lsc functions functions with jumps and high growth response surface building with implicit constraints system identification with subsequent minimization nonlinear transformations requiring ± functions with unbounded domain 42 / 126
64 Jumps Fits by polynomial and lsc function 43 / 126
65 Modeling possibilities with lsc functions functions with jumps and high growth response surface building with implicit constraints system identification with subsequent minimization nonlinear transformations requiring ± functions with unbounded domain 44 / 126
66 Nonlinear transformations Recall: probability density estimation with sample X 1,...,X ν : maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H including h 0 45 / 126
67 Nonlinear transformations Recall: probability density estimation with sample X 1,...,X ν : maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H including h 0 Exponential transformation: h(x) = exp( s(x)) 45 / 126
68 Nonlinear transformations Recall: probability density estimation with sample X 1,...,X ν : maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H including h 0 Exponential transformation: h(x) = exp( s(x)) minimize ν i=1 s(xi ) s.t. s S ν S excluding nonnegativity 45 / 126
69 Nonlinear transformations Recall: probability density estimation with sample X 1,...,X ν : maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H including h 0 Exponential transformation: h(x) = exp( s(x)) minimize ν i=1 s(xi ) s.t. s S ν S excluding nonnegativity h log-concave s convex 45 / 126
70 Nonlinear transformations Recall: probability density estimation with sample X 1,...,X ν : maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H including h 0 Exponential transformation: h(x) = exp( s(x)) minimize ν i=1 s(xi ) s.t. s S ν S excluding nonnegativity h log-concave s convex If s = c( ),r, then certain expression linear in r 45 / 126
71 Nonlinear transformations Recall: probability density estimation with sample X 1,...,X ν : maximize 1 ν ν i=1 logh(xi ) s.t. h H ν H including h 0 Exponential transformation: h(x) = exp( s(x)) minimize ν i=1 s(xi ) s.t. s S ν S excluding nonnegativity h log-concave s convex If s = c( ),r, then certain expression linear in r But need s to take on for h = exp( s( )) to vanish 45 / 126
72 Identifying functions with unbounded domains Complications for standard spaces 46 / 126
73 Identifying functions with unbounded domains Complications for standard spaces Is f ν near f 0? / x 46 / 126
74 Identifying functions with unbounded domains Complications for standard spaces Is f ν near f 0? / x Not in L p -norm, but epi-distance 1/ν 46 / 126
75 Modeling possibilities with lsc functions functions with jumps and high growth response surface building with implicit constraints system identification with subsequent minimization nonlinear transformations requiring ± functions with unbounded domain 47 / 126
76 Epi-graph f x 48 / 126
77 Epi-graph f epi f x 49 / 126
78 Geometric view of lsc functions f lsc-fcns(ir n ) epi f nonemptly closed subset of IR n+1 50 / 126
79 Geometric view of lsc functions f lsc-fcns(ir n ) epi f nonemptly closed subset of IR n+1 Notation: ρib = IB(0,ρ) = origin-centered ball with radius ρ d(y,s) = inf y S y y = distance between y and S 50 / 126
80 Distances between epi-graphs epi f epi g x 51 / 126
81 Distances between epi-graphs (cont.) ρ-epi-distance = dl ρ (f,g) = sup d( x,epi f) d( x,epi g), ρ 0 x ρib 52 / 126
82 Distances between epi-graphs (cont.) ρ-epi-distance = dl ρ (f,g) = sup d( x,epi f) d( x,epi g), ρ 0 x ρib g f x 52 / 126
83 Distances between epi-graphs (cont.) ρ-epi-distance = dl ρ (f,g) = sup d( x,epi f) d( x,epi g), ρ 0 x ρib epi g epi f =(, x) x dl ρ (f,g) = δ 53 / 126
84 Distances between epi-graphs (cont.) ρ-epi-distance = dl ρ (f,g) = sup d( x,epi f) d( x,epi g), ρ 0 x ρib epi (, ) epi (, epi ) (, epi ) = (, ) 54 / 126
85 Metric on the lsc functions ρ-epi-distance dl ρ is a pseudo-metric on lsc-fcns(ir n ) epi-distance = dl(f,g) = dl is a metric on lsc-fcns(ir n ) 0 dl ρ (f,g)e ρ dρ The epi-distance induces the epi-topology on lsc-fcns(ir n ) (lsc-fcns(ir n ),dl) is a complete separable metric space (Polish) 55 / 126
86 Epi-convergence f ν lsc-fcns(ir n ) epi-converges to f if dl(f ν,f) 0 56 / 126
87 Actual problem Constrained infinite-dimensional optimization problem minψ(f) such that f F F lsc-fcns(ir n ) criterion ψ (norms, error measures, etc) feasible set F (shape restrictions, external info) subspace of interest F (continuity, smoothness) 57 / 126
88 Take-aways in this section Actual problem: find best extended real-valued lower semicontinuous function captures regression, curve fitting, interpolation, density estimation etc: allows functions on IR n, jumps, transformation requiring ± Evolving problem due to changes in information and approximations Theory about lsc functions: metric space under epi-distance epi-distance = distance between epi-graphs 58 / 126
89 Questions? 59 / 126
90 Outline Illustrations and formulations (slides 12-33) Framework (slides 34-60) Epi-splines and approximations (slides 61-77) Implementations (slides 78-85) Examples (slides ) 60 / 126
91 Epi-splines and approximations 61 / 126
92 Epi-splines = piecewise polynomial + lower semicont. Piecewise polynomial not automatic; must be constructed 62 / 126
93 Epi-splines = piecewise polynomial + lower semicont. Piecewise polynomial not automatic; must be constructed s(x) m 1 m 2 m 3 m 4 m 5 m 6 m N-1 x 62 / 126
94 Epi-splines = piecewise polynomial + lower semicont. Piecewise polynomial not automatic; must be constructed s(x) m 1 m 2 m 3 m 4 m 5 m 6 m N-1 Optimization over new class of piecewise polynomial functions x 62 / 126
95 n-dim epi-splines 63 / 126
96 n-dim epi-splines (cont.) Epi-spline s of order p defined on IR n with partition R = {R k } N k=1 is a real-valued function that on each R k, k = 1,...,N, is polynomial of total degree p and for every x IR n, has s(x) = liminf x x s(x ) 64 / 126
97 n-dim epi-splines (cont.) Epi-spline s of order p defined on IR n with partition R = {R k } N k=1 is a real-valued function that on each R k, k = 1,...,N, is polynomial of total degree p and for every x IR n, has s(x) = liminf x x s(x ) e-spl p n(r) = epi-splines of order p on IR n with partition R 64 / 126
98 Evolving and approximating problems minψ ν (s) such that s S ν = F ν S ν 65 / 126
99 Evolving and approximating problems minψ ν (s) such that s S ν = F ν S ν subspace of interest S ν e-spl pν n (R ν ) F 65 / 126
100 Evolving and approximating problems minψ ν (s) such that s S ν = F ν S ν subspace of interest S ν e-spl pν n (R ν ) F evolving/approximate feasible set F ν 65 / 126
101 Evolving and approximating problems minψ ν (s) such that s S ν = F ν S ν subspace of interest S ν e-spl pν n (R ν ) F evolving/approximate feasible set F ν evolving/approximate criterion ψ ν 65 / 126
102 Function identification framework Evolving Problems - information growth - criterion and constraint approx. - function approximations solutions = epi-splines Actual Problem - complete information - infinite dimensional - conceptual solutions consistency, rates 66 / 126
103 Convergence of evolving problems Epi-convergence of functionals on a metric space: 67 / 126
104 Convergence of evolving problems Epi-convergence of functionals on a metric space: Definition: {ψ ν : S ν IR} ν=1 epi-converge to ψ : F IR if and only if (i) for every s ν f lsc-fcns(ir n ), with s ν S ν, we have liminfψ ν (s ν ) ψ(f) if f F and ψ ν (s ν ) otherwise; (ii) for every f F, there exists {s ν } ν=1, with sν S ν, such that s ν f and limsupψ ν (s ν ) ψ(f). 67 / 126
105 Convergence of evolving problems Epi-convergence of functionals on a metric space: Definition: {ψ ν : S ν IR} ν=1 epi-converge to ψ : F IR if and only if (i) for every s ν f lsc-fcns(ir n ), with s ν S ν, we have liminfψ ν (s ν ) ψ(f) if f F and ψ ν (s ν ) otherwise; (ii) for every f F, there exists {s ν } ν=1, with sν S ν, such that s ν f and limsupψ ν (s ν ) ψ(f). Equivalent to previous definition 67 / 126
106 Convergence of evolving problems Epi-convergence of functionals on a metric space: Definition: {ψ ν : S ν IR} ν=1 epi-converge to ψ : F IR if and only if (i) for every s ν f lsc-fcns(ir n ), with s ν S ν, we have liminfψ ν (s ν ) ψ(f) if f F and ψ ν (s ν ) otherwise; (ii) for every f F, there exists {s ν } ν=1, with sν S ν, such that s ν f and limsupψ ν (s ν ) ψ(f). Equivalent to previous definition Also say that the evolving optimization problems epi-converge to the actual problem. 67 / 126
107 Consequence of epi-convergence If evolving problems epi-converge to the actual problem, then ( ) lim sup inf s S ψν (s) inf ψ(f). ν f F 68 / 126
108 Consequence of epi-convergence If evolving problems epi-converge to the actual problem, then ( ) lim sup inf s S ψν (s) inf ψ(f). ν f F Moreover, if s k are optimal for the evolving problems min s S ν k ψ ν k(s) and s k f 0, then f 0 is optimal for the actual problem and lim inf ψ ν k (s) = inf ψ(f) = ψ(f 0 ). k s S ν k f F 68 / 126
109 When will evolving problems epi-converge to actual? minψ(f) minψ ν (s) such that f F F lsc-fcns(ir n ) such that s S ν = F ν S ν Theorem: A sufficient condition for epi-convergence is ψ ν converges continuously to ψ relative to F Epi-splines dense in lsc functions F ν set-converges to F F is solid 69 / 126
110 Epi-splines dense in lsc functions {R ν } ν=1, with Rν = {R ν 1,...,Rν N ν }, is an infinite refinement if for every x IR n and ǫ > 0, there exists a positive integer ν s.t. R ν k IB(x,ǫ) for every ν ν and k satisfying x clrν k. 70 / 126
111 Epi-splines dense in lsc functions {R ν } ν=1, with Rν = {R ν 1,...,Rν N ν }, is an infinite refinement if for every x IR n and ǫ > 0, there exists a positive integer ν s.t. R ν k IB(x,ǫ) for every ν ν and k satisfying x clrν k. Theorem: For any p = 0,1,2,..., and infinite refinement {R ν } ν=1, e-spl p n(r ν ) is dense in lsc-fcns(ir n ) ν=1 70 / 126
112 Dense in continuous functions under simplex partitioning Definition: A simplex S in IR n is the convex hull of n+1 points x 0,x 1,...,x n IR n, with x 1 x 0, x 2 x 0,..., x n x 0 linearly independent. 71 / 126
113 Dense in continuous functions under simplex partitioning Definition: A simplex S in IR n is the convex hull of n+1 points x 0,x 1,...,x n IR n, with x 1 x 0, x 2 x 0,..., x n x 0 linearly independent. Definition: A partition R 1,R 2,...,R N of IR n is a simplex partition of IR n if clr 1,..., clr N are mostly simplexes. 71 / 126
114 Dense in cnts fcns under simplex partitioning (cont.) Theorem: For any p = 1,2,.., and {R ν } ν=1, an infinite refinement of IR n consisting of simplex partitions of IR n, e-spl p n(r ν ) C 0 (IR n ) is dense in C 0 (IR n ). ν=1 72 / 126
115 Rates in univariate probability density estimation For convex and finite-dimensional estimation problem: Theorem: For any correct soft information, ν 1/2 d KL (h 0 h ν ǫ ) = O p(1) for some h ν ǫ = near-optimal solution 73 / 126
116 Decomposition Theorem: For every s e-spl p n(r), with n > p 1 and R = {R k } N k=1, there exist q k,i poly p (IR n 1 ), i = 1,2,...,n and k = 1,2,...,N, such that s(x) = n q k,i (x i ), for all x R k. i=1 an n-dim epi-spline is the sum of n, (n 1)-dim epi-splines 74 / 126
117 Take-aways in this section Epi-splines are piecewise polynomials defined on arbitrary partition of IR n Only lower-semicontinuity required (not smoothness) Dense in space of extended real-valued lower-semicontinuous functions under epi-distance (i.e., epi-splines can approximate every lsc function to an arbitrary accuracy) Solutions of evolving problem (in terms of epi-splines) are approximate solutions of actual problem 75 / 126
118 Questions? 76 / 126
119 Outline Illustrations and formulations (slides 12-33) Framework (slides 34-60) Epi-splines and approximations (slides 61-77) Implementations (slides 78-85) Examples (slides ) 77 / 126
120 Implementations 78 / 126
121 Implementation considerations Selection of epi-spline order and composition h = exp( s) Selection of partition: go fine! Implementation of criterion functional and soft info 79 / 126
122 Implementation considerations Selection of epi-spline order and composition h = exp( s) Selection of partition: go fine! Implementation of criterion functional and soft info Provide details for one-dimensional epi-splines of order 1 Focusing on criteria and constraints for probability densities 79 / 126
123 Implementation of criteria functionals Probability densities h e-spl 1 (m), with mesh m = {m 0,m 1,...,m N }, < m 0 < m 1 <... < m N < 80 / 126
124 Implementation of criteria functionals Probability densities h e-spl 1 (m), with mesh m = {m 0,m 1,...,m N }, < m 0 < m 1 <... < m N < First-order epi-splines (piecewise linear): h(x) = a k 0 +ak x for x (m k 1,m k ) 80 / 126
125 Implementation of criteria functionals (cont.) Log-likelihood for observations x 1,...,x ν : ν log h(x i ) = i=1 ν i=1 log(a k i 0 +ak i x i ), where k i such that x i (m ki 1,m ki ) 81 / 126
126 Implementation of criteria functionals (cont.) Log-likelihood for observations x 1,...,x ν : ν log h(x i ) = i=1 ν i=1 log(a k i 0 +ak i x i ), where k i such that x i (m ki 1,m ki ) Entropy: h(x)logh(x)dx = N k=1 mk m k 1 (a k 0 +ak x)log(a k 0 +ak x) 81 / 126
127 Implementation of criteria functionals (cont.) Log-likelihood for observations x 1,...,x ν : ν log h(x i ) = i=1 ν i=1 log(a k i 0 +ak i x i ), where k i such that x i (m ki 1,m ki ) Entropy: h(x)logh(x)dx = N k=1 mk m k 1 (a k 0 +ak x)log(a k 0 +ak x) Both concave in epi-parameters a k 0 and ak, k = 1,...,N 81 / 126
128 Soft information Nonnegativity: h 0 if a k 0 +ak m k 1 0 and a k 0 +ak m k 0, k = 1,...,N 82 / 126
129 Soft information Nonnegativity: h 0 if a k 0 +ak m k 1 0 and a k 0 +ak m k 0, k = 1,...,N Integrate to one: h(x)dx = N a0 k (m k m k 1 )+ k=1 N k=1 a k 2 (m2 k m2 k 1 ) = 1 82 / 126
130 Soft information Nonnegativity: h 0 if a k 0 +ak m k 1 0 and a k 0 +ak m k 0, k = 1,...,N Integrate to one: h(x)dx = N a0 k (m k m k 1 )+ k=1 N k=1 a k 2 (m2 k m2 k 1 ) = 1 Continuity: a k 0 +ak m k = a k+1 0 +a k+1 m k for k = 1,...,N / 126
131 Soft information Nonnegativity: h 0 if a k 0 +ak m k 1 0 and a k 0 +ak m k 0, k = 1,...,N Integrate to one: h(x)dx = N a0 k (m k m k 1 )+ k=1 N k=1 a k 2 (m2 k m2 k 1 ) = 1 Continuity: a k 0 +ak m k = a k+1 0 +a k+1 m k for k = 1,...,N 1. Log-concavity, convexity, monotonicity 82 / 126
132 Soft information (cont.) Deconvolution Y = X +W: Given densities h W and h Y, N mk h Y (y) = h(x)h W (y x)dx = a0 k h W (y x)dx k=1 m k 1 N mk + a k xh W (y x)dx for all y m k 1 k=1 83 / 126
133 Soft information (cont.) Deconvolution Y = X +W: Given densities h W and h Y, N mk h Y (y) = h(x)h W (y x)dx = a0 k h W (y x)dx k=1 m k 1 N mk + a k xh W (y x)dx for all y m k 1 k=1 Inverse problem Y = g(x): Given v q = y q h Y (y)dy, q = 1,2,... v q = N M k [g(x)] q h(x)dx w jk [g(x jk )] q (a0 k +a k x jk ) k=1 j=1 83 / 126
134 Soft information (cont.) Deconvolution Y = X +W: Given densities h W and h Y, N mk h Y (y) = h(x)h W (y x)dx = a0 k h W (y x)dx k=1 m k 1 N mk + a k xh W (y x)dx for all y m k 1 k=1 Inverse problem Y = g(x): Given v q = y q h Y (y)dy, q = 1,2,... v q = N M k [g(x)] q h(x)dx w jk [g(x jk )] q (a0 k +a k x jk ) k=1 j=1 Convex optimization problems 83 / 126
135 Software, references Papers and tutorials: Software for univariate probability density estimation Matlab toolbox R toolbox (S. Buttrey) 84 / 126
136 Outline Illustrations and formulations (slides 12-33) Framework (slides 34-60) Epi-splines and approximations (slides 61-77) Implementations (slides 78-85) Examples (slides ) 85 / 126
137 Examples: forecasting and fitting 86 / 126
138 Copper price forecast Stochastic differential equation: estimate drift, volatility Steps: identify discount factor curve using epi-splines and futures market information = future spot prices identify drift using epi-splines, historical data, future spot prices, etc identify volatility using epi-splines and observed/estimated errors 87 / 126
139 Copper price forecast (cont.) Price[USD] Real prices E(hist+market) + σ E(hist+market) σ E(hist+market) Months 88 / 126
140 Surface reconstruction f(x) = (cos(πx 1 )+cos(πx 2 )) 3 on [ 3,3] 2 continuity; second-order epi-splines; N = 400; 900 uniform points Actual function and epi-spline approximation 89 / 126
141 Surface reconstruction f(x) = sin(π x )/(π x ) for x [ 5,5] 2 \{0}, f(0) = 1 cont. diff.; second-order epi-splines; N = 225; 600 random points actual surface x 2 x x x Actual function and epi-spline approximation epi spline estimate / 126
142 Examples: density estimation 91 / 126
143 Examples of probability density estimation Find a density h that maximizes log-likelihood of observations and satisfies soft information Examples: earthquake losses, queuing, robustness, deconvolution, UQ, mixture, bivariate densities 92 / 126
144 Earthquake losses 93 / 126
145 Earthquake losses, Vancouver Region* Comprehensive damage model (284 random variables) 100,000 simulations of 50-year loss *Data from Mahsuli, 2012; see also Mahsuli & Haukaas, / 126
146 Earthquake losses (cont.) Probability density of 50-year loss (billion CAD) density loss 95 / 126
147 Use fewer simulations? Using 100 simulations only and kernel estimator 96 / 126
148 Use fewer simulations? Using 100 simulations only and kernel estimator density loss 96 / 126
149 Using function identification approach Max likelihood of 100 simulations; second-order exp. epi-splines Eng. knowledge: nonincreasing, smooth, nonnegative support 97 / 126
150 Using function identification approach Max likelihood of 100 simulations; second-order exp. epi-splines Eng. knowledge: nonincreasing, smooth, nonnegative support sample 100 sample 100 kernel 0.25 density loss 97 / 126
151 Using function identification approach (cont.) Varying number of simulations; same soft information sample sample 1000 sample 100 sample 10 sample 0.2 density loss 98 / 126
152 Using function identification approach (cont.) 30 meta-replications of 100 simulations density loss 100,000 sim. 100 sim. mean (±1.3) average 10% highest (±4.4) 99 / 126
153 Using function identification approach (cont.) Also pointwise Fisher info.: h (y)/h(y) [ 0.5, 0.1] density loss 100,000 sim. 100 sim. mean (±1.2) average 10% highest (±4.0) 100 / 126
154 Using function identification approach (cont.) Also pointwise Fisher info.: h (x)/h(x) [ 0.35, 0.25] density loss 100,000 sim. 100 sim. mean (±0.5) average 10% highest (±1.7) 101 / 126
155 Building robustness 102 / 126
156 Diversity of estimates using KL-divergence Returning to exponential density Continuously diff., nonincreasing, nonnegative support Original Estimate Empirical Density KL = True Density KL = 0.01 KL = 0.1 density x 103 / 126
157 Queuing 104 / 126
158 M/M/1; 50% of customers delayed for fixed time X = customer time-in-service; 100 obs.; exp. epi-spline Soft info: lsc, X 0, pointwise Fisher, unimodal upper tail True Density Epi Spline Estimate Kernel Estimate Empirical Density density h(x) x 105 / 126
159 Deconvolution 106 / 126
160 True density Gamma(5,1) First-order epi-spline on [0, 23], N = 1000 Three sample points Contin., unimodal, convex tails, bounds on gradient jumps density X True Density X Epi-Spline Estimate X Sample data x 107 / 126
161 Also noisy observations 5000 observations of Y = X +W W independent normal noise; mean 0, stdev / 126
162 Also noisy observations 5000 observations of Y = X +W W independent normal noise; mean 0, stdev 3.2 Estimate h Y separately 108 / 126
163 Also noisy observations 5000 observations of Y = X +W W independent normal noise; mean 0, stdev 3.2 Estimate h Y separately h Y (y) h(x)h W (y x)dx for 101 y points 108 / 126
164 Also noisy observations 5000 observations of Y = X +W W independent normal noise; mean 0, stdev 3.2 Estimate h Y separately h Y (y) h(x)h W (y x)dx for 101 y points density X True Density X Epi-spline Estimate X Sample data Error density x 108 / 126
165 Uncertainty quantification 109 / 126
166 Dynamical system Recall the true density of the amplitude 1.5 density of X x 110 / 126
167 Density of amplitude Sample size 100; cont. diff.; unimodal tails, exp. epi-splines 1.5 True Density Epi Spline Estimate Kernel Estimate Sample 1 density x 111 / 126
168 Gradient information Gradient information for bijective g : IR IR Recall: If X = g(v), then h(x) = h V (g 1 (x))/ g (g 1 (x)) 112 / 126
169 Gradient information Gradient information for bijective g : IR IR Recall: If X = g(v), then h(x) = h V (g 1 (x))/ g (g 1 (x)) Present context without a bijection and data x i = g(v i ),g (v i ): h(x i ) h V(v i ) g (v i ) Value of pdf bounded from below at x i 112 / 126
170 Gradient information (cont.) Sample size True Density Epi Spline Estimate Kernel Estimate Sample 1 density x 113 / 126
171 Fluid dynamics Drag/lift estimates: High-fidelity RANSE solves (each 4 hours on 8 cores) Low-fidelity potential flow solves (each 5 sec on 1 core) 114 / 126
172 Predicting high-fidelity performance 898 high- and low-fidelity solves Drag/lift using high-fidelity solver Drag/lift using low-fidelity solver Learn from low-fidelity solves and avoid (many) high-fidelity solves 115 / 126
173 Density using high or low solves Exponential epi-splines of second order; mesh N = 50 Soft info: log-concavity and bounds on second-order derivatives 15 high 898 Sample low 800 density response 116 / 126
174 Estimating conditional error X =high-fidelity; Y =low-fidelity h X (x) = h X Y (x y)h Y (y)dy 117 / 126
175 Estimating conditional error X =high-fidelity; Y =low-fidelity h X (x) = h X Y (x y)h Y (y)dy Normal linear least-squares regression model on training data of size 50 h X Y (x y) 117 / 126
176 Estimating conditional error X =high-fidelity; Y =low-fidelity h X (x) = h X Y (x y)h Y (y)dy Normal linear least-squares regression model on training data of size 50 h X Y (x y) density 15 high 898 cond ls Sample low response 117 / 126
177 Information fusion Max likelihood using 10 high-fidelity simulations 0.5h X (x) h X Y (x y)h Y (y)dy 1.5h X (x) 118 / 126
178 Information fusion Max likelihood using 10 high-fidelity simulations 0.5h X (x) h X Y (x y)h Y (y)dy 1.5h X (x) density 15 high 898 cond ls Sample high 10 + cond 10 low response 118 / 126
179 Only 10 high-fidelity density 15 high 898 cond ls Sample high 10 + cond 10 high 10 low response 119 / 126
180 Uniform mixture density 120 / 126
181 Uniform mixture density Sample size 100; lsc, slope constraints; exp. epi-splines True Density Epi Spline Estimate Kernel Estimate Empirical Density density x 121 / 126
182 Bivariate normal density 122 / 126
183 Bivariate normal probability density 123 / 126
184 Bivariate normal probability density Curvature, log-concave, 25 sample points, exp. epi-spline 124 / 126
185 Conclusions Function identification problems: rich class 125 / 126
186 Conclusions Function identification problems: rich class lsc functions provide modeling flexibility 125 / 126
187 Conclusions Function identification problems: rich class lsc functions provide modeling flexibility Epi-convergence allows evolution of info./approx. 125 / 126
188 Conclusions Function identification problems: rich class lsc functions provide modeling flexibility Epi-convergence allows evolution of info./approx. Epi-splines central approximation tools 125 / 126
189 References Singham, Royset, & Wets, Density estimation of simulation output using exponential epi-splines Royset, Sukumar, & Wets, Uncertainty quantification using exponential epi-splines Royset & Wets, From data to assessments and decisions: epi-spline technology Royset & Wets, Fusion of hard and soft information in nonparametric density estimation Royset & Wets, Multivariate epi-splines and evolving function identification problems Feng, Rios, Ryan, Spurkel, Watson, Wets & Woodruff, Toward scalable stochastic unit commitment - Part 1 Rios, Wets & Woodruff, Multi-period forecasting with limited information and scenario generation with limited data Royset & Wets, On function identification problems 126 / 126
From Data to Assessments and Decisions: Epi-Spline Technology
From Data to Assessments and Decisions: Epi-Spline Technology Johannes O. Royset Department of Operations Research Naval Postgraduate School joroyset@nps.edu Roger J-B Wets Department of Mathematics University
More informationMULTIVARIATE EPI-SPLINES AND EVOLVING FUNCTION IDENTIFICATION PROBLEMS
MULTIVARIATE EPI-SPLINES AND EVOLVING FUNCTION IDENTIFICATION PROBLEMS Johannes O. Royset Operations Research Department Naval Postgraduate School joroyset@nps.edu Roger J-B Wets Department of Mathematics
More informationStatistical Estimation: Data & Non-data Information
Statistical Estimation: Data & Non-data Information Roger J-B Wets University of California, Davis & M.Casey @ Raytheon G.Pflug @ U. Vienna, X. Dong @ EpiRisk, G-M You @ EpiRisk. a little background Decision
More informationVARIATIONAL THEORY FOR OPTIMIZATION UNDER STOCHASTIC AMBIGUITY
SIAM J. OPTIM. Vol. 27, No. 2, pp. 1118 1149 c 2017 Work carried out by an employee of the US Government VARIATIONAL THEORY FOR OPTIMIZATION UNDER STOCHASTIC AMBIGUITY JOHANNES O. ROYSET AND ROGER J.-B.
More informationDENSITY ESTIMATION OF SIMULATION OUTPUT USING EXPONENTIAL EPI-SPLINES
Proceedings of the 213 Winter Simulation Conference R. Pasupathy, S.-H. Kim, A. Tolk, R. Hill, and M. E. Kuhl, eds. DENSITY ESTIMATION OF SIMULATION OUTPUT USING EXPONENTIAL EPI-SPLINES Dashi I. Singham
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationQuantifying Stochastic Model Errors via Robust Optimization
Quantifying Stochastic Model Errors via Robust Optimization IPAM Workshop on Uncertainty Quantification for Multiscale Stochastic Systems and Applications Jan 19, 2016 Henry Lam Industrial & Operations
More informationGeneralized Convexity in Economics
Generalized Convexity in Economics J.E. Martínez-Legaz Universitat Autònoma de Barcelona, Spain 2nd Summer School: Generalized Convexity Analysis Introduction to Theory and Applications JULY 19, 2008 KAOHSIUNG
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations
Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The
More informationLopsided Convergence: an Extension and its Quantification
Lopsided Convergence: an Extension and its Quantification Johannes O. Royset Operations Research Department Naval Postgraduate School joroyset@nps.edu Roger J-B Wets Department of Mathematics University
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationGaussian processes for inference in stochastic differential equations
Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017
More informationOptimality Functions and Lopsided Convergence
Optimality Functions and Lopsided Convergence Johannes O. Royset Operations Research Department Naval Postgraduate School joroyset@nps.edu Roger J-B Wets Department of Mathematics University of California,
More information4th Preparation Sheet - Solutions
Prof. Dr. Rainer Dahlhaus Probability Theory Summer term 017 4th Preparation Sheet - Solutions Remark: Throughout the exercise sheet we use the two equivalent definitions of separability of a metric space
More informationNonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood
Nonparametric estimation of s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Cowles Foundation Seminar, Yale University November
More informationThe Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee
The Learning Problem and Regularization 9.520 Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing
More informationPDEs in Image Processing, Tutorials
PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower
More informationSteiner s formula and large deviations theory
Steiner s formula and large deviations theory Venkat Anantharam EECS Department University of California, Berkeley May 19, 2015 Simons Conference on Networks and Stochastic Geometry Blanton Museum of Art
More informationHW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.
HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a real-valued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard
More informationConvex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)
Convex Functions Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Definition convex function Examples
More informationLopsided Convergence: an Extension and its Quantification
Lopsided Convergence: an Extension and its Quantification Johannes O. Royset Operations Research Department Naval Postgraduate School Monterey, California, USA joroyset@nps.edu Roger J-B Wets Department
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationInvariance Principle for Variable Speed Random Walks on Trees
Invariance Principle for Variable Speed Random Walks on Trees Wolfgang Löhr, University of Duisburg-Essen joint work with Siva Athreya and Anita Winter Stochastic Analysis and Applications Thoku University,
More informationNonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood
Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Statistics Seminar, York October 15, 2015 Statistics Seminar,
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationFinite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product
Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )
More informationEmpirical Risk Minimization
Empirical Risk Minimization Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Introduction PAC learning ERM in practice 2 General setting Data X the input space and Y the output space
More informationReproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto
Reproducing Kernel Hilbert Spaces 9.520 Class 03, 15 February 2006 Andrea Caponnetto About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationVARIATIONAL THEORY FOR OPTIMIZATION UNDER STOCHASTIC AMBIGUITY
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 VARIATIONAL THEORY FOR OPTIMIZATION UNDER STOCHASTIC AMBIGUITY JOHANNES O. ROYSET AND ROGER J-B WETS Abstract. Stochastic ambiguity provides a rich
More informationAn introduction to some aspects of functional analysis
An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms
More informationChapter 2: Linear Programming Basics. (Bertsimas & Tsitsiklis, Chapter 1)
Chapter 2: Linear Programming Basics (Bertsimas & Tsitsiklis, Chapter 1) 33 Example of a Linear Program Remarks. minimize 2x 1 x 2 + 4x 3 subject to x 1 + x 2 + x 4 2 3x 2 x 3 = 5 x 3 + x 4 3 x 1 0 x 3
More informationEfficient coding of natural images with a population of noisy Linear-Nonlinear neurons
Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons Yan Karklin and Eero P. Simoncelli NYU Overview Efficient coding is a well-known objective for the evaluation and
More informationGeometry and Optimization of Relative Arbitrage
Geometry and Optimization of Relative Arbitrage Ting-Kam Leonard Wong joint work with Soumik Pal Department of Mathematics, University of Washington Financial/Actuarial Mathematic Seminar, University of
More informationLecture 1. Stochastic Optimization: Introduction. January 8, 2018
Lecture 1 Stochastic Optimization: Introduction January 8, 2018 Optimization Concerned with mininmization/maximization of mathematical functions Often subject to constraints Euler (1707-1783): Nothing
More informationAUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET. Questions AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET
The Problem Identification of Linear and onlinear Dynamical Systems Theme : Curve Fitting Division of Automatic Control Linköping University Sweden Data from Gripen Questions How do the control surface
More informationPenalty Methods for Bivariate Smoothing and Chicago Land Values
Penalty Methods for Bivariate Smoothing and Chicago Land Values Roger Koenker University of Illinois, Urbana-Champaign Ivan Mizera University of Alberta, Edmonton Northwestern University: October 2001
More informationSTATISTICAL BEHAVIOR AND CONSISTENCY OF CLASSIFICATION METHODS BASED ON CONVEX RISK MINIMIZATION
STATISTICAL BEHAVIOR AND CONSISTENCY OF CLASSIFICATION METHODS BASED ON CONVEX RISK MINIMIZATION Tong Zhang The Annals of Statistics, 2004 Outline Motivation Approximation error under convex risk minimization
More informationConvex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)
ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality
More informationA dual form of the sharp Nash inequality and its weighted generalization
A dual form of the sharp Nash inequality and its weighted generalization Elliott Lieb Princeton University Joint work with Eric Carlen, arxiv: 1704.08720 Kato conference, University of Tokyo September
More informationHidden Markov Models (HMM) and Support Vector Machine (SVM)
Hidden Markov Models (HMM) and Support Vector Machine (SVM) Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 1 Hidden Markov Models (HMM)
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More informationOptimization Problems with Probabilistic Constraints
Optimization Problems with Probabilistic Constraints R. Henrion Weierstrass Institute Berlin 10 th International Conference on Stochastic Programming University of Arizona, Tucson Recommended Reading A.
More informationReal Analysis Problems
Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.
More informationCS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu
CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge
More informationSome Background Material
Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationCS6375: Machine Learning Gautam Kunapuli. Support Vector Machines
Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this
More informationConvex optimization problems. Optimization problem in standard form
Convex optimization problems optimization problem in standard form convex optimization problems linear optimization quadratic optimization geometric programming quasiconvex optimization generalized inequality
More informationBAYESIAN DECISION THEORY
Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will
More informationl 1 -Regularized Linear Regression: Persistence and Oracle Inequalities
l -Regularized Linear Regression: Persistence and Oracle Inequalities Peter Bartlett EECS and Statistics UC Berkeley slides at http://www.stat.berkeley.edu/ bartlett Joint work with Shahar Mendelson and
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationMathematical Methods for Physics and Engineering
Mathematical Methods for Physics and Engineering Lecture notes for PDEs Sergei V. Shabanov Department of Mathematics, University of Florida, Gainesville, FL 32611 USA CHAPTER 1 The integration theory
More informationSupport Vector Machines
Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal
More informationTHEOREMS, ETC., FOR MATH 516
THEOREMS, ETC., FOR MATH 516 Results labeled Theorem Ea.b.c (or Proposition Ea.b.c, etc.) refer to Theorem c from section a.b of Evans book (Partial Differential Equations). Proposition 1 (=Proposition
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More information2. Variance and Higher Moments
1 of 16 7/16/2009 5:45 AM Virtual Laboratories > 4. Expected Value > 1 2 3 4 5 6 2. Variance and Higher Moments Recall that by taking the expected value of various transformations of a random variable,
More informationMcGill University Math 354: Honors Analysis 3
Practice problems McGill University Math 354: Honors Analysis 3 not for credit Problem 1. Determine whether the family of F = {f n } functions f n (x) = x n is uniformly equicontinuous. 1st Solution: The
More informationComputational Intelligence Winter Term 2018/19
Computational Intelligence Winter Term 2018/19 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund Three tasks: 1. Choice of an appropriate problem
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationOptimal global rates of convergence for interpolation problems with random design
Optimal global rates of convergence for interpolation problems with random design Michael Kohler 1 and Adam Krzyżak 2, 1 Fachbereich Mathematik, Technische Universität Darmstadt, Schlossgartenstr. 7, 64289
More informationRKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee
RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html
More informationL p Spaces and Convexity
L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications
ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications Yongmiao Hong Department of Economics & Department of Statistical Sciences Cornell University Spring 2019 Time and uncertainty
More informationNonparametric estimation under Shape Restrictions
Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions
More informationPartial Generalized Additive Models
Partial Generalized Additive Models An Information-theoretic Approach for Selecting Variables and Avoiding Concurvity Hong Gu 1 Mu Zhu 2 1 Department of Mathematics and Statistics Dalhousie University
More informationFoundations of Nonparametric Bayesian Methods
1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models
More informationminimize x subject to (x 2)(x 4) u,
Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationNonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control
Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control 19/4/2012 Lecture content Problem formulation and sample examples (ch 13.1) Theoretical background Graphical
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationLecture 2: Convex Sets and Functions
Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are
More informationMulti-period forecasting and scenario generation with limited data
Multi-period forecasting and scenario generation with limited data Ignacio Rios University of Chile Santiago, Chile Roger J-B Wets and David L. Woodruff University of California, Davis Davis, CA 95616,
More informationApproximate Dynamic Programming
Approximate Dynamic Programming A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course Approximate Dynamic Programming (a.k.a. Batch Reinforcement Learning) A.
More informationGame Theory and its Applications to Networks - Part I: Strict Competition
Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis
More informationStat542 (F11) Statistical Learning. First consider the scenario where the two classes of points are separable.
Linear SVM (separable case) First consider the scenario where the two classes of points are separable. It s desirable to have the width (called margin) between the two dashed lines to be large, i.e., have
More informationComputational Intelligence
Computational Intelligence Winter Term 2010/11 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11 Fakultät für Informatik TU Dortmund Three tasks: 1. Choice of an appropriate problem representation.
More informationCourse Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)
Course Outline FRTN Multivariable Control, Lecture Automatic Control LTH, 6 L-L Specifications, models and loop-shaping by hand L6-L8 Limitations on achievable performance L9-L Controller optimization:
More informationOptimal Transport in Risk Analysis
Optimal Transport in Risk Analysis Jose Blanchet (based on work with Y. Kang and K. Murthy) Stanford University (Management Science and Engineering), and Columbia University (Department of Statistics and
More informationHow much should we rely on Besov spaces as a framework for the mathematical study of images?
How much should we rely on Besov spaces as a framework for the mathematical study of images? C. Sinan Güntürk Princeton University, Program in Applied and Computational Mathematics Abstract Relations between
More informationFitting Linear Statistical Models to Data by Least Squares: Introduction
Fitting Linear Statistical Models to Data by Least Squares: Introduction Radu Balan, Brian R. Hunt and C. David Levermore University of Maryland, College Park University of Maryland, College Park, MD Math
More informationLesson 2: Analysis of time series
Lesson 2: Analysis of time series Time series Main aims of time series analysis choosing right model statistical testing forecast driving and optimalisation Problems in analysis of time series time problems
More informationNonlinear Stationary Subdivision
Nonlinear Stationary Subdivision Michael S. Floater SINTEF P. O. Box 4 Blindern, 034 Oslo, Norway E-mail: michael.floater@math.sintef.no Charles A. Micchelli IBM Corporation T.J. Watson Research Center
More informationStability-based generation of scenario trees for multistage stochastic programs
Stability-based generation of scenario trees for multistage stochastic programs H. Heitsch and W. Römisch Humboldt-University Berlin Institute of Mathematics 10099 Berlin, Germany http://www.math.hu-berlin.de/~romisch
More informationConvex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013
Convex Optimization (EE227A: UC Berkeley) Lecture 28 (Algebra + Optimization) 02 May, 2013 Suvrit Sra Admin Poster presentation on 10th May mandatory HW, Midterm, Quiz to be reweighted Project final report
More informationLecture Notes on Support Vector Machine
Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is
More informationNotation. General. Notation Description See. Sets, Functions, and Spaces. a b & a b The minimum and the maximum of a and b
Notation General Notation Description See a b & a b The minimum and the maximum of a and b a + & a f S u The non-negative part, a 0, and non-positive part, (a 0) of a R The restriction of the function
More informationDS-GA 1002 Lecture notes 12 Fall Linear regression
DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationStatistical inference on Lévy processes
Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline
More informationAnalysis Qualifying Exam
Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationRandom Sets. Roger J-B Wets Mathematics, University of California, Davis. May 2012, ETH-Zürich Lecture #2
Random Sets Roger J-B Wets Mathematics, University of California, Davis May 2012, ETH-Zürich Lecture #2 G : E d, G 1 (0) soln s of G(x) = 0, approximations? EG(x) = {G(ξ,x)} = 0 approximated by G ν (x)
More informationFRTN10 Multivariable Control, Lecture 13. Course outline. The Q-parametrization (Youla) Example: Spring-mass System
FRTN Multivariable Control, Lecture 3 Anders Robertsson Automatic Control LTH, Lund University Course outline The Q-parametrization (Youla) L-L5 Purpose, models and loop-shaping by hand L6-L8 Limitations
More informationTesting convex hypotheses on the mean of a Gaussian vector. Application to testing qualitative hypotheses on a regression function
Testing convex hypotheses on the mean of a Gaussian vector. Application to testing qualitative hypotheses on a regression function By Y. Baraud, S. Huet, B. Laurent Ecole Normale Supérieure, DMA INRA,
More information