SCORING RULES. ROBERT L. WINKLER Fuqua School of Business, Duke University, Durham, North Carolina

Size: px
Start display at page:

Download "SCORING RULES. ROBERT L. WINKLER Fuqua School of Business, Duke University, Durham, North Carolina"

Transcription

1 SCORING RULES INTRODUCTION ROBERT L. WINKLER Fuqua School of Business, Duke University, Durham, North Carolina VICTOR RICHMOND R. JOSE McDonough School of Business, Georgetown University, Washington, D.C. Uncertainty is a pervasive feature of our world, and fields such as decision analysis and statistics provide methods to help us make decisions, forecasts, and inferences in the face of uncertainty. Our everyday language includes many terms that relate to the degree of uncertainty in a situation: for example, rain is unlikely today, the chances are good that a surgical procedure will be successful, the prospects for an improved economic situation are not favorable, and so on. As the mathematical language of uncertainty, probability theory provides a structure to quantify uncertainty. Probabilities are encountered in the media (e.g., the probability of rain this afternoon is 20%) and widely used in modeling. Although probability forecasts are formulated and used extensively, very often they are never evaluated after the event or variable of interest is observed. Scoring rules provide such evaluations by giving a numerical score based on the probabilities and on the actual observation. For example, a probability of rain of 40% in a simple two-state setting of rain versus no rain will receive a higher score than a probability of 20% if it rains, and a lower score if it does not rain. In this manner, we can use scoring rules to compare the sources of the probabilities, which might be experts, models, or simply past data. The first scoring rule used on a regular basis was a quadratic rule developed by Brier [1] to evaluate probabilistic weather forecasts. Indeed, weather forecasting is the area in which scoring rules have been used most extensively. The presence of such an ex post evaluation using suitably designed scoring rules also provides ex ante incentives for careful formulation of probability forecasts. Much of the early development of scoring rules emphasized this ex ante role of scoring rules [e.g., [2 6]]. Attention was focused on strictly proper scoring rules, for which a forecaster can maximize his or her expected score only by honestly reporting the probabilities and also has the incentive to obtain further information to increase the accuracy of the probabilities. This ex ante motivation yields rules that reward probabilities that have good characteristics ex post, as we shall see. For a general discussion of scoring rules and reviews of the scoring rule literature, see Winkler [7] and Gneiting and Raftery [8]. We discuss some basic properties of scoring rules in the second section, focusing on the aspects related to ex ante incentives, and present some commonly encountered rules. In the next section, we turn to ex post evaluation, showing how some notions involving strictly proper scoring rules relate to ex post evaluation. The next two sections involve scoring rules with special characteristics, namely those that provide evaluations of probabilities relative to baseline distributions and those that take into account any ordering of the events of interest. A brief summary and discussion, including some connections with other fields, is presented in the final section. STRICTLY PROPER SCORING RULES We begin by considering the simplest possible situation, that of a single event A and its complement. Suppose that an expert is assessing a probability for A and is being evaluated with a scoring rule S. If a probability r is reported for A, then the score willbe Wiley Encyclopedia of Operations Research and Management Science, edited by James J. Cochran Copyright 2010 John Wiley & Sons, Inc. 1

2 2 SCORING RULES S(r, e), where e = 1ifA occurs and e = 0if A does not occur. Furthermore, assume that the expert s best judgment is that the probability of A is denoted by p. Then the expected score is S(r, p) = ps(r,1)+ (1 p)s(r,0). The scoring rule S is said to be strictly proper if S (r,0) S (r,1) S(p, p) > S(r, p) foranyr p. (1) 0.5 To maximize the expected score with a strictly proper rule, the expert should set r = p, thereby reporting the probability honestly. The scoring rules discussed here are oriented such that a higher score is better. Some rules in the literature, such as the Brier score [1], are oriented with a negative score being better, in which case the expert should set r = p to minimize the expected score. Rules such as the Brier score can be converted to a positive orientation by changing the sign of the score, so the focus on scores with a positive orientation here is not restrictive. The expected score S(p, p) for honest reporting from a strictly proper scoring rule is strictly convex, and conversely, a strictly proper scoring rule can be generated from any strictly convex function of p that is taken as the expected score function S(p, p) for honest reporting [6]. Thus, there are an infinite number of rules satisfying Equation (1). Three commonly used rules are as follows: Quadratic: S(r, e) = 1 2(e r) 2, (2) Logarithmic: S(r, e) = log[re + (1 r)(1 e)], (3) Spherical: S(r, e) = [re + (1 r)(1 e)]/[r 2 + (1 r) 2 ] 1/2. (4) These and any other strictly proper rules can be scaled as desired (e.g., to avoid negative scores), because any positive affine transformation of a strictly proper rule is itself strictly proper. Figure 1 shows S(r, 1), S(r, 0), and S(p, p) for the quadratic scoring rule. Note that for this simple two-event setting, S(r,1) and S(r, 0) are mirror images of each r (a) S (p,p) p (b) Figure 1. (a) Score functions S(r,1) and S(r,0) and (b) expected score S(p, p) under honest reporting for the quadratic scoring rule. other, with S(r, 1) increasing in r and S(r,0) decreasing in r. With the general concept of a strictly proper scoring rule established for the case of a single event (and its complement), we next generalize to the case of a set of mutually exclusive and exhaustive events {A 1,..., A k }, for which the expert s probabilities are given by the vector p = (p 1,..., p k )andthe reported probabilities are r = (r 1,..., r k ). With a scoring rule S, the expert s score is S(r, e i )ifa i occurs, where e i is a vector with the ith element equal to 1 and the other elements all equal to 0. The expected score from the perspective of the expert is S(r, p) = k i=1 p is(r, e i ), and the scoring rule is strictly proper if S(p, p) > S(r, p) forany r p. Quadratic, logarithmic, and spherical

3 SCORING RULES 3 rules for this case are S(r, e i ) = 2r i k r 2 j, (5) j=1 S(r, e i ) = log r i, (6) 1/2 k and S(r, e i ) = r i /, (7) r 2 j j=1 respectively. Note that this setup could be used when we are considering a discrete distribution of a random variable, which could include a discretization of a continuous random variable into a set of intervals. Finally, we present scoring rules for probability distributions of a continuous random variable x. Letp denote the expert s probability density function for x, andletr denote the corresponding reported density function. Then, for a scoring rule that gives a score S(r, x)when x = x, the expert s expected score is S(r, p) = S(r, x)p(x)dx. Quadratic, logarithmic, and spherical scoring rules in the continuous case are and S(r, x) = 2r(x) r 2 (x)dx, (8) S(r, x) = log r(x), (9) ( 1/2 S(r, x) = r(x)/ r (x)dx) 2. (10) Our focus has been on strictly proper scoring rules, which have been developed with the goal of providing the expert with an incentive to report honestly. But if the expert is not well-informed with respect to the situation, reporting probabilities honestly may not mean reporting good probabilities. Not all probability forecasts are necessarily good forecasts. Fortunately, assuming honest forecasting, strictly proper scoring rules will reward forecasts by providing higher expected scores to forecasts for which p is closer to 0 or 1. To see how this works for the quadratic, logarithmic, and spherical rules that have been presented here, we note that they share an important characteristic: they are symmetric. In the single-event case, that means that the expected score for an honestly reported probability of r is the same as the expected score for an honestly reported probability of 1 r. As noted earlier, reporting r = p under a strictly proper scoring rule results in an expected score function S(p, p) that is strictly convex in p. This convexity, combined with the symmetry, means that S(p, p) is minimized at p = 0.5 and increases as p 0orp 1 from p = 0.5. These features of S(p, p) are illustrated for the quadratic rule in Fig. 1. That means that under honest reporting, the expected score is higher for probability forecasts that are sharper, where sharpness refers to the degree that p is closer to 0 or 1. For example, a probability of 0 or 1 is perfectly sharp, whereas a probability of 0.5 admits a lot of uncertainty about the outcome. To illustrate the incentives from strictly proper scoring rules for both honesty and sharpness, consider a decomposition of the expected score for the quadratic rule in the case of a probability for a single event. The expected score is S(r, p) = p[1 2(1 r) 2 ] + (1 p)(1 2r 2 ). Expanding, adding and subtracting p 2, and rearranging yields S(r, p) = 1 2(p r) 2 2p(1 p). (11) The second term on the right-hand side of Equation (11) can be viewed as a penalty (because of the negative sign) for not setting r = p, and it thus provides an incentive for honesty. The last term is a penalty for lack of sharpness, because p(1 p) is maximized at p = 0.5 and decreases as p 0 or p 1. The best possible expected score is one, and dishonesty (r p) or lack of perfect sharpness (0 < p < 1) will reduce the expected score. Other strictly proper rules (e.g., the logarithmic and spherical rules) can be decomposed in a similar manner. Keep in mind that to maximize expected score, the expert has to report honestly, so attempting to have probabilities look sharp artificially (i.e., sharp reported probabilities that are not consistent with the expert s judgments) will decrease the expected score, not increase it. Note from Equation (11) that the sharpness term relates to the sharpness of p, not the sharpness of r. The primary

4 4 SCORING RULES aspect of strictly proper scoring rules is to encourage honesty, and the nature of strictly proper scoring rules is such that honest reporting by experts who have sharper probabilities will yield higher expected scores than honest reporting by experts who have probabilities that are not so sharp. In the final analysis, then, strictly proper scoring rules reward both honesty and sharpness. Strictly proper scoring rules differ in some characteristics. We will present different types of strictly proper rules in later sections and discuss some of their characteristics. As we shall see, not all strictly proper rules are symmetric in the sense discussed above, and in some cases it may be desirable to use a rule that is not symmetric. Another characteristic of note is that the logarithmic rule is the only rule for which the score depends only on the probability or density that has been assigned to the event or value of the variable that actually occurs. It does not depend on the probability or density assigned to other events or values. For example, if we consider scoring rules for k mutually exclusive and exhaustive events and event A i occurs, the logarithmic score, log r i, depends only on r i, whereas the quadratic score, 2r i k j=1 r2 j, depends on all of the probabilities r 1,..., r k. This property, unique to the logarithmic rule when k > 2, is called locality, and it is consistent in spirit with the likelihood principle that plays a major role in statistics. Although the quadratic, logarithmic, and spherical rules given above are the usual suspects when we think about scoring rules, they are special cases of two rich families of strictly proper scoring rules [9]. When probabilities for a set of k mutually exclusive and exhaustive events are reported, scores for the pseudospherical and power families are given by [ ( S S β (r, e i) = 1 ) r β 1 i β 1 E r (r β 1 ) 1/β 1] and (12) S P β (r, e i) = rβ 1 i 1 β 1 E r(r β 1 ) 1, (13) β respectively, where E r (r β 1 ) = k i=1 r i(r β 1 i ) and < β <. When β = 2, Equations (12) and (13) yield the spherical and quadratic rules, respectively. When β 1, both families converge to the logarithmic rule. In summary, the most important characteristic of scoring rules in an ex ante sense related to probability assessment is that they should be strictly proper, and there are many such rules from which to choose. If a rule is indeed strictly proper, then it should provide incentives for an expert to honestly report probabilities and to invest effort in an attempt to make those probabilities sharper. Such effort might be directed toward such things as gathering more data, using more powerful methods to analyze the data, and learning more about the processes affecting the events in question or about forecasts provided by others. EX-POST EVALUATION WITH STRICTLY PROPER SCORING RULES We shift now from the ex ante perspective of the previous section to consider the ex post evaluation of probability forecasts. For a given situation, we assume that an expert has probabilities that are based on the information available and consistent with the expert s best judgment. The ex ante viewpoint involves rules that provide incentives for the expert to attempt to come up with good probabilities and to report those probabilities honestly. Many of the characteristics discussed in the preceding section have counterparts when we consider their use for evaluation purposes. As is the case with statistical analysis of data in general, a single observation is not very informative. To have a reliable evaluation (when comparing experts or models in terms of their probabilities, for example), we would like to have a large number of observations. Thus, instead of considering a given situation with a single probability or set of probabilities, we have a set of data consisting of many situations with different probabilities.

5 SCORING RULES 5 Suppose that we have a sample of probability forecasts and the corresponding observations for the occurrence of an event. For example, this might consist of probabilities of default for loans (generated by a model or assessed by a bank officer) or probabilities of rain for different days at a given location. First, we can look at all of the occasions in the data set for which a particular value of the reported probability r (say, 0.30) was used, and determine the relative frequency of occurrence of the event of interest on those occasions. Denote this relative frequency by f r. With the quadratic scoring rule, the average score on all of the occasions with this value of r is S(r, f r ) = f r [1 2(1 r) 2 ] + (1 f r )(1 2r 2 ). Ex ante, the expected score from the perspective of the expert is a function of r and p,withp being known only to the expert. Ex post, the average score is a function of r and f r, where we are able to observe f r : S(r, f r ) = 1 2(f r r) 2 2f r (1 f r ). (14) Note that this is simply Equation (11) with f r used in place of p. The second term on the right-hand side of Equation (14) is a measure of calibration, which involves the correspondence between the reported probability and the relative frequency of occurrence of the event when that probability is used. If f r = r, then the reported probabilities of r are perfectly calibrated. The more the relative frequency deviates from r, the worse the calibration is. The last term on the right-hand side of Equation (14) is a measure of sharpness, which is better as f r 0orf r 1. Poorer calibration and less sharpness lead to lower average scores. In data with reported probabilities and outcomes, different probabilities will be used on different occasions. The overall average score S for an expert is found by aggregating the average scores for the different values of r. Ifweletn r represent the number of times a reported probability of r is used in the data set and let n = r n r represent the overall sample size, the overall average score can be expressed as S = r = 1 2 r (n r /n)s(r, f r ) (n r /n)(f r r) 2 2 (n r /n)f r (1 f r ). (15) r The first summation on the right-hand side of Equation (15) is an overall measure of calibration for the data set, and the second summation is an overall measure of sharpness. This decomposition into calibration and sharpness components can be generalized beyond the quadratic rule to any strictly proper scoring rule [10]. One convenient way to think about calibration and sharpness is to think of the probability assessment process for a single event as a two-step process. First, the expert puts forecast situations in equivalence classes, or boxes, such that the expert feels that the events in a given box have roughly the same probability of occurrence. Second, the expert assigns numbers (probability values) to the boxes. Calibration is then an evaluation of how well the expert assigns the numbers. Sharpness, on the other hand, is unrelated to the probability values. Instead, it measures how effective the expert is in creating boxes for which the relative frequency of occurrence of the events is close to 0 or 1. This is unlikely to be the way an expert really thinks about the forecasting process, but it is a convenient way to emphasize key differences between calibration and sharpness. For one thing, we can always attempt to correct for miscalibration. If an expert always gives probabilities that are too high, for example, a decision maker using those probabilities can reduce reported probabilities from that expert. Correcting for poor sharpness is a much trickier business. We have illustrated the notion of decomposition of an average score using the quadratic scoring rule, but the same idea can be applied to other rules. Also, an ex post average score or an ex ante expected score can be decomposed in different ways. The decomposition into terms measuring calibration and sharpness is the most frequently

6 6 SCORING RULES used decomposition, and it is arguably the most important decomposition. Gneiting and Raftery [8] comment that the goal of probabilistic forecasting is to maximize the sharpness of the (probabilities) subject to calibration. Although that seems reasonable, we feel that in reported evaluations, too much emphasis is typically given to calibration and not enough to sharpness. Ex post evaluation with strictly proper scoring rules involves most of the same ideas encountered in the role of scoring rules in terms of ex ante incentives. Ex ante incentives for honesty translate into ex post evaluations of calibration, and ex ante measures of sharpness based on the probabilities translate into ex post measures based on the relative frequencies of occurrence of the events of interest for given probability values (given boxes). The use of scoring rules and their decompositions to evaluate probabilities ex post can be thought of as exploratory data analysis. Such evaluations can be used to compare experts or models. In the case of a single expert or model, they can be used to learn more about that expert s or model s characteristics and abilities as a probability forecaster. Feedback can also help the expert understand his or her own characteristics as a probability forecaster and attempt to improve them in the future. SCORING RULES WITH BASELINE DISTRIBUTIONS Scoring rules such as the quadratic, logarithmic, and spherical rules given earlier can be thought of as providing an absolute evaluation of probabilities. Often, we would like to have a relative evaluation by comparing how good a probability or probability distribution is, relative to some baseline. When assessing probabilities of rain, for example, it is easier to get a high score in a location where rain seldom occurs than it is in a location where it rains reasonably often. Does that mean that the probability forecasts in the drier area are better? As noted earlier, the most commonly used scoring rules are symmetric in the sense that any permutation of the labels on the events and their associated probabilities does not change the expected score. One implication of this symmetry, when combined with the convexity of the expected score function, is that the expected score is minimized for a uniform distribution that gives a probability of 1/k to each event in the k-event case. Thus, these scores are implicitly being evaluated relative to a uniform baseline distribution. To avoid comparison with a uniform distribution, we could consider the percentage improvement in average scores over the scores for a baseline distribution. In assessing a probability of rain, for example, we might use climatology, which is the long-term relativefrequencyofraininagivenlocation at a specific time of year, as a baseline. However, a percentage improvement in the score over the baseline, which is called a skill score, is not strictly proper. For a strictly proper rule with a baseline distribution that is not uniform, we can choose a desired convex expected score function and generate a strictly proper rule that yields that expected score function [11]. For example, we might choose a function that is minimized at what might be viewed as a least skillful forecast. In forecasting rain, climatology might be considered least skillful among forecasts that seem reasonable, since it just involves looking up some past data and does not require any weather-forecasting expertise. In contrast, although a uniform distribution requires no expertise, it may not seem at all reasonable, as in the case of a dry location with a very low climatological relative frequency of rain. Asymmetric rules can be generated from symmetric rules. For example, in the singleevent case for which it is felt that the expected score with honest reporting should be minimized at a probability of q, we can take any symmetric strictly proper scoring rule S and create a new rule S (r, e q) = [S(r, e) S(q, e)]/t(q), where T(q) = S(1, 1) S(q,1) if r q and S(0, 0) S(q,0)ifr q. More generally, the families of scoring rules given by Equations (12) and (13) can be generalized to pseudospherical and power families of strictly proper scoring rules that allow for the incorporation of baseline distributions [9]. If the baseline

7 S (r, e 2 q) S (p, p q) S (r, e 1 q) r 1 (a) p 1 (b) Figure 2. (a) Score functions S(r,e 1 q) and S(r,e 2 q) and (b) expected score S(p,p q) under honest reporting for the power scoring rule with β = 2andq = (0.2, 0.8). distribution for a set of k mutually exclusive and collectively exhaustive events is denoted by q = (q 1,..., q k ), then we can define the pseudospherical and power families of scoring rules with baselines as follows: and S S β (r, e i q) = 1 β 1 S P β (r, e i q) = (r i/q i ) β 1 1 β 1 [ ( r i /q i E r [(r/q) β 1 ] 1/β ) β 1 1] E r[(r/q) β 1 ] 1, β (16) (17) SCORING RULES 7 where E r [(r/q) β 1 ] = k i=1 r i(r i /q i ) β 1 and <β<. These scoring rules are scaled so that they yield scores of 0 when r = q. Thus, a positive score represents improvement over the baseline and a negative score indicates a forecast worse than the baseline. (The expert s expected score with honest reporting is positive except at r = q, where it is 0.) As with Equations (12) and (13), β = 2 corresponds to spherical and quadratic rules, respectively, in Equations (16) and (17), and both families converge to a logarithmic rule when β 1. Figure 2 shows S S β (r, e 1 q), S S β (r, e 2 q), and S(p, p q) for the power scoring rule with β = 2. The consideration of baseline distributions provides a relative evaluation as opposed to an absolute evaluation, and relative evaluations are often of great interest. In addition, evaluations with baseline distributions can be useful in evaluating probabilities (and evaluating the forecasters providing the probabilities) that are made under different circumstances. For example, if one weather forecaster assesses probabilities of rain in a very dry climate (say, with a climatology of 0.05) and another forecaster assesses probabilities in a more moist climate (climatology 0.40), then it is much easier for the first forecaster to obtain higher scores in an absolute evaluation. This is because of the fact that the first forecaster is able, on average, to make sharper forecasts. If we use a relative evaluation with climatology as the baseline, then we are comparing the two forecasters in terms of how effective they are at improving upon a forecast based solely on climatology, thereby adjusting for the differences in the forecast situations. While not perfect, this will tend to even the playing field somewhat and make forfairercomparisons. SCORING RULES THAT ARE SENSITIVE TO DISTANCE In some situations, the events of interest are ordered. For example, in a soccer match, a team can win, lose, or tie. A win is better than a tie, which in turn is better than a loss,

8 8 SCORING RULES so there is an ordering. If we are giving probabilities for x, the amount of rain in inches on a given day, we might assess probabilities for x = 0, 0 < x 0.5, 0.5 < x 1, and x > 1. These four events are ordered. The scoring rules for multiple events, discussed in the preceding sections, ignore any ordering. Suppose that two experts report probabilities of (0.3,0.4,0.2,0.1) and (0.3,0.1,0.2,0.4). If there is no rain, then the experts will receive the same score. They both gave a probability of 0.3 for the event that occurred, and they gave probabilities of 0.4, 0.2, and 0.1 for the other three events. The fact that the latter three probabilities were in different orders does not affect the score because the scoring rule does not take ordering of the events into account. Some might argue that the first expert gave more probability on the event closest to the event that occurred and should therefore receive a higher score. Scoring rules have been developed that would take ordering into account in this way, and we say that such rules are sensitive to distance. Informally, this means that for the probability not assigned to the event that occurs, a higher score will result if more probability is given to events closer to the event that occurs and less probability to more distant events. The first strictly proper sensitive-todistance scoring rule was a quadratic rule called the ranked probability score [12]: S(r, e i ) = i 1 j=1 R2 j k 1 j=i (1 R j ) 2, where R j = j l=1 r l is a cumulative probability. By connecting the score to cumulative probabilities,theruleisabletotakesensitivityto distance into consideration. As probability moves from events more distant from the event that occurs to events closer to the event that occurs, the cumulative probabilities change accordingly and result in an increase in the score. Thesameideathatisusedtogeneratethe ranked probability score from the quadratic score can be used to obtain a sensitive-todistance scoring rule S based on any strictly proper rule S for a single event and its complement: i 1 k 1 S (r, e i ) = S(R j,0)+ S(R j,1). (18) j=1 j=i The corresponding expected score is S (r, p) = k 1 i=1 [P is(r i,1)+ (1 P i )S(R i, 0)], using a vector P of cumulative probabilities P j = j l=1 p l based on p. Note that Equation (18) can be used to generate new pseudospherical and power families of strictly proper scoring rules, with or without baseline distributions [13]. If baseline distributions are used, they will be expressed in cumulative form, with a vector Q of cumulative probabilities Q j = j l=1 q l representing the cumulative baseline distribution. It is important to mention that properties of the scoring rule S in Equation (18), other than the fact that it is strictly proper, are not necessarily inherited by S. For example, if S is logarithmic, S will not inherit the property of locality mentioned earlier. The score is based on the cumulative probabilities R 1,..., R k 1, so it clearly depends on more than just r i. Also, if S is determined from Equation (18) using a symmetric, strictly proper S without a baseline distribution, then the expected score S (p,p) is minimized at p = (0.5, 0,...,0,0.5). That is, if a baseline distribution is not chosen, the default baseline distribution is (0.5, 0,..., 0,0.5), not (1/k,..., 1/k), and the score for a uniform distribution will not be the same for all events because the ordering of the events is relevant and some events are more distant from others. Note that the default baseline distribution of (0.5, 0,...,0,0.5) for S translates to (0.5,...,0.5) when expressed in terms of the cumulative probabilities (R 1,..., R k 1 ). Since the relevant probabilities are cumulative in a score that is sensitive to distance, the baseline distribution is uniform in the cumulative probabilities. Furthermore, this distribution will give the same score S regardless of which event occurs, because R j = 0.5 ineachofthes(r j,0) and S(R j,1) terms in Equation (18). Commonly encountered scoring rules ignore any ordering of the events. When the events of interest are ordered, however, the ordering may be important in terms of the underlying real-world situation. For instance, with forecasts of returns on an investment, high probabilities for values that are not identical to the returns that actually occur but are close to those returns would

9 SCORING RULES 9 seem to be more valuable for investment purposes than high probabilities for values that are quite distant from the actual returns. In such a setting, a scoring rule that is sensitive to distance might provide incentives and an ex post evaluation that are more consistent with the decision making problem. SUMMARY AND DISCUSSION Probability forecasts are important inputs to quantify uncertainty in inferential and decision making problems. It is therefore important to have appropriate incentives for careful formulation of probability forecasts andtohavemeasurestoevaluatetheforecasts once the uncertainty is resolved and we see what actually happens. Those are exactly the roles that are played by scoring rules. In particular, strictly proper scoring rules provide incentives for making good forecasts (i.e., sharp forecasts) and reporting them honestly. In terms of ex post evaluation, the incentive for honest reporting ex ante translates into measures of the calibration of the forecasts, and given good calibration, sharper forecasts will earn higher scores on average. A few scoring rules tend to be used most often, but rich families of strictly proper scoring rules have been developed. Beyond the basic rules, there are options that add more flexibility while maintaining the strictly proper nature of the rules. Some rules allow the evaluation of probabilities relative to a chosen baseline distribution. Among other things, this makes scores for probabilities made in different situations more comparable. Other rules take into account any ordering of the events and are sensitive to distance in the sense of giving higher scores to probability distributions assigning higher probabilities to events near the event that occurs, all other things being equal. This feature is relevant when being close with the probability forecast can lead to better decisions or inferences. How might a user choose a scoring rule in a given situation? Among the basic rules, some have different properties from others and different rules can lead to different scores and different rankings of experts [14]. Thus, the choice of a rule might depend on how one feels about those properties and about the situation at hand. For example, with probabilistic answers for a multiple-choice test, the locality property of the logarithmic rule might have strong appeal. At the same time, the possibility of a score of negative infinity with the logarithmic score is a potential concern; some claim that the logarithmic rule has undesirable properties [15], and in certain settings, locality becomes less important. For example, in a two-event setting, all rules satisfy locality, and in a setting where sensitivity to distance is considered important, a sensitive-to-distance logarithmic rule no longer has the locality property. There is no general agreement on a single best rule for all situations. The use of a scoring rule that involves the choice of a baseline distribution depends on whether a relative evaluation is desired and whether there is a specific baseline distribution against which to compare probabilities. An important thing to keep in mind is that using the basic rules without choosing a baseline distribution means that probabilities are being evaluated relative to the default distribution, which is a uniform distribution. Another choice when events are ordered is whether to use a rule that is sensitive to distance, and this choice is related to whether giving higher probabilities to events close to the event that occurs is viewed as important for the situation at hand. What about practical issues in using the rules? The use of the rules ex post to evaluate probabilities is straightforward, just involving the computation of scores using the formula for any scoring rule that is chosen. Scores can then be used as feedback to enable experts and modelers to see their performance and perhaps learn from it. The use of scoring rules ex ante means that they should be part of a general probability assessment process that might include some training regarding probability if necessary. For many experts, the connection of the scoring rule formulas with the incentives is too opaque to make it valuable to dwell on the formulas. Discussing the incentives in an intuitive fashion is generally more effective. One option for relatively simple cases (e.g., a probability

10 10 SCORING RULES for a single event) is to present the possible scores in graphical or tabular form. The incentive to maximize expected score with strictly proper scoring rules is probably reasonable in most cases. In the context of thinking of the expert as wanting to maximize expected utility, it implies that the utility function is linear in the score (or linear in money if the score is translated into a monetary reward). If the expert s utility function U(S) for the score is known, a modification of the score to S = U 1 (S) with a strictly proper S will adjust for U, since U(S ) = U[U 1 (S)] = S, and will thereby encourage honest reporting. A practical problem with this is that we are not likely to know U, and eliciting it from the expert is not easy. In many cases, we feel that the importance of the score is probably not large enough to cause major violations of such linear utility due to risk aversion or risk taking, for example. However, if there are other stakes related to the probability forecasts, those stakes might cause significant shifting of the probabilities away from honest reporting. For example, if the situation is viewed as a contest against other experts (rightly or wrongly), the consideration of strategic play might lead the expert to give more extreme probabilities (i.e., probabilities closer to and often equal to 0 or 1) than justified by the expert s best judgments, in order to try to win the contest [16]. In most cases, however, we would expect that experts will try to come up with the best set of probabilities given the information that is available to them, and will not think strategically. In any event, strictly proper scoring rules can always be used for ex post evaluation purposes, and any hedging of reported probabilities can be expected to lead to lower average scores. In closing, we note that work on scoring rules has interesting connections to other fields. Scoring rules are closely connected to decision theory/decision analysis. A decision maker may hire an expert to report probabilities for events related to the decision and might like to tailor a scoring rule to the decision-making problem, in the spirit of Savage s share of the business notion [6]. The expert s reported probabilities can be viewed as new information by the decision maker, and connections between scores and the value of that information are of interest. These notions are related to the literature on incentives and mechanism design in economics and especially to agency theory. On a different tack, expected scores from strictly proper scoring rules are related to information measures from signal processing and information theory [9]. For example, the expected score for honest reporting under a logarithmic scoring rule is the negative Shannon entropy of the expert s probability distribution p and is the Kullback Leibler divergence of p with respect to q if the baseline distribution q is chosen. Finally, extensive experimental work by psychologists has investigated the degree to which individuals probability assessments are well calibrated and has led to various theories of the calibration of subjective probabilities [17]. Judgments about others probabilities are important in competitive situations, and economics and psychology are both relevant for this issue. Given the importance of probability forecasts in decision modeling and statistics as well as the connections with these different (and somewhat disparate) fields, we expect the interest in scoring rules and the application of such rules in practice to grow. REFERENCES 1. Brier GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78(1): Good IJ. Rational decisions. J R Stat Soc [Ser B] 1952;14(1): McCarthy J. Measures of the value of information. Proc Natl Acad Sci USA 1956;42(9): de Finetti B. Does it make sense to speak of good probability appraisers? In: Good IJ, editor. The scientist speculates: an anthology of partly-baked ideas. New York: Wiley; pp Winkler RL, Murphy AH. Good probability assessors. J Appl Meteorol 1968;7(5): Savage LJ. Elicitation of personal probabilities and expectations. J Am Stat Assoc 1971;66(336):

11 SCORING RULES Winkler RL. Scoring rules and the evaluation of probabilities. Test 1996;5(1): Gneiting T, Raftery A. Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 2007;102(477): Jose VRR, Nau RF, Winkler RL. Scoring rules, generalized entropy, and utility maximization. Oper Res 2008;56(5): De Groot MH, Fienberg SE. Assessing probability assessors: calibration and refinement. In: Gupta SS, Berger JO, editors. Statistical decision theory and related topics. New York: Academic Press; pp Winkler RL. Evaluating probabilities: asymmetric scoring rules. Manage Sci 1994;40(11): Epstein ES. A scoring system for probability forecasts of ranked categories. J Appl Meteorol 1969;8(6): Jose VRR, Nau RF, Winkler RL. Sensitivity to distance and baseline distributions in forecast evaluation. Manage Sci 2009;55(4): Bickel JE. Some comparisons among quadratic, spherical, and logarithmic rules. Decis Anal 2007;4(2): Selten R. Axiomatic characterization of the quadratic scoring rule. Exp Econ 1998;1(1): Lichtendahl KC, Winkler RL. Probability elicitation, scoring rules, and competition among forecasters. Manage Sci 2007;53(11): O Hagan A, Buck CE, Daneshkhah A, et al. Uncertain judgements: eliciting experts probabilities. Chichester: Wiley; 2006.

Scoring rules can provide incentives for truthful reporting of probabilities and evaluation measures for the

Scoring rules can provide incentives for truthful reporting of probabilities and evaluation measures for the MANAGEMENT SCIENCE Vol. 55, No. 4, April 2009, pp. 582 590 issn 0025-1909 eissn 1526-5501 09 5504 0582 informs doi 10.1287/mnsc.1080.0955 2009 INFORMS Sensitivity to Distance and Baseline Distributions

More information

A CHARACTERIZATION FOR THE SPHERICAL SCORING RULE

A CHARACTERIZATION FOR THE SPHERICAL SCORING RULE Theory and Decision (2009) 66:263 281 Springer 2007 DOI 10.1007/s11238-007-9067-x VICTOR RICHMOND JOSE A CHARACTERIZATION FOR THE SPHERICAL SCORING RULE ABSTRACT. Strictly proper scoring rules have been

More information

Proper Scores for Probability Forecasts Can Never Be Equitable

Proper Scores for Probability Forecasts Can Never Be Equitable APRIL 2008 J O L LIFFE AND STEPHENSON 1505 Proper Scores for Probability Forecasts Can Never Be Equitable IAN T. JOLLIFFE AND DAVID B. STEPHENSON School of Engineering, Computing, and Mathematics, University

More information

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society Enhancing Weather Information with Probability Forecasts An Information Statement of the American Meteorological Society (Adopted by AMS Council on 12 May 2008) Bull. Amer. Meteor. Soc., 89 Summary This

More information

Simple robust averages of forecasts: Some empirical results

Simple robust averages of forecasts: Some empirical results Available online at www.sciencedirect.com International Journal of Forecasting 24 (2008) 163 169 www.elsevier.com/locate/ijforecast Simple robust averages of forecasts: Some empirical results Victor Richmond

More information

Performance of Kullback-Leibler Based Expert Opinion Pooling for Unlikely Events

Performance of Kullback-Leibler Based Expert Opinion Pooling for Unlikely Events Proceedings of Machine Learning Research 58 (216) 41-5 Performance of Kullback-Leibler Based Expert Opinion Pooling for Unlikely Events Vladimíra Sečkárová Department of Adaptive Systems Institute of Information

More information

3 Intertemporal Risk Aversion

3 Intertemporal Risk Aversion 3 Intertemporal Risk Aversion 3. Axiomatic Characterization This section characterizes the invariant quantity found in proposition 2 axiomatically. The axiomatic characterization below is for a decision

More information

PRICING AND PROBABILITY DISTRIBUTIONS OF ATMOSPHERIC VARIABLES

PRICING AND PROBABILITY DISTRIBUTIONS OF ATMOSPHERIC VARIABLES PRICING AND PROBABILITY DISTRIBUTIONS OF ATMOSPHERIC VARIABLES TECHNICAL WHITE PAPER WILLIAM M. BRIGGS Abstract. Current methods of assessing the probability distributions of atmospheric variables are

More information

DECISIONS UNDER UNCERTAINTY

DECISIONS UNDER UNCERTAINTY August 18, 2003 Aanund Hylland: # DECISIONS UNDER UNCERTAINTY Standard theory and alternatives 1. Introduction Individual decision making under uncertainty can be characterized as follows: The decision

More information

Basic Verification Concepts

Basic Verification Concepts Basic Verification Concepts Barbara Brown National Center for Atmospheric Research Boulder Colorado USA bgb@ucar.edu Basic concepts - outline What is verification? Why verify? Identifying verification

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Structural Uncertainty in Health Economic Decision Models

Structural Uncertainty in Health Economic Decision Models Structural Uncertainty in Health Economic Decision Models Mark Strong 1, Hazel Pilgrim 1, Jeremy Oakley 2, Jim Chilcott 1 December 2009 1. School of Health and Related Research, University of Sheffield,

More information

The economic value of weather forecasts for decision-making problems in the profit/loss situation

The economic value of weather forecasts for decision-making problems in the profit/loss situation METEOROLOGICAL APPLICATIONS Meteorol. Appl. 4: 455 463 (27) Published online in Wiley InterScience (www.interscience.wiley.com) DOI:.2/met.44 The economic value of weather forecasts for decision-making

More information

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation Contents Decision Making under Uncertainty 1 elearning resources Prof. Ahti Salo Helsinki University of Technology http://www.dm.hut.fi Meanings of uncertainty Interpretations of probability Biases in

More information

Practical implementation of possibilistic probability mass functions

Practical implementation of possibilistic probability mass functions Soft Computing manuscript No. (will be inserted by the editor) Practical implementation of possibilistic probability mass functions Leen Gilbert, Gert de Cooman 1, Etienne E. Kerre 2 1 Universiteit Gent,

More information

Assessing probabilistic forecasts about particular situations

Assessing probabilistic forecasts about particular situations MPRA Munich Personal RePEc Archive Assessing probabilistic forecasts about particular situations Kesten C. Green Business and Economic Forecasting Unit, Monash University 25. April 2008 Online at http://mpra.ub.uni-muenchen.de/8836/

More information

Endogenous Information Choice

Endogenous Information Choice Endogenous Information Choice Lecture 7 February 11, 2015 An optimizing trader will process those prices of most importance to his decision problem most frequently and carefully, those of less importance

More information

Practical implementation of possibilistic probability mass functions

Practical implementation of possibilistic probability mass functions Practical implementation of possibilistic probability mass functions Leen Gilbert Gert de Cooman Etienne E. Kerre October 12, 2000 Abstract Probability assessments of events are often linguistic in nature.

More information

QUANTIFYING THE ECONOMIC VALUE OF WEATHER FORECASTS: REVIEW OF METHODS AND RESULTS

QUANTIFYING THE ECONOMIC VALUE OF WEATHER FORECASTS: REVIEW OF METHODS AND RESULTS QUANTIFYING THE ECONOMIC VALUE OF WEATHER FORECASTS: REVIEW OF METHODS AND RESULTS Rick Katz Institute for Study of Society and Environment National Center for Atmospheric Research Boulder, CO USA Email:

More information

Basic Verification Concepts

Basic Verification Concepts Basic Verification Concepts Barbara Brown National Center for Atmospheric Research Boulder Colorado USA bgb@ucar.edu May 2017 Berlin, Germany Basic concepts - outline What is verification? Why verify?

More information

Definitions and Proofs

Definitions and Proofs Giving Advice vs. Making Decisions: Transparency, Information, and Delegation Online Appendix A Definitions and Proofs A. The Informational Environment The set of states of nature is denoted by = [, ],

More information

Coefficients for Debiasing Forecasts

Coefficients for Debiasing Forecasts Reprinted from MONTHLY WEATHER REVIEW, VOI. 119, NO. 8, August 1991 American Meteorological Society Coefficients for Debiasing Forecasts University Center for Policy Research, The University at Albany,

More information

Three contrasts between two senses of coherence

Three contrasts between two senses of coherence Three contrasts between two senses of coherence Teddy Seidenfeld Joint work with M.J.Schervish and J.B.Kadane Statistics, CMU Call an agent s choices coherent when they respect simple dominance relative

More information

Introduction to Probability

Introduction to Probability Introduction to Probability Content Experiments, Counting Rules, and Assigning Probabilities Events and Their Probability Some Basic Relationships of Probability Conditional Probability Bayes Theorem 2

More information

Chapter 4 Introduction to Probability. Probability

Chapter 4 Introduction to Probability. Probability Chapter 4 Introduction to robability Experiments, Counting Rules, and Assigning robabilities Events and Their robability Some Basic Relationships of robability Conditional robability Bayes Theorem robability

More information

STA Module 4 Probability Concepts. Rev.F08 1

STA Module 4 Probability Concepts. Rev.F08 1 STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret

More information

FORECASTING STANDARDS CHECKLIST

FORECASTING STANDARDS CHECKLIST FORECASTING STANDARDS CHECKLIST An electronic version of this checklist is available on the Forecasting Principles Web site. PROBLEM 1. Setting Objectives 1.1. Describe decisions that might be affected

More information

Mathematics: applications and interpretation SL

Mathematics: applications and interpretation SL Mathematics: applications and interpretation SL Chapter 1: Approximations and error A Rounding numbers B Approximations C Errors in measurement D Absolute and percentage error The first two sections of

More information

An Economic Model of Personality and Its Implications for Measurement of Personality and Preference From: Personality Psychology and Economics

An Economic Model of Personality and Its Implications for Measurement of Personality and Preference From: Personality Psychology and Economics An Economic Model of Personality and Its Implications for Measurement of Personality and Preference From: Personality Psychology and Economics Mathilde Almlund University of Chicago By Angela Lee Duckworth

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

HOW TO INTERPRET PROBABILISTIC FORECASTS (IN PARTICULAR FOR WEATHER AND CLIMATE)

HOW TO INTERPRET PROBABILISTIC FORECASTS (IN PARTICULAR FOR WEATHER AND CLIMATE) HOW TO INTERPRET PROBABILISTIC FORECASTS (IN PARTICULAR FOR WEATHER AND CLIMATE) Jochen Bröcker Ackn: S. Siegert, H. Kantz MPI für Physik komplexer Systeme, Dresden, Germany (on leave) CATS Outline 1 The

More information

On Backtesting Risk Measurement Models

On Backtesting Risk Measurement Models On Backtesting Risk Measurement Models Hideatsu Tsukahara Department of Economics, Seijo University e-mail address: tsukahar@seijo.ac.jp 1 Introduction In general, the purpose of backtesting is twofold:

More information

Quantifying information and uncertainty

Quantifying information and uncertainty Quantifying information and uncertainty Alexander Frankel and Emir Kamenica University of Chicago April 2018 Abstract We examine ways to measure the amount of information generated by a piece of news and

More information

Verification of Probability Forecasts

Verification of Probability Forecasts Verification of Probability Forecasts Beth Ebert Bureau of Meteorology Research Centre (BMRC) Melbourne, Australia 3rd International Verification Methods Workshop, 29 January 2 February 27 Topics Verification

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

Week of May 5, lecture 1: Expected utility theory

Week of May 5, lecture 1: Expected utility theory Microeconomics 3 Andreas Ortmann, Ph.D. Summer 2003 (420 2) 240 05 117 andreas.ortmann@cerge-ei.cz http://home.cerge-ei.cz/ortmann Week of May 5, lecture 1: Expected utility theory Key readings: MWG 6.A.,

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

Coherence with Proper Scoring Rules

Coherence with Proper Scoring Rules Coherence with Proper Scoring Rules Mark Schervish, Teddy Seidenfeld, and Joseph (Jay) Kadane Mark Schervish Joseph ( Jay ) Kadane Coherence with Proper Scoring Rules ILC, Sun Yat-Sen University June 2010

More information

Stochastic Processes

Stochastic Processes qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot

More information

Quantifying information and uncertainty

Quantifying information and uncertainty Quantifying information and uncertainty Alexander Frankel and Emir Kamenica University of Chicago March 2018 Abstract We examine ways to measure the amount of information generated by a piece of news and

More information

Game Theory and Economics of Contracts Lecture 5 Static Single-agent Moral Hazard Model

Game Theory and Economics of Contracts Lecture 5 Static Single-agent Moral Hazard Model Game Theory and Economics of Contracts Lecture 5 Static Single-agent Moral Hazard Model Yu (Larry) Chen School of Economics, Nanjing University Fall 2015 Principal-Agent Relationship Principal-agent relationship

More information

Challenges of Communicating Weather Information to the Public. Sam Lashley Senior Meteorologist National Weather Service Northern Indiana Office

Challenges of Communicating Weather Information to the Public. Sam Lashley Senior Meteorologist National Weather Service Northern Indiana Office Challenges of Communicating Weather Information to the Public Sam Lashley Senior Meteorologist National Weather Service Northern Indiana Office Dilbert the Genius Do you believe him? Challenges of Communicating

More information

Ensemble Verification Metrics

Ensemble Verification Metrics Ensemble Verification Metrics Debbie Hudson (Bureau of Meteorology, Australia) ECMWF Annual Seminar 207 Acknowledgements: Beth Ebert Overview. Introduction 2. Attributes of forecast quality 3. Metrics:

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 14 From Randomness to Probability Copyright 2012, 2008, 2005 Pearson Education, Inc. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,

More information

Probability Elicitation for Agents with Arbitrary Risk Preferences

Probability Elicitation for Agents with Arbitrary Risk Preferences Probability Elicitation for Agents with Arbitrary Risk Preferences Nicolas S. Lambert July 2018 Abstract A principal asks an agent for his probability assessment of a random event, and wants to induce

More information

Recursive Ambiguity and Machina s Examples

Recursive Ambiguity and Machina s Examples Recursive Ambiguity and Machina s Examples David Dillenberger Uzi Segal May 0, 0 Abstract Machina (009, 0) lists a number of situations where standard models of ambiguity aversion are unable to capture

More information

Why on earth did you do that?!

Why on earth did you do that?! Why on earth did you do that?! A formal model of obfuscation-based choice Caspar 5-4-218 Chorus http://behave.tbm.tudelft.nl/ Delft University of Technology Challenge the future Background: models of decision-making

More information

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type

More information

Math Literacy. Curriculum (457 topics)

Math Literacy. Curriculum (457 topics) Math Literacy This course covers the topics shown below. Students navigate learning paths based on their level of readiness. Institutional users may customize the scope and sequence to meet curricular

More information

Reliability, Sufficiency, and the Decomposition of Proper Scores

Reliability, Sufficiency, and the Decomposition of Proper Scores Reliability, Sufficiency, and the Decomposition of Proper Scores Jochen Bröcker October 23, 2009 Abstract Scoring rules are an important tool for evaluating the performance of probabilistic forecasting

More information

Evaluating Forecast Quality

Evaluating Forecast Quality Evaluating Forecast Quality Simon J. Mason International Research Institute for Climate Prediction Questions How do we decide whether a forecast was correct? How do we decide whether a set of forecasts

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Algebra and Trigonometry 2006 (Foerster) Correlated to: Washington Mathematics Standards, Algebra 2 (2008)

Algebra and Trigonometry 2006 (Foerster) Correlated to: Washington Mathematics Standards, Algebra 2 (2008) A2.1. Core Content: Solving problems The first core content area highlights the type of problems students will be able to solve by the end of, as they extend their ability to solve problems with additional

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Some forgotten equilibria of the Bertrand duopoly!?

Some forgotten equilibria of the Bertrand duopoly!? Some forgotten equilibria of the Bertrand duopoly!? Mathias Erlei Clausthal University of Technology Julius-Albert-Str. 2, 38678 Clausthal-Zellerfeld, Germany email: m.erlei@tu-clausthal.de Abstract This

More information

1. The General Linear-Quadratic Framework

1. The General Linear-Quadratic Framework ECO 317 Economics of Uncertainty Fall Term 2009 Slides to accompany 21. Incentives for Effort - Multi-Dimensional Cases 1. The General Linear-Quadratic Framework Notation: x = (x j ), n-vector of agent

More information

This corresponds to a within-subject experiment: see same subject make choices from different menus.

This corresponds to a within-subject experiment: see same subject make choices from different menus. Testing Revealed Preference Theory, I: Methodology The revealed preference theory developed last time applied to a single agent. This corresponds to a within-subject experiment: see same subject make choices

More information

Background on Coherent Systems

Background on Coherent Systems 2 Background on Coherent Systems 2.1 Basic Ideas We will use the term system quite freely and regularly, even though it will remain an undefined term throughout this monograph. As we all have some experience

More information

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

MATH2206 Prob Stat/20.Jan Weekly Review 1-2 MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion

More information

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

2. Probability. Chris Piech and Mehran Sahami. Oct 2017 2. Probability Chris Piech and Mehran Sahami Oct 2017 1 Introduction It is that time in the quarter (it is still week one) when we get to talk about probability. Again we are going to build up from first

More information

Are Probabilities Used in Markets? 1

Are Probabilities Used in Markets? 1 Journal of Economic Theory 91, 8690 (2000) doi:10.1006jeth.1999.2590, available online at http:www.idealibrary.com on NOTES, COMMENTS, AND LETTERS TO THE EDITOR Are Probabilities Used in Markets? 1 Larry

More information

Foundations of Mathematics 11 and 12 (2008)

Foundations of Mathematics 11 and 12 (2008) Foundations of Mathematics 11 and 12 (2008) FOUNDATIONS OF MATHEMATICS GRADE 11 [C] Communication Measurement General Outcome: Develop spatial sense and proportional reasoning. A1. Solve problems that

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

The Laplace Rule of Succession Under A General Prior

The Laplace Rule of Succession Under A General Prior 1 The Laplace Rule of Succession Under A General Prior Kalyan Raman University of Michigan in Flint School of Management Flint, MI 48502 May 2000 ------------------------------------------------------------------------------------------------

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Cómo contar la predicción probabilistica? Part I: The problem with probabilities

Cómo contar la predicción probabilistica? Part I: The problem with probabilities Cómo contar la predicción probabilistica? How to tell probabilities in weather forecasting? Part I: The problem with probabilities 1 -Why all this talk about uncertainty and probabilities? The computer

More information

Microeconomics II Lecture 4: Incomplete Information Karl Wärneryd Stockholm School of Economics November 2016

Microeconomics II Lecture 4: Incomplete Information Karl Wärneryd Stockholm School of Economics November 2016 Microeconomics II Lecture 4: Incomplete Information Karl Wärneryd Stockholm School of Economics November 2016 1 Modelling incomplete information So far, we have studied games in which information was complete,

More information

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum To print higher-resolution math symbols, click the Hi-Res Fonts for Printing button on the jsmath control panel. PHY 13 Lab 1 - Error and Uncertainty and the Simple Pendulum Important: You need to print

More information

Transitive Regret over Statistically Independent Lotteries

Transitive Regret over Statistically Independent Lotteries Transitive Regret over Statistically Independent Lotteries April 2, 2012 Abstract Preferences may arise from regret, i.e., from comparisons with alternatives forgone by the decision maker. We show that

More information

SUPPORTING INFORMATION ALGEBRA II. Texas Education Agency

SUPPORTING INFORMATION ALGEBRA II. Texas Education Agency SUPPORTING INFORMATION ALGEBRA II Texas Education Agency The materials are copyrighted (c) and trademarked (tm) as the property of the Texas Education Agency (TEA) and may not be reproduced without the

More information

The Complexity of Forecast Testing

The Complexity of Forecast Testing The Complexity of Forecast Testing LANCE FORTNOW and RAKESH V. VOHRA Northwestern University Consider a weather forecaster predicting the probability of rain for the next day. We consider tests that given

More information

Measuring the Standard of Living: Uncertainty about Its Development

Measuring the Standard of Living: Uncertainty about Its Development Measuring the Standard of Living: Uncertainty about Its Development Wulf Gaertner Department of Economics, University of Osnabrück D 49069 Osnabrück, Germany E-mail: WGaertner@oec.uni-osnabrueck.de Yongsheng

More information

A Eliciting Predictions for Discrete Decision Making

A Eliciting Predictions for Discrete Decision Making A Eliciting Predictions for Discrete Decision Making YILING CHEN, Harvard University IAN A. KASH, Microsoft Research MIKE RUBERRY, Harvard University VICTOR SHNAYDER, Harvard University We consider a decision

More information

Notes and Solutions #6 Meeting of 21 October 2008

Notes and Solutions #6 Meeting of 21 October 2008 HAVERFORD COLLEGE PROBLEM SOLVING GROUP 008-9 Notes and Solutions #6 Meeting of October 008 The Two Envelope Problem ( Box Problem ) An extremely wealthy intergalactic charitable institution has awarded

More information

Extended Range Truing, Why and How

Extended Range Truing, Why and How Extended Range Truing, Why and How For a hundred and fifty years, ballisticians have sought to improve their predictions of the path of the bullet after it leaves the gun. The bullet s behavior is primarily

More information

STAT Chapter 3: Probability

STAT Chapter 3: Probability Basic Definitions STAT 515 --- Chapter 3: Probability Experiment: A process which leads to a single outcome (called a sample point) that cannot be predicted with certainty. Sample Space (of an experiment):

More information

Deceptive Advertising with Rational Buyers

Deceptive Advertising with Rational Buyers Deceptive Advertising with Rational Buyers September 6, 016 ONLINE APPENDIX In this Appendix we present in full additional results and extensions which are only mentioned in the paper. In the exposition

More information

Averaging Probability Forecasts: Back to the Future

Averaging Probability Forecasts: Back to the Future Averaging Probability Forecasts: Back to the Future Robert L. Winkler Kenneth C. Lichtendahl Jr. Yael Grushka-Cockayne Victor Richmond R. Jose Working Paper 19-039 Averaging Probability Forecasts: Back

More information

Math 710 Homework 1. Austin Mohr September 2, 2010

Math 710 Homework 1. Austin Mohr September 2, 2010 Math 710 Homework 1 Austin Mohr September 2, 2010 1 For the following random experiments, describe the sample space Ω For each experiment, describe also two subsets (events) that might be of interest,

More information

II. Analysis of Linear Programming Solutions

II. Analysis of Linear Programming Solutions Optimization Methods Draft of August 26, 2005 II. Analysis of Linear Programming Solutions Robert Fourer Department of Industrial Engineering and Management Sciences Northwestern University Evanston, Illinois

More information

Recitation 7: Uncertainty. Xincheng Qiu

Recitation 7: Uncertainty. Xincheng Qiu Econ 701A Fall 2018 University of Pennsylvania Recitation 7: Uncertainty Xincheng Qiu (qiux@sas.upenn.edu 1 Expected Utility Remark 1. Primitives: in the basic consumer theory, a preference relation is

More information

A New Interpretation of Information Rate

A New Interpretation of Information Rate A New Interpretation of Information Rate reproduced with permission of AT&T By J. L. Kelly, jr. (Manuscript received March 2, 956) If the input symbols to a communication channel represent the outcomes

More information

Evidence with Uncertain Likelihoods

Evidence with Uncertain Likelihoods Evidence with Uncertain Likelihoods Joseph Y. Halpern Cornell University Ithaca, NY 14853 USA halpern@cs.cornell.edu Riccardo Pucella Cornell University Ithaca, NY 14853 USA riccardo@cs.cornell.edu Abstract

More information

COHERENCE AND PROBABILITY. Rosangela H. Loschi and Sergio Wechsler. Universidade Federal de Minas Gerais e Universidade de S~ao Paulo ABSTRACT

COHERENCE AND PROBABILITY. Rosangela H. Loschi and Sergio Wechsler. Universidade Federal de Minas Gerais e Universidade de S~ao Paulo ABSTRACT COHERENCE AND PROBABILITY Rosangela H. Loschi and Sergio Wechsler Universidade Federal de Minas Gerais e Universidade de S~ao Paulo ABSTRACT A process of construction of subjective probability based on

More information

May 4, Statement

May 4, Statement Global Warming Alarm Based on Faulty Forecasting Procedures: Comments on the United States Department of State s U.S. Climate Action Report 2010. 5 th ed. Submitted by: May 4, 2010 J. Scott Armstrong (Ph.D.,

More information

HOW GOOD WERE THOSE PROBABILITY PREDICTIONS? THE EXPECTED RECOMMENDATION LOSS (ERL) SCORING RULE

HOW GOOD WERE THOSE PROBABILITY PREDICTIONS? THE EXPECTED RECOMMENDATION LOSS (ERL) SCORING RULE HOW GOOD WERE THOSE PROBABILITY PREDITIONS? THE EXPETED REOMMENDATION LOSS (ERL) SORING RULE David B. Rosen enter for Biomedical Modeling Research University of Nevada, Reno Present address: Department

More information

7.5 Partial Fractions and Integration

7.5 Partial Fractions and Integration 650 CHPTER 7. DVNCED INTEGRTION TECHNIQUES 7.5 Partial Fractions and Integration In this section we are interested in techniques for computing integrals of the form P(x) dx, (7.49) Q(x) where P(x) and

More information

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE 3.1 Model Violations If a set of items does not form a perfect Guttman scale but contains a few wrong responses, we do not necessarily need to discard it. A wrong

More information

Microeconomic theory focuses on a small number of concepts. The most fundamental concept is the notion of opportunity cost.

Microeconomic theory focuses on a small number of concepts. The most fundamental concept is the notion of opportunity cost. Microeconomic theory focuses on a small number of concepts. The most fundamental concept is the notion of opportunity cost. Opportunity Cost (or "Wow, I coulda had a V8!") The underlying idea is derived

More information

Verification of ensemble and probability forecasts

Verification of ensemble and probability forecasts Verification of ensemble and probability forecasts Barbara Brown NCAR, USA bgb@ucar.edu Collaborators: Tara Jensen (NCAR), Eric Gilleland (NCAR), Ed Tollerud (NOAA/ESRL), Beth Ebert (CAWCR), Laurence Wilson

More information

Combining Forecasts: The End of the Beginning or the Beginning of the End? *

Combining Forecasts: The End of the Beginning or the Beginning of the End? * Published in International Journal of Forecasting (1989), 5, 585-588 Combining Forecasts: The End of the Beginning or the Beginning of the End? * Abstract J. Scott Armstrong The Wharton School, University

More information

Decision Making Beyond Arrow s Impossibility Theorem, with the Analysis of Effects of Collusion and Mutual Attraction

Decision Making Beyond Arrow s Impossibility Theorem, with the Analysis of Effects of Collusion and Mutual Attraction Decision Making Beyond Arrow s Impossibility Theorem, with the Analysis of Effects of Collusion and Mutual Attraction Hung T. Nguyen New Mexico State University hunguyen@nmsu.edu Olga Kosheleva and Vladik

More information

Conjectural Variations in Aggregative Games: An Evolutionary Perspective

Conjectural Variations in Aggregative Games: An Evolutionary Perspective Conjectural Variations in Aggregative Games: An Evolutionary Perspective Alex Possajennikov University of Nottingham January 2012 Abstract Suppose that in aggregative games, in which a player s payoff

More information

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION Unit : I - V Unit I: Syllabus Probability and its types Theorems on Probability Law Decision Theory Decision Environment Decision Process Decision tree

More information

Preliminary Results on Social Learning with Partial Observations

Preliminary Results on Social Learning with Partial Observations Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from

More information

ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 1. Reminder and Review of Probability Concepts

ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 1. Reminder and Review of Probability Concepts ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 1. Reminder and Review of Probability Concepts 1 States and Events In an uncertain situation, any one of several possible outcomes may

More information

Chapter 4 - Introduction to Probability

Chapter 4 - Introduction to Probability Chapter 4 - Introduction to Probability Probability is a numerical measure of the likelihood that an event will occur. Probability values are always assigned on a scale from 0 to 1. A probability near

More information

Section 13.3 Probability

Section 13.3 Probability 288 Section 13.3 Probability Probability is a measure of how likely an event will occur. When the weather forecaster says that there will be a 50% chance of rain this afternoon, the probability that it

More information

A Note on the Existence of Ratifiable Acts

A Note on the Existence of Ratifiable Acts A Note on the Existence of Ratifiable Acts Joseph Y. Halpern Cornell University Computer Science Department Ithaca, NY 14853 halpern@cs.cornell.edu http://www.cs.cornell.edu/home/halpern August 15, 2018

More information