APPENDIX 1: CONSTRUCTING GINI COEFFICIENTS
|
|
- Barbara Polly Grant
- 5 years ago
- Views:
Transcription
1 APPENDIX 1: CONSTRUCTING GINI COEFFICIENTS A. Non-parametric estimation of bounds We follow the technique described in Murray (1978) for finding the upper and lower bounds of the Gini from grouped data on income distributions. 1 Calculating these bounds involves two optimization problems, which can be solved using linear and quadratic programming methods. We briefly outline the two optimization problems in this section. The structure of the American Indian reservation income data provides the following information:! is the total number of families, "is the population mean income, # $ is the number of families in the % &' interval where % = 1,, -, and [/ 0 $, / 1 $ ] are the lower and upper income limits of this interval. For the terminal interval -, there is no upper limit, which we denote as / 1 3 =. Two important statistics are unknown to us, given data limitations. These are: (1) the mean income within each interval, which we define as / $, and (2) the distribution of income within each interval. Solving the optimization problems require choosing these unknowns to solve for the upper and lower bounds of the Gini coefficient. Intuitively, finding the lower-bound Gini for a given grouped income distribution requires concentration of the within-interval mean incomes as much as possible towards the mean of the entire distribution, ". Minimizing the Gini also requires concentrating the distribution of within-interval incomes on the (unknown) within-interval mean income / $. Table A1 provides a simplified numerical example where - = 3 bins and! = 100 families. Table A1: Numerical Example to Illustrate Optimization Procedure Total number of families! = 100 Number of families within income range $0 to $33 = [/ 0 6, / 1 6 ) # 6 = 40 $33 to $66 = [/ 0 9, / 1 9 ) # 9 = 40 $66 or greater = [/ 0 :, ) # : = 20 Population Mean Income " = $50 Within-Group Mean Incomes / 6, / 9, / : =?#%#@A#B to be chosen Constrained by / $ 0 / $ / $ 1 1 For general non-parametric inequality measures estimation techniques see Cowell and Mehta (1982), Gastwirth and Glauberman (1976), and Cowell (2000). Depending on what information is available difference methods are available for calculating the 5bounds of inequality measures. Gastwirth (1972), Cowell (1991), McDonald and Randsom (1981), and Murray (1978) are all papers with these types of objectives in mind. A1
2 Figure A1 illustrates how the upper and lower bounds of the Gini are found based on the numerical example in table A1. With respect to the lower bound, note how that outer income ranges converge as much as possible towards the population mean income ". 1 Intuitively, one might expect / 6 to be chosen so that it is exactly equal to / 6, and / : to be 0 chosen so that it is exactly equal to / :, but there is the constraint that the sum of total income across all groups must be equal to the total income of the reservation, i.e., # 6 / 6 + # 9 / 9 + # : / : =!". Shown surrounding each / $ is the (unobservable) intermediate income distribution within interval %. To find the lower bound on the Gini, the optimization procedure forces the within-interval income distributions to collapse until every family within each interval has the same income (equal to / F ) and there is no within-interval inequality. Figure A1: Finding Upper and Lower Bounds of Gini Notes: / $ 0G corresponds to finding the lower bound of the Gini and / $ 1G corresponds to finding the upper bound of the Gini. Now consider finding the upper bound on the Gini. Here the optimization procedure involves allowing the within-interval mean income to diverge as much as possible from the population mean income ". Note in figure A1 how the within-interval mean incomes for the outer bins, / 6 and / :, diverge from the population mean ". Similar to the lower bound case, there is the constraint # 6 / 6 + # 9 / 9 + # : / : =!" which limits how far the / $ s can move. If this constraint were not effective, then the within-interval mean for the terminal bin / : would approach infinity and there would be no solution. The within-interval inequality is also 1 maximized such that a certain percentage of families earn the upper limit / $ and the rest earn the lower limit / 0 $, shown as the dividers between each income range. To formally solve the optimization problems we define H $ for all closed income 1 intervals as the proportion of families in interval % having income / $ and 1 H $ as the proportion having income / 0 $. The mean income within interval % can now be written as / $ = A2
3 H $ / $ 1 + (1 H $ )/ $ 0 where 0 H $ 1 for % = 1,, - 1. For the unbounded, terminal interval we define H 3 as the ratio of mean income / 3 to the lower bound / 3 0. The mean income for the terminal interval can be written as / 3 = H 3 / 3 0 where H 3 1. Given this specification, we can vary the two unknowns in the optimization problem within-interval mean income / $ and within-interval inequality by choosing the H s defined above. Moving H $ from 0 to 1 moves / $ from / $ 0 to / $ 1. In the upper bound Gini case, each family in interval % can have income equal to either / $ 0 or / $ 1. Therefore, if H $ = 0 then / $ = / $ 0, and if H $ = 1, then / $ = / $ 1. In either case, incomes are equal for all families within this group. As H $ moves away from 0 or 1, within-interval inequality increases as a certain proportion of families earn income at the opposite ends of the income range. When finding the lower bound of the Gini coefficient, the Gini can be expressed in the following way as a linear function of H $. Minimizing this function yields the lower bound Gini. LM#M 0G = NM# H $ L(H $ ) = 3U6 1! 9 " O# 3P 3 / 0 3 H 3 + Q # $ P $ R/ 0 $ / 0 6 SH 3 (/ 1 6 / 0 6 ) (Q # $ P $ ) 3U6 $V9 + Q # $ P $ R/ 0 $ / 0 $ S # 3 P 3 / 0 6 W $V9 where P $ = (! # $ ). This function takes into account the fact that within-interval incomes are equal. When finding the upper bound of the Gini, there is an additional degree of freedom allowing within-interval inequality to vary. The Gini is defined as a quadratic function of H $. Finding the upper bound Gini requires maximizing this function as below, 3 $V9 H $ LM#M 1G = NXY H $ L(H $ ) = 3U6 1! 9 " O# 3P 3 / 0 3 H 3 + Q # $ P $ R/ 0 $ / 0 6 SH 3 (/ 1 6 / 0 6 ) (Q # $ P $ ) $V9 3U6 + Q # $ P $ R/ 0 $ / 0 0 $ S # 3 P 3 / 6 $V9 3U6 A3 3 $V9 + Q # 9 $ R/ 1 $ / 0 $ SRH $ H 9 $ S + # 3 (# 3 1)/ 0 3 (H 3 1) W $V6 H $
4 where P $ = (! # $ ). Whether minimizing and maximizing the objective functions to find the bounds on the Gini, the optimization procedures are constrained by the limits of H $ and by total reservation income. We define these constraints as: (^1) 0 H $ 1 _@` M = 1,, - 1, (^2) 1 H 3, 3U6 (^3) Q # F RH $ / 1 $ + (1 H $ )/ 0 $ S + # 3 -/ 0 3 =! " FV6 Because both the objective function and constraints for the minimization problem are linear in the decision variable H $, we therefore use linear programming methods to calculate the lower bound Gini. In the case of the maximization problem, the objective function is quadratic in H $, which is why we employ quadratic programming methods to calculate the upper bound Gini. For the analysis that follows, we simply use the midpoint between the upper and lower bounds as the point estimate for the Gini coefficients. We justify this choice of a 0.50 weight between the lower and upper bound in section B below. For the simple numerical example in Table A1, where K = 3, the lower bound Gini is and the upper bound Gini is The midpoint is It is worth noting that the range between lower and upper bounds decreases with the number of intervals K, holding constant the distribution of income. For this reason, the upper minus lower bound difference in our actual Gini computations are significantly narrower than the = range computed from the simple example. 2 B. GMM estimation with Maximum Entropy distribution In order to produce a point estimate of the Gini coefficient, one must select a compromise value between the upper and lower bound Gini. For cases in which the grouped data contains information on group frequencies and group means, Cowell (2000) suggests a weighted average of the bounds giving a weight of 2/3 to the upper bound and 1/3 to the lower bound. Due to data limitations in our setting, we observe only group frequencies and population means and therefore have little guidance from Cowell (2000), or the rest of the literature, in terms of selecting a compromise value between lower and upper bound Ginis. 2 We calculated the Gini bounds with K = 5 and K = 10 assuming the family incomes were distributed uniformly from $0-100 and kept the lower limit of the terminal interval at $ /K. For example, for K = 5, there are approximately 10 families to a bin with a terminal interval of $80 or greater and, for K = 10, there are roughly five families to an interval with a terminal interval of $90 or greater. The difference between the upper and lower Gini bounds diminishes rapidly as K increases. When K = 5 this difference is and when K = 10 this difference is A4
5 To assess whether or not the midpoint of the lower and upper bound Ginis is a reasonable compromise value, we estimate the Gini coefficient after first estimating the continuous income distribution using techniques in Wu and Perloff (2007). Their procedure involves assuming a variable income in our case - is distributed according to a flexible maximum entropy density function 3, and then estimating the parameters of the so-called maxent density by Generalized Method of Moments (GMM). This procedure gives us a blueprint for calculating a Gini coefficient directly from an estimated density function. Because of computational constraints resulting from our data structure, however, we are unable to use the Wu and Perloff method to estimate the Gini coefficient for all reservations in our sample. 4 Therefore, we use the Wu and Perloff method to calculate Gini coefficients for a subset of our sample, and then compare the estimates with the midpoints of the lower and upper bound calculations described above. The following is a brief sketch of the estimation procedure. For theoretical motivation and a more in-depth explanation of the procedure, see Wu and Perloff (2007). The maxent density function we use to approximate the distribution is defined as _(Y b) = exp( b f b 6 Y b 9 Y 9 b : X`ghX#(Y) b i j@k(1 + Y 9 )) where b f is a normalization term that is defined such that _(Y b)my = 1. A special case of this Un function is the normal distribution, which occurs when b : and b i are both equal to zero. The X`ghX#(Y) component allows for deviations from a symmetric distribution, e.g., skewness, multi-modalness, etc. The j@k(1 + Y 9 ) component allows for fat tails. We employ GMM methods to estimate the parameters of this density function that best fits the grouped income data. This method involves minimizing a weighted quadratic function of the moment conditions of the density function which takes the form of (1). bo = X`kNM# N(b) qn(b (1) b Here the moment conditions are defined as (b) = r u s t sv _(Y b) B $ t w, x/ 0 n $, / 1 $ y are the upper _(Y b) " and lower bounds of interval %, B $ = z t { f is the share of the population in interval %, and " is the population mean. The objective is to make these moment conditions as close to zero as 3 The principle of maximum entropy is to choose the probability distribution consistent with known information while being as noncommittal as possible with regard to missing information. 4 The algorithm we use to minimize the objective function has difficulty converging for reservations which have an interval with no families in them, i.e., there is an empty bin in the frequency table. Generally, these are the smaller reservations. A5
6 possible subject to the weights in the weight matrix q. q is estimated using the following simulation procedure. 1. Draw an i.i.d. random sample Y of size! from the _RY bof S where bof is a consistent preliminary estimate calculated by setting q to the identity matrix. 2. Group Y in the same way as the original interval data and calculate simulated shares B $ and the population mean ". 3. Calculate the simulated weight matrix q defined below. B q = 0 B " Repeat this procedure Ä times and the weighting matrix is the average of these simulated weight matrices: q = G ÇV6 q (Ç). Once the entire distribution of income is estimated, the Gini coefficient can be calculated as the following: LM#M = 6 É U6 Ñ(Y)(1 Ñ(Y)mY f where Ö = Y_(Y)mY f Ü and Ñ(Y) = _(Y bo)my. In order to get an actual estimate of bo, we minimize function (1) f using the BFGS algorithm with numerical gradients. Because the maxent density is a rather complex function, a significant amount of numerical integration is required in order to evaluate this function, its gradients, and the moment conditions. This makes optimization slow. Moreover, it is often difficult to get the objective function to converge when trying to estimate the income distribution from grouped data which have bin or group intervals containing zero families. For this reason, we are unable to estimate the distribution of income and the Gini coefficient for many of the reservations whose grouped data have gaps (generally sparsely populated reservations). Nonetheless, this method allows us to estimate the Gini coefficients for a significant number of reservations. We compare the estimates with the estimates from the non-parametric procedure described above using different weights on the upper and lower bound Gini coefficients in order to find a reasonable compromise point estimate. In order to get a reasonable number of estimates with which to compare against the non-parametric estimates, we estimated the income distribution and Gini coefficients for all reservations in 1990 and 2000 for whom it was feasible. We then took the GMM Gini estimates L F áàà and compared them with the Gini estimates from the non-parametric procedure using different weights on the upper and lower bound of the Gini: L F {â (A) = A6
7 AL F 1 + (1 A)L F 0. In order to find a value of A which makes L F áàà and L F {â (A) close, we minimize the sum of the squared differences: Aä = X`kNM# A F (L áàà F L {â F (A)) 9. This yields a value of Aä = This finding indicates that roughly equal weights to the upper and lower bounds of the Gini generates the Gini point estimate which is closest to the GMM procedure. For this reason we use the midpoint between the upper and lower bounds as the point estimate of the Gini coefficient for all of the analysis. A7
8 APPENDIX 2: GROWTH-INEQUALITY RELATIONSHIPS FOR AMERICAN INDIAN RESERVATIONS BY GEOGRAPHIC REGION Notes: The geographical regions are depicted in figures 3 and 4. A8
9 APPENDIX 3: TRITILE GRAPHS OF GROWTH-INEQUALITY RELATIONSHIP Figure A.3.1: Tritile Graph of Growth-Inequality, Acoma to Cocopah Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A9
10 Figure A.3.2: Tritile Graph of Growth-Inequality, Coeur d Alene to Fond du Lac Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A10
11 Figure A.3.3: Tritile Graph of Growth-Inequality, Forest County Potawatomi to Fort Yuma Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A11
12 Figure A.3.4: Tritile Graph of Growth-Inequality, Gila River to Lac Courte Oreilles Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A12
13 Figure A.3.5: Tritile Graph of Growth-Inequality, Lac du Flambeau to Mescalero Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A13
14 Figure A.3.6: Tritile Graph of Growth-Inequality, Mille Lacs to Osage Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A14
15 Figure A.3.7: Tritile Graph of Growth-Inequality, Pala to Reno-Sparks Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A15
16 Figure A.3.8: Tritile Graph of Growth-Inequality, Rincon to San Pascual Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile. A16
17 Figure A.3.9: Tritile Graph of Growth-Inequality, Santa Clara to Standing Rock Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile A17
18 Figure A.3.10: Tritile Graph of Growth-Inequality, Stockbridge-Munsee to Umatilla Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile A18
19 Figure A.3.11: Tritile Graph of Growth-Inequality, Ute Mountain to Yavapai-Apache Nation Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile A19
20 Figure A.3.12: Tritile Graph of Growth-Inequality, Zuni Note: The vertical axis shows the Gini Coefficient and the horizontal axis shows the income tritile A20
21 APPENDIX 4: ADDITIONAL TABLES A. Variable Definitions and Sources Reservation Income o American Indian Per Capita Income Per capita income information from 1943 to 1945 is available from BIA reports housed in the National Archives in Washington D.C.. Income data from 1970 to 2010 is available from the Census Bureau. o Slot Machines Per Capita Number of slot machines in tribal casinos on the reservation divided by the American Indian population. For more information see Anderson and Parker (2008) and Cookson (2010). Ethnicity o Blood Quantum Blood quantum information is available from 1938 in a BIA report housed in the National Archives in Washington D.C.. The data includes the number of individuals in four blood quantum bins, 100%, 50-99%, 25-49%, and less than 25%. Reservation Demographics o American Indian Population American Indian population data is from the same set of reports as the per capita income data. o Pct. Completed High School This variable reports the share of the American Indian population with at least a high school degree. This measure is available from the Census Bureau for o Migration Migration measures were calculated from Census Bureau data available through NHGIS. Migration share used in Appendix Table A.4.3 is calculated as the share of the American Indian population that moved from either out of the state or out of the country to the reservation in the last 5 years in 2000 and in the prior year for Reservation Characteristics o Land Tenure Land tenure shares are measured as the fraction of the reservation in different tenure types as of The measures include the share of land in tribal trust, individual trust, and fee-simple. These measures are from the Bureau of Indian Affairs. o Indian Reorganization Act IRA adoption is from Frye and Parker (2016) and is a binary measure of whether or not the reservation voted to adopt the Indian Reorganization Act between 1934 and o Public Law 280 Public Law 280 is from Anderson and Parker (2008) and is a binary measure of whether or not Public Law 280 was applied to the reservation. Regional Economic Controls o State Per Capita Income State per capita income is from the Bureau of Economic Analysis. o Adjacent County Per Capita Income Adjacent county per capita income is from the Bureau of Economic Analysis. This measure is the average income of those counties that border the reservation and do not overlap with the reservation. o Distance to Nearest MSA Distance is calculated from the centroid of each reservation to the closest MSA in A21
22 B. Summary Statistics Table A.4.1: Summary Statistics by Period Gini Coefficient (9.109) (7.431) (8.134) (4.288) (4.484) (5.559) American Indian Per Capita Income (5007.5) (2142.8) (3105.2) (2107.1) (3076.1) (4272.9) Slots per capita (0) (0) (0) (0.0523) (0.589) (1.165) Less Ethnically Assimilated (0.217) (0.217) (0.217) (0.217) (0.217) (0.217) Ethnic Fragmentation (0.159) (0.159) (0.159) (0.159) (0.159) (0.159) American Indian Population (5324.9) (6501.7) ( ) ( ) ( ) ( ) Pct. Completed High School (.) (0.103) (0.0729) (0.127) (0.0913) (0.0879) State Per Capita Income (2329.4) (2782.7) (2877.5) (3417.2) (4447.2) (4203.2) Adjacent County Per Capita Income (.) (2546.9) (3112.5) (4122.0) (4264.4) (4015.5) Distance to Nearest MSA (in mi) (83.95) (83.68) (81.86) (81.86) (81.86) (81.86) Share of Acreage in Tribal Trust (0.383) (0.383) (0.383) (0.383) (0.383) (0.383) Share of Acreage in Individual Trust (0.176) (0.176) (0.176) (0.176) (0.176) (0.176) Share of Acreage in Fee-Simple (0.314) (0.314) (0.314) (0.314) (0.314) (0.314) Indian Reorganization Act (0/1) (0.402) (0.402) (0.402) (0.402) (0.402) (0.402) Public Law 280 Reservation (0/1) (0.480) (0.480) (0.480) (0.480) (0.480) (0.480) Notes: Table presents means and standard deviations for the outcomes and covariates used throughout the paper. For sources and definitions see Appendix A.4.1. Incomes are adjusted to 2010 dollars. A22
23 C. Inequality and Migration Tables Table A.4.2: Panel Model Estimates of Relationship between of Income and Inequality, Including Endogenous Controls Ln(Income Per Capita) Ln(Income Per Capita) LBQA Ln(Income Per Capita) BQP Ln(Income Per Capita) LBQA BQP (1) (2) (3) (4) (5) *** ** (0.070) (0.159) (0.103) (0.171) (0.213) ** (0.200) (0.152) (0.264) 0.873*** 0.789*** 2.826*** (0.262) (0.295) (0.864) ** (1.323) Reservation Fixed-Effects x x x x x Year Fixed-Effects x x x x x Time-Varying Controls x x x x x Historic Time-Trend Controls x x x x x Endogenous Population Controls x x x x x Endogenous Education Controls x x x x x Number of Reservations Number of Observations R-Squared Notes: * p < 0.10, ** p < 0.05, *** p < Standard errors, reported in parentheses, are clustered at the reservations level. Time-Varying controls include state per capita income and adjacent county per capita income. Historic Time-Trend controls include a dummy variable for whether the reservation adopted the IRA, a dummy variable for whether Public Law 280 applied to the reservation, log distance from the closest MSA, and controls for the share of reservation land held in tribal trust and individual trust. All of these variables are interacted with time period. The null hypothesis is that all the coefficients in the model are equal to zero. A23
24 Table A.4.3: Panel Model Estimates of Relationship between of Income and Inequality, Considering Migration Ln(Income Per Capita) Ln(Income Per Capita) LBQA Ln(Income Per Capita) BQP Ln(Income Per Capita) LBQA BQP (1) (2) (3) Share Ln(Gini) Ln(Gini) Migrate *** *** (0.538) (0.538) (8.509) 1.323** 1.323** (0.588) (0.587) (10.632) 5.372*** 5.401*** (1.942) (1.927) (31.934) * * (2.544) (2.524) (45.198) Reservation Fixed-Effects x x x Year Fixed-Effects x x x Time-Varying Controls x x x Historic Time-Trend Controls x x x Migration Controls x Number of Reservations Number of Observations R-Squared Notes: * p < 0.10, ** p < 0.05, *** p < Standard errors, reported in parentheses, are clustered at the reservations level. Data is from 2000 and Time-Varying controls include state per capita income and adjacent county per capita income. Historic Time- Trend controls include a dummy variable for whether the reservation adopted the IRA, a dummy variable for whether Public Law 280 applied to the reservation, log distance from the closest MSA, and controls for the share of reservation land held in tribal trust and individual trust. All of these variables are interacted with time period. The null hypothesis is that all the coefficients in the model are equal to zero. A24
GMM Estimation of a Maximum Entropy Distribution with Interval Data
GMM Estimation of a Maximum Entropy Distribution with Interval Data Ximing Wu and Jeffrey M. Perloff January, 2005 Abstract We develop a GMM estimator for the distribution of a variable where summary statistics
More informationMaking Our Cities Safer: A Study In Neighbhorhood Crime Patterns
Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence
More informationTesting Homogeneity Of A Large Data Set By Bootstrapping
Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationGMM Estimation of a Maximum Entropy Distribution with Interval Data
GMM Estimation of a Maximum Entropy Distribution with Interval Data Ximing Wu and Jeffrey M. Perloff March 2005 Abstract We develop a GMM estimator for the distribution of a variable where summary statistics
More informationNonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix
Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract
More informationSystems and Matrices CHAPTER 7
CHAPTER 7 Systems and Matrices 7.1 Solving Systems of Two Equations 7.2 Matrix Algebra 7.3 Multivariate Linear Systems and Row Operations 7.4 Partial Fractions 7.5 Systems of Inequalities in Two Variables
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationCOMMUNITY SERVICE AREA
INSTITUTE FOR TRIBAL ENVIRONMENTAL PROFESSIONALS Tribal Waste and Response Assistance Program (TWRAP) Developing and Implementing a Tribal Integrated Solid Waste Management Plan April 12-14, 2016 Palm
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More informationminimize x subject to (x 2)(x 4) u,
Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for
More informationGMM estimation of a maximum entropy distribution with interval data
Journal of Econometrics 138 (2007) 532 546 www.elsevier.com/locate/jeconom GMM estimation of a maximum entropy distribution with interval data Ximing Wu a, Jeffrey M. Perloff b, a Department of Agricultural
More informationINFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction
INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental
More informationThe flu example from last class is actually one of our most common transformations called the log-linear model:
The Log-Linear Model The flu example from last class is actually one of our most common transformations called the log-linear model: ln Y = β 1 + β 2 X + ε We can use ordinary least squares to estimate
More informationLimited Dependent Variables and Panel Data
and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement
More informationMore on Roy Model of Self-Selection
V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income
More information1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation
1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations
More informationInteraction Analysis of Spatial Point Patterns
Interaction Analysis of Spatial Point Patterns Geog 2C Introduction to Spatial Data Analysis Phaedon C Kyriakidis wwwgeogucsbedu/ phaedon Department of Geography University of California Santa Barbara
More informationCh 7: Dummy (binary, indicator) variables
Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationEcn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,
More informationRepeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data
Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationPBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.
PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the
More informationOnline Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics
Online Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics This robustness appendix provides a variety
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationAlgebra Topic Alignment
Preliminary Topics Absolute Value 9N2 Compare, order and determine equivalent forms for rational and irrational numbers. Factoring Numbers 9N4 Demonstrate fluency in computations using real numbers. Fractions
More informationDetermining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1
Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1 Income and wealth distributions have a prominent position in
More informationDynamics in Social Networks and Causality
Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:
More informationName (NetID): (1 Point)
CS446: Machine Learning (D) Spring 2017 March 16 th, 2017 This is a closed book exam. Everything you need in order to solve the problems is supplied in the body of this exam. This exam booklet contains
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More information2015 May Exam Paper 2
2015 May Exam Paper 2 1a. [2 marks] In a debate on voting, a survey was conducted. The survey asked people s opinion on whether or not the minimum voting age should be reduced to 16 years of age. The results
More informationFinal Group Project Paper. Where Should I Move: The Big Apple or The Lone Star State
Final Group Project Paper Where Should I Move: The Big Apple or The Lone Star State By: Nathan Binder, Shannon Scolforo, Kristina Conste, Madison Quinones Main Goal: Determine whether New York or Texas
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the
More information12 slots, 2 hours each. A homework: visualization, simple testing, and simple classification algorithms.
12 slots, 2 hours each. A homework: visualization, simple testing, and simple classification algorithms. Approximate Syllabus: Organization and structure. Intro to R. Set operations. Venn diagramms. De
More informationCalifornia Common Core State Standards Comparison - Sixth Grade
1. Make sense of problems and persevere in solving them. 2. Reason abstractly and quantitatively. 3. Construct viable arguments and critique the reasoning of others 4. Model with mathematics. Number Sense
More informationECE 592 Topics in Data Science
ECE 592 Topics in Data Science Final Fall 2017 December 11, 2017 Please remember to justify your answers carefully, and to staple your test sheet and answers together before submitting. Name: Student ID:
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers
Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:
More informationSTANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II 1 st Nine Weeks,
STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I Part II 1 st Nine Weeks, 2016-2017 OVERVIEW Algebra I Content Review Notes are designed by the High School Mathematics Steering Committee as a resource
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationExam D0M61A Advanced econometrics
Exam D0M61A Advanced econometrics 19 January 2009, 9 12am Question 1 (5 pts.) Consider the wage function w i = β 0 + β 1 S i + β 2 E i + β 0 3h i + ε i, where w i is the log-wage of individual i, S i is
More informationClassification and Prediction
Classification Classification and Prediction Classification: predict categorical class labels Build a model for a set of classes/concepts Classify loan applications (approve/decline) Prediction: model
More informationSTANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part I. 2 nd Nine Weeks,
STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I Part I 2 nd Nine Weeks, 2016-2017 OVERVIEW Algebra I Content Review Notes are designed by the High School Mathematics Steering Committee as a resource
More informationCensus Geography, Geographic Standards, and Geographic Information
Census Geography, Geographic Standards, and Geographic Information Michael Ratcliffe Geography Division US Census Bureau New Mexico State Data Center Data Users Conference November 19, 2015 Today s Presentation
More informationprobability of k samples out of J fall in R.
Nonparametric Techniques for Density Estimation (DHS Ch. 4) n Introduction n Estimation Procedure n Parzen Window Estimation n Parzen Window Example n K n -Nearest Neighbor Estimation Introduction Suppose
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More information4. Nonlinear regression functions
4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change
More informationAnnouncements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45
Announcements Solutions to Problem Set 3 are posted Problem Set 4 is posted, It will be graded and is due a week from Friday You already know everything you need to work on Problem Set 4 Professor Miller
More informationAnswer all questions from part I. Answer two question from part II.a, and one question from part II.b.
B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries
More informationARTNeT Interactive Gravity Modeling Tool
Evidence-Based Trade Policymaking Capacity Building Programme ARTNeT Interactive Gravity Modeling Tool Witada Anukoonwattaka (PhD) UNESCAP 26 July 2011 Outline Background on gravity model of trade and
More informationDOE 2017 National Tribal Energy Summit Tribal Energy Sovereignty Strengthening Strategic Partnerships
DEPARTMENT OF THE INTERIOR DOE 2017 National Tribal Energy Summit Tribal Energy Sovereignty Strengthening Strategic Partnerships Office of Indian Energy & Economic Development IEED Division of Energy &
More informationMontana Content Standards Science Grade: 6 - Adopted: 2016
Main Criteria: Montana Content Standards Secondary Criteria: Subjects: Science, Social Studies Grade: 6 Correlation Options: Show Correlated MT.6-8.PS. BENCHMARK / STANDARD 6-8.PS.3. MT.6-8.LS. BENCHMARK
More informationAssociation Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression
Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression Last couple of classes: Measures of Association: Phi, Cramer s V and Lambda (nominal level of measurement)
More informationAPPENDIX 1 BASIC STATISTICS. Summarizing Data
1 APPENDIX 1 Figure A1.1: Normal Distribution BASIC STATISTICS The problem that we face in financial analysis today is not having too little information but too much. Making sense of large and often contradictory
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationSonoran Substation to Wilmot Energy Center 138 kv Transmission Line Project August 2018 EXHIBIT A. Exhibit Page 1
Tucson Electric Power Company CEC Application Sonoran Substation to Wilmot Energy Center 138 kv Transmission Line Project August 2018 EXHIBIT A Exhibit Page 1 Tucson Electric Power Company CEC Application
More informationApéndice 1: Figuras y Tablas del Marco Teórico
Apéndice 1: Figuras y Tablas del Marco Teórico FIGURA A.1.1 Manufacture poles and manufacture regions Poles: Share of employment in manufacture at least 12% and population of 250,000 or more. Regions:
More informationThe Governance of Land Use
The planning system Levels of government and their responsibilities The Governance of Land Use COUNTRY FACT SHEET NORWAY Norway is a unitary state with three levels of government; the national level, 19
More informationCHAPTER 1. Introduction
CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing
More informationSummer Assignment IB Math Studies L2 Paper 1 Practice. Calculate the sum of the first 90 terms of the sequence. Answers:
1. The first five terms of an arithmetic sequence are shown below. 2, 6, 10, 14, 18 Write down the sith number in the sequence. Calculate the 200 th term. Calculate the sum of the first 90 terms of the
More informationTesting Restrictions and Comparing Models
Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by
More information1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX
Well, it depends on where you're born: A practical application of geographically weighted regression to the study of infant mortality in the U.S. P. Johnelle Sparks and Corey S. Sparks 1 Introduction Infant
More informationMachine Learning, Midterm Exam
10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More informationVariables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010
Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010 Review Recording observations - Must extract that which is to be analyzed: coding systems,
More informationOregon Population Forecast Program Rulemaking Advisory Committee (RAC) Population Research Center (PRC)
Oregon Population Forecast Program Rulemaking Advisory Committee (RAC) Population Research Center (PRC) RAC Meeting Agenda 1. Committee member introductions 2. Review charge of the Oregon Population Forecast
More informationComprehensive Examination Quantitative Methods Spring, 2018
Comprehensive Examination Quantitative Methods Spring, 2018 Instruction: This exam consists of three parts. You are required to answer all the questions in all the parts. 1 Grading policy: 1. Each part
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationIncome Distribution Dynamics with Endogenous Fertility. By Michael Kremer and Daniel Chen
Income Distribution Dynamics with Endogenous Fertility By Michael Kremer and Daniel Chen I. Introduction II. III. IV. Theory Empirical Evidence A More General Utility Function V. Conclusions Introduction
More informationTDT4173 Machine Learning
TDT4173 Machine Learning Lecture 3 Bagging & Boosting + SVMs Norwegian University of Science and Technology Helge Langseth IT-VEST 310 helgel@idi.ntnu.no 1 TDT4173 Machine Learning Outline 1 Ensemble-methods
More informationAnalysis of Bank Branches in the Greater Los Angeles Region
Analysis of Bank Branches in the Greater Los Angeles Region Brian Moore Introduction The Community Reinvestment Act, passed by Congress in 1977, was written to address redlining by financial institutions.
More informationDwelling Price Ranking vs. Socio-Economic Ranking: Possibility of Imputation
Dwelling Price Ranking vs. Socio-Economic Ranking: Possibility of Imputation Larisa Fleishman Yury Gubman Aviad Tur-Sinai Israeli Central Bureau of Statistics The main goals 1. To examine if dwelling prices
More informationIntroduction of Recruit
Apr. 11, 2018 Introduction of Recruit We provide various kinds of online services from job search to hotel reservations across the world. Housing Beauty Travel Life & Local O2O Education Automobile Bridal
More informationAppendix A Taylor Approximations and Definite Matrices
Appendix A Taylor Approximations and Definite Matrices Taylor approximations provide an easy way to approximate a function as a polynomial, using the derivatives of the function. We know, from elementary
More informationKey Issue 1: Why Do Services Cluster Downtown?
Key Issue 1: Why Do Services Cluster Downtown? Pages 460-465 1. Define the term CBD in one word. 2. List four characteristics of a typical CBD. Using your knowledge of services from chapter 12, define
More informationEntrepôts and Economic Geography
Entrepôts and Economic Geography Hugh Montag & Heyu Xiong 6/2/17 Motivation What explains the uneven distribution of economic activities across space? A large empirical literature has emphasized the significance
More informationAgile Mind Mathematics 6 Scope and Sequence, Common Core State Standards for Mathematics
In the three years preceding Grade 6, students have acquired a strong foundation in numbers and operations, geometry, measurement, and data. They are fluent in multiplication of multi- digit whole numbers
More informationPhD/MA Econometrics Examination January 2012 PART A
PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator
More informationClimate and Health Vulnerability & Adaptation Assessment Profile Manaus - Brazil
Climate and Health Vulnerability & Adaptation Assessment Profile Manaus - Brazil Christovam Barcellos (ICICT/Fiocruz) Diego Xavier Silva (ICICT/Fiocruz) Rita Bacuri (CPqLMD/Fiocruz) Assessment Objectives
More information18.9 SUPPORT VECTOR MACHINES
744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations
More informationMaximum Likelihood (ML) Estimation
Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear
More informationEconometrics for PhDs
Econometrics for PhDs Amine Ouazad April 2012, Final Assessment - Answer Key 1 Questions with a require some Stata in the answer. Other questions do not. 1 Ordinary Least Squares: Equality of Estimates
More informationWhat s New in Econometrics. Lecture 13
What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments
More information2. Topic: Series (Mathematical Induction, Method of Difference) (i) Let P n be the statement. Whenn = 1,
GCE A Level October/November 200 Suggested Solutions Mathematics H (9740/02) version 2. MATHEMATICS (H2) Paper 2 Suggested Solutions 9740/02 October/November 200. Topic:Complex Numbers (Complex Roots of
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13
More informationEconometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points]
Econometrics (60 points) Question 7: Short Answers (30 points) Answer parts 1-6 with a brief explanation. 1. Suppose the model of interest is Y i = 0 + 1 X 1i + 2 X 2i + u i, where E(u X)=0 and E(u 2 X)=
More informationEco517 Fall 2004 C. Sims MIDTERM EXAM
Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering
More informationGaussian Mixture Models
Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some
More information2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0
Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct
More informationNOTES ON COOPERATIVE GAME THEORY AND THE CORE. 1. Introduction
NOTES ON COOPERATIVE GAME THEORY AND THE CORE SARA FROEHLICH 1. Introduction Cooperative game theory is fundamentally different from the types of games we have studied so far, which we will now refer to
More informationSpecification testing in panel data models estimated by fixed effects with instrumental variables
Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions
More informationPanel data panel data set not
Panel data A panel data set contains repeated observations on the same units collected over a number of periods: it combines cross-section and time series data. Examples The Penn World Table provides national
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to
More information