COMBINING ENUMERATION AREA MAPS AND SATELITE IMAGES (LAND COVER) FOR THE DEVELOPMENT OF AREA FRAME (MULTIPLE FRAMES) IN AN AFRICAN COUNTRY: PRELIMINARY LESSONS FROM THE EXPERIENCE OF ETHIOPIA BY ABERASH TARIKU ABAYE NATIONAL STATISTICAL DATA QUALITY DIRECTORATE DIRECTOR CENTRAL STATISTICAL AGENCY ADDIS ABABA ETHIOPIA OCTOBER 2010 ADDIS ABABA
TABLE OF CONTENT 1. Introduction. 4 2. land cover classification 5 3. Creating the sampling frame in area frame survey using EA maps and land cover map 14 4. Sample size and sample selection.. 20 5. Training of field staff and field organization 20 6. Data collection in the selected segments 21 7. Estimation. 21 8. Validation of the area frame result 22 9. Conclusion and recommendations 23 10 Appendixes. 24 2
Acronyms CSA EA LCCS PPS PSU SSU Central Statistical Agency Enumeration Area Land Cover Classification Systems Probability Proportional to Size primary Sampling Unit Secondary Sampling Unit 3
1. Introduction CSA has been conducting Annual Agricultural Surveys for more than three decades. List frame approach is used in an integrated annual agricultural survey to collect agricultural data and area and production is estimated for Cereal, Pulses, Oilseeds, Vegetables, Root Crops, Fruit Crops and Other crops. Population and Housing Census is used as a frame for National Integrated Household Surveys. The recent population and housing census was conducted in 2007. This frame is list of enumeration areas delineated during the population census cartographic work. The enumeration areas are geo referenced. In the list frame approach stratified random sampling with Zones as strata, EAs as PSUs and households as SSUs is implemented. The PSUs are selected with pps systematic sampling size being the number of households from census. Once the PSUs are selected, households with in the PSU are listed and a sample Households (SSUs) is selected with systematic sampling. To see the feasibility of area frame and to compare the result with that of list frame and to come up with justifiable conclusion, CSA conducted a pilot survey for area frame in west shoa zone of the Oromiya region in 2008/09. A full scale pilot in area frame approach was followed in all zones of Oromia Region in 2009-10. By incorporating different comments, improvements and also additional variables, CSA is planning to conduct pilot survey for area frame in 2010/11. In the area frame the observation units are territorial subdivisions instead of list of holders/holdings as in the list frame. In area frame survey the sampling units are defined on a cartographic representation of the surveyed territory. The units of an area frame can be points, transects (straight lines of a certain length) or pieces of territory often named segments. CSA used segments as units of area frame. To develop the frame in the area frame approach, the digitized EA map from census cartographic work is overlaid on land cover map. The percent crop land for each EA is then 4
estimated using the land cover data. The EAs are stratified in to four stratas based on crop land intensity. Sample EAs are selected by PPS, size being number of segments in the EA and the sampled EA will be divided in to segments of size 40 hectare. Two segments from each sampled EA are selected systematically. The data for all fields with in the sampled segment will then be collected. 2. Land cover classification Land cover is the observed physical cover of the earth s surface. It is the physical material at the surface of the earth including grass, trees, bare ground, water, vegetation and artificial structure. In the land cover classification, the land is stratified in to land cover categories. The objective of Ethiopia Land Cover mapping activity is to carry out the national land cover data base using satellite images. Land cover is critical for sustainable management of natural resources, environmental protection, food security and also to provide core data for monitoring and modeling. CSA in collaboration with mapping agency, Ministry of Agriculture and Rural Development and Information Network Security Agency (INSA) are working on the land cover classification. This activity is funded by EU and FAO gives technical support. Operational subjects are handled by CSA, EU, and FAO where as the technical committee is organized from CSA, MOARD, EMA, INSA, EU, FAO and WFP. 5
The land cover activity is expected to produce a land cover data base which will provide a standardized, multipurpose product useful for environmental and agricultural purposes. In addition to the major out put of producing land cover data base for the country, the land cover classification will help to prepare base line for area frame. Satellite imagery, appropriate soft ware and predefined legend are required for the land cover classification. In CSA Ethiopia, spot 5 satellite imagery and MADCAT soft ware are used for land cover classification and appropriate legend for Ethiopia is prepared and used for the land cover classification. Intensive training on the land cover activities was given by FAO. The land cover data production activities include 1. Preliminary steps Ancillary data collection Spot index generation Spot mosaicking Segmentation Preliminary legend set up 2 interpretation preliminary interpretation field works photo keys definition 6
supervision Edge matching process Final interpretation Final legend set up 3. GIS activity Data base processing o Merge, dissolve, topology check, generalize, smoothing, simplifying...etc. o Data base finalization SPOT image 5 m resolution used for land cover classification 7
In the land cover classification, once the other preparatory activities are done, the image segmentation activities will follow. This image segmentation is based on a region merging technique of the division of image in to spatially continuous and spectrally homogeneous regions or objects. Segmentation is done using definiens soft ware. This segmentation is an activity of generating a polygon segments from spot 5 imagery based on spectral value of each pixel. 8
Example of image segmentation for the LCCS in Ethiopia using definiens
Based on the land cover classification system (LCCS) developed by FAO to prepare a legend, appropriate legend for Ethiopia is developed and used for the land cover classification. The land cover classification system (LCCS) which is used to prepare the legend is a comprehensive methodology for description, characterization, classification and comparison of most land cover features identified any where in the world, at any scale or level of detail. It is a basis for comparative classification. LCCS is created in response to a need for a harmonized and standardized collection and reporting on the status of land cover. The legend prepared for Ethiopia has the following classes. Natural and semi natural vegetation/ terrestrial and aquatic Cultivated areas/ terrestrial and aquatic Artificial surfaces and associated areas Classes bare areas Artificial and natural water bodies 15 classes 19 classes 6 classes 4 classes 6 classes The detail legend is given in appendix I
In this land cover activity MAD CAT (Mapping Device and Change analysis tool) soft ware is used for image processing. MAD CAT has multiple window editing environments. The interpretation in the LCCS is done by visual interpretation mechanism. 11
Dissolving activity will be conducted to minimize the number of polygons by merging adjacent polygons which have the same LCCS class. 12
After the interpretation work of each tile is completed, the independent shape files will be merged in to mosaics. 13
3. Creating the sampling frame in Area frame survey using EA maps and land cover map The basic steps in Area Frame Survey are creating the sampling frame for the study area, Stratify the area frame, define the sampling units, allocate and select the samples. Estimation and validation should be done after data collection. The first step in the area frame survey is frame development. Two inputs, enumeration area maps and land cover map, are used to develop area frame. The EAs which are the PSUs were delineated for the purpose of Population and Housing Census. The criteria to delineate an EA were to have 150 200 households in rural areas. Topo-sheet was used as a base map and the GPS readings for the EA corners (turning points) were plotted on the toposheet and then the EA map was traced from the topo- sheet. The enumeration areas are geo referenced. To develop a frame in the area frame, CSA digitized EA map obtained from census cartographic work is overlaid on the land cover map..
174 EAs in one wereda are overlaid on the Land Cover database LCCS
The second step is to estimate the percent crop land (crop intensity) for each EA. The percent crop land for each EA is calculated using the interpreted imagery and appropriate soft ware. Percent crop land Values are shown in EAs
Stratification is done by forming groups of similar EAs with respect to crop intensity. The four strata s created based on crop intensity are Stratum I Crop intensity 75% or more Stratum II Crop intensity 50 to 74% Stratum III - Crop intensity 25 to 49% Stratum IV - Crop intensity less than 25% Enumeration areas classified in to stratum based on the percent crop land from land cover 0 TO 24 25 TO 49 50 TO 74 75 AND ABOVE
Allocated numbers of EA from each stratum are selected by PPS and the sampled EAs are divided to segments of size 40 hectare. These segments are treated as SSU. The formation of the segments is based on 40 hectare area and clear identifiable physical boundaries as much as possible. Two segments are selected systematically from each EA for data collection. An EA which is subdivided into seven, 40 hectare SSUs 18
Segment number 2 selected for the EA ( PSU) shown above. 19
4. Sample size and sample selection Enumeration areas are the primary sampling units (PSU) for the area frame. They are delineated during the population census cartographic work. These EAs are geo referenced. Once the EA is divided in to 40 hectare segments, the segments are considered as secondary sampling units (SSU). The Sample size for the oromiya region Area Frame was 215 PSUs (EAs) and 430 SSUs (segments - 2 segments per EA). This sample size was determined based on the precision and sample size in west shoa area frame pilot. These 215 EAs were allocated to the 17 zones proportional to the cultivated area from the report of the integrated survey. Within each zone allocation of the sample to each stratum was done on the basis of cultivated area from LCCS. EAs in each stratum are selected by PPS, size being number of segments. Two segments from each selected PSU (EA) are selected systematically. 5. Training of field staff and field organization Enumerators and supervisors were trained on data collection. The training includes both class room training and field practices. The class room training includes i) How to use segment map to identify and delineate a segment ii) How to use GPS iii) How to list fields with in a segment iv) How to measure fields by GPS v) How to fill forms on agricultural practices 20
6. Data collection in the selected segments Segments are cluster of fields (land uses). A segment map for each segment Is prepared and used for delineation in the field. Closed segment approach is used in this pilot to collect data. That is all the fields (land use) with in the selected segment are listed and questionnaire is filled in for each field. The first activity to be done by the enumerator in the data collection is to identify the segment boundary using segment map and GPS coordinates. GPS is used to delineate a segment and to measure all the fields with in the selected segment. For all the fields listed, their area is measured by GPS twice (clock wise and anti clock wise) and the result is filled in the form. The size of the segment which is around 40 hectare is used as an indirect control for data collection. Commercial farms should be treated separately and an independent survey is conducted for them. The frame with list of commercial farms is collected and a separate commercial farm survey using the list frame approach is conducted. Hence, multiple frame approach which is a combination of area frame and list frame for the private farms and commercial farms will be implemented. 7. Estimation Estimate of Total Yˆ h in each stratum is given by Yˆ = h nea i= 1 nhi W j= 1 hi Y hij The weight for each segment is calculated as M N sh hi W hi = = neamseanhi n M ea sh n hi The variance for the estimates in each stratum is calculated as 21
Var( Y h ) = nea n 1 ea nea i= 1 Y hi Y n h h 2 V ( Y ) = l h= 1 var( Y h ) 8. Validation of the area frame result To validate the result of the area frame, the estimate is compared with the list frame result. The comparison shows that the estimate for the area covered by different crops in area frame approach is higher than list frame result. Different activities were done to identify possible sources of the difference. These include Checking the methodology used in both approaches by estimating the number of households in both approach and comparing with the population census result Checking the list of crops covered in both approaches Estimating number of fields in both approaches Checking crop wise average area per field in both approaches Checking the multiplier for zones /stratum with abnormal weight Evaluating the adjustments made for non responding segments Evaluating the treatment of commercial farms in the area frame approach 9. Conclusion and recommendations The methodology used for both area frame and list frame follows standard procedure. The estimate of average field size and number of fields in both methods needs to be checked in the coming area frame survey 22
For zones with high nomadic population and the less cropland stratum, care should be taken in sampling and estimation. For the coming area frame survey areas with less than 2 percent cropland are excluded from the frame in these nomadic zones. In the area frame data some segments with area significantly greater than 40 was observed. One of the reasons for this was the case of segment boundaries dissecting a field. A clear instruction to include only part of the field with in the segment boundary should be written in the manual. Intensive and organized supervision also needs to be done and a control over around 40 hectare should also be done.. When commercial farms are observed in the segment, their data should be collected in a separate list frame approach. This was one of the sources for over estimation in the area frame survey. By accommodating all the recommendations above and also by adding crop cutting activity, the area frame data collection will be repeated in October 2010 in oromiya region. 23
Appendix 1 Ethiopia Legend AeA Ethiopia Legend 24
25
26
27