Calculating Land Values by Using Advanced Statistical Approaches in Pendik

Similar documents
Integrating Geographic Information System and. Building Information Model for Real Estate Valuation. Haicong Yu Ying Liu

Spatial multicriteria analysis for home buyers

GIS Geographical Information Systems. GIS Management

Land Accounts - The Canadian Experience

Using 3D Geographic Information System to Improve Sales Comparison Approach for Real Estate Valuation

Basics of GIS. by Basudeb Bhatta. Computer Aided Design Centre Department of Computer Science and Engineering Jadavpur University

Valuation of environmental amenities in urban land price: A case study in the Ulaanbaatar city, Mongolia

Simulating urban growth in South Asia: A SLEUTH application

Introduction to GIS. Dr. M.S. Ganesh Prasad

Dr.Sinisa Vukicevic Dr. Robert Summers

Compact guides GISCO. Geographic information system of the Commission

Taxonomies of Building Objects towards Topographic and Thematic Geo-Ontologies

Investigation of the Effect of Transportation Network on Urban Growth by Using Satellite Images and Geographic Information Systems

Progress and Land-Use Characteristics of Urban Sprawl in Busan Metropolitan City using Remote sensing and GIS

Optimized positioning for accommodation centers in GIS using AHP techniques; a case study: Hamedan city

IB County. UL Transportation. HI Hidrography Lake River. TO Topography

The Changing Landscape of Land Administration

Site Suitability Analysis for Urban Development: A Review

Case study: Nairobi City County CBD and its Environs. Presenter: Peter Patrick Kioko. Supervisor: Mr. B.M. Okumu

Riga. Riga City Planning Region. Legal framework

Creating Basemaps to Manage Buildings and Facilities

The Road to Data in Baltimore

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

Preparation of Database for Urban Development

HIGH RESOLUTION BASE MAP: A CASE STUDY OF JNTUH-HYDERABAD CAMPUS

Neighborhood Locations and Amenities

Integrating Geographic Information System and Building Information Model for Real Estate Valuation

Abstract: Contents. Literature review. 2 Methodology.. 2 Applications, results and discussion.. 2 Conclusions 12. Introduction

Introduction-Overview. Why use a GIS? What can a GIS do? Spatial (coordinate) data model Relational (tabular) data model

Land Use of the Geographical Information System (GIS) and Mathematical Models in Planning Urban Parks & Green Spaces

GeoComputation 2011 Session 4: Posters Accuracy assessment for Fuzzy classification in Tripoli, Libya Abdulhakim khmag, Alexis Comber, Peter Fisher ¹D

COMPARISON OF PIXEL-BASED AND OBJECT-BASED CLASSIFICATION METHODS FOR SEPARATION OF CROP PATTERNS

Sustainable and Harmonised Development for Smart Cities The Role of Geospatial Reference Data. Peter Creuzer

ASSESSING THEMATIC MAP USING SAMPLING TECHNIQUE

GIS FOR MAZOWSZE REGION - GENERAL OUTLINE

USER PARTICIPATION IN HOUSING REGENERATION PROJECTS

GIS = Geographic Information Systems;

SPATIAL ANALYSIS OF POPULATION DATA BASED ON GEOGRAPHIC INFORMATION SYSTEM

Application of Geographic Information Systems for Government School Sites Selection

GEOMATICS. Shaping our world. A company of

INVESTIGATION LAND USE CHANGES IN MEGACITY ISTANBUL BETWEEN THE YEARS BY USING DIFFERENT TYPES OF SPATIAL DATA

GIS and territorial planning

SUPPORTS SUSTAINABLE GROWTH

Contemporary Data Collection and Spatial Information Management Techniques to support Good Land Policies

Supplementary material: Methodological annex

ANALYSIS OF URBAN PLANNING IN ISA TOWN USING GEOGRAPHIC INFORMATION SYSTEMS TECHNIQUES

USING HYPERSPECTRAL IMAGERY

Analyzing Suitability of Land for Affordable Housing

Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey

Reading, UK 1 2 Abstract

Towards indicators of proximity to services in Europe's major cities

BIG IDEAS. Area of Learning: SOCIAL STUDIES Urban Studies Grade 12. Learning Standards. Curricular Competencies

UNCERTAINTY IN THE POPULATION GEOGRAPHIC INFORMATION SYSTEM

Trip Generation Model Development for Albany

Determination of flood risks in the yeniçiftlik stream basin by using remote sensing and GIS techniques

Learning with multiple models. Boosting.

EO Information Services. Assessing Vulnerability in the metropolitan area of Rio de Janeiro (Floods & Landslides) Project

Contents. Introduction Study area Data and Methodology Results Conclusions

presents challenges related to utility infrastructure planning. Many of these challenges

Final Group Project Paper. Where Should I Move: The Big Apple or The Lone Star State

Spatial analysis of locational conflicts

A GIS-based multicriteria model for the evaluation of territorial accessibility

Examining the Potential Travellers in Catchment Areas for Public Transport

Accessibility as an Instrument in Planning Practice. Derek Halden DHC 2 Dean Path, Edinburgh EH4 3BA

Joanne N. Halls, PhD Dept. of Geography & Geology David Kirk Information Technology Services

Site Suitability Analysis for Local Airport Using Geographic Information System

Sample assessment task. Task details. Content description. Year level 7

DEVELOPMENT OF DIGITAL CARTOGRAPHIC DATABASE FOR MANAGING THE ENVIRONMENT AND NATURAL RESOURCES IN THE REPUBLIC OF SERBIA

Statistical-geospatial integration - The example of Sweden. Marie Haldorson Director, Statistics Sweden

High Speed / Commuter Rail Suitability Analysis For Central And Southern Arizona

NR402 GIS Applications in Natural Resources

Zagreb Spatial Data Infrastructure. Sanja Batić, APIS IT Split, 14. rujna 2011.

SRJC Applied Technology 54A Introduction to GIS

A GIS based Land Capability Classification of Guang Watershed, Highlands of Ethiopia

INSPIRE in the context of EC Directive 2002/49/EC on Environmental Noise

Pipeline Routing Using Geospatial Information System Analysis

Ensemble Methods and Random Forests

Remote Sensing and GIS Application in Change Detection Study Using Multi Temporal Satellite

Geographic Systems and Analysis

INTEGRATION OF GIS AND MULTICRITORIAL HIERARCHICAL ANALYSIS FOR AID IN URBAN PLANNING: CASE STUDY OF KHEMISSET PROVINCE, MOROCCO

Context-based Reasoning in Ambient Intelligence - CoReAmI -

SOFTWARE ARCHITECTURE DESIGN OF GIS WEB SERVICE AGGREGATION BASED ON SERVICE GROUP

Techniques for Science Teachers: Using GIS in Science Classrooms.

- World-wide cities are growing at a rate of 2% annually (UN 1999). - (60,3%) will reside in urban areas in 2030.

Aileen Buckley, Ph.D. and Charlie Frye

Local Area Key Issues Paper No. 13: Southern Hinterland townships growth opportunities

Globally Estimating the Population Characteristics of Small Geographic Areas. Tom Fitzwater

A Review of Concept of Peri-urban Area & Its Identification

Decision-making support tool for promotion policies of abandoned mine areas

SEEA Experimental Ecosystem Accounting

GIS Technology and Tools for Long Range Transportation Planning in the National Park Service

LOCATIONAL PREFERENCES OF FDI FIRMS IN TURKEY

The Use of GIS in Exploring Settlement Patterns of the Ethnic Groups in Nan, Thailand. Pannee Cheewinsiriwat.

International Journal of Computing and Business Research (IJCBR) ISSN (Online) : APPLICATION OF GIS IN HEALTHCARE MANAGEMENT

Cultural Data in Planning and Economic Development. Chris Dwyer, RMC Research Sponsor: Rockefeller Foundation

Methodological issues in the development of accessibility measures to services: challenges and possible solutions in the Canadian context

DP Project Development Pvt. Ltd.

GIScience in Urban Planning Education - Experience from University of Maryland

GIS Supports to Economic and Social Development

Developing Spatial Awareness :-

Transcription:

Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Calculating Land Values by Using Advanced Statistical Approaches in Pendik Prof. Dr. Arif Cagdas AYDINOGLU Ress. Asst. Rabia BOVKIR and Dr. Ismail COLKESEN Dept. of Geomatics Engineering Department, Gebze Technical University

Overview Introduction to land valuation Valuation methods Purpose of the study Determining factors affecting the land value Case study in Pendik, Istanbul Fuzzy logic approach for determining thematic value trends Random forest algorithm for determining land values Conclusion

Land Valuation In sustainable land development concept, management of the land related information efficiently is highly important. Because definitive and reliable information about land and real property promote the geographical enablement which leads to achieve a sustainable development. In addition, governments need reliable value information of the land and real property for implementing many legal practices such as taxation, expropriation, market capitalization, rural-urban transformation and land consolidation. Land valuation is the process of determining the current market value of a real property such as land, residence, cropland, work place, vineyard, garden for a certain time. There are many methods for valuation, yet no certain and objective method exists. The most well-known and used methods are sales comparison, cost and income methods which are called traditional approaches. In addition, with the developing technology advanced statistics and machnine learning algorithms can be a good tool for obtaining better results with the spatial analysis ability of GIS.

Sales comparison method is based on to analyse and evaluate the different characteristics of recently sold properties and to specify how these characteristics (location, landscaping, use, quality of buildings and services etc.) influence the prices. Cost method is based on the principle where the value is equal to the substitution cost, minus accrued depreciation. In this approach, the value is obtained by adding the estimated value of the land to the construction cost with all reproduction or replacement costs. In the income method, the present value of the future benefits and net income of property is estimated. The main criterion in determining the value by means of income is the net income to be obtained in the future. The basis of statistical valuation is to create a mathematical model with numerical or proportional relations between land value and land related factors. According to the literature statistical approaches such as multiple regression, artificial neural networks, decision trees, support vector machines and fuzzy logic have been used for land and real property valuations in many studies.

Purpose of the Study A wide variety of different parameters are affecting the property value and it makes objective evaluations quite difficult. In order to perform objective and accurate evaluations, all parameters related to land and property should be considered and taken into consideration. Because information alone is not enough for decision making, it should be analysed and interpreted logically and statistically for obtaining better results. Therefore, advanced statistics methods and machine learning algorithms can be a good tool for land valuation. In this study, random forest approach and random forest algorithm were used for land valuation process in Pendik with the help of GIS analysing tools. Resulting maps and performance of the algorithms were compared and analysed in detail for both methods.

Methodology: Determining factors affecting land value According to the literature study including academic researches, nationally and internationally accepted standard documents, factors that significantly affects the value of the real estate are determined. Value Factor Distance to Public Services Distance to Educational Facilities Distance to Health Services Distance to Religious Centers Distance to Cultural Centers Distance to Shopping Centers Exit to the Main Roads Distance to the Main Roads Distance to Bus Stations Distance to Metro and Tram Stations Distance to the Green Areas Parking Area Distance to the Airport Distance to Industrial Areas Distance to Cemeteries Distance to Infrastructure Facilities Explanation This factor expresses the importantance of administrative and official buildings for property valuation process. Educational facilities like kindergarten, primary education, secondary education, high school and university effect the value of the property. The presence and proximity of health services like family health centers, polyclinics, hospitals effect the value of the property. Although having a lower effect, proximity of the religious centers to the property affects its value positively. Presence of the cultural centers such as theaters, cinemas and libraries at the location where the property is located effect the value of the property positively. Big shopping malls provides a lot of services about food, entertainment and shopping which make them important attraction centers for property. Closeness to the main roads increases the value of the property. The proximity to the main roads and motorways affected the value positively. The distance to the bus stations is an important element for property and it increases the value. The proximity of the property to the metro and tram stations affects the value positively. The presence and closeness of green areas such as parks, playgrounds, picnic areas etc. affects the value positively. The proximity of the property to parking areas affects the value positively. Closeness to airway transportation is considerably influential on the property value. The proximity to industrial and production areas affects the value negatively. The closeness to the cemetery areas generally effect the value negatively. Infrastructure facilities such as treatment facility, transformers, power transmission stations etc. have a negative effect on the value. Slope Slope characteristics of the land are very important for proper settlement and effects the property value. Aspect Because Turkey is located in the north pole properties in the south are more valuable. Population Density Education Level Very dense and very low population has a negative effect on the property value. According to the literature research, it is determined that education and culture levels of the people living in the environment where the property is located are affecting the value in the socio-cultural sense.

Methodology: Fuzzy Logic as Unsupervised Valuation Random Forest as Supervised Valuation Value Factor Distance to Public Services Distance to Educational Facilities Distance to Health Services Distance to Religious Centers Distance to Cultural Centers Distance to Shopping Centers Exit to the Main Roads Distance to the Main Roads Distance to Bus Stations Distance to Metro and Tram Stations Distance to the Green Areas Parking Area Distance to the Airport Distance to Industrial Areas Distance to Cemeteries Distance to Infrastructure Facilities Slope Aspect Population Density Education Level Explanation This factor expresses the importantance of administrative and official buildings for property valuation process. Educational facilities like kindergarten, primary education, secondary education, high school and university effect the value of the property. The presence and proximity of health services like family health centers, polyclinics, hospitals effect the value of the property. Although having a lower effect, proximity of the religious centers to the property affects its value positively. Presence of the cultural centers such as theaters, cinemas and libraries at the location where the property is located effect the value of the property positively. Big shopping malls provides a lot of services about food, entertainment and shopping which make them important attraction centers for property. Closeness to the main roads increases the value of the property. The proximity to the main roads and motorways affected the value positively. The distance to the bus stations is an important element for property and it increases the value. The proximity of the property to the metro and tram stations affects the value positively. The presence and closeness of green areas such as parks, playgrounds, picnic areas etc. affects the value positively. The proximity of the property to parking areas affects the value positively. Closeness to airway transportation is considerably influential on the property value. The proximity to industrial and production areas affects the value negatively. The closeness to the cemetery areas generally effect the value negatively. Infrastructure facilities such as treatment facility, transformers, power transmission stations etc. have a negative effect on the value. Slope characteristics of the land are very important for proper settlement and effects the property value. Because Turkey is located in the north pole properties in the south are more valuable. Very dense and very low population has a negative effect on the property value. Fü Fü According to the literature research, it is determined that education and culture levels of the people living in the environment where the property is located are affecting the value in the socio-cultural sense. Fuzzy Overlay: Public Services, Fuzzy Overlay: Transportation, Fuzzy Overlay: Planning Fuzzy Overlay: Socio-cultural factors Fuzzy Overlay: Utilization, Fuzzy Overlay: Environmental factors Fuzzy Overlay: Cultural factors Training data set: Ground truth value

Case Study in Pendik Pendik is chosen is the case study area because of its large surface area and having urban-rural differences. Database having these criteria were produced in GIS environment for Pendik. Obtained geographic data sets were optimized with the help of the geographical analysis tools in ArcGIS software. Optimization should be performed for the successful analysis and process of the existing data. Raster surfaces should be produced which express the geographical characteristics of the study area. Analysis processes were performed in ArcGIS software environment.

Pixel-based thematic surfaces were created for determined thematic factors for the study area. Slope and Elevation analysis were performed by using Digital Elevation Model (DEM); Density Analysis for the creation of the surfaces reflecting population and education level characteristics and Euclidean Distance for calculating the thematic distances for other thematic factors in ArcGIS software. Slope Aspect Education level Population Density

Thematic distance surfaces of Public (a), Education (b), Health (c), Cultural (d) Facilities for study area Thematic distance surfaces of Religious Centers(a), Cemetries (b) for study area

Thematic distance surfaces of Bus station (a), Parking Area (b), Metro (c), Green Space (d) for study area Thematic distance surfaces of Infrastructure Services (c), Shopping (d) Areas for study area

Fuzzy logic approach for determining land values The factors affecting the land value are highly complex and depending on various parameters. Because information can be the form of verbal expressions such as large, small and little in fuzzy logic, it is an effective method for processing oral variables, which is very important for valuation process. An element of a fuzzy set can be full member (100% membership) or a partial member (between 0% and 100% membership). That means an assigned membership value of an element is not restricted with two values, but can be 0, 1 or any value in-between. The shape of the membership function varies according to the application area requirements.

Produced surfaces were gathered in seven thematic groups according to their characteristics as public services, transportation, planning, socio-cultural factors, utilization, environmental factors and cultural factors. Fuzzy membership functions were defined and applied for each thematic raster surfaces according to the characteristics; then fuzzy overlay maps were created for each thematic group.

Random forest algorithm for determining land values Random forest (RF) is one of the popular machine learning algorithms applied for various classification and regression problems. RF, widely known as tree based ensemble learning algorithm, is based on the idea that training a set of decision tree (DT) learners, using randomly selected samples through bootstrap aggregating (Bagging) strategy to make final a prediction. For the implementation of RF algorithm, two parameters (the number of trees and the number of variables) have to be set by the analyst. In order to construct a RF prediction model, two randomization processes are employed. First, training samples for each individual tree are randomly selected by applying bootstrap sampling strategy. Second, rather than choosing the best split among all attributes, the tree inducer randomly samples a subset of the attributes and chooses the best one.

Since the RF algorithm is a supervised learning algorithm, up-to-date market value for 1.789 real-estate located in Pendik district of Istanbul province were determined as ground-truth data. Then, the ground-truth dataset was randomly split into 70:30 ratio for construction (training) and validation of the model respectively to evaluate the prediction performance of RF algorithm. As a result, 1.250 points were selected as a training set and 537 points were selected as a validation set.

k = m The input data set consisted of 20 factors, hence the number of input variables (m) was set to be 5 (i.e. k = variables m at each split). Out-of-bag (OOB) error results of RF ensemble model was used to determine the number of trees parameter. For this purpose, input data set was firstly classified using a large number of trees (i.e. 500 trees) to estimate changes in OOB error with increasing number of trees. It was observed that there was a sharp decline in OOB error from 0.040 to less than 0.030 as number of tree increased from 1 to 150. After that, OOB error continued to decrease slightly until the number of trees takes value 250. From this critical point to larger tree sizes, OOB error stays stable. For this reason, the number of trees (n) was set to be 250 for the current study.

RO represents the combination of many decision tree classifiers. In such structure, each decision tree is trained with randomly selected sample sets from the training data set. After the training of decision trees, some data are extracted from training data sets and the degree of importance of different properties is determined. Finally, all properties in the data set are ordered by importance according to the Z value.

The predictive accuracy of RF model constructed with the user-defined parameters was tested using; Root-mean-square error (RMSE)=0.103 Mean-absolute-error (MAE)= 0.0758 and Correlation coefficient (CorC)= 0.799 The RF model constructed with optimal parameter setting applied to whole dataset to produce land valuation map of the study area. The resulting thematic map was produced for Pendik.

When the produced thematic map was analysed, it was observed that the south-west part of the Pendik district especially the lands situated along the sea side were predicted as high land value. Furthermore, areas close to the shopping centre located to the north-east part of the study area were also predicted as zones with high value.

Conclusion The resulting thematic map produced with RF algorithm demonstrates the model predictions (supervised), while thematic maps produced with fuzzy logic approach demonstrate value trends in each thematic factor group (unsupervised). if you have ground truth value or real market value with geographic location, random forest can be used to predict real value of each pixel. According to the results, it can be said that fuzzy logic and random forest algorithms with the help of GIS analysis capabilities gives successful results and can be used for mass real estate valuation. Both of them can be programmed easily, it can be used together and seperately. You can produe valuation map more accurate with less parameter selections by users.

Thank you for your attention Prof. Dr. Arif Cagdas AYDINOGLU Gebze Technical University Dept. of Geomatics Engineering 41400 Gebze/Kocaeli Turkey Email: aydinoglu@gtu.edu.tr Web: arifcagdas@com Twitter: https://www.twitter.com/arifcagdas