statistical methods for real estate data prof. rndr. beáta stehlíková, csc. 2013

30
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Upload: letitia-barker

Post on 17-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

Beáta Stehliková, Bratislava 3 How to obtain new knowledge? We want to answer the question: How to obtain new information, new knowledge from data?

TRANSCRIPT

Page 1: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Statistical methods for real estate data

prof. RNDr. Beáta Stehlíková, CSc.

2013

Page 2: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 2

Informationis currently besides financial, energy,material resources

the main factor of progress.

Page 3: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 3

How to obtain new knowledge?

We want to answer the question:How to obtain new information, new knowledge from data?

Page 4: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Talk only about one method of spatial statistics

Why spatial statistics ?Methods of spatial statistics are for spatial

data

Real estate data contain very often information about the geographic location – there are spatial data

Beáta Stehliková, Bratislava 4

Page 5: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Variable and data

A variable - a characteristic of population or sample that is of interest for us.

Data - the actual values of variables

Beáta Stehliková, Bratislava 5

Page 6: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

6

Different kinds of data

Cross-sectional data are data on one or more variables collected at a single point in time

Time series data data are collected over a period of time on one or more variables

Panel data – the same cross-section over time

Obs Price (SEK) Living Area 1 600 000 80 2 750 000 95 3 675 000 75 4 825 000 84 . . .

200 925 000 96

Obs. Year Index GDP 1 1981 101 900 2 1982 105 1050 3 1983 110 1200 .

20 1999 250 8500

in real estate

Page 7: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Types of data (scale)

We have said that data - the actual values of variables

Types of data: Interval data are numerical observations Ordinal data are ordered categorical observations Nominal data are categorical observations

Beáta Stehliková, Bratislava 7

Page 8: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Types of data (scale)

Knowing the type of data (scale) is necessary to properly

select the technique to be used when analyzing data.

Beáta Stehliková, Bratislava 8

Page 9: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

2.9

Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful

information is produced.

Descriptive statistics

Descriptive statistics

Page 10: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Descriptive statistics

graphical techniques (histogram)numerical descriptive measures

Mean (average) Median (middle value) Mode (most frequently ) Variance Standard deviation

Beáta Stehliková, Bratislava 10

Page 11: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 11

Descriptive statistics are not enough

Average (17,8) Standard deviation (4,7) Coefficient of variation

(26,4 %) n=25 1

4

0

11

9

6,1 10,1 14,1 18,1 22,1

2

10

0

12

1

0

2

4

6

8

10

12

9,8 13,8 17,8 21,8 25,8

It is necessary to know

the probability distribution

Consider two data sets A and B

A

B

Page 12: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 12

Second example

Consider two large data sets A and B

Page 13: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 13

The location information

It is not possible to identify differences between data sets without we take into account the location information

Page 14: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 14

The location information

Variograms quantify changes in values in the space

there is no there is spatial autocorrelation

small distances

correspond to small changes in values

small distances

correspond to large

  changes in values

Page 15: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Spatial autocorrelation

The degree to which near and more distant things are interrelated

Measures of spatial autocorrelation attempt to deal with similarities

in the location of spatial objects and their attributes

Page 16: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Spatial autocorrelation

Positive (objects similar in location are similar in attribute)

Negative (objects similar in location are very different)

Zero (attributes are independent of location)

Page 17: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Spatial autocorrelation - measures.

Several measures available: Moran’s coefficient I, Geary’s C coefficient, Getis-Ord coefficient G.

These measures may be •“global” - they apply to the study region • or “local” - autocorrelation may exist in some parts of the region but not in others.

Page 18: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Moran’s coefficient I

varies between –1.0 and + 1.0 0 indicates no spatial autocorrelation [1/(n-1)]

(indicate random pattern) When autocorrelation is high, the I coefficient is

close to 1 or -1 Negative values I indicate negative

autocorrelation Positive values I indicate positive autocorrelation

(indicate a tendency toward clustering)

Page 19: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Regression analysis

is a technique for using data to identify relationships among variables and use these relationships to make predictions.

Beáta Stehliková, Bratislava 19

Page 20: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 20

Page 21: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Regression analyses that ignore spatial dependency can have

unstable parameter estimates and unreliable significance tests.

Solution: Spatial Autoregressive Models Lag model Spatial Error model

Beáta Stehliková, Bratislava 21

Page 22: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Spatial Models

22

SPATIAL LAG SPATIAL ERROROrdinary Least Squares

No influence from neighbors

Dependent variable influenced by

neighbors

Residuals influenced by neighbors

Y = β0 + Xβ Y = β0 + λ WY + Xβ + ε Y = β0 + Xβ + ρWε + ξ

Lag model controls spatial autocorrelation in the dependent variable

Error model controls spatial autocorrelation in the residuals, thus it controls autocorrelation in

the dependent and the independent variables

Page 23: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Software GeoDa

Beáta Stehliková, Bratislava 23

Page 24: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Compare different spatial models

Neither R2 nor Adjusted R2 can be used to compare different spatial regression models

We can used Akaike Information Criteria (the smaller the AIC value the better the model)

24

Page 25: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Example

Beáta Stehliková, Bratislava 25

dependent variable y – price of dwellingindependent variable x – living area

Classical regression analysis

Page 26: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Residuals

Beáta Stehliková, Bratislava 26

Moran´s I = 0.193022

Significance:P value= 0.03140<0.05

This indicate positive spatial autocorrelation

between residuals.

Page 27: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Spatial error model

Beáta Stehliková, Bratislava 27

Page 28: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Local Moran’s coefficients

Beáta Stehliková, Bratislava 28

Which values produce spatial autocorrelation ?

Page 29: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Spatial statistics

Methods of spatial statistics very use full for data with the location information

The art of looking for beauty,and science looking for true.

Spatial statistics will help us find the truewhen we use the right methods

Beáta Stehliková, Bratislava 29

Page 30: Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 30