Moran’s I in PostGIS

artlembo
Aug 25, 2016
1 min read

Just thinking out loud here. I’ve always been bothered by the complexity of Moran’s I. Actually, it’s not complex, it’s just math. And, it’s really nothing more than Pearsons Correlation Coefficient tricked into a spatial context. So, even calculating Pearsons by hand is a pain. But really, it is nothing more than simply performing a correlation on two arrays. In this case, the arrays are the values of adjacent features. Nowadays, we have great tools for calculating the correlation coefficient, so if you can get two arrays representing the adjacent data, you simply wrap you query into that. Take a look here as I revisit the Figure 15.4 of my textbook An Introduction to Statistical Problem Solving in Geography:

SELECT corr(a.pctwhite, b.pctwhite)
FROM cleveland AS a, cleveland AS b
WHERE st_touches(a.geometry, b.geometry)
AND a."OID" <> b."OID"

We are simply finding those census tracts that are adjacent to one another and obtaining their respective pctwhite values. That returns two columns, which we pass into the correlation function (corr).

The results are nearly identical to ESRI’s Morans I index.

What do our statistician friends think?

If you want to learn how to write spatial SQL code, work with Postgres, or understand statistics and geography, check out my courses here.

Moran’s I in PostGIS

Recent Posts

Comments