Doing a GROUP BY in ArcGIS
As most of you know, I am a big fan of spatial SQL. It is my go-to tool whenever working with GIS. But, I have seen too many people using ArcGIS get tripped up with trying to summarize the results of a spatial operation because ArcGIS does not support SQL. So today, to spare my ArcGIS friends the trouble of writing large “for” loops in Arcpy to populate data that takes hours to run, I want to show you two lines of Arcpy to very quickly replicate the GROUP BY function in SQL:
Using my favorite GIS data set in Tompkins County, NY, assume we have two layers: parcels2007 and watersheds:
What I want to do is calculate for each watershed, the sum of all the parcel areas. In SQL, we would create a query statement like:
SELECT Sum(Area(ClipIntersect([Watersheds].id,[Parcels2007].id))) AS watpar, watershed INTO temptable FROM [Watersheds],[Parcels2007] WHERE touches([Watersheds].id,[Parcels2007].id) GROUP BY watershed
In ArcGIS, you can perform an Intersect_Analysis, and then you can summarize the result of that analysis by taking the SUM of the resulting area (Shape_Area) and essentially group it by the watershed name (watershed):
arcpy.Intersect_analysis(["watersheds", "parcels2007"],"watpar","ALL") arcpy.Statistics_analysis("watpar","watsum",[["Shape_Area","SUM"]],"watershed")
So that’s it. Two lines of Arcpy code.