
Descriptive Spatial Statistics
In Lesson 2, we highlighted some descriptive statistics that are useful for measuring geographic distributions. Additional details are provided in Table 3.3. Functions for these methods are also available in ArcGIS.
Table 3.3 Measures of geographic distributions that can be used to identify the center, shape and orientation of the pattern or how dispersed features are in space
Spatial Descriptive Statistic | Description | Calculation |
---|---|---|
Central Distance | Identifies the most centrally located feature for a set of points, polygon(s) or line(s) | Point with the shortest total distance to all other points is the most central feature D=∑ni=1∑n−1i=1√(xj−xi)2+(yj−yi)2Dcentral=minimum(D) |
Mean Center (there is also a median center called Manhattan Center) | Identifies the geographic center (or the center of concentration) for a set of features **Mean sensitive to outliers** |
Simply the mean of the X coordinates and the mean of the Y coordinates for a set of points ˉX=n∑i=1xin,ˉY=n∑i=1yin |
Weighted Mean Center | Like the mean but allows weighting by an attribute. | Produced by weighting each X and Y coordinate by another variable (Wi) ˉX=∑ni=1wixi∑ni=1wiˉY=∑ni=1wixi∑ni=1wi |
Spatial Descriptive Statistic | Description | Calculation |
---|---|---|
Standard Distance | Measures the degree to which features are concentrated or dispersed around the geometric mean center The greater the standard distance, the more the distances vary from the average, thus features are more widely dispersed around the center Standard distance is a good single measure of the dispersion of the points around the mean center, but it doesn’t capture the shape of the distribution. |
Represents the standard deviation of the distance of each point from the mean center: SD=√∑ni=1(xi−ˉX)2n+∑ni=1(yi−ˉY)2n Where xi and yi are the coordinates for a feature and ˉX and ˉY are the mean center of all the coordinates. Weighted SD SDw=√∑ni=1wi(xi−¯X2)n+∑ni=1wi(yi−¯Y2)n Where xi and yi are the coordinates for a feature and ˉX and ˉY are the mean center of all the coordinates. wi is the weight value. |
Standard Deviational Ellipse | Captures the shape of the distribution. | Creates standard deviational ellipses to summarize the spatial characteristics of geographic features: Central tendency, Dispersion and Directional trends |
For this analysis, use the crime types that you selected earlier. The example here is for the homicide data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | #------MEAN CENTER #calculate mean center of the crime locations xmean <- mean (xhomicide $ x) ymean <- mean (xhomicide $ y) #------MEDIAN CENTER #calculate the median center of the crime locations xmed <- median (xhomicide $ x) ymed <- median (xhomicide $ y) #to access the variables in the shapefile, the data needs to be set to data.frame newhom_df <- data.frame (xhomicide) #check the definition of the variables. str (newhom_df) #If the variables you are using are defined as a factor then convert them to an integer newhom_df $ FREQUENCY <- as.integer(newhom_df $ FREQUENCY) newhom_df $ OBJECTID <- as.integer(newhom_df $ OBJECTID) #create a list of the x coordinates. This will be used to define the number of rows a = list (xhomicide $ x) #------WEIGHTED MEAN CENTER #Calculate the weighted mean d = 0 sumcount = 0 sumxbar = 0 sumybar = 0 for (i in 1 : length (a[[ 1 ]])){ xbar <- (xhomicide $ x[i] * newhom_df $ FREQUENCY[i]) ybar <- (xhomicide $ y[i] * newhom_df $ FREQUENCY[i]) sumxbar = xbar + sumxbar sumybar = ybar + sumybar sumcount <- newhom_df $ FREQUENCY[i] + sumcount } xbarw <- sumxbar / sumcount ybarw <- sumybar / sumcount #------STANDARD DISTANCE OF CRIMES # Compute the standard distance of the crimes #Std_Dist <- sqrt(sum((xhomicide$x - xmean)^2 + (xhomicide$y - ymean)^2) / nrow(xhomicide$n)) #Calculate the Std_Dist d = 0 for (i in 1 : length (a[[ 1 ]])){ c <- ((xhomicide $ x[i] - xmean)^ 2 + (xhomicide $ y[i] - ymean)^ 2 ) d <- (d + c ) } Std_Dist <- sqrt(d / length (a[[ 1 ]])) # make a circle of one standard distance about the mean center bearing <- 1 : 360 * pi / 180 cx <- xmean + Std_Dist * cos(bearing) cy <- ymean + Std_Dist * sin(bearing) circle <- cbind (cx , cy) #------CENTRAL POINT #Identify the most central point: #Calculate the point with the shortest distance to all points #sqrt((x2-x1)^2 + (y2-y1)^2 sumdist2 = 1000000000 for (i in 1 : length (a[[ 1 ]])){ x1 = xhomicide $ x[i] y1 = xhomicide $ y[i] recno = newhom_df $ OBJECTID[i] #print(recno) #check against all other points sumdist1 = 0 for (j in 1 : length (a[[ 1 ]])){ recno2 = newhom_df $ OBJECTID[j] x2 = xhomicide $ x[j] y2 = xhomicide $ y[j] if (recno = = recno2){ } else { dist1 <- (sqrt((x2-x1)^ 2 + (y2-y1)^ 2 )) sumdist1 = sumdist1 + dist1 #print(sumdist1) } } #print("test") if (sumdist1 < sumdist2){ dist3 <- list (recno , sumdist1 , x1 , y1) sumdist2 = sumdist1 xdistmin <- x1 ydistmin <- y1 } } #------MAP THE RESULTS #Plot the different centers with the crime data plot (Sbnd) points (xhomicide $ x , xhomicide $ y) points (xmean , ymean , col = "red" , cex = 1.5 , pch = 19 ) #draw the mean center points (xmed , ymed , col = "green" , cex = 1.5 , pch = 19 ) #draw the median center points (xbarw , ybarw , col = "blue" , cex = 1.5 , pch = 19 ) #draw the weighted mean center points (dist3[[ 3 ]][ 1 ] , dist3[[ 4 ]][ 1 ] , col = "orange" , cex = 1.5 , pch = 19 ) #draw the central point lines (circle , col = 'red' , lwd = 2 ) |
Deliverable
Perform point pattern analysis on any two of the available crime datasets (DUI, arson, or homicide). It would be beneficial if you would choose crimes with contrasting patterns. For your analysis, you should choose whatever methods seem the most useful, and present your findings in the form of maps, graphs, and accompanying commentary.