Translate

Saturday, April 13, 2019

WSP Collision Data for WM County 01/01/2017 - 04/01/2019: Initial text analysis of MV_Drvr_CitationCharge

From the WSP Collision Data for Whatcom County, I am doing my best to summarize the 968 unique 'Citation Charges' from the 7,174 Whatcom County Collisions I am tracking for the 27 month period covering 01/01/2017 - 04/01/2019 . I have developed some text mining techniques to summarize the 'top citations' and I am interested if this information gives me insight into Whatcom County driver behavior in collisions. If I had to summarize my findings to date, I would comment that the top cause of collisions or dangerous driver behavior can be attributed to:

(1) "Following too close"
(2) "Driving too fast"
(3) "Driver inattention"
(4) "Driving under the influence of alcohol, drugs or other attention diverting substances."

The MV_Drver_CitationCharge field contains much duplication. I am developing some text mining and aggregation techniques to abstract more generalized categories. Yet it seems clear that speeding, 'tail gating' (maintaining a too close stopping distance), and driving under the influence (probably most inclusively alcohol) are the major and most significant precursors to collisions in Whatcom County. One can see from a simple aggregation of  the unique citation terms (e.g. "FOLLOWING TOO CLOSE","FOLLOW TOO CLOSE"), that more generalized grouping is needed:

mergeCD[MV_Drvr_CitationCharge != "",.N,.(MV_Drvr_CitationCharge)][order(-N)][1:15]
                 MV_Drvr_CitationCharge   N
 1:                 FOLLOWING TOO CLOSE 437
 2:                  DRIVER INATTENTION 422
 3:       SPEED TOO FAST FOR CONDITIONS 397
 4:                                 DUI 174
 5:     DRIVING WITH WHEELS OFF ROADWAY 148
 6: FAIL TO YIELD RIGHT OF WAY-LEFT TUR 145
 7:                 IMPROPER LANE USAGE 143
 8:      FAIL TO YIELD THE RIGHT OF WAY 114
 9: FAIL TO REDUCE SPEED FOR CONDITIONS  81
10: FLD TO YIELD ROW FROM DRIVEWAY OR P  76
11:  FAIL TO STOP/YIELD AT INTERSECTION  71
12: FAIL STOP AT STOP SIGN/INTERSECTION  68
13:  FLD SIGNAL STOPS/TURNS-UNSAFE LANE  66
14:                    FOLLOW TOO CLOSE  61
15:         FAILURE TO MAINTAIN CONTROL  52

If I use the citation fields as an index, my 'corpus' for the top 15 (single word) term mentions with over 100 mentions each is as follows:

as.matrix(findFreqTerms(CitDtm, lowfreq=100)[1:15])
      [,1]         
 [1,] "dui" # Note: 'Driving under the influence'     
 [2,] "inattention"
 [3,] "fail"       
 [4,] "obey"       
 [5,] "improper"   
 [6,] "turn"       
 [7,] "control"    
 [8,] "left"       
 [9,] "yield"      
[10,] "dwls"  # Note: 'Driving with license suspended'     
[11,] "speed"      
[12,] "stop"       
[13,] "close"      
[14,] "follow"     
[15,] "driver"     

Using those top 'corpus' terms, I can aggregate a top 15 set of  "citation categories" for Whatcom County collisions for my 01/2017 - 04/2019  study period:

             CitCharge   N
 1:                 NA 878 # No citation
 2:       CLOSE FOLLOW 635
 3:              SPEED 538
 4:INATTENTION DRIVER  463
 5:         FAIL YIELD 289
 6:                DUI 261
 7:           IMPROPER 261
 8:    FAIL LEFT YIELD 160
 9:              YIELD 112
10:          FAIL STOP 107
11:         FAIL SPEED 102
12:    FAIL YIELD STOP  97
13:  FAIL OBEY CONTROL  88
14:       FAIL CONTROL  74
15:          TURN STOP  70


Inference and Conclusion

A possible inference from the WSP MV_Drver_CitationCharge field data for the average daily 8.7 collisions in Whatcom  County for my 27 month study period is that the causes of most collisions are behavorial.  A possible hypothesis for future study could be that an increasing Whatcom County growth rate, increased vehicle traffic (especially on I-5), changing social fabric and status anxiousness among residents in Whatcom county are factors that could be contributing psychological effects on driver alertness in Whatcom County.  


Appendix 1

7174 Collisions / 820 days
[1] 8.74878 collisions per day

No comments: