(1) "Following too close"
(2) "Driving too fast"
(3) "Driver inattention"
(4) "Driving under the influence of alcohol, drugs or other attention diverting substances."
The MV_Drver_CitationCharge field contains much duplication. I am developing some text mining and aggregation techniques to abstract more generalized categories. Yet it seems clear that speeding, 'tail gating' (maintaining a too close stopping distance), and driving under the influence (probably most inclusively alcohol) are the major and most significant precursors to collisions in Whatcom County. One can see from a simple aggregation of the unique citation terms (e.g. "FOLLOWING TOO CLOSE","FOLLOW TOO CLOSE"), that more generalized grouping is needed:
mergeCD[MV_Drvr_CitationCharge != "",.N,.(MV_Drvr_CitationCharge)][order(-N)][1:15]
MV_Drvr_CitationCharge N
1: FOLLOWING TOO CLOSE 437
2: DRIVER INATTENTION 422
3: SPEED TOO FAST FOR CONDITIONS 397
4: DUI 174
5: DRIVING WITH WHEELS OFF ROADWAY 148
6: FAIL TO YIELD RIGHT OF WAY-LEFT TUR 145
7: IMPROPER LANE USAGE 143
8: FAIL TO YIELD THE RIGHT OF WAY 114
9: FAIL TO REDUCE SPEED FOR CONDITIONS 81
10: FLD TO YIELD ROW FROM DRIVEWAY OR P 76
11: FAIL TO STOP/YIELD AT INTERSECTION 71
12: FAIL STOP AT STOP SIGN/INTERSECTION 68
13: FLD SIGNAL STOPS/TURNS-UNSAFE LANE 66
14: FOLLOW TOO CLOSE 61
15: FAILURE TO MAINTAIN CONTROL 52
If I use the citation fields as an index, my 'corpus' for the top 15 (single word) term mentions with over 100 mentions each is as follows:
as.matrix(findFreqTerms(CitDtm, lowfreq=100)[1:15])
[,1]
[1,] "dui" # Note: 'Driving under the influence'
[2,] "inattention"
[3,] "fail"
[4,] "obey"
[5,] "improper"
[6,] "turn"
[7,] "control"
[8,] "left"
[9,] "yield"
[10,] "dwls" # Note: 'Driving with license suspended'
[11,] "speed"
[12,] "stop"
[13,] "close"
[14,] "follow"
[15,] "driver"
Using those top 'corpus' terms, I can aggregate a top 15 set of "citation categories" for Whatcom County collisions for my 01/2017 - 04/2019 study period:
CitCharge N
1: NA 878 # No citation
2: CLOSE FOLLOW 635
3: SPEED 538
4:INATTENTION DRIVER 463
5: FAIL YIELD 289
6: DUI 261
7: IMPROPER 261
8: FAIL LEFT YIELD 160
9: YIELD 112
10: FAIL STOP 107
11: FAIL SPEED 102
12: FAIL YIELD STOP 97
13: FAIL OBEY CONTROL 88
14: FAIL CONTROL 74
15: TURN STOP 70
Inference and Conclusion
A possible inference from the WSP MV_Drver_CitationCharge field data for the average daily 8.7 collisions in Whatcom County for my 27 month study period is that the causes of most collisions are behavorial. A possible hypothesis for future study could be that an increasing Whatcom County growth rate, increased vehicle traffic (especially on I-5), changing social fabric and status anxiousness among residents in Whatcom county are factors that could be contributing psychological effects on driver alertness in Whatcom County.Appendix 1
7174 Collisions / 820 days[1] 8.74878 collisions per day
No comments:
Post a Comment