Summary
View this post in wide mode on your cell phone. My best modeling almost always shows potential Democrat votes exceeding Republican votes in any county wide high turnout vote. There is a tendency in my modeling to overestimate Democrats but this is difficult to confirm completely since there is an historical tendency of predicted Democrats to turnout less than predicted Republicans. GE 2018 was a very strong turnout for Whatcom County. As 2019 is a local election, it is unknown what the turnout will be. The data below give model predictions and GE 2018 votes for Senate*.
- MC= Maria Cantwell
- SH = Susan Hutchison
sumD.Model sumMC* sumR.Model sumSH*
1: 95087 64971 49916 43757
*Does not include votes from hidden precincts.
The Whatcom ounty voter database has experience net growth year over year. This is probably because of increased registration efforts and increased net migration rates in Western WA and Whatcom County especially.
- October 2019 WM Active = 145,414
- October 2018 WM Active = 139,876
This gives us a 5,538 net increase year over year. Comparing registrations between the two points of 10/2018 and 10/2019:
- 9130 new active registration numbers
- 2098 lost (once active) registrations numbers
Those numbers are hardly indicative of voter "churn" or "flux". The WA SoS provides an excellent description of the effects of voter mobility on the voter database. Many voters are like light bulbs on a board with sketchy wiring. They toggle on and off from active to inactive status depending on migration, in county moves or simply forgetting to register a new address after the last move. You are can also be deactivated if you fail to vote in two consecutive federal elections.
Precinct Snapshot:
VR[CountyCode == "WM" & StatusCode == "Active",.N,.(PrecinctCode)][order(PrecinctCode)]:
PrecinctCode N
1: 101 1086
2: 102 716
3: 103 868
4: 104 514
5: 105 493
---
175: 609 976
176: 610 938
177: 611 697
178: 701 882
179: 801 839
Precincts GTR.1000 LT.500 mean min max mad sd var
1: 179 47 20 812 2 1446 249 291 84856
Precincts GTR.1000 LT.500 mean min max mad sd var
1: 179 47 20 812 2 1446 249 291 84856
I describe my use of the poLCA library here in this post. To reiterate, latent class analysis or regression is not usually described as a "big data" or "machine learning" technique. However, the approach to prediction is essentially the same. Without 'high dimensional' data, I attempt to regress a latent class composed of manifests (age, gender, location) against covariates (e.g. 'training set' in ML speak) here derived from the AVBallotParty field of the May 24, 2016 Presidential Primary. In this Primary, WA residents are asked to identify and AVBallotParty field as 'Democratic' or 'Republican'.
Latent class analysis requires 'parsimony' for accuracy. The manifests should be informative enough to be able accurately regress against the covariates or latent classes. In this case I judge the accuracy of latent class regression by how close the prediction comes to a recent precinct results (Senate GE 2018). To do this, I divide up the precincts in groups of levels of support for the Democratic candidate. I then regress my manifests against covariates specific to those groups, recombine all the posteriors for a separate individual score as in the table below. I make no guarantees that I have any real idea what I am doing, but at least I am searching for an electoral prediction solution that doesn't involve spending thousands or millions of dollars to purchase someone's private Facebook data!
Age | Gender | PID | AVBallotParty | DemPred | RepubPred | Region | Party | |
---|---|---|---|---|---|---|---|---|
1 | 70.00 | 1 | 251 | 1.00 | 0.00 | SD | SD | |
2 | 67.00 | 2 | 251 | 1.00 | 0.00 | SD | SD | |
3 | 51.00 | 1 | 509 | Republican | 0.00 | 1.00 | LR | SR |
4 | 78.00 | 2 | 224 | 1.00 | 0.00 | SD | SD | |
5 | 42.00 | 2 | 104 | NO PARTY SELECTED | 0.00 | 1.00 | LR | SR |
6 | 88.00 | 1 | 115 | Democratic | 1.00 | 0.00 | SR | SD |
7 | 61.00 | 1 | 115 | NO PARTY SELECTED | 0.00 | 1.00 | SR | SR |
8 | 50.00 | 2 | 115 | NO PARTY SELECTED | 0.00 | 1.00 | SR | SR |
9 | 80.00 | 2 | 115 | Democratic | 1.00 | 0.00 | SR | SD |
10 | 63.00 | 2 | 115 | Republican | 0.00 | 1.00 | SR | SR |
11 | 63.00 | 1 | 115 | Republican | 0.00 | 1.00 | SR | SR |
12 | 29.00 | 2 | 141 | 1.00 | 0.00 | SR | SD | |
13 | 79.00 | 2 | 145 | 0.00 | 1.00 | SR | SR | |
14 | 69.00 | 1 | 145 | 0.00 | 1.00 | SR | SR | |
15 | 50.00 | 1 | 604 | 0.00 | 1.00 | SR | SR | |
16 | 22.00 | 2 | 243 | 0.00 | 1.00 | SD | SR |
Precinct Division by GE2018 Democratic Senate Vote:
t1[div <= .55 & div >= .45,Region:= "Ind"] # Independent
t1[div > .55 & div <= .65,Region:= "LD"] # Light Democrat
t1[div > .65,Region:= "SD"] # Strong Democrat
t1[div < .45 & div >= .35,Region:= "LR"] # Light Republican
t1[div < .35,Region:= "SR"] # Strong Republican
1: SD 40292 11485 67202
2: SR 4073 11529 20870
3: Ind 7069 7200 20044
4: LD 7485 4841 16631
5: LR 6052 8702 20256
Latent Class Regression Prediction Percentages by aggregated GE 2018 Senate Precinct vote (See Precinct Division above):
SD | SR | IND | LD | LR | |
---|---|---|---|---|---|
1 | 86.00 | 28.00 | 57.00 | 75.00 | 43.00 |
2 | 14.00 | 72.00 | 43.00 | 25.00 | 57.00 |
Modeled by Predicted Party vote without Independents. (e.g. Voters pushed to one party or the other based on individual score.)
m1[,.N,.(Party)]
Party N
1: SD 93951
2: SR 48849
3: LD 1136
4: LR 1067
Model vs GE 2018 Senate Vote (excepts hidden precincts in GE 2018)
MC= Maria Cantwell
SH = Susan Hutchison
sumD.Model sumMC sumR.Model sumSHProblem Precincts for LCA Prediction
These are the precinct predictions with the greatest distance from the Senate GE 2018 vote. These precincts have an absolute percentage difference between projected and Senate GE 2018 of over 25%. Most of these precinct predictions are biased toward Democrats. The extreme Democratic votes of the WWU dorm district precincts (245,252) cause the poLCA library appropriation of Newton-Raphson to reverse their vote patterns almost entirely!
PrecinctID | MariaCantwell | SusanHutchison | D.lca | R.lca | D.pct.abs.diff | |
---|---|---|---|---|---|---|
1 | 105 | 207 | 161 | 404 | 89 | 25.70 |
2 | 108 | 393 | 428 | 943 | 268 | 29.10 |
3 | 159 | 115 | 86 | 117 | 17 | 30.10 |
4 | 162 | 341 | 254 | 623 | 133 | 25.10 |
5 | 166 | 418 | 317 | 839 | 157 | 27.30 |
6 | 182 | 530 | 403 | 1149 | 201 | 28.30 |
7 | 226 | 577 | 47 | 480 | 394 | 37.60 |
8 | 245 | 635 | 65 | 7 | 537 | 89.40 |
9 | 247 | 713 | 71 | 671 | 359 | 25.80 |
10 | 252 | 269 | 25 | 16 | 215 | 84.60 |
11 | 253 | 874 | 118 | 742 | 534 | 29.90 |
12 | 257 | 568 | 51 | 382 | 365 | 40.70 |
13 | 501 | 373 | 417 | 830 | 269 | 27.40 |
14 | 502 | 301 | 283 | 653 | 165 | 27.50 |
15 | 503 | 336 | 375 | 665 | 241 | 25.90 |
16 | 504 | 373 | 362 | 862 | 248 | 26.70 |
17 | 507 | 287 | 298 | 682 | 213 | 26.20 |
By LCA Predicted Region: Senate GE 2018 Totals and Precinct Prediction Percent
Regions | Cantwell | Hutchison | RegionTotal | LiklDemPct | LiklRepPct | CantPct | HutchPct | |
---|---|---|---|---|---|---|---|---|
1 | Ind | 7069 | 7200 | 20044 | 57.1 | 42.9 | 49.50 | 50.50 |
2 | LD | 7485 | 4841 | 16631 | 74.8 | 25.2 | 60.70 | 39.30 |
3 | LR | 6052 | 8702 | 20256 | 43.3 | 56.7 | 41.00 | 59.00 |
4 | SD | 40292 | 11485 | 67202 | 86.3 | 13.7 | 77.80 | 22.20 |
5 | SR | 4073 | 11529 | 20870 | 28 | 72 | 26.10 | 73.90 |
Prediction By Category with LastVoted
LastVoted | LD | LR | SD | SR | |
---|---|---|---|---|---|
1 | 2019-08-06 | 369 | 395 | 36412 | 20838 |
2 | 2018-11-06 | 389 | 368 | 31609 | 15200 |
3 | No Last Vote Record | 131 | 110 | 10197 | 5316 |
4 | 2016-11-08 | 0 | 0 | 6475 | 2829 |
5 | 2019-02-12 | 0 | 0 | 2272 | 2086 |
6 | 2012-11-06 | 0 | 0 | 1431 | 406 |
7 | 2018-08-07 | 0 | 0 | 662 | 293 |
8 | 2017-11-07 | 0 | 0 | 641 | 339 |
9 | 2008-11-04 | 0 | 0 | 515 | 136 |
10 | 2014-11-04 | 0 | 0 | 407 | 0 |
11 | 2018-02-13 | 0 | 0 | 371 | 118 |
12 | 2016-05-24 | 0 | 0 | 365 | 181 |
13 | 2013-11-05 | 0 | 0 | 250 | 0 |
14 | 2010-11-02 | 0 | 0 | 244 | 0 |
15 | 2017-08-01 | 0 | 0 | 221 | 0 |
16 | 2004-11-02 | 0 | 0 | 211 | 0 |
17 | 2015-11-03 | 0 | 0 | 161 | 0 |
18 | 2016-02-09 | 0 | 0 | 154 | 0 |
Prediction: By Age Decade and Predicted Party
Decade | LD | LR | SD | SR | |
---|---|---|---|---|---|
1 | 10 | 0 | 0 | 23 | 24 |
2 | 9 | 2 | 2 | 787 | 621 |
3 | 8 | 0 | 11 | 3299 | 2965 |
4 | 7 | 0 | 76 | 10324 | 7166 |
5 | 6 | 490 | 706 | 15353 | 8525 |
6 | 5 | 589 | 262 | 14021 | 6953 |
7 | 4 | 55 | 10 | 14584 | 6449 |
8 | 3 | 0 | 0 | 19292 | 4315 |
9 | 2 | 0 | 0 | 15565 | 9520 |
10 | 1 | 0 | 0 | 703 | 2311 |
Prediction: 42nd by Party
Party | N | |
---|---|---|
1 | SR | 39211 |
2 | SD | 58221 |
3 | LD | 1037 |
4 | LR | 969 |
Prediction: 40th by Party
Party | N | |
---|---|---|
1 | SD | 35730 |
2 | SR | 9638 |
3 | LD | 99 |
4 | LR | 98 |
poLCA citation:
Linzer, Drew A. and Jeffrey Lewis. 2013. "poLCA: Polytomous Variable Latent Class Analysis." R package version 1.4. http://dlinzer.github.com/poLCA.
Linzer, Drew A. and Jeffrey Lewis. 2011. "poLCA: an R Package for Polytomous Variable Latent Class Analysis." Journal of Statistical Software. 42(10): 1-29. http://www.jstatsoft.org/v42/i10
No comments:
Post a Comment