Some analysis of the SoCal Govenor Cup G01 Preliminary round predictions. Scroll way down to see the simulations or click this --> Link to page where you can see simulations with link to team pages This simulation was run with a model based on data before the Governors Cup started. Here are the actual results from the tournament.
So how well did the model do comparing the predictions and actual results?
So how well did the model do comparing the predictions and actual results?
- Predicting the #1 finisher. What fraction of the time brackets do you expect that the team with the highest probability would win? To do comparison, I sum up the highest probabilities in the 1st column and divide by 100 and compare to # of times the #1 finisher was the one with the highest probability.
- If my model were perfect and there were no uncertainty in the estimates, then I expect to get 11.29 brackets right (right=I correctly predicted the #1 in bracket).
- If I chose randomly (meaning all probabilities in 1st col = 25), I expect to get .26x17=4.25 brackets right.
- I actually got 7 right. Better than average but definitely lower than 11 by a lot.
- So, the estimated strengths are uncertain (certainly true) and/or past performance is not a perfect predictor future performance (also true).
- Overall, my impression of my success at predicting the #1 finisher in a bracket is...'Meh'. I suspect that that is just hard to predict (which is part of what make tournaments fun---the favorites do not always finish #1)
- How well does the model predict what team gets through their bracket. Here things look a lot better (to me). This plot shows the model probability of getting through (probability finishing #1 + probability of finishing #2) against whether or not the team did get through (0 or 1). You can see there is a strong correlation. If there were none, the red line, which is the logistic regression fit, would be flat.
- If the model probability is less than 30, chances are grim BUT some do still make it and it could be your team!
- If the probability is greater than 70, chances are good but don't get complacent. Even teams almost certain of making it through (according to my model) have been stopped in the prelims.
- Between 30 and 70, chances are about 50/50.
- Next step, compare to the National Cup logistic regression after that cup is done. My intuition is that a higher level cup (National Cup) will be more predictable---though this is pretty spot on.
SIMULATION RESULTS
SoCal Govenors Bracket A 1st 2nd 3rd 4th Conejo Valley United White G01 0 2 10 88 Temecula United G01 32 33 31 4 #2 West Coast SA East Valley Lions G01 37 33 27 4 #3 Westside Breakers Blue G01 31 32 32 4 #1 SoCal Govenors Bracket B 1st 2nd 3rd 4th Carlsbad Lightning White G01 0 3 28 69 Murrieta Surf Blue G01 8 75 14 2 #1 Santa Monica United G01 91 9 0 0 #2 Simi Valley Eclipse White G01 0 13 58 29 #3 SoCal Govenors Bracket C 1st 2nd 3rd 4th Fram Soccer Club White G01 44 38 14 4 #1 Santa Anita SC Bronze G01 47 36 13 4 #2 Baldwin Park United G01 4 11 33 52 #3 Santa Barbara SC Red G01 5 15 40 40 SoCal Govenors Bracket D 1st 2nd 3rd 4th Desert Elite FC Natives G01 1 24 31 44 Legends FC White G01 95 4 0 0 #1 Vista Storm G01 2 32 35 31 #2 Wolfpack G01 2 39 33 25 #3 SoCal Govenors Bracket E 1st 2nd 3rd 4th Downtown SC G01 1 3 12 84 San Luis Obispo Storm G01 23 31 38 8 #1 Elite Soccer League - Saddleback Strikers G01 44 32 21 3 #3 Sherman Oaks Extreme G01 33 34 28 5 #2 SoCal Govenors Bracket F 1st 2nd 3rd 4th Arsenal FC SD Avengers 0 1 15 84 #3 Central Coast Condors G01 2 16 68 14 CVSC Rebels G01 18 65 16 1 #2 So Cal Athletic Stanwood G01 80 18 1 0 #1 SoCal Govenors Bracket G [I'm not sure I used the right Wolves team] 1st 2nd 3rd 4th FC Deportivo Hacienda Wolves G01 43 28 19 10 LA Galaxy South Bay HB Black G01 7 15 27 51 #2 Legacy FC G01 26 29 27 18 #3 San Diego SC Black G01 25 28 27 20 #1 SoCal Govenors Bracket H 1st 2nd 3rd 4th AYSO Matrix Bonita Blue G01 2 59 28 11 #2 Westfield SC Blue G01 0 27 43 30 #3 CVSC Rampage G01 98 2 0 0 #1 Empire SC G01 0 12 30 58 Bracket I missing because no data on AYSO team
SoCal Govenors Bracket J 1st 2nd 3rd 4th California FC Pumas G01 2 6 18 75 #3 San Diego Fusion G01 22 30 36 12 Santa Clarita Valley Green G01 37 32 25 7 #1 Simi Valley Eclipse Blue G01 40 32 22 6 #2 SoCal Govenors Bracket K 1st 2nd 3rd 4th RSF Attack Green G01 0 3 41 56 Scripps Ranch SC Red G01 0 4 52 43 #3 Coachella Valley Soccer Strikers G01 57 40 3 0 #1 Slammers FC White G02 42 53 5 1 #2 SoCal Govenors Bracket L 1st 2nd 3rd 4th Real So Cal Black G01 8 51 31 10 #1 City SC Westchester G01 5 31 47 17 Elite Soccer League Surf G01 87 11 2 0 #2 FC Sol Black G01 0 7 20 73 #3
Bracket N missing since no data on the AYSO team
SoCal Govenors Bracket M 1st 2nd 3rd 4th Foothill Storm G01 24 25 26 25 #2 LA Galaxy South Bay HB Green G01 28 26 24 22 San Marcos YS Revolution G01 20 22 25 33 #3 Xplosion SC Whittier G01 27 27 25 21 #1 SoCal Govenors Bracket O 1st 2nd 3rd 4th Celtic Hibs G01 0 5 19 76 Crescenta Valley SC White G01 93 6 1 0 #1 LA Premier FC White G01 4 51 35 9 #3 Poway Vaqueros White G01 2 37 45 15 #2 SoCal Govenors Bracket P 1st 2nd 3rd 4th Cypress FC Premier White G01 52 37 11 0 #1 La Jolla Impact Blue G01 42 43 14 1 #2 Ventura County Fusion YSA G01 0 0 8 92 Exiles SC Red G01 6 19 67 7 #3 SoCal Govenors Bracket Q 1st 2nd 3rd 4th Liverpool Premier Trask G01 4 15 56 25 #2 Pateadores HB Blue G01 35 50 14 2 #1 Madrid North G01 61 32 6 1 #3 Valley United Sedaghat G01 0 3 24 72 SoCal Govenors Bracket R 1st 2nd 3rd 4th AYSO Matrix San Elijo G01 0 61 27 12 #2 BYSC Corona United G01 100 0 0 0 #1 Irvine Slammers White - FC Blades Red G01 0 16 32 52 Wolfpack White G01 0 23 41 36 #3 SoCal Govenors Bracket S 1st 2nd 3rd 4th El Segundo Gunners G01 20 55 18 6 #2 Encinitas Express Soccer Blue G01 1 11 38 50 #1 Greater Long Beach SC Tidal Waves G01 76 20 3 1 #3 LA Galaxy South Bay White G01 2 13 41 44
Code that made the plot glm.out = glm(P ~ made, family=binomial(logit), data=a) plot(a$made,a$P, type="p",ylab="Probability of Making it Through",xlab="Model Probability of Making it Through") lines(a$made[order(a$made)], glm.out$fitted[order(a$made)], col="red",lwd=4)
Impressive workout on prediction, Well see what is going to happen in real world. I am impressed with your work
ReplyDeletePerformance Sport Sleeves (Green and Black Digital Camo) Youth Med/Large in
$15.99