In the “Bill James Handbook 2008″ Bill James projected John Maine to have a 12-11 record with a 4.05 ERA, and Oliver Perez to have a 9-12 record with a 4.69 ERA. Now after watching these two pitchers emerge into top of the rotation starters over the past two years I found it hard to believe that they would each take such a significant step backwards next year. The only explaination I could come up with for this bold prediction was that Bill James had found something that indicated that Maine and Perez over- achieved last year. I personally felt that they were both due for another step forward next year, but the god of sabermetrics disagreed with me, so I needed to statistically back up my hypothesis.
The ultimate goal of baseball is to score the most runs, and in order to score a run you need to gain four bases. So under this principle each base is worth 1/4 of a run. To confirm this number I took a SRS of 30 ML pitchers and divided the average Runs Allowed by the average Total Bases (1b+2(2b)+3(3b)+4(HR)+BB+HBP). I found the mean ration to be .248 which is basically 1/4. I then devised a regression formula, R=TB/4. This formula gives you a predicted number of runs allowed, which allows you to find the residual value, observed(actual)- predicted Runs. A positive residual meant that the pitcher was actually a victim of bad luck and pitched better than his statistics indicated, and a negative residual meant that the player over-achieved. The r-sq. value of this regresion was .74, meaning that 74% of the data can be accounted for by this regression. I believe this value, however, would increase with a larger sample size.
While researching this stat further I found two pretty significant outliers. In 2007 Brad Penny gave up only 75 runs, but according to the regression formula he should’ve given up 88.25. However in 2006 and 2005 he had a +1.25 and a +4.5 run residual value meaning that his stats reflected the way he actually pitched in each of those years. The other outlier was Roy Oswalt, who had a +17.75 run residual in 2007. Unlike Penny, though, this was a trend for Oswalt who had had a +19.25 residual value in each of the prior two years. I found that in each of the three years (2007, 2006, and 2005), with runners in scoring position, opposing batters had a babip of .285, .262, and .272, and an OPS of .637, .619, and .734 respectively. This indicated to me that Roy Oswalt was an unbelievably gifted pitcher when it came to stranding runners on base.
Now we finally get to the focus of this post, John Maine and Oliver Perez. Last year Maine gave up 90 runs, and in the regression formula he was predicted to have given up 90.25, a residual of -.25 meaning that Maine’s statistics last year reflected his overall performance. His babip for last year was .281 which again means he was neither lucky nor unlucky. Perez also gave up 90 runs last year, but his predicted run total was only 86 giving him a residual of +4. This actually means that Ollie pitched better last year than his stats indicate, and his babip of .273 also reflects this belief.
Nothing I have found or calculated indicates that either one of these pitchers is due for a decline in performance next year. It seems more logical to believe that experience will only help these two young pitchers continue to improve as they have over the past two years. I have the utmost respect for Bill James and his sabermetric statistics, but in this case I think he is well off in his prediction, and basically I find it hard to imagine either of these two pitchers regressing next year.
I think Bill’s numbers are derived from factoring in their performances from previous years, which leave out things like changing methods and attitudes when handled by the Mets’ coaching staff, etc. Perez would definitely suffer from such an analysis given his 2006.
I don’t think the Bill James projection is really picking Maine to take a significant step back. A 4.05 ERA really isn’t all that different from the 3.91 he actually had last year, given random noise.
As for Ollie, I wouldn’t be surprised to see step backward, either. Your method above is a little simplistic — Bill’s projections look beyond just the past season, and weight different aspects of performance differently, while also probably taking into account the types of batted balls Ollie produced. Assuming that his batted balls from last season would have the same results this year is a little dangerous.
Most important, prior to 2007, Ollie had two seasons that were quite bad. He also gave up a lot of unearned runs last year, which disguises his ERA. His RA (including unearned runs) was 4.56 last season.
Wow– serious geekitude.
From a non-statistical standpoint, I´d be concerned about Maine´s second half performance (I believe he dropped off a bit in the second half), and Perez´s combustibility.
On the other hand, I think that having Castillo and Schneider for the entire year may help them both, as will more consistency in the OF– the injuries in th OF last year resulted in a lot of turnover, and may have resulted in a few more runs scoring. Also, moving down a slot in the rotation may be of benefit as well, as they will theoretically be facing less acomplished pitchers than they would have in the 2-3 slots in the rotation.
For Maine, 12-11 just seems too low a win total given the fact that the Mets still have one of the best lineups in the league. If you project a 4.05 ERA, he should have the run support for more wins.
Enjoy the statistical analysis, but couldn’t you have also used wOBA for pitchers? Also, should this stat be used in conjuction with BABIP, since it is so hard to tell whether the 1B, 2B, and 3B were the result of luck or not?