Promoted from the User Journals, john provides an analysis of what makes Johan Johan.
When the Mets’ season ended too soon for my liking on September 28th, 2008, I thought to myself, “What Now?” Baseball is a long season and now I suddenly found myself with some free time. I figured that I would go back and look at the various pitchers that pitched for the Mets in 2008 and see how they did.
I was going to start with the bullpen and work my way up to the starters but I am still pretty upset with the majority of the pen, so I figured I’d start with a big bright spot for us: Johan Santana. First I’ll give a brief overview of the basic stats and then I go further into detail using MLB’s pitch f/x data. If you guys ever read Dan Fox’s articles on BP, I’ll probably be doing a similar format to that. Hopefully, next year I can do a comparison for most pitchers since I’ll have two years of data instead of one. Anything 2008 in this article was data obtained by me. Anything 2007 was from various articles on the web.
Basic Stats
Johan Santana signed last offseason and was worth every single penny of that big contract he signed in my opinion. In 2008, Johan posted his lowest ERA in his entire career. Having a reputation for being dominant especially in the second halves of seasons, Johan did not disappoint in the second half of 2008, posting an ERA of 2.17 down the stretch and going 8-0. Johan finished the season 16-7 (mostly because the bullpen blew a lot of his games) and had an ERA of 2.53.
There were a couple of interesting things that happened this year for Johan. Going from the AL to the NL, me (and a bunch of others Id imagine) figured that Johan would strike out more batters with the DH being replaced by the pitcher hitting, however this was not the case. Johan K/9 dropped to 7.91, the lowest rate since 2001 for Johan. Additionally, while the K rate went down quite a bit, the walk rate went slightly up raising from 2.14 BB/9 to 2.42 BB/9. This resulted in a FIP of 3.51, nearly a run higher then the ERA he posted. Usually, FIP is a better indicator of future ERA than this seasons ERA so we may see a dip in ERA next season. While that’s entirely possible, Johan has consistently outperformed his FIP so I do not think the drop will be that significant, if there’s a drop at all. His left on-base percentage was 82.6%, the highest mark in his career, so again we may see a little higher ERA next season. While the strikeouts were down, he made up for it by inducing more ground balls: 41.2% (compared to his average of 38.2%). Lastly his HR/9 rate was in line with the majority of his career so nothing to discuss there, tho one of the biggest differences between his first and second halves was that he gave up 14 home runs in the 1st half and lowered that to nine home runs in the second half. Also, only five of those 23 home runs occurred with runners on base (one a grand slam by you know who).
Pitch F/X
1) What Does He Throw?
Using data from MLB Gameday application, we can look further in detail to what Johan Santana threw in 2008. Johan threw 3598 pitches during the 2008 season, of these gameday has tracked 3490 of them or 97% of his total pitches. The following table lists the data:
| Pitch Type |
Count |
Pct Thrown |
Avg Speed |
Horizon Move |
Vertical Move |
| Changeup |
962 |
27.56% |
79.94 |
7.23 |
7.09 |
| 4-Seamer |
905 |
25.93% |
91.28 |
4.49 |
10.70 |
| 2-Seamer |
1175 |
33.67% |
91.06 |
7.83 |
8.14 |
| Slider |
448 |
12.84% |
83.28 |
0.13 |
3.63 |
| Grand Total |
3490 |
100.00% |
|
|
|
A couple things here before I go forward for those not familiar to the data. Horizontal movement is the amount of inches of movement compared to a ball with no spin. Positive values means the ball breaks inward towards a left-handed hitter. All left-handed pitchers have natural movement on their four-seam and two-seam fastballs towards the left handed hitter, hence the positive values. The changeup behaves just like a fastball and moves in the same direction, whereas the slider or curve will break away from a left-handed batter when thrown by a left-handed pitcher. Vertical movement is the downward movement compared to a ball with no spin. Typically, a curve ball will be in the negatives.
Let’s review Johan’s pitches, starting with the fastball. I had broken his fastball down into a four-seam fastball and a two-seam fastball. At first, I had trouble telling the two apart. Whereas Mike Pelfrey has a clear distinction between the two, Johan’s clusters were a bit closer together. Johan’s fastball averaged 91 mph. This is exactly league average. The thing that jumped out to me the most was the amount of two-seam fastballs compared to four-seamers. This year’s split was 25.93/33.67, whereas last year’s split, according to a Hardball Times article was 51%/19%. After seeing that, its no surprise why Johan’s groundball rate was the highest its been throughout his career. Typically two-seamers are thrown with a bit less velocity than four-seamers, so perhaps he’s getting more groundballs at the expense of the strikeout. He throws the fastball pretty much in line with the MLB Average of 59%.
The next pitch in Johan’s arsenal is the changeup. As seen from the above, Johan’s changeup averages around 79 mph, a full 12 mph difference from his fastballs. This difference is what makes Johan so tough. Most changeups have a 5-10 mph difference. He throws the pitch with similar horizontal movement but about an inch less vertical movement than the MLB Average.
The last pitch is Johan’s slider. As seen from above, Johan doesn’t throw this pitch very often. The MLB average for slider is 84 so Johan is pretty much similar in that regard and has similar movement to league average as well. Compared to last year’s data, his slider has improved drastically, having much more horizontal movement then in 2007.
2) Who Does He Throw It To?
Next let’s breakdown Johan’s pitches between lefties and righties:
| Pitch Type |
Pct Thrown All |
Vs Lefty |
Vs Righty |
| Changeup |
27.56% |
11.87% |
33.16% |
| 4-Seamer |
25.93% |
32.24% |
23.68% |
| 2-Seamer |
33.67% |
29.19% |
35.26% |
| Slider |
12.84% |
26.69% |
7.89% |
| Grand Total |
100.00% |
100.00% |
100.00% |
Johan follows the typical pitcher, throwing about the same percentage of fastballs to everyone, preferring the slider against lefties over the changeup, and the changeup against righties over the slider. Typically, lefties are tougher on left-handed hitters; however, Johan can keep his splits closer together because he has such a great changeup. You’ll see this in more detail when I go over some bullpen members (some do not have changeups or throw it not enough, hence the huge splits). This is also why a guy like Ollie whose mainly fastball/slider so effective on lefties but not so much against the righties.
3) What Happens When He Throws It?
In this section, I wanted to break down the numbers by count but I was unable to do so with the database I had (I know very little MySQL). Maybe later on I’ll figure it out. What I also wanted to do was break down by outcome to perhaps gain a better understanding of what his best pitches are and what pitches tend to get hit the most. This turned out to be difficult as well. I wanted to do AVG/OBP/SLG for each type pitch, but the best I can do is give outcomes. So here it goes:
| Outcome/Pitch |
Changeup |
4-Seamer |
2-Seamer |
Slider |
Grand Total |
| Bunt G. Out |
0 |
3 |
0 |
0 |
3 |
| Double |
9 |
9 |
11 |
6 |
35 |
| Field Error |
3 |
0 |
3 |
2 |
8 |
| Fielder’s Choice |
0 |
1 |
0 |
0 |
1 |
| Fly Out |
43 |
35 |
55 |
13 |
146 |
| Force Out |
8 |
10 |
5 |
4 |
27 |
| Ground Out |
37 |
31 |
68 |
26 |
162 |
| GIDP |
2 |
4 |
6 |
0 |
12 |
| Home Run |
4 |
9 |
6 |
3 |
22 |
| Line Out |
9 |
11 |
15 |
4 |
39 |
| Pop Out |
24 |
18 |
13 |
8 |
63 |
| Sac Bunt |
1 |
4 |
2 |
2 |
9 |
| Sac Fly |
0 |
0 |
1 |
0 |
1 |
| Single |
27 |
26 |
66 |
21 |
140 |
| Triple |
1 |
1 |
2 |
0 |
4 |
| Grand Total |
168 |
162 |
253 |
89 |
672 |
It might be missing some data due to missing pitches, but I think it gives us a good idea at least at what happens each time Johan throws each pitch. I get 672 balls in play compared to 691 on fangraphs, if I calculate that correctly. His overall BABIP of .299 is a bit higher than the .287 on Fangraphs though, so perhaps I’m a bit off here.
| Pitch Type |
BABIP |
| Changeup |
.253 |
| 4-Seamer |
.278 |
| 2-Seamer |
.336 |
| Slider |
.337 |
Not surprisingly, the changeup comes out on top big time here, with the four-seamer coming right behind. The slider is the pitch that gets hit the most.
Conclusion
I hope this article has given some insight on why Johan Santana has been a successful pitcher and hopefully will remain to be one throughout his Mets career.
Nice work john.
Johan is good at pitching.
what an awesome article
you rock john, almost as much as johan!
Hi John,
I’ve been fiddling around with the pitchfx analysis for Mike Pelfrey and some of the other met pitchers, and one statement of yours came to mind:
“At first, I had trouble telling the two apart. Whereas Mike Pelfrey has a clear distinction between the two, Johan’s clusters were a bit closer together.”
I have all of Pelf’s data and do not see this distinction you claim to see. In movement and speed terms, it is incredibly difficult if doable at all to distinguish between the two types of pitches he throws. My email is wraithlead@hotmail.com. Please check on this if you can and contact me if you do think you find such a distinction.
Thanks
Hi Loran
Shoot. I had just written alot and it all got deleted.
The difference mostly comes in the vertical category. I think a typical 4-seamer has more vertical movement and less horizontal movement then a 2-seamer. I wish I had that chart handy that shows the range of pitches. Two seamers are pretty much identical to changeups just thrown at fastball speed.
Anyways I couldnt tell just by looking. I used K-Means Clustering to cluster the pitches together.
I think Mike Pelfrey’s difference in Vertical movement between his 4-seamer and 2 seam is greater then Johan’s. Looking at Johan on a game by game basis, I honestly couldnt tell the difference and neither did using K-Means clustering…..however when I bunched all pitches together he gave me two separate clusters…..combined with the fact that Johan actually said his 2-seamer he felt was his best pitch. He said it set up all his others.
Pelfrey when using K-means, the pitches separated just fine.
I think a hard rule is anything under 5 inches of vert can be considered a sinker.
I dont know. I plan on looking at all the data sometime this weekend tho.
I guess that should read LESS vertical movement. The numbers are a bit tricky.
Typical 4-seamer = 10.00 inches
Typical 2-seamer = 5.00 inches.
Also, 4-seamers are thrown harder most of the time (tho not a huge difference). Tho with Pelfrey Im not so sure thats the case IIRC.
Well, part of the issue is that Pelf’s speed does not seem to correlate with his movement on his pitches. In other words, as you said, it doesn’t look like on pitchfx that his “4 seamers” are going faster than his “2 seamers,” and if they are at all, it is certainly not by a large amount.
Now if you want to call all of his fastballs at 5 or lower as a 2 seamer that makes sense. But the quantity of those pitches that are under that category is rather small so we’re likely excluding a lot of 2 seamers. Most of pelf’s pitches are in the 6-8 vertical movement range, which is imo a grey zone.
There simply isn’t on the graph (and using speed doesn’t help for aforementioned reasons) of horizontal by vertical movement any way to seemingly distinguish between pitches in these areas.
We Know pelf throws the two pitches, but I don’t think the difference is particularly noticeable via pitchfx, except via the extremes that you’ve mentoined.
Hmm I’ll have to take a look at the data.
Right now, pitch classification is extremely difficult. I don’t think that there’s really a way to be 100% on determing (especially when dealing with subsets of pitches)….I mean you got 2-seam, 4-seam and then u got cutter, splitfinger fastball etc etc.
Prehaps 5 inches shouldnt be the cutoff point. Looking at Johan his 2-seamer averages around 8.
Its difficult because each pitcher is different. A tom glavine fastball might be another pitchers changeup.
I’ve seen ppl calcuate spin (rpm) using the data and classify in that fashion. Tho they are using the same data we have so im not sure whether or not thats any more accurate then what im doing here.
I am pretty comfortable with general pitches…fastball,slider,curve,changeup…..breaking those down any further I think prehaps in the future we can get a bit more accurate.
I do think tho the fact that 2-seamers are going to give you more ground outs and probably give up more singles whereas 4-seamers might give more extra base hits……looking at the data above (and pelfreys) when I broke them down between the two pitches, both got more ground balls on the 2-seamers. So while I could be classifying them incorrectly, im reasonablely confident most are being classified right. But its definitely something that prehaps in the future we can get better at. I’ve read a ton of pitch fx articles, I havent seen a better way of classifying yet unfortunately.
Another thing…..vertical movement should not be the only difference…….theres difference in horizontal as well between the two so both need factored in.
Above you got 10.7 vs 8.1 but you also got 4.4 vs 7.2, both combined with fastball speed could help separate the two.
Horizontal movement also does not seem to correlate that well with any other factor in pelf’s case.
I mean, i agree, its a lot easier to categorize them in terms of FA, CH, SL, etc. And it’s clear the type of fastball it is SHOULD matter a lot (especially in pelf’s case).
Anyhow, I’m just nitpicking since i saw that sentence and was working on pelf. The article on Johan seems pretty solid. Hopefully we’ll get more pitchfx stuff here in the future.
Thanks. Im hoping to go through the majority of the pitchers this offseason.
What if you combined the two together tho?
While Horizontal or Vertical might not correlate well in of itself, I was thinking the two combined would.
Seems like a sizable difference 10.7/4.4 to 8.1/7.2. But those are only averages. There might be some in the 2-seam cluster that are also very close to 4-seam. Prehaps there’s a better way.
Another question would be the magnitude of the difference. I mean…..whats a big difference….is a ball that breaks 9 inches alot different then one that breaks 10 inches?
I think alot of questions have yet to be unanswered. But I do believe we can still learn alot from what we have.
For instance, I always wondered why some pitchers tend to have extreme splits as opposed to others……ive learned from this analysis generally pitchers that have good changeup’s have less extreme splits.
Whenever I see Showenweis stuggle against righties I think to myself “man if he could just learn another pitch, a changeup to keep the hitter off balance or prehaps a cutter…….prehaps its not that easy tho.
One of the things you’ll see with Mike Pelfrey is he doesnt throw the changeup very often….and when he does, its mostly out of the strike zone……hence the reason lefties have hit him pretty well in his career. I think Alex did an article on here where it mentioned that throwing a good changeup isnt easy so prehaps thats why some are ineffective in throwing it.
Great article.
Where did you get all this data?
And one thing that disturbs me is the amount of lineouts, unless some are being read incorrectly.
And how could horizontal movement of 0.13 be much more than in 2007? Was it 0.001 then?
I used a perl script from Mike Fast to spider the data from mlb.com. I put the script into my scheduled tasks so that it automatically downloaded the data each night. I altered the script because I only wanted mets pitchers. Once the data was obtained….I parsed the data into a MYSQL database. I queried a specific pitcher……exported the results to excel and played around with it from there.
That was alot easier then what I was orgininally doing which was every morning I manually downloaded the xml files and then opened them in excel, renamed the file, and saved them. That was tiresome but I continued to do it because sometimes after watching the game i’d wanna know the next day just how that start went…..so I continued…i had a folder for each pitcher and each file is a game log.
The lineouts make sense. According to fangraphs……151 line drives were hit against johan…..typically 75% of line drives are hits….so thats about 113 LD hits or 38 line outs.
According to a hardballtimes article in 2007 his horizontal movement on the slider was 2.74……on Josh Kalks players cards for 2007 was 2.98…..so his slider wasnt getting much horizontal movement at all…..you want that to be close to zero or in the neg’s (away from the LHB)
Not only does the FIP of 3.5 scare me, but if you check fangraphs, a lot of his peripherals have mildly but steadily declined since ‘04 (his first Cy Young season)….These include K/9, BB/9, K/BB, AVG, WHIP, and BABIP…..
Just a little worrisome.