Monday, March 8, 2010

Day and Night

The Man with the "Goat"-Tee.

Day and Night. No, not the Kid Cudi song, though it rang through my head almost the entire time I was working on this. We're talking about batting splits. Almost a week ago, Paul Sullivan quoted Cubs statistician Chuck Wasserstrom as saying: "You don't look at that as much, but you definitely pay attention to day vs. night because of the amount of games we play during the day." This kind of shocked a lot of us, but still failed to dull my overall pleasure at the revelation.

Anyway, Sullivan went on to mention how Aaron Miles had hit for a high day-time batting average in 2008 (yuck; bad stat, weird split, and small sample), yet was not able to transfer that into better production. So, I decided to delve further into this. Is there really a difference between day and night splits? Can the Cubs benefit from adding a guy based on his day splits?

First of all: in deference to the local neighborhood traffic, the Cubs play a majority of their home games during the day. So, if we can cause one of the 60-some day games to flip from win to loss, it would really do us good in a sport where five to ten extra wins could be the difference of third and first place. So: I'm all for taking advantage of whatever home field advantages we can make. But is there really a quantifiable difference in day and night hitting?

From my own experience, playing baseball during the day can be an entirely separate experience from playing at night. At night, the ball cuts through distinct, white lights, and during the day, funny shadows can play tricks on pitches (and have done so historically) and everything -- from the grass to the sky -- is just really bright. But, I ask myself, would these issues really still occur for someone who plays the sport professionally -- i.e. someone who has, since youth, played at night, practiced in the day, and many times vice versa; someone who plays the sport all year long (if we include winter ball) and has practiced for years? Here, I tend to become suspicious.

Normally, I just turn to The Book whenever I have a mathematical question of this nature, but I could not find anything about day/night splits in there. Typically, most sabermetricians don't consider day/night splits simply because: a) there doesn't seem to be significant variation and b) more importantly, there is really not a large enough sample to examine a player's variation.

Well, let's address (a). Is it a mis-perception that day/night splits are pretty neutral? Here's the last five years per wOBA:

(Note: wOBA includes stolen bases, which should not presumably be different for day/night splits. However, any misleading results from this are seemingly nulled because I examined the variations for OBP and SLG and saw similar results as wOBA)

Year MLB Wrig Diff

2009 0.331 0.328 0.003

2008 0.330 0.338 -0.008

2007 0.334 0.335 -0.001

2006 0.337 0.339 -0.002

2005 0.330 0.327 0.003

AVG 0.333 0.333 -0.001

Year MLB Diff AL Diff NL Diff

2009 0.000 0.006 0.001

2008 0.000 0.003 0.001

2007 0.000 -0.001 0.004

2006 -0.001 0.005 0.000

2005 -0.002 0.005 -0.002

AVG -0.001 0.004 0.001

Year Day Night Day Night Day Night Day Night
2009 0.332 0.332 0.326 0.325 0.334 0.340 0.331 0.332
2008 0.330 0.330 0.328 0.327 0.333 0.335 0.331 0.331
2007 0.334 0.334 0.335 0.330 0.333 0.338 0.334 0.334
2006 0.336 0.337 0.333 0.332 0.341 0.343 0.337 0.337
2005 0.329 0.331 0.326 0.328 0.333 0.334 0.329 0.331
AVG 0.332 0.333 0.329 0.328 0.335 0.338 0.332 0.333

Hmmm... The above charts don't give a lot of compelling evidence to the possibility that batters are different in day versus night. The top chart shows MLB average wOBA versus the wOBAs at Wrigley Field. This is just kind of a park adjustment reminder. Basically, it shows that over the last five years, players have hit pretty much what we expect them to hit in Wrigley (note: this is not a full-scale park adjustment; we have previously noticed how Wrigley, on average, is a hitters park; our above-average pitching influences the wOBA results).

The second chart presents the differences of the bottom chart. Basically, we can assume that hitters generally hit very similarly in both day and night, across all leagues.

However, these are just averages; somewhere behind the scenes, hundred of players with different skill-sets are potentially pulling the results in different directions. I would B-Ref's data included standard deviations, so we could get a better feel for the overall outlay of this distribution. But, nonetheless, any variation seems to be (on the surface) a product of chance.

But let's look at Aaron Miles. Was there something that stick out in his numbers? In hopes of capturing a larger sample size, lets look at his career day/night splits:

Day 254 855 6 0.307 0.344 0.402 0.746 0.334 120
Night 463 1568 10 0.268 0.309 0.331 0.640 0.291 89

Wow! His OBP and SLG are enormous! But two problems: a) his sample size -- 855 PAs is like 1 season and a month (and Ken Griffey in 2009 showed us how reliable one season of data is) -- which brings us to b) his BABIP. In that sample, Miles appears to have been extraordinarily lucky (a BABIP .030 points higher than his career average), and some of those hits appear to have gone for doubles and triples too, as his SLG is far above his career norms (he hit 11 triples during the day!).

However, we also have to ask: is his BABIP higher for different reason? If he runs faster during the day (highly unlikely) or blinds fielders with his goatee (also unlikely), then this would be easier to discern. However, there certainly is a chance that he's simply narrowing in on the ball more during the day and therefore hitting more line drives. Unfortunately, we don't have splits for line-drive data (not that I know of), so this will have to go unanswered.

Ultimately, I can only comfortably assert the following:

1) the majority of players hit the same during day or night games,

2) the possibility of exceptions to this do exist, and

3) Aaron Miles may be one of those exceptions, but it does not seem likely and we cannot be certain.

I leave you with this:


  1. Kid Cudi and sabermetrics. Love it. I've listened to a ton of Cudi lately.

    Also: Question on one of your last posts: Where did you get your wOBA's for your post on wOBA's per position?

  2. Tragically, I had to compute it myself because I was using numbers from Baseball-Reference. I will probably make all these datasets available in the next few days, so people can play with them and double-check my numbers.