Friday, February 26, 2010

A Deeper Look at 2009 Batting Splits: By Position

Well, a week or so ago, I wrote what I considered to be my final, consummate analysis of 2009. This seemed reasonable, given that Spring Training was right around the corner and I'd have something else to do. Well, I admit now that seeing pictures and videos of the Cubs running drills and playing catch has not in fact sated my desire for statistical games -- it's only increased it. So, without any new statistical information (the games don't start until next Thursday), I must continue to plow through 2009 data to remind myself that 2010 is full of potential.

Anyway, one thing I want to look is the old, manager's axiom: "We've got to have better situational hitting." This tired, old statement seems to spill out of each manager's mouth at some point in the season and can mean a lot of things, but typically it's about hitting well with runners on -- when the run expectancy is highest. I'll get to that eventually, but I want to also peruse some other topics of the Splits nature:

Thursday, February 25, 2010

So It Begins...

There's something very comforting about all of this:

The above video comes from Boys of Spring blog. Fans that don't check it everyday may not be ready for baseball!

Tuesday, February 23, 2010

Around the Interlines

First of all, and this won't be the last time I mention it, our pals from DRaysBay -- namely Steve Slowinski -- created an excellent definitions database for sabermetric terms and tools. It's called the Sabermetrics Library, and we should all visit it often. (It's also linked on the side of our page layout.)

On a similar note, my idol, RJ Anderson, took some time to link up a bunch of useful saber sources.

Oh, and Soriano's knee still hurts. Geez, he is turning into a caricature of an aging power hitter. I really hope his health turns around and we don't have to DFA him after the All-Star Break.

At ACB, mb21 looks into Hendry's multi-year contracts. It's a really great study and frankly surprised and intrigued me. Also, it continues to add to my Winter of Apologies, as I begin to better appreciate the Cubs FO.

Lastly, this baby:

It comes from some data that Papa Tango was playing with on his blog. Basically, Tom is looking at peak pitchers ages, and how such a study can be corrupted by selection bias -- a topic that was pretty heated a month or two ago, and still worth examining in my opinion.

Oh, and our friend Berselius made me really, really hungry.

Saturday, February 20, 2010

The Pitch

A few days ago, I gave my friends and family quite a scare: I tweeted a photo of myself, appearing to have received some minor stitching on my forehead. Well, I certainly received some sort of stitching. Well, this all started in early January when the accumulated snow was thickening to ice and the skies were daily gray...

Friday, February 19, 2010

Lent and Why I Abstained from Purchasing Cubs Tickets

It is Lent and I am a little loopy from quasi-fasting and not eating meat (which is a real struggle for me). I haven't had a full meal in 48 hours (unless curling in mass quantities counts).

Anyways, Cubs tickets went on sale today to the general public.

I decided to abstain from Cubs tickets this year because




Like faith, I have much more questions than answers for this year's team. Those questions are many and varied:

Will we compete for the division?

Will we remain relatively healthy?

Will Soto rebound?

Has Zambrano used 7 minute abs?

Will Soriano catch a flyball without his hop?

Will the Ricketts family get rid of the guest singer for the 7th inning stretch?

So this year, I'm going to keep the Cubs at arms length focusing more on production analysis than complete fandom. I will continue to pray for the Cubs. But sacrificing the naivete of being a Cubs fan will be good for the soul.

Sometimes you need a break.

So today's meal consisted of (bad) minestrone soup...

...sometimes during Lent, fasting is the better choice.

Wednesday, February 17, 2010

Ryan Theriot, Arbitration, Starlin Castro, Benny Hill and Curling?

Rob G. from The Cub Reporter had a GREAT post on the arbitration process with the Cubs and Ryan Theriot. It includes a brief synopsis of arbitration, a comparison (to Stephen Drew), UZR, other stats, and Benny Hill theme music (all is right in the world).

All of this arbitration talk made me think of something though; this could be Theriot's last season as the Cubs' starting shortstop. The reason being is that we have Starlin Castro waiting in the wing.

Marc Hulet from Fangraphs had this to say about Castro:

"Just 19, Castro was pushed from rookie ball to high-A ball and he still hit .302/.340/.391 in 358 at-bats...Castro’s walk rate of just 5.0% was worrisome, but it improved to 8.3% upon a promotion to double-A (111 at-bats). Despite the late-season jump, the shortstop actually showed improvements in his game against better pitching...he also kept his strikeout rate low at 10.8% while hitting .288...successful in all six of his stolen base attempts in double-A. Defensively, he has the skill set to remain at shortstop...He’ll likely return to double-A to begin the season, but he’ll probably be the Cubs’ starting shortstop before his 21st birthday."

What I gather from this arbitration process and the talented Castro paying his dues in the minors is that Theriot doesn't fit into the Cubs' long term plans at shortstop. If the Cubs are smart, they give Castro a hard look this spring and maybe call him up at some point during the season. Then, hopefully, next year's middle infield would be Castro and Theriot.

In other news: I noticed something about the Olympics and it is this: Curling sneaks up on you. I've caught myself watching and enjoying it these past couple days. Sure, it looks ridiculous- but admit it, you've given it a thought or two of becoming a curler just to say you're an Olympian.

Chase that dream.

Injured Pitchers, Korean Pitchers, and Something Else

Over on RotoBlog, Josh Hermsmeyer released a pretty significant injuries database, ranging back to 2002. It's a well constructed, unprecedented database. His findings give us this:

You mean baseball players tend to get arm/shoulder injuries?! Yeah, I realize that's not really groundbreaking, but I think this database will really be fun moving forward.

In other news, the Cubs appear to have signed Kim Jin-yeong, a prospect out of Korea. He's an 18-year old RHP, straight out of high school. I do not speak Korean, so my information is limited to what Korea Baseball has to offer. However, Korean prospects are certainly unique in that, oftentimes, the government interrupts their development with the nationally-required military service. So hopefully Kim breaks into the majors fast and earns an exemption.

Also, I just finished an article reviewing the Cubs 2009 season over at ACB. Please check it out.

UPDATE: And for those who didn't believe Soto lost the 40 pounds, I refer you here.

Tuesday, February 16, 2010

Spurned by SIERA

Today, we finish the Cubs-SIERA analysis, taking a look at the bottom of SIERA's barrel. But, before we dive into that, I'd like to mention mb21's ACB post about Carlos Silva. I am beginning to campaign for Silva's forgiveness because: a) I really do hope he turns his career around and b) I may have looked at his numbers and scoffed him too quickly. That being said, I really would be surprised if he has a good season. I suspect he gets designated for assignment by mid-season. :(

Anyway, Cubs Stats readers may recall my previous post, which both explained SIERA and applied it to the 2009 Cubs. In that post, we discovered SIERA dislikes these pitchers:

The Kind-of Losers (anyone with a higher SIERA than xFIP)
Sean Marshall: (SIERA) 4.01, (xFIP) 3.82
Randy Wells: 4.34, 4.24

Carlos Silva: 5.65, 5.38

Now, what I found most intriguing about the differences in SIERA and xFIP is that the maximum gain from SIERA was around 0.94 (Russ Springer) and the maximum loss was equally close to 1 (J.J. Putz). In other words, there is really not a whole lot of variation there. But, considering SIERA uses a lot of similar inputs as xFIP, we should expect that any variation is pretty close to being significant (even if it is not statistically significant, which is a whole different can of worms).

So what are the profiles of these pitchers that SIERA would punish them? Or, more importantly, do any of these pitchers have profiles more akin to recieving SIERA's blessing rather than spurning?
Sean Marshall earned some above-average ground ball rates (GB%), but his strikeouts and walks were close to league average (K/BB 2.13; league average is about 2.00, and higher is better). Also, he had a normal home run rate. Therefore, SIERA doesn't see anything to reward.

Randy Wells had a great rookie season, but was also pretty lucky. Almost every metric tells us that. He had an unusually low HR-rate (0.76 per 9 IP; league average is about 1.00) with fewer strikeout than you would like for a guy giving up a standard amount of flyballs. In other words, SIERA can see through the luck, too.

Carlos Silva really only pitched a handful of games in 2009, a total of 30 IPs. It's hard to critique on that small of a sample, but SIERA rightly detects that: a) he struck no one out and b) surrendered long balls like mad.
Ultimately, it again appears that SIERA has done a pretty good job identifying what the marginal statistics confirm. With the exception of Carlos Marmol, I can't think of any SIERA results that left me scratching my head.

Monday, February 15, 2010

WAR Calculator, Spring Training Photos, Psuedo-SIERA, and Jay Cutler

Our friend a Another Cubs Blog, mb21, has updated Colin Wyer's 2008 WAR Calculator, applied it to the Cubs, and made it available to all of us! This is really handy when trying to make predictions using different numbers than the norm.

Meanwhile, the fine bloggers at WPBC brought to my attention Tim Sheridan's blog, Boys of Spring. Sheridan has some really neat posts, complete with photos of some early-bird Cubs. Take some time to get excited about baseball again!

The SIERA testing phase continues as Patriot of Walk Like a Sabermetrician dives into SIERA's innerworkings and makes his own faux-SIERA. It's an interesting read for those keeping tabs.

In the world of football, the ever-excitable Ron Jaworski took some time to analyze Cutler's 26 picks in 2009. Many of these throws certainly seem bone-headed in a vacuum, but some of the reasons he was so jittery and throwing into double coverage include: 1) no running game, so everyone sits back in coverage, and 2) weak pass protection made him a scared, scared man -- teams didn't need to rush more than four because his porous line (which improved steadily near the end of the year) had already put The Fear into him.

Saturday, February 13, 2010

SIERA Watch: Day 5 (FIP and xFIP)

Above, we have SIERA vs FIP and xFIP. The line is approximately 45 degrees. In other words, the line represents a 1-to-1 equality. If SIERA and FIP/xFIP were by-in-large similar (which wouldn't necessarily be bad), then they would follow along this line. We can see, particularly in the higher numbers, how xFIP tends to be a touch lower than SIERA. It's hard to say, just yet, what this means.

Anyway, let's continue profiling our beloved Cubs and see if this coincides with our other knowledge.

Friday, February 12, 2010

SIERA Watch: Day 4 (Data!)

I've created two (separate, but equal) docs -- one in Excel, one in OpenOffice -- for a complete SIERA list, including pitchers who pitched 25 innings or more. You can download these from my Google Docs, here:

Excel SIERA Test


OpenOffice SIERA Test

For those just catching up, here is Day 1 and Day 3 (I was busy on Day 2 with my classes or looking at pictures of cats or something -- doesn't matter, nothing super significant happened).


I just ran a quick little regression or two in STATA. It looks like FIP and SIERA, as a whole, share correlation, but are indeed somewhat different. The chart below shows how they tend to fan out in the higher ranges:

Thursday, February 11, 2010

SIERA Watch: Day 3

Today, Tom Tango took a look at SIERAs treatment of walks. I'm really happy with the way the sabermetric community has so steadfastidly (?) prodded this new tool. It's very much a wiki-like reaction: in the first few days, everyone began tinkering with SIERA, opening the hood, poking the engine with a wrench (or whatever you do to cars). On the first day, I built a little Quick Calculator for the world to play with, and my dawg, FreeZorrilla, put it to the AL East test. Meanwhile, the big dogs at The Book blog put the metric through the rigors. The latest article produces some eyebrow-raising questions -- like: "Since when has allowing more walks been a good thing for pitchers?"

Anyway, Tango mentions the need for real-world comparisons to help gauge SIERA in non-isolation. So, let's take a closer look at our initial results:

Tuesday, February 9, 2010

I Can't Decide Whether to Call it SEE-air-UH or SEER-uh...

Today, Tango mentioned Matt Swartz and Eric Seidman's latest pitching metric, what they call Skill-Interactive Earned Run Average (SIERA). Anyway, I had some time on my hands and some intrigue my heart, so I thought I'd apply their nifty formula (below) to our beloved Cubs (2009 stats, plus Carlos Silva in the Google Doc). Here is their formula:

SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)*((GB-FB-PU)/PA) – 4.027*(BB/PA)*((GB-FB-PU)/PA)

And here is why we like it:
1. Allows for the fact that a high ground-ball rate is more useful to pitchers who walk more batters, due to the potential that double plays wipe away runners.

2. Allows for the fact that a low fly-ball rate (and therefore, a low HR rate) is less useful to pitchers who strike out a lot of batters (e.g. Johan Santana's FIP tends to be higher than his ERA because the former treats all HR the same, even though Santana’s skill set portends this bombs allowed will usually be solo shots).

3. Allows for the fact that adding strikeouts is more useful when you don't strike out many guys to begin with, since more runners get stranded.

4. Allows for the fact that adding ground balls is more useful when you already allow a lot of ground balls because there are frequently runners on first.

5. Corrects for the fact that QERA used GB/BIP instead of GB/PA (e.g. Joel Pineiro is all contact, so increasing his ground-ball rate means more ground balls than if Oliver Perez had done it, given he's not a high contact guy).

6. Corrects for the fact that FIP and xFIP use IP as a denominator which means that luck on balls in play changes one's FIP.
Follow the jump as we examine how this relates to our beloved Cubbies.

Friday, February 5, 2010

Dear Cubs Fans, LET IT GO!

The argument I HATE, with a passion, is when certain Cub fans come up to me and say Steve Bartman blew the Cubs chances in 2003. As a studying economist and sabermetrician hobbyist, I know this isn't correct. But I need proof. Data. Analysis. Hard numbers. Ron Santo's hair piece.

So I embarked on a journey through the inter-web, trying to find some statistical measurements about that fateful October night and low and behold, I found it courtesy of Tom Tango from the Hardball Times. He had this to say about the Cubs collapse.

Basically, the Bartman incident only decreased the Cubs chances of winning by 3%. The odds were still HEAVILY in their favor with a whopping 92% at that point.

There we have it. Finally. Some sense to this nonsensical argument.

So please, Cubs fans that blame Bartman for 2003, Brangelina, Miley Cyrus, and the bailout, stop embarrassing yourselves and the rest of Cubs nation and LET IT GO.

Besides, you know you would've reached for that ball as well.

Something extra:
1)ESPN is doing a documentary on Steve Bartman for their "30 for 30" series titled "Catching Hell".
2)In July of 2005, Wayne Drehs of ESPN wrote an excellent piece about finding Steve can read it here.
3)And if you still aren't sold on this, allow me to tug on your heart strings; imagine your last game at Wrigley Field to watch your beloved Cubbies play ended like this.

The Rotation Outlook

As the precious glimmers of Spring Training approach, many Cubs fans are really getting anxious about our rotation. The top four (Zambrano, Lilly, Dempster, and Wells) will be only three for the first month as Terrible Ted recoups from surgery. So that leaves an open 5th spot and a month-long vacant 4th spot. And frankly, this state of flux has produced a lot of hand-wringing. So let's examine this no-Lilly situation:

Starting Pitchers, last three xFIPs (recent to old):

Carlos Zambrano - 4.27, 4.45, 4.62
Ryan Dempster - 3.81, 3.74, 4.25
Randy Wells - 4.24, n/a, n/a

These are good numbers. The numbers we want from our rotation. Now let's look for these numbers among...

The Contenders, last three xFIPS:

Jeff Samardzija - 5.16, 4.34, n/a
Carlos Silva - 5.53, 4.64, 4.57
Sean Marshall - 3.82, 4.25, 4.56
Tom Gorzelanny - 3.74, 5.84, 4.82

Follow the jump as we take these numbers apart.

Thursday, February 4, 2010

What's Going On, Internet?

First, this video (h/t The Book blog):

Dave Cameron breaks my heart today on Fangraphs, suggesting the Red will be "right in the thick of things come September."

Also, on THT, Jeremy Greenhouse muses about the linear weights and reliever WAR issues of late. This is very interesting debate in the sabermetric community right now, and I'm hoping we get some super WAR out of it in the end.

Meanwhile, Phil Burnbaum on Sabermetric Research looks at the Yankees payroll -- and like most people, does a double-take.

And keep an eye out as we prepare to release our Sabermetric Glossary. Hopefully it will provide a lot of assistance to those wondering about all the fancy acronyms, but don't understand (or care about) how they're produced.

Tuesday, February 2, 2010

An Exercise in Expectations

In my continued effort to spread statistical good-will among Chicago sports blogs, I recently completed a study on Another Cubs Blog concerning expectations based on a previous season’s success (or, more specifically, winning percentage). Check it out!

Here: An Exercise in Expectations

Update: Julie, over at a League of Her Own, muses about Mark Prior and shoulder pain. An excellent read.