Winegrowing and Baseball – Rotobase

Wine and baseball are intertwined in the best of ways. Both the growing season and the playing season overlap almost perfectly. Pitchers and catchers report in February, foreshadowing the beginning of spring training. In the vineyard we prune and train our vines in February in anticipation of spring.

Play begins in earnest in April, and hope is in the heart of every fan for the possibilities of a great season. Bud break and initial vine growth occur with the crack of the first bat.

Early season injuries can devastate a team (~cough~ Jose Reyes ~cough~) just as easily as early season frost can decimate a vineyard.

In September and October the seasons wind down and the harvest and playoffs begin amid frantic activity and excitement. A winner is crowned as baby wines are barreled down for the winter.

And when it’s all over and the last leaves fall from the vine, we’re eager to sit down, reflect on the past season and begin looking forward to the new.

I love baseball.

Each year, in the quiet period after crush has ended and before the work of growing begins anew, I take a few weeks to work on a project that both interests me and expands my skill set. 4 years ago, along with my family, I decided to start building a winery. Last year I wrote a desktop database client and a companion iPhone app for the winery (BTW, here’re my thoughts on the recently announced iPad).

This year my mind turned to baseball. For my wine geek friends who aren’t into baseball ( but should be) you can click away now. It’s about to become a baseball stat geekatorium up in here.

Basically I said to myself, “self, you’ve always wanted your own baseball stats database, and a pretty way to access it. You also need to get a deeper understanding of mySQL and php for projects like Help a Winery Out. Why not do that for your yearly project?” To which I replied, “hell yeah.”

There are so many incredible resources out there for the baseball fan with some technical chops, it’s breathtaking. Retrosheet, for instance, is a complete record of every play made in every game stretching back to the 50s, and they are adding more historical data each year. And it’s completely free. Truly remarkable.

So I downloaded the sucker and got to work building a cool way to interface it.

Now, dear reader, if you count yourself as one of those baseball purists who don’t sugar the whole fantasy baseball thing, you may want to click away at this point as well. That should leave under ten interested readers. Excellent! You are my peeps.

Here’s what I built. It’s called Rotobase. Like Fangraphs but for fantasy baseball nuts.

I think I do these projects now as a reaction to being unable to complete my winery project. I feel a very pressing need to complete *something* and “ship” it each year, even if it isn’t a bottle of wine.

Happily next year will be different. Bottles of wine will finally ship. Which makes me wonder if my desire to do these projects will ship with them.

For now I’ll be competing in the NFBC Auction (nationwide high stakes league) in Vegas in March and using this tool to aid me in my research.

Wish me luck!

Fair use is made of cropped copy of a photo appearing on Uncork for a Cause

Capozzi Winery

11 Comments → “Winegrowing and Baseball – Rotobase”

  1. Bradley 2 years ago  

    About 49 seconds into this I knew I was awash in baseball geekdom. I love it. Even I don’t make it to a game this year, I’ll have Rotobase to soothe my fiery unmet needs and perhaps allow me to gain some measure of respect if I choose to return to fantasy ball.
    I smell a comeback.

  2. Jay Ducote 2 years ago  

    Rotobase looks amazing. As food and drink blogger and rabid fantasy baseball enthusiast, it is great to see some other statistical junkies out there. It looks like you did an amazing job with it. Now the question is, how do I get my hands on it??

  3. Greg 2 years ago  

    Nicely done!

    What stat library do you have working in there? It would be cool to get R running inside of the program to do some crazy modeling of the data.

    Again, nice!

    Greg

  4. Josh Hermsmeyer 2 years ago  

    @ Bradley and Jay,

    There are some IP and ethical issues that would need to be worked out before I released the tool. It would be free, but there is some data in it that I couldn’t remove without crippling it, but would need a contractual agreement to use. We’ll see…

    @ Greg I’m using a Retrosheet database. I also have a PitchF/X database as well and was looking at rapache as a way to auto generate r graphs. Need to learn a lot more about r tho. Would love some help when the time comes!

    Thx for the kind words guys.

  5. Craig Gummer 2 years ago  

    Wow, this takes “Who’s on first, What’s on second” to a whole new level… very cool Josh. My favorite fantasy league auction quote ever was a buddy who replied “Hmm, I would have to write him down to cross him off.” This would spare a lot of no-hit weak-throw guys from ever being written down. Play Ball and skÃ¥l. CG

  6. Craig Gummer 2 years ago  

    Wow, this takes “Who’s on first, What’s on second” to a whole new level… very cool Josh. My favorite fantasy league auction quote ever was a buddy who replied “Hmm, I would have to write him down to cross him off.” This would spare a lot of no-hit weak-throw guys from ever being written down. Play Ball and skÃ¥l. CG
    Oops, should have added great post! Can’t wait for your next post!

  7. Steve Paulo 2 years ago  

    What part took the longest? I’ve tried to take some time to put something like this together for myself in the past, and the place I always trip over is “lining up” the database (BDB or Retrosheet) with the projections systems, and the projections with each other, with the Rotowire links, the FanGraphs URLs, etc…

    Basically… how did you deal with getting all the different player IDs to match yours? (or really, how did you deal with converting yours to the myriad others)

  8. Steve Paulo 2 years ago  

    Oh, and would you consider releasing the PHP source? Obviously it’s not a working system w/o the data, but I would love to see the code!

  9. Josh Hermsmeyer 2 years ago  

    Steve,

    Check this out!

    “I have found a willing partner in MLBAM to provide all their IDs. I have
    taken the first step of cobbling ID maps of various people and sources,
    and have created the following:

    http://www.insidethebook.com/ee/imag…ORT_ID_MAP.zip

    More details are here:
    http://www.insidethebook.com/ee/inde…_data_project/

    For all of you who have an ID mapping file, PLEASE, download the above,
    link it to yours and:
    1. fix your files if you find errors
    2. tell me about any errors I have
    3. after you fix your errors, submit your ID mapping file to me; I can
    certify it as accurate, or it might help me in finding more errors

    I have so far cobbled data from several sources. My hope is that after 2
    or 3 iterations of this that I will come up with the definitive ID mapping
    file.

    If someone out there works for STATS, then be a good guy and help me out.
    The BIS data looks like it’s complete, as is the Retro and BDB. There is
    still some problems with the MLBAM data (a few dozen corrections
    required).

    Once this is done, we can then link to the biographical information (name,
    handedness, birth, death, school), and MLBAM as I said is a willing
    partner here. MLBAM includes not only the 17,000 or so MLB player ID, but
    another 65,000 or so minors and others player IDs.

    Hopefully, at some point in the near future, we can make it so that we
    have overnight updates any time corrections are made.

    Help me, help you.

    Thanks, Tom”

    Tangotiger to the rescue!

    As to releasing the code, if I’m ultimately unable to figure out a way to make the tool public I’ll definitely post it.

Trackbacks For This Post

  1. The Innovative Russian River Pinotblogger | DrHo.ro - 2 years ago

    [...] Designed an application for baseball fantasy sports [...]

  2. Rotobase – A Work In Progress | Rotoblog - 2 years ago

    [...] here’s the thread at BaseballHQ (behind pay wall) where this all began, and here’s a post over at my winery blog that has a video demo of the first version of [...]