Wednesday, April 20, 2011

Retrosheet MySQL Queries

Here are a few Retrosheet queries I have found helpful. You can download Retrosheet using the steps outlined here. These queries assume you followed those exact steps. If your columns are named something different, you'll have to change that part of the code a bit. All of this was done by trial and error, so there may be a better way to do the code.

Add Season Column:
This should add in a column on the events table that puts the year next to the Game ID.
ALTER TABLE events
ADD Season YEAR AFTER GAME_ID;
UPDATE events
SET Season = (SELECT MID(GAME_ID,4,4))

Add Pitcher Splits:
This should add pitcher splits per year for Left/Right matchups.
CREATE TABLE Pitcher_Splits(
SELECT PIT_ID,
CONCAT(id.first," ",id.last) AS Pitcher, Season, BAT_HAND_CD AS Batter_Hand,
COUNT(IF(BAT_EVENT_FL='T',1,NULL)) AS PA,
COUNT(IF(AB_FL='T',1,NULL)) AS AB,
SUM(EVENT_RUNS_CT) AS R,
COUNT(IF(H_CD<>'0',1,NULL)) AS H,
COUNT(IF(EVENT_CD='20',1,NULL)) AS 1B,
COUNT(IF(EVENT_CD='21',1,NULL)) AS 2B,
COUNT(IF(EVENT_CD='22',1,NULL)) AS 3B,
COUNT(IF(EVENT_CD='23',1,NULL)) AS HR,
COUNT(IF(RUN1_SB_FL='T',1, NULL)) AS SB1,
COUNT(IF(RUN2_SB_FL='T',1, NULL)) AS SB2,
COUNT(IF(EVENT_CD='6',1,NULL)) AS CS,
COUNT(IF(EVENT_CD='14',1,NULL)) AS BB,
COUNT(IF(EVENT_CD='3',1,NULL)) AS K,
SUM(H_CD) AS TB,
COUNT(IF(DP_FL='T',1,NULL)) AS GDP,
COUNT(IF(EVENT_CD='16',1,NULL)) AS HBP,
COUNT(IF(SH_FL='T',1,NULL)) AS SH,
COUNT(IF(SF_FL='T',1,NULL)) AS SF,
COUNT(IF(EVENT_CD='15',1,NULL)) AS IBB,
COUNT(IF(EVENT_CD='18',1,NULL)) AS RBOE
FROM EVENTS, id
WHERE EVENTS.PIT_ID=id.id AND (BAT_HAND_CD="L" OR BAT_HAND_CD="R")
GROUP BY Season, PIT_ID, BAT_HAND_CD);

Fill Calculated Columns:
This fills in the columns you created above.
UPDATE Hitter_SPLITS
SET BA=H/AB;
UPDATE Hitter_SPLITS
SET OBP=((H+BB+HBP)/(AB+BB+HBP+SF));
UPDATE Hitter_SPLITS
SET SLG=((1B+2*2B+3*3B+4*HR)/AB);
UPDATE Hitter_SPLITS
SET OPS=OBP+SLG;
UPDATE Hitter_SPLITS
SET BABIP=((H-HR)/(AB-K-HR+SF));