Cumulative statistics can be a tricky thing. How often does the baseball world see a prospect or a young hitter come up, make a huge splash by hitting 4 home runs in his first 3 games and then end up back in AAA by mid-season? I've always been curious as to why that happens. Is it adrenaline? Is it because the pitcher has no scouting report on this new hitter and thus just lobs a 'whatever' right over the heart of the plate?
I think back to Eric Thames and the incredible start he had in 2017, hitting homers like a mad man in his 'back for revenge' season after playing in Japan the previous year. Let's just say, Thames went off! But, it didn't last long. By the end of his first 100 AB's, he started to come back down to earth or, regress, as it is. Eventually, a player regresses back to what would be closer to a career average. This post isn't about regression, it is simply showcasing a function that will allow python users to visualize a cumulative statistic, in this case, batting average. When downloading Baseball Savant game data, there are a few things that need to be fixed up before you can visualize a player's average over the course of the season. See the function below and the comments for all the steps I took.
Now that you have your data, you've cleaned it up and you're rounding 2nd, it's time to visualize!
By visualizing and looking at Thames over the course of the season, we can see this hot start didn't last long.
Let's take a look at a few examples of younger hitters. These graphs are fun to look at because we don't really know for sure what they are expected to regress to because they've haven't played much. Here you see 2nd year Juan Soto and Rookie Bo Bichette.
NOTE: I dropped these batters first 2 AB's if they got hits, just to remove 1.000 from the image.
Will these images allow you to deeply evaluate a player or make clear projections? Probably not. But, they do allow you to see that cumulative stats, like a batting average, can tell a story. With vertical markers we could point out injury spells, slumps, streaks and other points in time where there was some kind of influential side note. If you could only look at these graphs, which hitter would you rather have on your team in 2020?
0 Comments
Leave a Reply. |