Fantasy baseball mailbag: Beware of the dangers of sample size
You ask, I answer. Here are some of my thoughts on the questions that I receive all week at the
Perception isn't always reality.
All of us, at one time or another, get sucked into the sample size morass, yet very few of us emerge from it with our senses intact. Example. Emilio Bonifacio may be the most valuable player in the fantasy game in the month of July as he's hitting .441 with eight steals and 10 runs scored in nine games. So you pick up Bonifacio, ride him until he returns to being the below average hitter he has always been (.259/.319/.333 for his career) and move on.
The problem is, most of the time in just such a scenario you don't realize that the minor player is hot until they've been that way for two weeks, so by the time you actually make the move to pick them up they're already tailing off. It's why, more times than not, going with the more skilled player results in a better outcome than trying to play the hot hand. Two weeks, or even a month, just isn't that long when the season goes on for six months.
As for Stubbs, he is a highly skilled, albeit flawed, player. There's no disputing that he has struggled of late, hitting .206 over his last 10 games and homerless in his last 14 games. Further, the last time he stole a base was June 27. But back to our old friend, sample size. Let's look past his past three weeks and compare his season-long pace this year to the numbers he produced last year.
Stubbs is actually on pace to have a slightly better fantasy season this year than last because of the addition of the steals and runs.
If you drafted Stubbs expecting him to hit .280, you were fooling yourself. If you drafted him thinking he would be as consistent as the sun rising and falling, you were fooling yourself. Stubbs is an all or nothing type hitter who strikes out too much, and therefore has long stretches of ineffectiveness, but as long as he keeps up his year long pace his numbers will be just fine at year's end.
I know this isn't a question, but I'll use it as a springboard to mention something that you should all be made aware of -- first and second half splits usually mean little. But if you're going to say "Player A hits .050 points higher in the second half, how could you not care about that?" My response: it's totally random. Why not choose May 3 through June 29 as the sample size to review? Because it's ugly to look at. Using the All-Star Game as a dividing line makes all the sense in the world because it's a natural break point. However, that's all it is -- a natural break point. Let me give you an example.
Let's say Player A is a .250 hitter in the first half, but a .300 hitter in the second half. If you saw an article pointing that out, your natural inclination would be to add that player right now. But should you? Let's say that Player A had exactly 250 at-bats each of the six years he has been in the league in the second half. Again, let's postulate that he is a .300 hitter in the second half. What if I told you the following hitting line would give you a total of .300 for a second half average despite looking pretty scary?
.250, .350, .375, .225, .335, .265
Those totals would net you an average of .300, but as you can tell, two of the years Player A was well below average, and one season he was league average at best. If you get the player who hits .375 you win your league. If you get the guy who hits .225, well, fantasy football will start soon (we all hope). And don't forget about sample size. Make sure there is enough data at your disposal to truly ferret out what is going on as two seasons of splits isn't likely to give you a crystal clear outlook on the situation.
Be careful not to buy into a number without checking out the data behind that number. As much as I love numbers, even I know that they can be deceiving at times.
Napoli's usage over the years is one of the more vexing situations in the game. Year after year the guy flat out mashes, yet his manager never seems to have confidence in him. He's not the greatest defensive catcher in the game, though his
Napoli has hit a mere .232 this season, but his OPS is .873, fourth at the position among fellas with 180 plate appearances. He's also powered 12 homers with 33 RBI for the Rangers in just 155 at-bats. That's a pace that would net him 36 homers and 99 RBI over 465 at-bats. Why isn't someone willing to use him at catcher, first and DH to give him 500 at-bats? I guess no team in baseball could use 30 homers.
Honestly, with the way that the Rangers have used him all year, I don't have much faith that they will suddenly start running him out there every day. The best way for that to happen would likely be if he was dealt to a team that understands the talents he possesses.
Remember at the top when I said perception isn't always reality. The perception is that Joyce started out hot and is now a dud, while LoMo is a better all-around hitter. However, is that true?
One thing is completely clear: Joyce was phenomenal, and then poor. Joyce hit .370 with nine homers over his first 51 games, and since then he's hit .163 with three homers. Yikes is right.
For his part, LoMo has also struggled recently. After hitting .320 over his first 32 games, he's hit .221 over his last 37 games. Still, I bet it would surprise many of you out there to learn that Joyce still bests LoMo in runs, batting average, OBP, SLG and OPS.
Joyce: .290-12-41-45-5 with a .351 OBP, .513 SLG, .864 OPS
Surprising isn't it?
So who would I take rest of the way? Have you got a coin to flip? I'd go with Morrison, but I admit that it's quite possible that the numbers of the two outfielders this season will end up being pretty similar.