Category Archives: Sports

Humans without Vulnerability

Every good story is, at its core, a story about human nature. Who are we? What are we really like?

In order to answer these questions about ourselves, we tell stories that put human beings to test. What happens if the various aspects of human nature get pitted against each other? What happens when we test our human nature against its limits? What happens if you change or remove some vital part of human nature?

Then once we have concluded a story about human nature, we then are left with a question. what does this story say about how we should behave and organize ourselves?

I recently found myself unintentionally but simultaneously binge-watching two stories that tested human nature in a very similar way, but drew completely different conclusions. Those stories were a web serial called 17776 by Jon Bois’, and the HBO television drama Westworld.

(WARNING: some mild spoilers follow.)

Both of these stories imagine a near-future where human beings find themselves in an environment where the intrinsic physical vulnerability of human nature has been removed. In the case of 17776, there are some mysterious nano-bots which automatically fix things anytime someone gets sick or hurt. In Westworld, humans interact with robots who they are free to treat however well or badly they like with no repercussions. People can kill the robots, but the robots can’t kill humans. If a robot is killed, they are removed, fixed, and returned to service good as new.

That’s pretty much where the similarity between these two stories end. The two stories reach very different conclusions about what humans would do if they suddenly became physically invulnerable. Bois imagines that people would spend their days playing and watching increasingly elaborate games of football. Westworld, on the other hand, thinks that people would primarily indulge themselves with sex and violence.

17776 is optimistic about human nature, and the conclusion you could draw from it is that our vulnerability essentially causes us to indulge in behaviors that harm other people. Human nature is essentially good, and if you removed the external sources of our vulnerabilities, there would be no point in bothering to harm anyone else, so we wouldn’t. Westworld, more pessimistically, implies that it is our vulnerability that prevents us from harming others, because we are afraid of reciprocal harm. If you remove that fear, we will all become psychopaths and indulge in orgies of harm. We are by nature essentially wild, dangerous animals that need to be restrained.

Which model of human nature is more correct? It’s hard to know for sure. These are fictional stories. In real life, you cannot simply devise a scientific experiment where you remove vulnerability from human nature and see what happens. Every aspect of human nature, our emotions and intellect capacity and built-in heuristics are evolutionary responses to all the various sorts of vulnerabilities that all our ancestors faced since the creation of the earth. Human nature consists of a complex jumble of behaviors that are not easy to reduce down with a simple two-dimensional A/B test.

But speaking of simple, two-dimensional A/B tests, I will leave it as an exercise for the reader to draw the parallel between these two views of human nature, and the views of human nature that underlie the policies of our two American political parties.

One Small Step Towards a Theory of Pitch Sequencing

Three years ago, I wrote an article called “10 Things I Believe About Baseball Without Evidence“, in which I hypothesized that it ought to be possible to develop some sort of theory of pitch sequencing. To me, pitch sequencing is the very heart of the sport, the chess match between batter and pitcher which makes the sport compelling. But for all our progress in sports analytics in recent years, a theory of pitch sequencing — what it is, how it works, which pitchers are good at it, which batters can be fooled by it — seems as distant as ever.

In this article, I hypothesized (without evidence, as the title suggests) that such a theory would involve somehow understanding that the brain of the batter makes predictions for the next pitch based on previous pitches:

I believe that before any given pitch, the batter is in some sort of Prediction State for the next pitch. After each pitch, the batter then moves into a different Prediction State.

One year after I wrote this evidence-free idea, a piece of evidence came in which supported my hypothesis.

Jeff Hawkins and Subutai Ahmad, who work for a company called Numenta which is trying to reverse engineer the brain with computers, published in October of 2015 a paper called “Why Neurons Have Thousands of Synapses, A Theory of Sequence Memory in Neocortex”.

You can read a nice layperson’s summary of the paper here. But I’ll summarize the summary even further.

Memory in the brain consists of cells called neurons. These neurons have different parts, and one of these parts is called “distal synapses”. Up until this point, nobody really had a good idea what these distal synapses were for, because they didn’t seem to do anything while a particular memory was firing. Hawkins and Ahmad theorize that this is because the distal synapses don’t cause the neuron to fire immediately. Instead, they electrically prepare the cell to fire quickly if a signal comes in from a certain direction. And it is this preparation which allows the brain to make predictions about sequences of events. Relevant quote from the paper:

“Each neuron learns to recognize hundreds of patterns that often precede
the cell becoming active. The recognition of any one of these
learned patterns acts as a prediction by depolarizing the cell
without directly causing an action potential. Finally, we show
how a network of neurons with this property will learn and
recall sequences of patterns. The network model relies on
depolarized neurons firing quickly and inhibiting other
nearby neurons, thus biasing the network’s activation
towards its predictions.”

And herein lies the physical foundation of a theory of pitch sequencing. For if Hawkins and Ahmad are correct about sequential learning, it means that there is indeed some sort of Prediction State that the brain is in before each pitch.

Once the brain has seen some sort of sequence of inputs, it prepares itself to recognize that sequence again, and to recognize and react to it more quickly the next time it appears, by being electrically primed to react through this neuronal depolarization.

At this point, it’s important to understand that we’re not just talking about sequences of individual pitches here (a curve followed by a fastball followed by a changeup). It can be that, too, but not only that.

A single pitch in and of itself is a sequence of patterns happening that the brain needs to recognize. It’s a windup, and then a release, and then a ball movement out of the hand, and then a spin which one can perhaps recognize, and then a speed and a directional movement of the ball in one way or another.

Each of these patterns and sub-patterns and sub-sub-patterns that compose a pitch are represented in the brain at the neuronal level. As a batter observes sequences of (sub-)(sub-)patterns, the brain automatically prepares itself to see those sequences again by depolarizing the neurons to make them respond faster to these patterns. Thus, from the pitches it has seen in the past, the brain moves into a sort of Prediction State about the pitches it anticipates seeing in the future.

This has the effect, as Hawkins and Ahmad put it, of “biasing the network’s activation towards its predictions”. The batter’s Prediction State has a bias, and pitchers can exploit this bias. The brain is ready to react to some patterns, which it will react quickly to, but at the expense of inhibiting a reaction to other other patterns, which it will be slower to react towards.

So if you throw three fastballs with the same speed and the same location in a row, the batter’s brain will become more and more prepared/biased to predict that pitch accurately with each subsequent pitch, and the batter becomes more likely to hit the ball hard.

But if pitchers understand what the batter’s brain is biased towards, they can fool the batter by defying that prediction. Throw a changeup to the same location, but with a different speed, and you can make the batter swing too early. The wrong neurons get fired, and the ones that should have fired to hit the ball properly are instead inhibited by the bias, and the batter does the wrong thing.

They say that pitching is an art, and perhaps at this time it is, but there is potential in this information that it could eventually be turned into a science.

* * *

This information doesn’t explain everything about how the brain processes sequencing, obviously. It’s just a initial framework for understanding how the brain learns to understand sequences of events and to predict them. And since we don’t really understand exactly it works in that general case in the brain, we therefore also don’t understand how it works for the specific case of pitch sequencing.

So if we have unanswered questions about the brain like, “how long does this cell depolarization last?” we also have corresponding unanswered questions about pitch sequencing, like, “how long does a batter remain biased towards a kind of pitch once he has seen it?”

The good news is, we can probably answer the second question without necessarily answering the first. There is data that will tell us how much better a batter gets when he sees the same pitch multiple times, either in a row, or in close proximity. Understanding the basic framework of how the brain works can help us ask better questions about pitch sequencing, and to develop useful theories about how it works, even before the neuroscientists figure out precisely how it works in the brain.

Good luck, baseball analysts.

The Data/Human Goal Gap

As I was writing a letter to my third-grade daughter’s principal in support of a change in homework policy (a letter which I’ve posted here), it occurred to me I was making a point about a phenomenon that isn’t unique to education at all, but happens in a lot of other fields, too: baseball, business, economics, and politics.

I don’t know if this phenomenon has a name. It probably does, because you’re very rarely the first person to think of an idea. If it does, I’m sure someone will soon enlighten me. The phenomenon goes like this:

* * *

Suppose you suck at something. Doesn’t matter what it is. You’re bad at this thing, and you know it. You don’t really understand why you’re so bad, but you know you could be so much better. One day, you get tired of sucking, and you decide it’s time to commit yourself to a program of systematic improvement, to try to be good at the thing you want to be good at.

So you decide to collect data on what you are doing, and then study that data to learn where exactly things are going so wrong. Then you’ll try some experiments to see what effect those experiments have on your results. Then you keep the good stuff, and throw out the bad stuff, and pretty soon you find yourself getting better and better at this thing you used to suck at.

So far so good, eh? But there’s a problem. You don’t really notice there’s a problem, because things are getting better and better. But the problem is there, and it has been there the whole time. The problem is this: the thing your data is measuring is not *exactly* the thing you’re trying to accomplish.

Why is this a problem? Let’s a simplified graph of this issue, so I can explain.

Let’s call the place you started at, the point where you really sucked, “Point A”.
Let’s call the goal you’re trying to reach “Point G”.
And let’s call the best place the data can lead you to “Point D”.

Note that Point D is near Point G, but it’s not exactly the same point. Doesn’t matter why they’re not the same point. Perhaps some part of your goal is not a thing that can be measured easily with data. Maybe you have more than one goal at a time, or your goals change over time. Whatever, doesn’t matter why, it just matters they’re just not exactly the same point.

Now here’s what happens:

You start out very far from your goal. You likely don’t even know exactly what or where your goal is, precisely, but (a) you’ll know it when you see it, and (b) know it’s sorta in the Point D direction. So, off you go. You embark on your data-driven journey. As a simplified example, we’ll graph your journey like this:

statsgraph2

On this particular graph, your starting point, Point A, is 14.8 units away from your goal at Point G. Then you start following the path that the data leads you. You gather data, test, experiment, study the results, and repeat.

After a period of time, you reach Point B on the graph. You are now 10.8 units away from your goal. Wow, you think, this data-driven system is great! Look how much better you are than you were before!

So you keep going. You eventually reach Point C. You’re even closer now: only 6.0 units away from your goal!

And so you invest even more into your data-driven approach, because you’ve had nothing but success with it so far. You organize everything you do around this process. The process, and changes that you’ve made because of it, actually begin to become your new identity.

In time, you reach Point D. Amazing! You’re only 4.2 units away from your goal now! Everything is awesome! You believe in this process wholeheartedly now. The lessons you’ve learned permeate your entire worldview now. To deviate from the process would be insane, a betrayal of your values, a rejection of the very ideas you stand for. You can’t even imagine that the path you’ve chosen will not get any better than right here, now, at Point D.

Full speed ahead!

And then you reach Point E.

Eek!

Egads, you’re 6.00 units away from your goal now. You’ve followed the data like you always have, and suddenly, for no apparent reason, things have suddenly gotten worse.

And you go, what on Earth is going on? Why are you having problems now? You never had problems before.

And you’re human, and you’ve locked into this process and weaved it into your identity. You loved Points C & D so much that you can’t stand to see them discredited, so your Cognitive Dissonance kicks in, and you start looking for Excuses. You go looking for someone or something External to blame, so you can mentally wave off this little blip in the road. It’s not you, it’s them, those Evil people over there!

But it’s not a blip in the road. It’s the road itself. The road you chose doesn’t take you all the way to your destination. It gets close, but then it zooms on by.

But you won’t accept this, not now, not after the small sample size of just one little blip. So you continue on your same trajectory, until you reach Point F.

You stop, and look around, and realize you’re now 10.8 units away from your goal. What the F? Things are still getting worse, not better! You’re having more and more problems. You’re really, really F’ed up. What do you do now?

Can you let go of your Cognitive Dissonance, of your Excuse seeking, and step off the trajectory you’ve been on for so long?

F is a really F’ing dangerous point. Because you’re really F’ing confused now. Your belief system, your identity, is being called into question. You need to change direction, but how? How do you know where to aim next if you can’t trust your data to lead you in the right direction? You could head off in a completely wrong direction, and F things up even worse than they were before. And when that happens, it becomes easy for you to say, F this, and blow the whole process up. And then you’re right back to Point A Again. All your effort and all the lessons you learned will be for nothing.

WTF do you do now?

F’ing hell!

* * *

That’s the generic version of this phenomenon. Now let’s talk about some real-world examples. Of course, in the real world, things aren’t as simple as I projected above. The real world isn’t two-dimensional, and the data doesn’t lead you in a straight line. But the phenomenon does, I believe, exist in the wild. And it’s becoming more and more common as computers make data-driven processes easy for organizations and industries to implement and follow.

Education

As I said, homework policy is what got me thinking about this phenomenon. I have no doubt whatsoever that the schools my kids are going to now are better than the ones I went to 30-40 years ago. The kids learn more information at a faster rate than my generation ever did. And that improvement, I am confident, is in many ways a result of the data-driven processes that have arisen in the education system over the last few decades. Test scores are how school districts are judged by home buyers, they’re how administrators are judged by school boards, they’re how principals are judged by administrators, and they’re how teachers are judged by principals. The numbers allow education workers to be held accountable for their performance, and provide information about what is working and what needs fixing so that schools have a process that leads to continual improvement.

From my perspective, it’s fairly obvious that my kids’ generation is smarter than mine. But: I’m also pretty sure they’re more stressed out than we were. Way more stressed out, especially when they get to high school. I feel like by the time our kids get to high school, they have internalized a pressure-to-perform ethic that has built up over years. They hear stories about how you need such and such on your SATs and this many AP classes with these particular exam scores to get into the college of their dreams. And the pressure builds as some (otherwise excellent) teachers think nothing of giving hours and hours of homework every day.

Depression, anxiety, panic attacks, psychological breakdowns that require hospitalization: I’m sure those things existed when I went to school, too, but I never heard about it, and now they seem routine. When clusters of kids who should have everything going for them end up committing suicide, something has gone wrong. That’s your Point F moment: perhaps we’ve gone too far down this data-driven path.

Whatever we decide our goal of education is, I’m pretty sure that our Point G will not feature stressed-out kids who spend every waking hour studying. That’s not the exact spot we’re trying to get to. I’m not suggesting we throw out testing or stop giving homework. I am arguing that there exists a Point D, a sweet spot with just the right amount of testing, and just the right amount of homework, that challenges kids the right amount without stressing them out, and leaves the kids with the time they deserve to just be kids. Whatever gap between Point D and Point G that remains should be closed not with data, but with wisdom.

Baseball

The first and most popular story of an industry that transforms itself with data-driven processes is probably Michael Lewis’s Moneyball. It’s the story of how the revenue-challenged Oakland A’s baseball team used statistical analysis to compete with economic powerhouses like the New York Yankees.

I’ve been an A’s fan my whole life, and I covered them closely as an A’s blogger for several years. So I can appreciate the value that the A’s emphasis on statistical analysis has produced. But as an A’s fan, there’s also a certain frustration that comes with the A’s assumption that there is no difference between Point D and Point G. The A’s assume that the best way to win is to be excruciatingly logical in their decisions, and that if you win, everyone will be happy.

But many A’s fans, including myself, do not agree with that assumption. The Point F moment for us came when, during a stretch of three straight post-season appearances, the A’s traded their two most popular players, Yoenis Cespedes and Josh Donaldson, within a span of six months.

I wrote about my displeasure with these moves in an long essay called The Long, Long History of Why I Do Not Like the Josh Donaldson Trade. My argument was, in effect, that the purpose of baseball was not merely winning, it was the emotional connection that fans feel to a team in the process of trying to win.

When you have a data-driven process that takes emotion out of your decisions, but your Point G includes emotions in the goal of the process, it’s unavoidable that you will have a gap between your Point D and your Point G. The anger and betrayal that A’s fans like myself felt about these trades is the result of the process inevitably shooting beyond its Point D.

Business

If Moneyball is not the most influential business book of the last few decades, it’s only because of Clayton Christensen’s book, The Innovator’s Dilemma. The Innovator’s Dilemma tells the story of a process in which large, established businesses can often find themselves defeated by small, upstart businesses with “disruptive innovations.”

I suppose you can think of the phenomenon described in the Innovator’s Dilemma as a subset of, or perhaps a corollary to, the phenomenon I am trying to describe. The dilemma happens because the established company has some statistical method for measuring its success, usually profit ratios or return on investment or some such thing. It’s on a data-driven track that has served it well and delivered it the success it has. Then the upstart company comes along and sells a worse product with worse statistical results, and because of these bad numbers, the establish company ignores it. But the upstart company is on an statistical path of its own, and eventually improves to the point where it passes the established company by. The established company does not realize its Point D and Point G are separate points, and finds itself turning towards Point G too late.

Here, let’s graph the Innovator’s Dilemma on the same scale as our phenomenon above:

statsgraph3

The established company is the red line. They have reached Point D by the time the upstart, with the blue line, gets started. The established company thinks, they’re not a threat to us down at Point A. And even if they reach our current level at Point D, we will beyond Point F by then. They will never catch up.

This line of thinking is how Blockbuster lost to Netflix, how GM lost to Toyota, and how the newspaper industry lost its cash cow, classified ads, to Craigslist.

The mistake the establish company makes is assuming that Point G lies on/near the same path that they are currently on, that their current method of measuring success is the best path to victory in the competitive market. But it turns out that the smaller company is taking a shorter path with a more direct line to the real-life Point G, because their technology or business model has, by some twist, a different trajectory which takes it closer to Point G than the established one. By the time the larger company realizes its mistake, the smaller company has already gotten closer to Point G than the larger company, and the race is essentially over.

* * *

There are other ways in which businesses succumb to this phenomenon besides just the Innovator’s Dilemma. Those companies that hold closely to Milton Friedman’s idea that the sole purpose of a company is to maximize shareholder value are essentially saying that Point D is always the same as Point G.

But that creates political conflict with those who think that all stakeholders in a corporation (customers, employees, shareholders and the society and environment at large) need to have a role in the goals of a corporation. In that view, Point D is not the same as Point G. Maximizing profits for the shareholders will take you on a different trajectory from maximizing the outcomes for other stakeholders in various proportions. When a company forgets that, or ignores it, and shoots beyond its Point D, then there is going to inevitably be trouble. It creates distrust in the corporation in particular, and corporations in general. Take any corporate PR disaster you want as an example.

Economics

I’m a big fan of Star Trek, but one of the things I never understood about it was how they say that they don’t use money in the 23rd century. How do they measure the value of things if not by money? Our whole economic system is based on the idea that we measure economic success with money.

But if you think about it, accumulating money is not the goal of human activity. Money takes us to Point D, it’s not the path to Point G. What Star Trek is saying is that they somehow found a path to Point G without needing to pass through Point D first.

But that’s 200 years into a fictional future. Right now, in real life, we use money to measure human activity with. But money is not the goal. The goal is human welfare, human happiness, human flourishing, or some such thing. Economics can show us how to get close to the goal, but it can’t take us all the way there. There is a gap between the Point D we can reach with a money-based system of measurement, and our real-life Point G.

And as such, it will be inevitable that if we optimize our economic systems to optimize some monetary outcome, like GDP or inflation or tax revenues or some such thing, that eventually that optimization will shoot past the real-life target. In a sense, that’s kind of what we’re experiencing in our current economy. America’s GDP is fine, production is up, the inflation rate is low, unemployment is down, but there’s still a general unease about our economy. Some people point to economic inequality as the problem now, but measurements of economic inequality aren’t Point G, either, and if you optimized for that, you’d shoot past the real-life Point G, too, only in a different direction. Look at any historically Communist country (or Venezuela right now) to see how miserable missing in that direction can be.

The correct answer, as it seems to me in all of these examples, is to trust your data up to a certain point, your Point D, and then let wisdom be your guide the rest of the way.

Politics

Which brings us to politics. In 2016. Hoo boy.

Well, how did we get here?

I think there are essentially two data-driven processes that have landed us where we are today. Both of these processes have a gap between what we think of as the real-life goals of these entities, and the direction that the data leads them to. One is the process of news outlets chasing media ratings. And the other is political polling.

In the case of the media, the drive for ratings pushes journalism towards sensationalism and outrage and controversy and anger and conflict and drama. What we think journalism should actually do is inform and guide us towards wisdom. Everybody says they hate the media now, because everybody knows that the gap between Point D and Point G is growing larger and larger the further down the path of ratings the media goes. But it is difficult, particularly in a time where the technology and business models that the media operate under are changing rapidly, to change direction off that track.

And then there’s political polling. The process of winning elections has grown more and more data-driven over recent decades. A candidate has to say A, B, and C, but can’t say X, Y, or Z, in order to win. They have to casts votes for D, E, and F, but can’t vote for U, V or W. They have to make this many phone calls and attend that many fundraisers and kiss the butts of such and such donors in order to raise however many millions of dollars it takes to win. The process has created a generation of robopoliticians, none of whom have an original idea in their heads at all (or if they do, won’t say so for fear of What The Numbers Say.) You pretty much know what every politician will say on every issue if you know whether there’s a “D” or an “R” next to their name. Politicans on neither side of the aisle can formulate a coherent idea of what Point G looks like other beyond a checklist spit out of a statistical regression.

That leads us to the state of the union in 2016, where both politicians and the media have overshot their respective Point Ds.

And nobody feels like anyone gives a crap about the Point G of this whole process: to make the lives of the citizens that the media and the politicians represent as fruitful as possible. Both of these groups are zooming full speed ahead towards Point F instead of Point G.

And here are the American people, standing at Point E, going, whoa whoa whoa, where are you all going? And then the Republicans put up 13 robocandidates who want to lead everybody to the Republican version of Point F, plus Donald Trump. The Democrats put up Hillary Clinton, who can probably check all the data-driven boxes more skillfully than anybody else in the world, asking to lead everybody to the Democratic version of Point F, plus Bernie Sanders.

And Trump and Sanders surprise the experts, because they’re the only ones who are saying, let’s get off this path. Trump says, this is stupid, let’s head towards Point Fascism. Sanders says, we need a revolution, let’s head towards Point Socialism.

And most Americans like me just shake our heads, unhappy with our options, because Fascism and Socialism sound more like Point A than Point G to us. I don’t want to keep going, I don’t want to start over, and I don’t want to head in some old discredited direction that other countries have headed towards and failed. I just want to turn in the direction of wisdom.

“It’s not that hard. Tell him, Wash.

“It’s incredibly hard.”

The Long, Long History of Why I Do Not Like the Josh Donaldson Trade

Once upon a time, about a billion years ago, life was simple. Everybody lived in the oceans, and everybody had only one cell each. This was quite a fair and egalitarian way to live. Nobody really had significantly more resources than anyone else. Every individual just floated around, and took whatever it needed and could find, and just let the rest be.

This golden equilibrium was how life did business for a couple billion years. There was no such thing as jealousy or envy, and as a result, everyone lived pretty happy lives.

Then, one day about 800 million years ago, a pair of single-celled organisms merged to become the first multi-cellular organism in the history of the earth.

At first, these multi-celled creatures were just kind of like big blobs of single-celled organisms, and didn’t cause a lot of problems. Everybody was still kind of doing the same job as everyone else, even if they had organized themselves into a limited corporation of sorts. Most other single-celled creatures just figured they were harmless weirdos hanging out together, and ignored them.

They could not have been more wrong. For once the multi-cell genie was out of the bottle, Pandora’s box could not be closed, and the dominos began to fall. This simple change may have seemed innocent at first, but little did the single-cells know that they were the first creatures on earth to fall victim to the innovator’s dilemma. The single-celled creatures were far too invested in the status quo to change, and consequently ignored the multi-cellulars as irrelevant, and did not realize until it was too late that the game had suddenly shifted.

Continue reading

The Short, Short Josh Donaldson Trade Story Based on Platoon Splits

Ok, look, I told y’all with the Cespedes trade that you can’t analyze an A’s trade of a position player without breaking it down by platoon splits across the whole lineup. But did any of y’all listen to me? No. Y’all are still trying to analyze Donaldson vs Lawrie as if they are single players on single teams instead of two players on two platoon teams with other players on the team. So stop that.

Now look, I’m gonna make this simple. I’m going to assume that both Lawrie and Donaldson will be equally healthy, and they’re roughly comparable defensive players. They may not be, but this is a quick and dirty exercise here, so bear with me. And I’m just going to use OPS, so I don’t have to make this story as long as the other Josh Donaldson story that’s coming later today.

Let us begin.

* * *

OPS 2014/career, Josh Donaldson vs. RHP: .727 / .744
OPS 2014/career, Josh Donaldson vs. LHP: 1.007 / .953

OPS, 2014/career Brett Lawrie vs RHP: .760 / .760
OPS, 2014/career Brett Lawrie vs LHP: .595 / .713

See, Brett Lawrie is actually better than Josh Donaldson against RHPs. The difference is that Donaldson crushes LHPs, and Lawrie for whatever reason actually is worse against LHPs than RHPs. He was particularly bad in 2014. I do not know why.

So for the platoon team that plays 2/3s of the A’s games, the one against RHPs, the A’s lineup actually just got better.

* * *

So now we need to fix the 1/3 of the A’s games against LHPs.

Last year, one of the A’s primary 1B/DHs against LHPs was Alberto Callaspo. He was awful. The A’s have signed Billy Butler to replace him.

OPS 2014/career, Alberto Callaspo vs LHP: .518 / .729
OPS 2014/career, Billy Butler vs LHP: .847 / .912

So the A’s are losing about .400 OPS points by downgrading from Donaldson to Lawrie vs LHPs, but they get back about .300 of those OPS points by upgrading from Callaspo to Butler.

So now all Billy Beane has to do find that extra .100 points of OPS against LHPs, and the math works. Maybe it will come just out of the fact that most players don’t have reverse splits last their whole careers, and Lawrie will actually bounce back and hit better against LHPs in the future. If so, QED.

* * *

Disclaimer: the above analysis does not mean I like this trade. I do not like this trade. That (much longer) explanation is here.

10 Things I Believe About Baseball Without Evidence

Well, here we are. The Giants won another World Series, while the A’s flopped in the playoffs yet again. I’m not one of those A’s fans who hate the Giants, but it’s starting to annoy the crap out of even me to see the Giants always succeed in the playoffs, while seeing the A’s always fail.

The A’s have had 14 chances in the last 14 years to win a game to advance to the next round of the playoffs. They have lost 13 of those 14 games. If the playoffs are truly a crapshoot, the odds of this happening are 1-in-1,170. (So it’s not technically always — they could have gone on to lose the 2006 ALDS against the Twins, too, which would have made them 0-for-16, with an unlikelihood odds of 1-in-65,536. So if you want to look on the bright side, things could be 56 times worse than they are.) And in a crapshoot, the odds of the Giants winning 10 playoff series in a row, as they have now done, is 1-in-1024.

So if you’re an A’s fan who hates the Giants, and who believes that the playoffs are a just crapshoot, you’ve been struck with a series of unfortunate events that had literally less than a 1-in-a-million chance of happening.

Sabermetrics has come up with no good explanation for it except to say, well, these things happen about once every thousand times, or once every million times, sorry A’s fans, it just happened to be your turn to hit that unfortunate lottery, and it’s just bad luck. Oh, and you have a crappy stadium that’s falling apart and a team ownership and a local government who all seem too incompetent to do anything about it unlike those guys across the bay, sorry about that, too, gosh you guys are unlucky, tsk tsk tsk.

Which is just a deeply, deeply unsatisfying answer. If you have an ounce of humanity, you will reject that explanation, and ask the obvious question.

But Why?

And to answer that question, the sabermetrician dives into the numbers, and pulls some out numbers with some number-pulling-out tools, and finds nothing to report. Nope, no evidence here of anything, so it must just be bad luck.

To which I ask: what if the reason the number-pulling-out tools can’t find any cause for the problem is because those number-pulling-out tools themselves are the problem?

I have no evidence of that. But it’s something I believe might be true, even though I can’t prove it.

* * *

I have a number of these beliefs–or hypotheses, if you will–about baseball, but I’ve mostly kept them to myself because of this lack of evidence. What the hell do I know, anyway? Who am I to pontificate? And why bother spouting these theories when I can’t defend them with evidence? So I just keep my mouth shut.

But I got a little bit of self-confidence in my belief system when Robert Arthur of Baseball Prospectus took one of my hypotheses (that injured A’s in the second half of 2014 had begun cheating on fastballs, making themselves vulnerable to offspeed pitches) and found evidence to support it ($):

The overall pattern of changes is beautifully consistent with Ken’s theory…

It’s very satisfying to find that the data supports one’s theory!

But I didn’t just come to this particular hypothesis that Mr. Arthur investigated out of thin air. This hypothesis arose out of a deeper foundation of hypotheses that color the way I look at baseball. I want to put all those hypotheses out on the table now, lack of evidence be damned. And maybe someone (maybe me someday, if I ever find the time and energy and resources and willpower to do so, which hasn’t happened yet) will take those hypotheses and invent the technology needed to find the evidence to support it.

So let’s put it out there.

* * *

Belief Without Evidence #1. A technological Sapir-Whorf hypothesis

The Sapir-Whorf hypothesis, a/k/a the Linguistic relativity principle holds that the language that a person speaks influences the way a person conceptualizes their world. The obvious example of this is that people have trouble distinguishing between colors if their language does not have a word for that color.

To a certain extent, I believe this hypothesis. Being fluent in both Swedish and English, I know there are certain concepts, such as the difference between belief in an opinion and belief in a fact, where the Swedish language makes clear distinctions (tycka and tro) and English does not. English speakers spend ridiculous amounts of time arguing about these things, and Swedes simply don’t need to. It’s not that English speakers can’t conceptualize the difference between opinion and fact, but doing so is way more difficult in English, because the word “belief” in English is quite fuzzy, whereas in Swedish, the language makes it simply impossible to confuse the two.

I touched upon this in my essay in the 2014 Baseball Prospectus annual, that I believe a similar concept applies to the technology we use. The reason statistical analysis began to influence the way we conceptualize baseball in the 1990s is not because human beings suddenly became smarter in the 1990s. There were statistically informed people who suggested such analysis almost a century earlier. It happened in the 1990s because the price of the technology needed to perform such analysis had finally became reasonable.

The predominant technology we use to perform such analysis is SQL, which is the primary language used to query relational databases. SQL and relational databases are technologies which are built upon set theory. A set is basically an unordered collection of objects.

And this is where I believe that a technological Sapir-Whorf hypothesis applies to baseball. Practically all of our analysis of baseball statistics treats its data an unordered collection of baseball events: pitches, plate appearances, games, series. Standard baseball analysis (the public kind anyway, who knows what is being done inside these organizations) treats its data that way because that’s the way SQL treats its data. The available technology guides our conceptualization of the world. And that leads us to my second hypothesis:

Belief Without Evidence #2. Baseball events are NOT unordered

For any batter to hit a ball, the batter needs to predict where the ball is going to be before it reaches the bat. There are two different mechanisms for this prediction.

First, there is a conscious prediction. The batter may decide, consciously, based on some sort of rational analysis, that he is looking for a fastball down and in, and wants to swing at only a pitch in that location that he can pull.

But once the pitcher releases the ball, this kind of conscious prediction mechanism is far, far too slow to be of any use. At this point, everything is turned over to a much faster, subconscious, automatic system to predict the actual flight of the ball, and to send the muscles in motion to meet the ball.

My thoughts here are heavily influenced by Jeff Hawkins‘ book On Intelligence, which lays out a framework for how this automatic system in the brain works as a memory-based prediction machine.

Order matters in baseball, because this automatic prediction mechanism has a strong recency bias. (A conscious prediction might not have a recency bias if truly rational, but how often does a batter perform a purely rational analysis at the plate?) The speed, location and movement of the most recent pitch will affect the brain’s automatic prediction of the speed, location and movement of the next pitch. The more recent a pitch, the more it affects the automatic system’s prediction for the next pitch.

Pitch sequencing, therefore, is at the heart of the very sport of baseball, yet it is woefully understudied in current public analysis, because our tools, based on a foundation of unordered sets, are woefully bad at processing and studying sequenced events.

There is a whole industry now dedicated to the statistical analysis of baseball using these set-based SQL tools. But SQL does not have a recency bias clause in its syntax that you can apply to a query. Because these tools don’t handle the ordered data well, they basically ignore The. Very. Core. of the sport: the sequencing battle between pitcher and batter.

Let me say that again: statistical analysis (that we in the public are aware of) takes the most important element of the sport, and ignores it.

It’s like having Newtonian physics without relativity and quantum mechanics. There’s a lot you can do with Newtonian physics, but at the extremes, it begins to break down, because it is ignoring some deeper, more fundamental truths.

If you’re a team that relies on constructing its roster using such statistical analysis, what mistakes are you making by ignoring the most important part of the game?

Belief Without Evidence #3. All high-level sabermetric truths derive from lower-level truths about human biomechanics and psychology

And not vice versa. Things like platoon splits and home field advantage are not Constants of the Universe like the speed of light or the Planck-Einstein relation. The arise from more fundamental truths about human anatomy and psychology.

For instance, once I got in an argument in which I did not believe that Sean Doolittle pitched better to certain catchers than others. The stats did not agree with me, albeit perhaps with a small sample size. But my objection wasn’t to the numbers, adequate sample size or not, it was to the lack of any sort of underlying physical/psychological mechanism where this these numbers could derive from. Sean Doolittle throws 90% fastballs. What the hell difference physically/psychologically does it make what catcher is back there catching it? It’s the same pitch, no matter who is catching it.

I do not consider a sabermetric truth to really be a truth unless there is a biomechanical/psychological foundation upon which that truth can rest, and from which that truth is capable of being derived.

Belief Without Evidence #4. Pitches are paths between states in a Prediction State Automaton

First, a little explanation of automata:

Automata theory is used in computer science to study states. For example, you can look at baseball as an base/out automaton, where before each plate appearance, the base/out combination is in one “state”, and in another “state” after the plate appearance. There are rules that tell you what possible states you can be in before and after a plate appearance.

So, at the beginning of an inning, the baseball base/out automaton is in a {Nobody on, 0 out} state. After the first plate appearance, you will be in one of five possible states:

{Runner on 1st, 0 out}
{Runner on 2nd, 0 out}
{Runner on 3rd, 0 out}
{Nobody on, 0 out} (Batter homered)
{Nobody on, 1 out}

You can’t, after the first appearance, reach a state where there are two runners on or two outs. You have to go to an intermediate state first. There are exactly 24 possible states you can have in this automaton. Each state in this automaton is a two dimensional {base, out} object. And from any of these 24 possible states, there are a limited, finite number of possible following states.

The “automaton” then, defines the what possible states can exist, and the rules by which you can move from one state to another.

Got it?

OK, now to the thing I believe without evidence: I believe that before any given pitch, the batter is in some sort of Prediction State for the next pitch. After each pitch, the batter then moves into a different Prediction State.

I don’t have a clear belief on exactly how many dimensions these Prediction States have. Maybe the Prediction State has three dimensions it:

1. Whether to swing
2. When to swing
3. Where to swing

Or maybe these Prediction States are much more complex, combining the above three states with specific kinds of pitches and movements and locations. It may be expressed by something like this, for example:

{60% expection of fastball, 30% changeup, 10% curve;
80% outside, 10% middle, 10% inside;
60% down, 30% middle, 10% up;
70% in the strike zone, 30% out of the strike zone}

and then if the pitcher throws you a fastball on the lower outside corner for a strike, perhaps you move to a state like this:

{70% fastball, 20% changeup, 10% curve;
85% outside, 8% middle, 7% inside;
70% down, 20% middle, 10% up;
75% in the strike zone, 25% out of the strike zone}

Or whatever. I don’t really know as what the parameters for these Prediction States should be. Is it {pitch type, in/out, up/down, movement/straight, fast/slow} or some other combination of pitch attributes? I don’t know.

And to what extent are these prediction automata more or less universal, or does each batter have his own unique automaton with its own unique rules? Again, I don’t know.

But I do know that if I were to build a technology for analyzing baseball, this is where I would begin, right at the core of the game, the engine that drives the sport: what pitch the batter is expecting from the pitcher, and what happens when the pitch he gets conforms or deviates from that expectation.

In order to unite the quantum and Newtonian versions of baseball analysis, the biophysical and the statistical, any Grand Unified Theory Of Everything Baseball must, in my belief, have some way to handle the Prediction State of the batter.

Belief Without Evidence #5: The quality of a pitch is a function of its speed, location, and movement, and also of the batter’s swing and prediction state

There are a few pitchers, like Aroldis Chapman, who can throw a pitch with such high-quality speed that the location, movement, and prediction state are rather irrelevant. And there are some, like Mariano Rivera, who have such a combination of high-quality location and movement that the speed and prediction state don’t matter much. With pitchers like that, the batter can predict perfectly what pitch he’s going to get, and still not hit it.

But most pitchers do not possess such a high-quality pitch that they can be predictable and get away with it at the Major League level. They need to manipulate the prediction state of the batter in order to succeed.

The less a batter is expecting a certain pitch, the less likely he is to make good contact. But pitching is not just a function of being unpredictable: the pitcher must balance what the Prediction State of the batter is and the batter’s ability to hit it, with his ability to also throw a pitch with good speed, location, and movement.

The complex nature of that 5-dimensional object ( {speed, location, movement, swing, prediction state} ) is what makes baseball so fascinating from pitch to pitch.

So for each pitch, the pitcher wants to:

1. Choose a pitch the batter is likely to predict incorrectly
2. Choose a pitch the pitcher is likely to throw with good speed, location, and movement
3. Choose a pitch which will result in a suboptimal swing path, resulting either in a miss or weak contact
4. Choose a pitch which, if not put in play, worsens the batter’s Prediction State for the next pitch

Belief Without Evidence #6: The quality of an at-bat is a 3-dimensional function

Those three dimensions being:
1. Getting a good pitch to hit
2. Hitting a ball hard when you do
3. Hitting a ball hard if you don’t.

A good pitch to hit is a pitch that (a) he is successfully predicting, and (b) he can get a good swing on. Whether he can get a good swing on a particular pitch depends on what his swing path is.

And again, there are two kinds of predictions: the automatic subconscious one where the batter just reacts to a pitch, and a conscious one where the batter decides beforehand to look for a certain pitch and ignore all others. And the count plays a big role whether the batter can take an approach to consciously look for a particular pitch, or whether he should (with two strikes especially) just let his subconscious react to whatever comes in there.

On the subconscious level, the more the pitcher keeps throwing the same pitch, the more the batter predicts that pitch accurately, and the more likely the batter is to hit that pitch. When pitchers talk about “establishing the inside fastball” for example, this is what they mean: to change the Prediction State in such a way that an inside fastball becomes part of the Prediction State, and thereby necessarily reduces the expectation of a different pitch in the future.

Just because a batter gets a pitch he is predicting, does not mean he will hit it. Most batters have some kind of hole in their swing. Some batters prefer high pitches, others low. Some are vulnerable inside, and others can’t hit the outside pitch well. Some can hit a fastball, but can’t time an offspeed pitch. Others have a slow bat speed and struggle with fastballs, but feast on the slower pitches.

So for each pitch, the batter wants to:
1. Predict a pitch correctly
2. Swing at a pitch that if it lets him approximate his optimal swing path
3. Take a pitch if it would cause a suboptimal swing path (unless 2 strikes in zone)
4. Take pitches out of the zone to move to a better Prediction State for the next pitch
5. If in a 2-strike situation, make contact (foul or fair) on a pitch in the zone

Belief Without Evidence #7: SQL-reliant GMs don’t value the third dimension of #6 enough

In a vast sea of unordered pitches from an unordered group of pitchers, you will get a randomly-distributed plethora of good pitches to hit, so the numbers will all work out in the end. So you acquire hitters based on these vast seas of data, ignoring what the batter does with difficult pitches to hit, because in the long run, they don’t matter much.

But against a good pitcher on a good day who does not give you a good pitch to hit, what do those batters do? Do they hit a ball hard if they don’t get a good pitch to hit?

To me, the biggest difference between the A’s in the playoffs and the Giants in the playoffs is Pablo Sandoval. Because there may not be anyone in baseball right now better than Sandoval who does damage even when he does not get a good pitch to hit. He can turn pitches in the dirt, in his eyes, and/or six inches off the plate into a hit. He’s almost immune to prediction state manipulation by opposing pitchers. And Hunter Pence, though not as extreme as Sandoval, has similar characteristics.

The A’s simply do not pursue those types of players. Players like Sandoval tend to have low OBPs, because they swing at so many bad pitches. Minor leaguers with that profile flop far more than they succeed, so they’re a bad risk to take. But there are times, against a good pitcher on a good day who is simply not giving hitters a good pitch to hit, that it is valuable to have a player who often does damage even with a bad pitch to hit. And those times happen more often in the playoffs.

A technology that used a system of evaluating players in which high-level statistics of player value were derived from a low-level {speed, location, movement, swing path, prediction state} matrix would better identify the true value of such players.

Belief Without Evidence #8: Diversity is Good for Batting Lineups

This belief is related to the belief about the definition of the quality of a pitch, and to the belief of a biomechanical/psychological foundation to all of this. A lineup with too many batters with similar strengths and weaknesses can make it easier for a pitcher to settle into a psychological/mechanical rhythm and mow down such a lineup. A lineup that is diverse (some hit fastballs, some like it inside, or low, some slug, others make contact, etc.) makes a pitcher have to change his approach from at-bat to at-bat. That forces the pitcher to have to make a variety of quality pitches in order to win. It’s harder for a pitcher to win if he has to have multiple pitches working well.

So when I praised the Giants for having Pablo Sandoval, I did not mean that an entire team of hitters like Pablo Sandoval would be ideal. But having one or two guys like him in a lineup with some more patient-type hitters is a good thing.

Belief Without Evidence #9: A lineup without holes scores runs exponentially, not linearly

This is probably the easiest of my hypotheses to disprove. But I have the gut feeling that one guy who is an automatic out in the middle of a lineup can take a rally that might score five runs and drop that rally down to 0 or 1 runs.

I think we saw this play out with the 2014 Oakland A’s. At the beginning of the year, everyone in the lineup was healthy and hitting somewhat near or above expectations. The A’s were just killing it in the pythagorean win column, because they’d get a rally going and that rally would just keep going and going.

But then Josh Donaldson and Brandon Moss started having some nagging injuries, and Moss in particular became pretty much an automatic out for a month or two. Those five-run rallies, once plentiful, almost instantly disappeared. Every rally seemed to be killed by a terrible at-bat in the middle of it.

Almost every team has a hole in the lineup at any given time, someone who is slumping for whatever reason. So for most teams, run scoring appears to be linear. But in those rare cases when everyone is clicking at the same time, their run scoring graph turns like a hockey stick and shoots upward.

The A’s success early in the year depended on the lineup being holeless, and when holes appeared, the whole thing collapsed back from exponential scoring into linear.

Belief Without Evidence #10: A’s fans are magical elves

I’ve been playing in my mind lately with the idea that A’s fans are like the house elves in the Harry Potter stories.

We exist so that others may abuse us. The greatest triumphs of others often comes at our expense. We dress in ratty clothing (stadium). Yet despite this constant abuse, we are fiercely loyal to our master. We attack viciously anyone who dares attack our master. We perform magic (great stadium atmosphere) on their behalf, no matter how awful our masters treat us in return.

If ever we were given clothing (a new stadium) by our master, we would be free of our bondage. Some, like Dobby, desire this, but others would not know what to do with themselves with freedom and wealth. It would ruin the very essence of their being.

I used to be like Dobby, longing for the freedom that a World Series victory and/or a new stadium would bring. But now, I am beginning to feel that the other elves are right — that it is wrong to support S.P.E.W. and long for something that would destroy who we are.

We are meant to suffer, so that other wizards may have their glories. We are elves. Let us be that we are and seek not to alter us.

Hubris

I believe the evidence is clear enough to tell us this much: We were created not by a supernatural intelligence but by chance and necessity as one species out of millions in Earth’s biosphere. Hope and wish for otherwise as we will, there is no evidence of an external grace shining down upon us, no demonstrable destiny or purpose assigned us, no second life vouchsafed us for the end of the present one. We are, it seems, completely alone.

Edward O. Wilson

In Sophocles’ play Oedipus the King, the title character hears a rumor that he may not be what he thinks he is: the son of Polybus and Merope, the King and Queen of Corinth. Polybus and Merope deny the rumor, but Oedipus seeks external confirmation, and visits the Oracle at Delphi. The oracle ignores his question, and instead prophecies that he will kill his father and wed his mother.

Oedipus has no evidence he is not his parents’ son. He has no evidence to suggest he will eventually kill Polybus and marry Merope. But the latter is a much bigger problem than the former, so Oedipus ignores the first small problem and acts on the second, leaving Corinth forever, so as to avoid this horrible fate. He then proceeds to live his life as if he had solved his problem. And, of course, because this is a Greek tragedy, he hadn’t.

Rumors are not facts. Prophecies are not proven theorems. Yet it is not true that Oedipus had no evidence that he was not his parents’ son. He had the rumor. He had the prophecy. In a Bayesian sense, he should have considered the odds of his being adopted having increased from 0% before hearing the rumor and the prophecy, to what–1%? 10%? 25%?–afterwards.

The odds being less than 50%, however, the logical thing for Oedipus to do when faced with any given binary decision is to act as if the rumor was false. That’s the choice that gives him the best odds of succeeding, based on the information he has.

 

Hubris is extreme pride and arrogance shown by a character that ultimately brings about his downfall.

Hubris is a typical flaw in the personality of a character who enjoys a powerful position; as a result of which, he overestimates his capabilities to such an extent that he loses contact with reality. A character suffering from Hubris tries to cross normal human limits and violates moral codes.

–Definition of Hubris from Literary Devices

Is it extreme pride and arrogance to make the most logical decision? If so, then the human condition is tragic no matter what decisions we make.

If we choose with the odds based on the best information we have, we risk making a catastrophic decision because we lacked a critical piece of data. If we choose out of rumor and superstition and fear, we risk living a life where bad decisions compound themselves with every choice we make, and we end up living a suboptimal life.

The more successful we are, however, the more likely we are to make the catastrophic decision that results in a classical, Greek-style tragedy. With every successful decision we make, the less likely it is, in a Bayesian sense, that we are lacking that critical piece of information, and the more likely it is, in a Bayesian sense, that our decision-making process is sound.

If you have a decision-making algorithm, and you’re 50% sure it’s good, and then you test it, and it works, now you’re, what–51%? 55%? 60%?–sure that it works. Test it again and it works again, and the odds rise again. Eventually, if you reach the top of a hierarchy and stay there, you get really confident that you know what you’re doing. You’re the king!

Hubris, then, is the logical result of success. In every form of competition, somebody has to reach the top. The closer to the top you get, the more likely it is that you think your success is because of your knowledge and your decision-making process. The more you become certain that your data and your process are sound, the more you should logically make bigger and bigger bets based on that data and that process. And because of those bigger and bigger bets, the harder you will fall if and when it turns out that your data and/or your decision-making process was flawed.

 

But if you look at the impact those trades have on this particular team’s offense, it’s negligable. Offensively, the numbers tell us that losing Cespedes is no big deal.

Ken Arneson

If you look at Yoenis Cespedes statistically, there’s no real evidence that trading him would hurt the A’s very much. His numbers are mediocre, and easily replaced.

But looking back on the trade now, it feels like the A’s and their fans were focused on the wrong prophecy. The prophecy that a superstar ace pitcher was the missing piece to Moneyball. The significant rumor, the important piece of Bayesian evidence that we ignored was this: that the 2012-14 A’s team was not a product of Billy Beane’s genius. That this team played like complete and utter crap for five years, and then Yoenis Cespedes showed up, and it suddenly and immediately became good. That for 2 1/2 years, when Cespedes was in the lineup, the team played well, and when he was out of the lineup, the team played like crap, regardless of how well Cespedes was playing.

And then Beane, in his moment of hubris, trusting the logic and the data and the decision-making process that had made a best-selling book and a Hollywood movie of his life and had seemingly landed him in first place for 2 1/2 years, traded Cespedes away, and the team reverted immediately to playing like complete and utter crap again.

Could this Cespedes anomaly possibly, actually be real thing? No one can explain it. The fans don’t know why this Cespedes anomaly exists, and all the statisticians don’t know why, and Bob Melvin doesn’t know why, and Billy Beane doesn’t know why. There no evidence! It’s just rumor, innuendo, speculation, unfactual gobbledygook, completely illogical bullshit ex-post-facto rationalization.

But it’s there. It exists. It hurts to look at it. And it has all of us A’s fans wanting to poke our eyes out.

The gods hate us. They want to punish us for our pride and arrogance.

And you may say, gods are superstitious nonsense, that there is no evidence of an external wrath raining down upon us, no demonstrable cruel destiny or fate assigned us, no eternal Sisyphean existence vouchsafed us for the end of the present one.

And that’s true. There is no evidence for the existence of God, or gods. Except for the small, annoying, persistent rumor that at this particular point in time, we are here.

The Yoenis Cespedes Trade

The Oakland A’s made a huge trade yesterday, sending their biggest name, Yoenis Cespedes, and a draft pick to the Boston Red Sox for Jon Lester and Jonny Gomes. They also made a smaller trade, sending Tommy Milone to the Minnesota Twins in exchange for Sam Fuld. Of course, the sports world was abuzz from the Cespedes trade, which stunned many.

A couple of things left me unsatisfied about the reactions I’ve seen of the Cespedes trade. One is an old idea, expressed in Moneyball back in 2002: you don’t try to replace Giambi/Cespedes with one player, you replace him with other players in aggregate across the roster. The other a newer idea: is that the A’s platoon so much, that you can’t just analyze A’s players as atomic units. You can’t just say X is a 5 WAR player and Y is a 2 WAR player, and X – Y = 3 WAR. You have to break them down into their platoon split components, because the A’s use platoons far more efficiently than is baked into most of these formulas.

For example, if you look at Jonny Gomes as an atomic unit, he has suffered a severe decline this year. He’s hitting .234/.329/.354 this year, a far cry from the .262/.377/.491 he hit with the A’s in 2012, and in no way close to being able to replace Cespedes’ production. However, if you break Gomes down into platoon splits, you can see that his decline is entirely against right-handed pitching, where he is hitting a godawful .151/.236/.258 this year. Against left-handed pitching, however, he is still hitting a very healthy .302/.400/.431. A’s manager Bob Melvin is a master at getting the platoon advantage for his players, so we can bet we won’t see much of Jonny Gomes against RHPs.

So what I want to see is an analysis that really looks at the A’s as two teams: one team against RHPs which plays 72% of the time, and another team against LHPs which plays 28% of the time. Let’s look at those teams before and after the trade, and see how much the trades affected those two teams, even if we calculate these things in a kind of quick and dirty fashion.

To do that, you need to project performance by splits, which isn’t easy to find. PECOTA has a Marcel-like calculation called “Platoon multi”. Dan Szymborski pointed me to a platoon projection spreadsheet he created for his ZiPS projection. So I took that pre-season projected data, and combined it with their 2014 performance in a spreadsheet, to create a rest-of-season projection. (Okay, that wasn’t so quick, so the rest of this will be kind of dirty. We don’t have to be precise here, we just want a ballpark understanding of what’s going on.)

There’s another complicating factor here, in that the A’s currently have three players who are injured: Coco Crisp, Craig Gentry, and Kyle Blanks. Plus, Stephen Vogt has an injury that prevents him from catching, but not playing 1B or OF. So we’re going to run one set of numbers assuming everyone is healthy, and another assuming these injuries. Here are the best-hitting lineups (not by batting order, but sorted by GPA, from best player to worst). We’ll make removed (traded or optioned) players red, and added players blue.


Healthy lineup vs LHP: (position,obp,slg)

Donaldson (3b, .373, .604)
Norris (c, .399, .519)
Gomes (dh, .380, .440)
Cespedes (lf, .332, .473)
Crisp (cf, .353, .411)
Moss (rf/dh, .326, .439)
Blanks (1b, .336, .407)
Gentry (lf/rf, .348, .361)
Lowrie (ss, .320, .395)
Callaspo (2b, .304, .324)

Bench: Fuld, Vogt, Burns, Reddick, Punto, Jaso, Sogard.

Estimated runs per game, new lineup: 5.266
Estimated runs per game, old lineup: 5.218

The offense improves vs LHPs, because Gomes is actually slightly more productive than Cespedes, thanks to his high OBP. The defensive effect is that Moss gets moved from DH into the outfield, because he’s a better fielder than Jonny Gomes, but not a better fielder than Cespedes.


Healthy lineup vs RHP:

Jaso (dh, .372, .452)
Moss (lf/1b, .333, .510)
Reddick (rf, .325, .458)
Vogt (1b/c, .328, .422)
Cespedes (lf, .302, .453)
Crisp (cf, .321, .417)
Lowrie (ss, .329, .395)
Donaldson (3b, .321, .404)
Callaspo (2b, .333, .351)
Norris (c, .331, .353)

Bench: Blanks, Gentry, Fuld, Sogard, Punto, Gomes, Burns.

Estimated runs per game, new lineup: 4.810
Estimated runs per game, old lineup: 4.841

Losing Cespedes against RHPs has a more noticeable effect. Gomes and Cespedes are equivalent players vs LHPs, but the gap between Cespedes and his replacement against RHPs, Derek Norris, is larger, and creates a slight loss of runs per game. It also shifts Vogt and Moss around defensively to get Norris into the lineup.


Injured lineup vs LHP: (position,obp,slg)

Donaldson (3b, .373, .604)
Norris (c, .399, .519)
Gomes (dh, .380, .440)
Cespedes (lf, .332, .473)
Moss (lf/dh, .326, .439)
Fuld (cf, .337, .378)
Lowrie (ss, .320, .395)
Vogt (1b, .275, .448)
Callaspo (2b, .304, .324)
Burns (cf, .318, .292)
Reddick (rf, .245, .411)

Bench: Punto, Jaso, Sogard.
Out: Crisp, Blanks, Gentry.

Estimated runs per game, new lineup: 5.023
Estimated runs per game, old lineup: 4.852

Yeesh, those are some atrocious OBPs at the bottom of the lineup with these injuries, because LH batters Vogt and Reddick are forced into the lineup against LHPs. Fuld is also a LH batter, but he has a weird reverse platoon split in his career; he’s actually been better vs LHPs than RHPs. Like with the healthy group, going from Cespedes to Gomes is a slight upgrade against LHPs; but the upgrade from Burns to Fuld is enormous.


Injured lineup vs RHP:

Jaso (dh, .372, .452)
Moss (lf/rf, .333, .510)
Reddick (rf/cf, .325, .458)
Vogt (1b, .328, .422)
Cespedes (lf, .302, .453)
Lowrie (ss, .329, .395)
Donaldson (3b, .321, .404)
Callaspo (2b, .333, .351)
Norris (c, .331, .353)
Fuld (cf, .311, .321)

Bench: Sogard, Punto, Gomes, Burns.
Out: Crisp, Blanks, Gentry.

Estimated runs per game, new lineup: 4.685
Estimated runs per game, old lineup: 4.708

The main effect here is that Fuld gets Cespedes’ at bats, and that Reddick can move back to right field. But without the Fuld trade to complement the Cespedes trade, Sogard would be getting Cespedes’ at bats, and you’d have an awful outfield of Moss-Reddick-Vogt with Callaspo at 1b. Yeesh. You’re going to lose some offense, but that defensive alignment would probably kill you. I suspect that avoiding that defensive alignment alone is probably justification for trading Milone.


So let’s take those estimated runs per game, and extrapolate them over 162 games, and assume the average split of 72% RHPs and 28% LHPs, and combine those two split-handed teams into one team again, leaving us with just a healthy team and an injured team.

Of course, the injured team is not as good as the healthy team, and will be scoring fewer runs than the healthy team. But to analyze the trades, we don’t need to know the raw totals, we really only need to know how much the trades change the run scoring.

The healthy team loses 3.6 runs vs RHPs in the trades, but gains 2.2 runs vs LHPs, for a total loss of 1.4 runs over a whole season. It’s practically no loss of offense at all.

The injured team loses 2.7 runs vs RHPs in the trades, but gains 7.8 runs vs LHPs, for a total gain of 5.1 runs over a whole season. Most of that gain is from playing Fuld over Burns (vs RHPs) and Reddick/Vogt (vs LHPs).

Let’s say these three injured players are going to miss one-third of the remaining games to play. Multiply that 5.1 by one-third, and the -1.4 by two-thirds, and what you end up with is actually a slight gain (0.25 runs over the rest of the season), albeit so small that it is practically a wash.


The trades felt like a shock to many of us. On the surface, losing Cespedes’s sexy bat hurts, and trading a decent starting pitcher like Tommy Milone for a fourth outfielder seems like a waste. In a vacuum, that is true. But if you look at the impact those trades have on this particular team’s offense, it’s negligable.

Offensively, the numbers tell us that losing Cespedes is no big deal. And if everyone is healthy, trading for Fuld is a waste, because he wouldn’t play. But not everyone is healthy, especially in CF, and so Fuld is essential to keeping the offense at the level it would be without the trades.

So basically, we can consider the offense a wash. Now we can move on to analyzing the effect these trades have on the A’s defense and pitching. But I’m leaving that as an exercise for the reader. I’ve done enough for today.

Fixing the Oakland Coliseum Fences (and Foul Territory)

Grant Brisbee has a fun series over on SB Nation where he ranks MLB stadiums by how well they make home runs look impressive. Surprisingly, he ranks the Oakland Coliseum 13th. It gets that high ranking because the various levels of Mount Davis provide a good contrast between a mediocre home run, and a towering one. When someone crushes one at the Coliseum, you can tell it’s crushed because it lands in the 2nd deck (down the line) or hits off the luxury boxes in center field.

That’s fine and all. I suppose it’s good that Mount Davis has some redeeming feature. But there are far more mediocre home runs than monster ones, and it’s what the current version of the Coliseum does to those wimpy home runs that I hate.

Hate hate HATE.

Really, there is nothing I hate more about the Coliseum than the placement of the outfield walls. Nothing. Not the troughs, not the sewage, not the crap we A’s fans have to take from other fans teams about the troughs and the sewage, not the 8th-inning Call Me Maybe, not even Mount Davis itself. I hate the placement of the outfield walls more than all of those things.

Except at the foul poles, there is no logic to the outfield walls at all. None. Look at the fence at any point between the foul poles. Why is the fence there? Why is it that height? No reason at all, really.

And worse than that, what really drives me bonkers about it is this: any EVERY point from pole to pole, if you hit the ball just barely over the fence, it DOES NOT LAND IN A SEAT.

Home runs should land in seats. Or if not IN seats, then OVER seats. Period.

* * *

Ok, Ken, you’ve been made Dictator of the Oakland Athletics for a day, and you can change one thing and one thing only. Give us your plan.

OK, I’m going to assume the A’s will sign a rumored 5-10 year lease extension, and are therefore planning to stay at the Coliseum awhile. This may be putting lipstick on a pig, but nonetheless, let’s make it a better place to watch a ballgame.

First of all, do you know why there is so much foul territory in Oakland? The story goes, as former A’s broadcaster Monte Moore use to tell, that the third deck had obstructed views of home plate because of its slope, so they had to move home plate further out than they planned.

I don’t know if that’s true or not, but let’s say that it is. Well, guess what? We’re not using the 3rd deck anymore. It’s (mostly) tarped off. So why is home plate still pushed out so far?

We’re going to put home plate back and the foul poles back to where they originally were supposed to be. Then we’re going to use the extra eight feet or so we gain to add some seats in front of the current bleacher seats. What we end up with is (a) an outfield configuration where, except for at the stairs, every home run lands in or over a seat, and (b) every seat in the main seating bowl is suddenly about two rows closer to the action, in a way that (c) shouldn’t cost ridiculous amounts of money to implement.

Here’s what it looks like with the new configuration in left field, and the old configuration in right field (click image for larger version):

coliseumremodelcompare900

Let’s look at this in more detail:

 

1. We’re moving the foul poles over about 6-7 feet, so that there’s only about 1 foot between the pole and the foul line seats. This pushes home plate back about eight feet or so, thusly:

coliseumhomeplate

 

2. The wall nearest to the foul poles is about 2-3 feet shorter than the seats, and begins to angle away from those seats as you move more towards center field. We’re fixing this. The walls go all the way up to the seats, and hug the seating section all the way. No more balls that land over this fence, but fall short of the seats. Compare the new and old corners:

coliseumcorners

 

3. We’ll get rid of that stupid idiotic ledge above the out-of-town scoreboard. With home plate being pushed about 8 feet back, we have room to add two or three extra rows of seats, and still keep roughly the same distance from home plate as before.

I don’t know if we keep a scoreboard there or not. If you give free wifi throughout the stadium instead, you probably don’t need it.

I cut and pasted Fenway’s Green Monster seats here, to show you don’t need to add seats identical to the other bleacher seats. There’s room for some creativity in this new section.

coliseumbleachers

 

4. Centerfield is now about 405 feet from home instead of 400, but we’ve cut down on the foul territory quite a bit, so this may keep the amount of offense roughly the same as before.

coliseumcenterfield

* * *

Ahhhhhhhh, now see? That’s much better.

I’m sure you have all loved your Dictator for the Day, and Wish Long Life for your Beloved Comrade Who Brings Glory to the Homeland. Now please excuse me, I have some propaganda posters to go photoshop.

Projected 2014 Oakland Athletics Anagram Roster

There’s no way to be gentle about this: A’s General Manager Baby Nellie’s offseason moves have clearly weakened the A’s anagram roster for 2014. They have become slightly worse across the board, but some of his moves in the bullpen…well, I just don’t know what he was thinking.

Starting Rotation:

The A’s have lost the two best anagrams from their 2013 starting rotation: Bartender Snot and No Local Robot. Angry Nosy and Rat Mocks Zit are decent replacements to be sure, but are also both clearly a step down. Fin Jar GIF looks like odd man out, as acronyms are purely replacement-level stuff, even if they can be pronounced.

11: Pro Radar Jerk
54: Angry Nosy
57: I Melt My Moon
64: Fin Jar GIF
67: Daily Rants
??: Rat Mocks Zit

Bullpen:

Ask the Pen: is there any better anagram for a reliever? No, there is not. And yet, the A’s just let him go for nothing. To ask the pen without him to match what they were with him is unfair.

It gets worse before it gets better. Swapping closer No Fat Burglar with Oh MJ Is On NJ is nothing but a disaster.

Trading away JV Errs Byline is addition by subtraction, but similarly wretched She Aces JV EZ is somehow still around.

On the bright side, there remains a solid young core led by Oldest Toenail. Greek Loungers may be the best A’s acquisition this offseason, and don’t overlook Banana Fodder.

With no options remaining, there may be no room for Fedora Groupie, so perhaps Baby Nellie can find a match for him with the Astros.

48: Okay Corn
60: She Aces JV EZ
61: Neat Odor
62: Oldest Toenail
65: Fedora Groupie
??: Oh MJ Is On NJ
??: Greek Loungers
??: Banana Fodder

Catchers

The roster of catchers remains the same. Order Sinker is the best gamecaller of the group, of course. Pegs Hot Vent remains to fill in should either of the other two catchers need to go on midseason pilgrimages again.

5: Hajj Soon
21: Pegs Hot Vent
36: Order Sinker

Infielders:

Armload Seas gnip-gnopped his way to Texas last summer, so the A’s have replaced him with Tonic Punk. It’s a slight upgrade, to a mostly intact infield where even the weakest link redeems himself with a Star Wars reference.

7: Mean Fainter
8: Roid Jewel
10: Rat Brain Doc
18: Palatable Colors
20: DJ Han Solo Nods
28: Scarier Dog
37: Random Snobs
??: Tonic Punk

Outfielders:

Grouchy Sin is out, Great Crying is in. You reap what you sow, I guess. Don’t forget that Random Snobs can play outfield if needed, which may leave no room to Erotically Ham.

4: Cisco Crop
16: Jocks Did Her
23: Erotically Ham
52: Eyes Second Pies
??: Great Crying

My Letter from 1989 about the Earthquake World Series

Grantland posted an oral history of the 1989 World Series and earthquake the other day. That prompted me to dig up an old letter I sent to my friends and family outside the Bay Area, mostly in Sweden, about my experiences during that time.

A bit of background: in October of 1989, I had just returned from a year living in Sweden with my girlfriend (now wife) Pam. Pam was staying at her parents’ house and I was staying with her brother, until we could find jobs and afford to get our own place.

In hindsight, this letter is quite long, full of unnecessary details and subplots, not unlike a Victorian novel. It also lacks a good plot, because, well, no buildings fell down around me or anything. Nobody in the story was hurt, nobody was rescued. But in my defense, this was back in the days when you couldn’t just send an email or post something on Twitter or Facebook or Instagram and have everyone you know around the world instantly know what’s going on in your life. My Swedish friends probably got some horrific pictures on TV of collapsed buildings and fires and thought San Francisco had fallen into the sea. We weren’t so overwhelmed with data that a lack of filtering was a problem. TL;DR was not a thing back then.

So, here it is, what I wrote back in 1989:

Continue reading

How to Remove the Yahoo Sports background image on Chrome

Yahoo Sports remodeled their site this morning, and it’s awful. Mostly, I think, because the new background image on is really distracting and annoying. So I decided to zap it. Here’s how I did it, and you can too:

1. Install the Stylebot Chrome extension.

2. Once you install it, you will get a little “CSS” button on your toolbar.

3. Go to a Yahoo Sports web page.

4. Click the Stylebot “CSS” button on your toolbar.

5. Click “Install Style from Social…”

6. It should show “Loading…” for a second, then bring up “Remove background image from Yahoo Sports by kenarneson”.

You should now see a plain gray background instead of the grass field background.

* * *

I wanted to make it a plain white background like it used to be, but the text on the Yahoo Sports site is now mostly white, so that wouldn’t work. So I made it gray.

1973 in video gaming

I don’t remember the first time I ever saw a video game. I doubt it was as early as 1973. I know my next-door neighbor had an Atari 2600 in 1978, and I had a Mattel Electronics Football game around the same time. I know I went minigolfing for a couple birthdays in between there, and the minigolf place had an arcade. They probably had Pong, if not a few other video games in the arcade. Probably, then, I first laid eyes on a video game around 1976 or so.

So this Random Wikipedia article, 1973 in video gaming, comes a few years too early for me to have any personal memories. As a historical landmark, it’s one year too late. The big year in video gaming is 1972. In 1972, Atari was founded and they produced Pong. Additionally in 1972, Magnavox introduced the Odyssey, the first home video game console.

So 1973 was a period of infancy for video games–after they were invented, but before they became a major force in popular culture. Did the people working on video games back really believe it would later become a huge deal? Or did they assume they were just part of a temporary fad, just trying to figure something out, maybe eking out a living or something if they’re lucky, but not really suspecting they were incubating a baby entertainment industry that would eventually be as big as movies or TV?

And what’s the 2013 version of video gaming — the rough beast that’s just a baby now, barely even noticed, but one day will grow to be king of the world?

2013–14 Clemson Tigers men’s basketball team

To be honest, I don’t give a crap about the 2013–14 Clemson Tigers men’s basketball team.

* * *

For a week now, I’ve been writing a blog entry each weekday about a random Wikipedia article. I’m not sure why. Something about it struck me as an interesting idea, so I went with it.

But when the Random Wikipedia Wheel of Fortune brought me to the 2013–14 Clemson Tigers men’s basketball team, I almost quit the idea. It annoyed the hell out of me. I mean, look at this, here’s the entire Wikipedia entry:

The 2013–14 Clemson Tigers men’s basketball team will represent the Clemson University during the 2013–14 NCAA Division I men’s basketball season.

It’s basically a tautology. It’s nonsense. It’s vaporware. It’s nothing.

Pffffft. The 2013–14 Clemson Tigers men’s basketball team doesn’t even exist yet. I didn’t go to Clemson University. Why the heck should I care about it? I don’t think I personally know anybody who went to Clemson. Heck, I barely know anyone who went to any of the schools in Clemson’s athletic conference, the ACC. Why should I bother writing about it?

* * *

The past few weeks, I’ve been taking an online course in Behavioral Economics. One of the issues they talk about is how much we overvalue the present and undervalue the future. We also overvalue things that are near to us, and undervalue things that are far away from us. For example:

Would you give $100 if it would pay for an operation that would, guaranteed to work or your money back, save the life of a 5-year-old child today? Probably, you would.

Would you give $100 if it would pay for an operation that would, guaranteed to work or your money back, save the lives of a hundred 5-year-old children in Belgium in 2043, kids who won’t even be born for another 25 years? Hmm…it’s a tougher question, isn’t it?

Why is it so hard to feel sympathy for people and events far away and in the future?

* * *

Taking that knowledge, I plowed ahead and did some googling about next year’s Clemson basketball team. I found an article on RealGM Basketball which uses some statistical analysis of college basketball players to predict that Clemson will go 6-12 in the ACC during 2013-14. Dan Hanner explains:

Given that they lose their two best players and have zero players who were elite high school recruits on their roster, I think a lot of preseason predictions will have them even lower than this. There really isn’t anyone on the roster who looks like a likely offensive star. (The only good news is that Clemson was young last year and the sophomore leap should help at least a couple of their freshmen become solid players.) But let’s face it, this is going to be an ugly team to watch. The only reason the model doesn’t have Clemson lower is because of Brad Brownell’s ability to teach defense.

Maybe that’s accurate. Or not. A year from now, we’ll know for sure.

But I’m from California, not Carolina. I follow the Pac-12, not the ACC. So again, I really don’t care. Because I am human. I concern myself mostly with the here and now. I am, as my behavioral economic class suggests, biased against the people and things that are separated from me by large gaps of space and time.

* * *

In my last Random Wikipedia entry about Błudowo, Poland, I examined a picture of a bible passage about the Lamb of God. I didn’t examine a matching companion text on the ceiling of that same church, partly because the image is interrupted by an ugly ceiling lamp, but partly because it seems to contradict the first image. The text is a quote from Revelations 1:8:

Alpha and Omega

In English, the Błudowo text quotes God saying, “I am the Alpha and the Omega, the Beginning and the End.” The next part of that passage goes: “I am the one who is, who always was, and who is coming. I am the Almighty.”

It’s an interesting pairing. In the first image, God is presented as being meek and humble. Here in the second, God is powerful and eternal. What does it mean to put these passages together?

* * *

Will Leitch had a good article recently about what Christians mean when they thank God after a sporting event. Money quote:

When you live a Christian life, everything you do, from showing up to church on Sunday, to going to the grocery store, to pumping gas, to hitting a home run, to striking out, is done for the glory of Christ. Hamilton isn’t thanking Jesus for helping him hit a homer; he is thanking Jesus for everything.

I think that’s right, but incomplete. Living a Christian life doesn’t just mean understanding or believing Christianity, it means practicing it. And I don’t mean practicing as in “doing”, I mean practicing as in “training.”

We are naturally biased towards the here and near and now. We naturally discount the distant, both in time and space. You can’t just overcome that built-in bias with rational understanding. That bias is our default mode. You have to overcome that bias by actively training yourself to overcome it, otherwise you slip right back into your default mode.

In default mode, you think that three-point shot you just made to win the game is the most important thing in the world. You’re so awesome!

Expressing gratitude toward God, as a practice, removes you from that default mode. It strips away your bias, in two ways:

  1. It affirms that second passage in the Błudowo church. It’s an acknowledgement that there is some thing more awesome than you, and some time more important than now. It is, as Leitch suggests, gratitude towards everything that was, is, and shall be.
  2. It reminds us that our natural biases, a/k/a our sins, are not washed away by conquering the here and now like a tiger. On the contrary, our selfish, competitive biases toward satisfying the desires of ourselves and those nearest to us at the expense of others, is actually a cause of suffering in the world. The practice of thanking God is an act of humility and generosity, of caring about something beyond the immediate. Thanking God makes you more lamb-like than tiger-like.

* * *

Funny though, how in a large Christian nation like America, there aren’t any major sports teams nicknamed “the Lambs”.

* * *

So here I am, a sinner who doesn’t give a crap about the the 2013–14 Clemson Tigers men’s basketball team. If I were more God-like, more Christ-like, I would. I would overcome the bias that makes me care more about my local team and my local league and the current and most recent year than about some team far away and in the future. I’d be more generous, more caring, about everything.

* * *

And this is why I think I am writing these random Wikipedia articles. Like thanking God, it is a kind of practice, designed to train me away from my biases. Away from my compulsive desire to compete, to be great, to win in the here and now. A random Wikipedia article can send me anywhere–past, present, future–and it forces me to contemplate on it, to be generous towards it. Contemplation leads to empathy and compassion, and the world becomes a better place for it. And perhaps I become a better human being, too.

* * *

So Godspeed, 2013–14 Clemson Tigers men’s basketball team.

2000 Ericsson Open – Women’s Singles

Martina Hingis

There are two tiers of professional tennis tournaments: the Grand Slam events, and all the others.

The Ericsson Open, a/k/a the Sony Open, a/k/a the Miami Masters, may be the Grandest of the Ungrand. Most Ungrand events are one week, single gender tournaments. The Miami tournament, like the Grand Slam events, plays over two weeks, hosts both genders, and has a large prize purse. It probably has visions of perhaps one day becoming Grand itself.

But so far, it remains the Biggest Fish in the Small Pond. Is that such a bad deal?

The Random Wikipedia Wheel of Fortune has sent us today back in time to the 2000 Ericsson Open Women’s Singles tournament. It is not a particularly remarkable tournament, other than serving as one affirmation, among many, of the greatness of Martina Hingis. Hingis marched through this tournament basically unchallenged. She never got close to losing a single set. She won 6-3, 6-1 in the quarterfinals against Amanda Coetzer. She destroyed Monica Seles in the semifinals, 6-0, 6-0. Hingis then trounced Lindsay Davenport in the finals, 6-3, 6-2.

In the past two days, the RWWoF sent us to examine the ordinary, unremarkable moments before and after greatness. Today, our eyes are opened to the existence of many other utterly ordinary moments, even in the middle of greatness. Perhaps we are meant to wonder: if greatness is so short and fleeting, what exactly is so great about greatness anyway?

Photo credit: Taís Melillo on Flickr via Creative Commons license.

A’s Pitcher Similarity Scores

Over at Beyond the Boxscore, Stephen Loftus has posted Pitcher Similarity Scores. The scores compare pitchers to each other based on:

  • Pitch Velocity
  • Pitch Break (Horizontally and Vertically)
  • Pitch Locations
  • Pitch Release Point

Curious about how the A’s scored, I extracted the A’s pitchers from the spreadsheet. A few pitchers didn’t seem to throw enough pitches last year to qualify (Brett Anderson, Sean Doolittle, Pat Neshek), while Fernando Rodriguez is on it, even though he hasn’t been seen in Oakland yet, because he got hurt in spring training.

A few notes:

  • A.J. Griffin is only mildly Zitoesque, and is actually more similar to Jerry Blevins, of all people.
  • Griffin is the only player on the A’s who does not have R.A. Dickey among his 10 least-similar players.
  • Bartolo Colon has the most-similar least-similar player in baseball, if that makes sense. His similarity to John Axford, his least similar player, scores higher in similarity than any other player’s least-similar player. I assume that’s because Colon throws mostly fastballs.
  • Tommy Milone seems to be the most unique pitcher on the A’s. His #1 comp score (Jason Vargas, 0.739) would be the 24th-highest score on Bartolo Colon’s list.

Enjoy:

Bartolo Colon
Jeanmar Gomez 0.868
Zach Britton 0.843
Dillon Gee 0.823
Travis Blackley 0.818
Joe Saunders 0.804
Christhian Martinez 0.799
Luis Ayala 0.797
Hiroki Kuroda 0.792
Rick Porcello 0.788
Kyle Lohse 0.781
R.A. Dickey 0.356
Matt Thornton 0.355
Steve Delabar 0.350
Fernando Rodriguez 0.339
Esmil Rogers 0.319
Tim Collins 0.315
Aroldis Chapman 0.306
Mike Fiers 0.296
Josh Collmenter 0.247
John Axford 0.241
A.J. Griffin
Liam Hendricks 0.835
Wi-Yin Chen 0.806
Jerry Blevins 0.750
Jeff Karstens 0.743
Kyle Lohse 0.732
Randy Wolf 0.724
Phil Hughes 0.710
Colby Lewis 0.707
Ernesto Frieri 0.700
Josh Lindblom 0.696
Barry Zito 0.533
Mitchell Boggs 0.290
Jim Johnson 0.287
Aaron Laffey 0.276
Jake Westbrook 0.274
Jonathan Papelbon 0.249
Vinnie Pestano 0.248
Brandon League 0.240
Steven Cishek 0.222
Roy Halladay 0.196
R.A. Dickey 0.098
Jarrod Parker
Anthony Swarzak 0.879
Ubaldo Jimenez 0.878
Jeremy Guthrie 0.869
Randall Delgado 0.864
Bud Norris 0.858
Zack Grienke 0.819
James Shields 0.813
Christian Friedrich 0.812
Chris Resop 0.811
Tyson Ross 0.803
Carlos Zambrano 0.367
Kameron Loe 0.364
Roland Belisario 0.357
Derek Lowe 0.351
Jose Arredondo 0.349
Roy Halladay 0.312
Brandon League 0.291
Josh Collmenter 0.281
Joe Smith 0.256
R.A. Dickey 0.199
Tommy Milone
Jason Vargas 0.739
Travis Wood 0.704
Wade LeBlanc 0.703
P.J. Walters 0.678
Chris Capuano 0.669
Cole Hamels 0.648
Ryan Vogelsong 0.642
Ian Kennedy 0.631
Phil Hughes 0.626
Jonathan Sanchez 0.622
Luke Gregerson 0.187
Jose Arredondo 0.184
Vinnie Pestano 0.179
Luis Ayala 0.172
Roland Belisario 0.148
Jared Hughes 0.147
Joe Smith 0.133
R.A. Dickey 0.131
Brandon League 0.110
Steven Cishek 0.101
Grant Balfour
Clayton Kershaw 0.870
Wade Davis 0.858
Chris Tillman 0.844
Mat Latos 0.799
Joe Nathan 0.760
Matt Garza 0.750
Brandon Morrow 0.735
Kenley Jansen 0.730
Bud Norris 0.714
Greg Holland 0.714
Erik Bedard 0.187
Chris Capuano 0.186
Roy Halladay 0.186
Jeff Francis 0.180
Shawn Camp 0.173
Derek Lowe 0.170
Doug Fister 0.162
Dallas Keuchel 0.160
Joe Smith 0.158
R.A. Dickey 0.072
Jerry Blevins
Brian Duensing 0.784
Liam Hendricks 0.767
Jeff Karstens 0.753
A.J. Griffin 0.750
Tom Gorzelanny 0.735
Randy Wolf 0.734
Madison Bumgarner 0.724
Kyle Lohse 0.719
Dillon Gee 0.715
Blake Beavan 0.708
Joel Peralta 0.282
Vinnie Pestano 0.280
Tom Wilhelmsen 0.279
Aroldis Chapman 0.274
John Axford 0.274
Chris Young 0.259
Roy Halladay 0.244
Josh Collmenter 0.225
Tim Collins 0.219
R.A. Dickey 0.192
Ryan Cook
Pedro Strop 0.875
Glen Perkins 0.835
Jeff Samardzija 0.802
Carlos Marmol 0.801
Henderson Alvarez 0.801
Jason Hammel 0.751
Tyler Chatwood 0.735
Joel Hanrahan 0.735
Adam Ottavino 0.723
Garrett Richards 0.706
Chris Young 0.245
Jose Arredondo 0.242
R.A. Dickey 0.223
Tim Collins 0.215
Scott Diamond 0.212
Mike Fiers 0.204
Samuel Deduno 0.194
Josh Tomlin 0.176
Joel Peralta 0.137
Josh Collmenter 0.103
Chris Resop
Fernando Salas 0.821
Tyson Ross 0.812
Jarrod Parker 0.811
Anthony Swarzak 0.796
Jeremy Guthrie 0.792
Heath Bell 0.790
Homer Bailey 0.778
Ivan Nova 0.777
Ubaldo Jimenez 0.776
Randall Delgado 0.771
Carlos Zambrano 0.291
Dallas Keuchel 0.288
Roy Halladay 0.242
Josh Collmenter 0.240
Brandon League 0.235
Jason Marquis 0.233
R.A. Dickey 0.191
Derek Lowe 0.187
Shawn Camp 0.170
Joe Smith 0.142
Fernando Rodriguez
Heath Bell 0.797
Wade Davis 0.771
Fernando Salas 0.758
Tim Collins 0.727
Joe Nathan 0.709
James Shields 0.707
Tyler Clippard 0.689
Edison Volquez 0.684
Mat Latos 0.684
John Axford 0.684
Chris Resop 0.678
Carlos Zambrano 0.189
Christhian Martinez 0.160
Luis Ayala 0.159
Luke Gregerson 0.144
Steven Cishek 0.143
R.A. Dickey 0.114
Derek Lowe 0.101
Jason Marquis 0.091
Shawn Camp 0.088
Joe Smith 0.050

2012 Vacation Photos and Baseball Player Names

Back on the old Baseball Toaster, I wrote 8,320 entries of various sorts.

I recently surpassed that number of posts on Twitter, and I have now reached my 10,000th tweet.

I wanted to do something special to commemorate the milestone, so I dug something up out of my old bag of tricks, and made a slideshow of my Top 30 2012 Vacation Photos and Baseball Player names.

Check it out.

* * *

If you enjoyed those, here are some older, similar slideshows built on outdated technology: