Book Report: The Theory that would Not Die

It's a book on the history of Bayes' Theorem. Bayes' Theorem is, roughly, a handy tool for practical probability problems. Suppose you are an email system's spam filter. You see a new email message that says "Best bargains Vi@gra". You need to put this message in the Spam folder or the Inbox. What do you do? Bayes says you can figure probabilities.

60% of email messages are spam.
In a set of 1000 not-spam messages, 2 mentioned "Best"
In a set of 1000 spam messages, 3 mentioned "Best"

Bayes gives you a nice way to multiply together the relevant numbers: we can ignore all those non-"Best" messages and concentrate on the relative probabilities: .003 * 60% spammy cases versus .002 * 40% non-spammy cases. So if you're looking at a message that contains "Best" and trying to decide if it's spam, considering just that word the odds are 18:8 that this message is spam. And you can get more information from the other words in the message.

At least I think that's roughly how Bayes works. This book traces the struggle of Bayesians versus some other group who use "Frequency Probability". Unfortunately, all I know about statistics is a few techniques. Of the tools in my toolbox... I don't know whether they're Bayesian or Frequentist or Pickle Sandwich or whatever. I have a hard time understanding frequentism. This book only tries to kinda hand-wavily describe it; I guess the author doesn't want to lose non-technical readers. So... I knew just enough to find myself confused nonetheless.

Mmmmaybe the difference is: I carefully computed that "60% of email is spam" statistic. (Where by "carefully computed", I mean "Looked in the email and spam folders on one email account and eyeballed a rough count.") But what if I didn't have that historical data? If I understand this book correctly, the Bayesian answer is "We need a decision. So plug in an estimate. What percentage of mail do you think is spam? Now you can use that to multiply with the other numbers. (But be sure to update that guess when you know more)"; but the Frequentist answer is "Give up! Wait until you have a significant number of emails to count!!" That sounds weird to me. Anyhow.

I still enjoyed this book, even though I didn't understand the struggle that it used as a framing story. Why? Because it presented some interesting statistics problems that have occurred through history. Interesting problems are good for the brain.

Laplace rediscovered Bayes' Rule... actually, this book makes a good case that Bayes' Rule, as stated by Bayes was not so useful. That maybe we should call it Laplace's Rule or something. Anyhow, Laplace applied the rule to many things, including jury trials. There were a lot of guessed factors in there, few known statistics. Still, he made a pretty good case that juries were wrong... not super-often, but not infitessimally-often, either. He used this as an argument against capital punishment: being found guilty wasn't a strong enough indicator of guilt.

A bunch of the stories involve looking for things at sea. Suppose you're a WWII US Navy fleet commander. Many merchant convoys are trying to cross the Atlantic to get supplies from the USA to England. U-Boats prowl the Atlantic, sinking the convoys. You have some destroyers, some search planes... but not enough to patrol all of the Atlantic. How do you organize your search? How long should a destroyer search in one place before moving on to another?

Or what about the Broken Arrow incidents at Palomares and Thule? You want to find a nuclear bomb that's... in the water. Wow, the Earth has a lot of water. Again, how do you figure out where to look? When do you decide to give up on that spot and look somewhere else?

There was election prediction. You might think that Nate Silver has a tough job, but when Tukey ran a group predicting elections for the TV news, one year his bosses sequestered the team because they didn't trust the prediction. It's never a good sign when someone locks up the statisticians in a room. Anyhow.

There's also some love for computing here: statistics, whether Bayesian, Frequentist, or whatever wasn't super-practical until it was easy to work with big piles of data. Maybe the anti-Bayesians had a point: until you had computers, if you didn't have enough data to get a scientifically-significant result, why on earth would you spend weeks of your life wrestling with data to get an answer that's not going to be that much better than the guess you'd make by eyeball? Nowadays, gathering data is still tough; but once you've got it, it's relatively easy to press a button and say "Computer, tell me about any weird correlations in here".

All in all, a good read. I was tempted to put the book down a few pages in when it said "Laplace emerged from Caen a swashbuckling mathematical virtuoso..." and then there was no swashbuckling; it was bad like when Steven Levy is bad. Fortunately, McGrayne isn't Levy-bad nearly as often as Levy is, so I kept reading.