What’s the Big Deal About Bayes?

Plenty of folks I admire love Bayesianism, a school of statistics and philosophy of knowledge. As do I, though as I’ll discuss here, it’s not magic, and sometimes Bayes gets credit for a principle that’s a lot more broad. (Sometimes, though less commonly, this happens for frequentism too!)

So much ink has been spilled on Bayes that I doubt I’ll make any strictly original claims. But I can at least give a synthesis of both philosophical and statistical perspectives (most rabid philosophical Bayesians on the Internet are not statisticians, and vice versa), and touch on the significance of these ideas for practical decision-making.

The Rule

First, the math. I’ll state Bayes’ rule in words:

The credibility of some claim given some evidence is proportional to the chance of seeing that evidence if the claim is true, times the credibility of the claim before considering the evidence.

I used the weaselly “proportional” partly because the sentence would’ve been too long and confusing otherwise. But more importantly, the factor that’s missing is one that doesn’t matter when you want to assess the odds of a claim relative to its denial. In fact, lots of popular methods for using this rule in complicated stats models don’t compute that missing factor, precisely for this reason.

The upshot is: to assess how much more credible a claim is than its denial, you adjust your starting judgment (the prior) in the direction of whichever of the two makes the evidence less surprising. And the more skewed this ratio of surprise (the likelihood ratio) is, the more you adjust. More on this later. But as one example, the fossil record is much more surprising in a universe where creationism is true than one where evolution is—awfully kind of God to not put any rabbit fossils in the Precambrian strata!—hence it makes sense to interpret the fossil record as evidence for evolution. Not that these are the only two options, but the principle is the same.

Here’s the symbolic way to state it, for your viewing pleasure:

Perhaps an obvious point: since this is just a mathematical fact, a tiny bit of algebra applied to the definition of conditional probability, there’s ironically nothing uniquely “Bayesian” about Bayes’ rule. Frequentists acknowledge it as much as Bayesians do. They definitely acknowledge the likelihood ratio principle in designing some hypothesis tests, though as we’ll see, the notorious P-value muddies the waters. (See RustyStatistician’s answer here.)

Where frequentists and Bayesians disagree, among other points, is on whether it makes sense to apply Bayes’ rule to make sense of quantities like “the probability that this claim [about some deterministic value] is true given that evidence.” On the Bayesian interpretation, probability is a measure of our uncertainty, of how plausible some proposition in question is. The frequentist interpretation, as I understand it, distinguishes how subjectively plausible a claim is (or “should” be) from its probability. Probability computations other than 0% or 100% are reserved for things that are—apologies for scare quote abuse—”inherently random/stochastic.” Technically this seems to be a feature of the propensity interpretation, but it’s also how frequentists in practice often object to Bayesian claims, and it’s consistent with frequentism when an event (supposedly) can’t be put in a class of relevantly similar events. By “relevantly similar” I mean, for instance, if I flip a quarter we might say that quarter is sufficiently similar to every other quarter that has been flipped before, that we can model it as having 50% probability of heads based on past flip frequencies.

There’s some intuitive appeal to this. Things like coin flips, poker hands, fluctuations of financial graphs around a trend … these seem like random things, while there seems to be nothing random about the fraction of Americans who support raising the minimum wage. There definitely doesn’t appear to be anything random about whether the 1000th digit of pi is even; it’s just a number, you can look it up and confirm that it’s odd.

On closer inspection, though, this distinction doesn’t really make sense. The difference is not qualitative, but a mere difference in the strength of our ability to resolve our uncertainty if we gathered enough data within our grasp. To explain that argument, we need to go where angels and undergrads fear to tread.

Why introductory stats is confusing

(Or at least, one reason it’s confusing. Another reason is that combinatorics is hard, but I digress.)

Here’s an innocent-enough claim: “There is a 95% probability that the true effect size in [insert scientific regression study here] is in the 95% confidence interval.”

The way confidence intervals are defined in frequentist hypothesis testing, this is false. Much to the chagrin of the students in the intro stats courses I’ve TA’d. The canonically “correct” statement in this paradigm is: “The probability that the true effect size in [insert scientific regression study here] is in the 95% confidence interval is either 0% or 100%, we just don’t know which. But if arbitrarily many 95% confidence intervals were constructed from this same sampling procedure, 95% of them would contain the true effect size.”

Clear as mud.

Now, I’ll be the first to say that statistics does involve a lot of genuine subtleties that need to be respected if you’re going to understand it. “Linear” regression doesn’t mean you can’t have powers greater than one in your predictors. In a continuous distribution, every outcome has probability 0%. Simpson’s paradox is a thing. I could go on.

But I think this is one concept that is so needlessly confusing that students shouldn’t be blamed for not getting it.

Suppose our frequentist hero is an exit pollster who, on November 4, 2020, estimated a 95% confidence interval of [-0.01, 0.10] for Joe Biden’s popular vote margin of victory. If you had offered them a bet for 90 cents paying a dollar if the true margin ended up in this interval, I claim they’d be a fool not to take it—assuming they truly believed this interval, as something based on a representative sample of voters. Why? Well, the expected monetary value is +5 cents if you assign 95% credibility to the true margin being in [-0.01, 0.10]. (And I’m assuming that for money on the scale of cents, utility increases linearly with money.)

But the frequentist view apparently can’t make sense of this. It’s just one confidence interval, and the true margin either is or isn’t in that interval. The expected value of the bet is undefined, on this view. If your philosophy of probability is inconsistent with how you would ideally make decisions under uncertainty, the very sort of thing that motivates humanity to study probability in the first place, I think that’s a damning mark against such a philosophy.

We can see the same logic in the case of our “1000th digit of pi” question. If someone offers you a bet that costs 40 cents if that digit is odd and pays 60 if it’s even, then—unless you memorize digits of pi as a hobby, and assuming you are banned from checking the Internet, and further assuming we ignore the psychological evidence provided by the fact that this person was willing to offer the bet at all—clearly you should take it. If you’re very risk averse, fine, let’s modify it to let the law of large numbers do its magic. If you get offered the equivalent of this bet for every thousandth digit of pi up to the trillionth, for heaven’s sake, take those bets and become a multimillionaire on average. To be fair, the frequentist will not necessarily call you irrational here. But as far as I can tell, they will have no basis for agreeing with the very plausible claim that you are doing something positively rational, either. They have to suspend judgment. (See also: this video.)

“But what about quantum mechanics? Radioactive decay?” you might protest. Surely these are genuine instances of objective randomness?

Well, not necessarily. The many-worlds interpretation of quantum mechanics is deterministic, in the sense that it says not even these phenomena are intrinsically random. It happens that an observer, confined to one branch, would find it physically impossible to predict the future within their branch if they knew a Theory of Everything, but this does not mean the cosmos taken as a whole is fundamentally unpredictable even from a God’s-eye view. I am certainly not a quantum physicist, but a sizable proportion of experts in this field endorse MWI, and for what it’s worth I find its mathematical consistency more convincing than objections that it is simply counterintuitive.

This might seem like hair-splitting, since I conceded that an individual simply cannot predict quantum events within the branch they experience. That sure sounds like deep randomness, no? But the key is that this is at bottom a limitation of the individual’s knowledge, or access to all the physical data. It’s not a property of the quantum events themselves.

And the same is true for chaos, coin flips, and so on. If you had to guess a coin flip as soon as the coin left the flipper’s thumb, with sufficiently sophisticated physics knowledge you would be able to beat the 50% success rate by at least a small margin. There doesn’t seem to be any reason this couldn’t be taken to an extreme of practical certainty, with arbitrarily accurate knowledge of the initial conditions. Given this, it’s not clear why a frequentist wouldn’t say of the coin flip as well, “Its probability of landing heads is either 0% or 100%, you just don’t know which.”

One response I could foresee is that they actually would agree with that statement when it comes to a real-world coin flip; “coin flips” in stats textbooks are just a convenient abstraction. To which I say, sure, but why doesn’t this reasoning equally apply to the exit poll case? It is a convenient abstraction that, to the pollster, Biden’s margin of victory was random, since they couldn’t practically know that margin without counting all the votes. We can’t explain this in terms of time, either, since I assume a frequentist will agree that a coin that has already been flipped, and is covered by someone’s hand, has 50% probability of being heads. If that assumption is wrong, well, this is my incredulous stare.

I should note: some of my credence in MWI comes from pre-existing credence in Bayesianism—the idea of “inherent randomness” just doesn’t make intuitive sense to me, upon examination of everything else in physics and analysis of the concept. So it would not make sense to count this point as an argument in favor of Bayesianism per se. But it does show that physics is not inconsistent with the Bayesian view of probability.

The incompleteness of P-values

Recall that The Rule for adjusting your belief in some claim versus its denial is to go in the direction of whichever of the two makes the evidence more likely. You have to consider both how likely the evidence is if the claim is true, and how likely it is if it’s false.

A P-value by definition only reports one of the two. It is nothing more or less than the probability of the evidence—or anything more “extreme” than that evidence, which sounds fuzzy but is generally clear from context—if a certain privileged claim called the null hypothesis is true.

I explain this at length here. In that piece, I lamented the frequency (pun fully intended) with which people get the definition of a P-value backwards, and yet, as with the confidence interval confusion, can I blame them? The probability of a hypothesis given evidence is a perfectly reasonable thing to want to quantify. It’s what science, indeed basic truth-seeking, is in the business of assessing. The probability of the evidence given the hypothesis, while obviously useful information, is not sufficient to make decisions.

Coming back to our fossil record example: imagine that a creationist argued, “Let’s be generous to the godless Darwinists, and call evolution the null hypothesis. Innocent until proven guilty. It would be absurdly unlikely for the fossils to be arranged in exactly the configuration we see, if evolution were true. Null hypothesis rejected!” This is technically true! Any particular arrangement of fossils is unlikely under basically any hypothesis other than the super-specific one that says, “The laws of physics are such that fossils will accumulate in exactly the pattern observed by contemporary paleontologists.” But of course, 1) creationism is not that specific hypothesis, and by considering the ratio of the very tiny probabilities these two hypotheses assign to the evidence, evolution wins on this score; and 2) Occam’s razor is not kind to that specific hypothesis, if we interpret it as more than just a tautology, i.e., as claiming that the laws of physics are “rigged” in some way that favors this figuration of fossils.

(To be fair, there’s a sense in which our straw creationist is not just wrong about the implications of this evidence, but also cheating. Real hypothesis tests that frequentist statisticians perform have some directionality to them.)

The Problem with Popper

Much of the sympathy to frequentist philosophy seems to have its roots in a view that science is all about falsifying claims, not supporting them; deduction, not induction. The great philosopher Karl Popper championed this view, and I’d be surprised if the dear reader were not taught Popper’s philosophy implicitly (not by name, certainly!) in grade school science classes. I’ll let this paper speak for itself:

Some people may want to think about whether it makes scientific sense to “directly address whether the hypothesis is correct.” Some people may have already concluded that usually it does not, and be surprised that a statement on hypothesis testing that is at odds with mainstream scientific thought is apparently being advocated by the ASA leadership. Albert Einstein’s views on the scientific method are paraphrased by the assertion that, “No amount of experimentation can ever prove me right; a single experiment can prove me wrong” (Calaprice 2005). This approach to the logic of scientific progress, that data can serve to falsify scientific hypotheses but not to demonstrate their truth, was developed by Popper (1959) and has broad acceptance within the scientific community. … It is held widely, though less than universally, that only deductive reasoning is appropriate for generating scientific knowledge. Usually, frequentist statistical analysis is associated with deductive reasoning and Bayesian analysis is associated with inductive reasoning

This essay makes the excellent counterpoint that scientists usually do not, and should not, take some evidence from experiment that appears inconsistent with theory H as deductive proof that H is false. What they actually do is consider background assumptions they’ve made in the experimental design and analysis (chief of which is “I made no mistakes in data collection and there was no measurement error”), weigh how plausible it is that one of those assumptions is false rather than H, and, if the track record of H is quite strong, often settle on rejecting one of the assumptions. If, however, the track record of the assumptions is even stronger, then it’s arguably time to throw H into the dustbin.

This should remind you of a prior.

Our Platonic ideal scientist might also ask, “Okay, even if H isn’t perfectly consistent with this evidence, is there another theory that does any better, without cheating/overfitting?”

This should remind you of a likelihood ratio, and the prior of the alternative.

Point being, even the most precise physical theories, like Newton’s laws, can’t be logically refuted by experiment without a tapestry of assumptions about humans’ ability to perfectly measure nature. And fields like biology make a whole host of theoretical claims that are not mathematically precise, and therefore aren’t subject to strict deduction. This doesn’t make them unscientific.

While some readers may take this as a pedantic point, strictly speaking “deduction” doesn’t even seem possible in anything other than pure mathematics and logic:

If all premises are true, the terms are clear, and the rules of deductive logic are followed, then the conclusion reached is necessarily true.

Wikipedia entry, “Deductive reasoning.”

I cannot overstate how strong a standard this is. As anyone familiar with theoretical math knows, deduction only allows you to make the most modest, conservative claims, since you are constrained by rules that demand absolute certainty (up to human margin of error, and the objections of radical skeptics, anyway). This is the difference between a “theorem” and a “theory.” Doesn’t matter how many examples you test the Riemann hypothesis on, if you don’t prove it, no Millennium Prize for you. Unless you prove one of the others, anyway.

Which is why I find it bizarre to uphold frequentist philosophy as on the side of deduction. “P < 0.00001” doesn’t give you deductive certainty that the null hypothesis is false. Not even “P < 0.000000000000000000000001” would. (Technically, some null hypotheses are of the form “the effect of X on Y is exactly 0,” which you can certainly reject without any evidence in the first place, if the definition of effect in question pertains to values on a continuum. But that’s tangential to this point, and would render frequentist hypothesis testing unnecessary anyway.)

This is not a vice of frequentism. It’s a virtue. I claim that frequentists already in practice agree with Bayesians that deduction is not feasible in science. What they seem to be doing is an approximation to (an attempt at) deduction. In everyday language, you can round off “this is extremely unlikely if H is true” to “this is impossible if H is true,” and “deduce” that H is false. And the appeal of this strategy to the frequentist is that it doesn’t require claims about probabilities of hypotheses, other than 0%. But this isn’t really deduction, nor is it the proper method of induction either, and I can only wonder how many frequentists would be more open to Bayesianism if they embraced this. Induction is not epistemological anarchy, Hume notwithstanding. We have a nice mathematical formalism, Bayes’ rule, telling us how to do induction just fine.

On subjectivity

Bayesianism is accused of being too subjective, since probabilities depend on priors that are at the researcher’s discretion. Gelman has a decent reply to this:

The prior distribution requires information and user input, that’s for sure, but I don’t see this as being any more “subjective” than other aspects of a statistical procedure, such as the choice of model for the data (for example, logistic regression) or the choice of which variables to include in a prediction, the choice of which coefficients should vary over time or across situations, the choice of statistical test, and so forth.

Yet, as elegantly explained in the dialogue I linked at the start, the frequentist P-value for a given hypothesis test can depend on the subjective intentions of the researcher. For example, the P-value of a sequence of coin flips “HHHHT,” relative to the null hypothesis “this coin is fair,” depends on whether you resolved to flip the coin exactly 5 times or stop at the first tails. Or stop at the first “HT.” The adjusted P-value that won’t get your paper rejected depends on how many models you truly, deeply decided to test, and are trusted on your honor to report honestly. (Even I, fan of Bayes that I am, was a bit surprised to learn that multiple testing isn’t really a problem for Bayesian inference.)

If your tool for establishing scientific truth can be gamed so systematically, and avoiding this gaming dissuades researchers from asking as many questions as they please, that speaks to something perverse about the tool.

Moreover, many frequentist procedures are equivalent to Bayesian ones with an uninformative, or “flat,” prior. In the context of a study about the sizes of drug effects on some health outcome, this would be the prior that models all effect sizes as equally likely. Does that model sound at all plausible to you? Do you expect, before considering the evidence contained in a given study, that it’s just as likely a dose of a new antidepressant instantly and permanently cures depression as that it makes the user feel moderately better for a day? Of course not. By considering the record of similar drugs, you probably expect a bit more than a placebo’s worth of effect (or, heck, not even a bit more), and not much variance around that mean. This is not a bald, unscientific assumption or fuzzy subjectivity. It’s an entirely reasonable summary of previous relevant evidence and common sense, that is, a prior.

As that dialogue also states, proponents of Bayesianism are not recommending that researchers just report the final answer of their prior-times-update-factor computation and call it a day. The sensible standard is to require researchers to report the likelihoods they use in their analysis. And if you, intrepid reader, disagree with their choice of prior, it’s your prerogative to take a prior that makes more sense to you, and update it with the evidence.

“Why should I care?”

You’re probably not a statistician, scientist, or anyone else who uses this stuff in your day job. (My prior says so!)

So what’s at stake for you?

For one, these are some compelling reasons to be generally skeptical of claims backed up by P-values alone. Including negative claims! “P > 0.05” does not mean “no evidence for a decision-relevant effect.” This has some pretty far-reaching implications for assessing evidence about public health, policy, charity, and such.

But I think this stuff is important at a more general level. As incredibly simple as this idea is—that something counts as “evidence for” a claim to the extent (and only to the extent) that it is more likely under that claim than its denial—it’s all too easy not to abide by it when evaluating non-deductive arguments, or incorporating new information. Having the mathematical concept on hand helps me try to follow the evidence where it actually leads. There’s some evidence that Bayesian reasoning is partly responsible for the success of the world’s best predictors.

I suspect this could have crucial applications in philosophy, too. Many philosophical arguments rely on the strength of intuitions in favor of different ideas. I do think intuition is basically all we’ve got as the bedrock of, for example, ethics. But not all intuitions are created equal. An important question to ask of any given intuition is, “Would I expect people to believe this even in a world where it was false?” If so, all else equal, the fact that you find an intuition compelling is not particularly good evidence for that intuition. I highly recommend that the reader try cross-examining their beliefs with this standard.

One area where I’ve found this exercise enlightening is in moral consideration of animals. In a world where animals hypothetically did have all the features we’d consider necessary for a being to be morally important, would it be surprising for humans to feel anyway that we are overwhelmingly more important than animals—creatures that cannot protest verbally against the harm we inflict on them, that are so genetically distant from us that helping them provides “us” basically no benefit in inclusive fitness? No. This suggests that the fact that most of us consider it weird to care as much about a chicken’s suffering as a human’s is pretty weak, if any, evidence that we wouldn’t care as much upon careful moral reflection.

Then there’s The Future. I will give the question of cluelessness a more thorough treatment in a forthcoming post. But as a teaser, the Bayesian approach at least gives us a way to confront uncertainty about radically unprecedented events. We don’t have to simply say no probabilities can be assigned because the events aren’t repetitions of near-identical processes. This certainly isn’t to say we should be cavalier about making decisions based on extremely imperfect estimates, only that we aren’t totally hopeless. Choosing to ignore that which you can’t confidently predict isn’t noble skepticism; it’s an implicit prediction that the thing you’re ignoring washes out in the end, which is a strong assumption!

Appendix: Why you might not want to use Bayesian statistical methods anyway

This essay is fundamentally about philosophical frequentism vs Bayesianism, not the clusters of statistical methods that have been labeled frequentist vs Bayesian. I don’t have much of a problem with many “frequentist” methods, any more than I have a problem as a pure consequentialist with following rules. Null hypothesis significance testing, as we’ve seen, is a glaring exception. Non-Bayesian methods can be super useful heuristics when, as is often the case, computing the full Bayesian solution isn’t feasible. Andrew Gelman at Columbia University is as Bayesian as they come, and he agrees with this assessment.

In my research territory of online and reinforcement learning, for instance, “frequentist regret bound” is a common term for a guarantee about the performance of an algorithm that holds universally across some class of problems. There’s nothing inherently frequentist about this. I’ve recently done some work building on a paper that uses a Bayesian approach and proves exactly this sort of guarantee. (For the record, this is why I strongly prefer to call it a “worst-case regret bound.”) (For the second record, so far Bayesian methods have been found superior in reinforcement learning, but who’s counting?)

Though I don’t agree with every point in this Fervent Defense of Frequentist Statistics, it’s worth a read.

Tranquilism Respects Individual Desires

[Brian Tomasik has written on this objection to negative utilitarianism as well, and also applies a perspective of considering different person-moments as different “people” in moral tradeoffs, but here I address a different problem with the objection. Further, I’d expect the biggest criticism of my argument here would be that preferences aren’t the only thing that matters—hedonic experience is the main moral object. I’m sympathetic to this view, but my target audience here is people who consider respect for individual preferences to be the main virtue of a moral theory.]

The criticism

A common critique I encounter of negative utilitarianism and tranquilism—briefly, the idea that any experience without desire for change is perfect, and an absence of experiences is not an imperfection—is something like:

Look, I know what I prefer better than you do, and I prefer to experience small amounts of suffering in exchange for great happiness, independent of any confounding factors about effects on other people or fulfilling my desires about a life narrative. I would just prefer that sequence of experiences, over total neutrality. I’ve thought long and hard about this, and it’s what I want. Who are you to say otherwise, and impose your weird conception of value upon me?

This is fair as far as the preferences we’d like ethics to respect are people’s stated preferences over future world-states, in the abstract. Or people’s retrospective judgment that some past bundle of experiences was worth it. But I don’t think this is actually the right unit of “preference” to consider. From the NU FAQ:

My craving for the pizza is greater than my desire for eating potatoes, and it is probably accurate that the pizza-experience is of greater pleasure-intensity. However, if we are going to therefore conclude that what is more pleasurable is automatically better (and that the worse states are in need of improvement!), Buddhist intuitions will object: We have been looking at it from the wrong perspective! We have been comparing, from the outside, two different states and our current cravings for being in one or the other. Why not instead look at how the states are like from the inside?

This seems far more sensible to me. How I feel about a given sequence of experiences my future self might have, as a third party considering those experiences, isn’t the final word. Sure, the best we can do to prevent bad futures is reflect on our memories of experiences, and use those to predict what we and others would like, but there are serious biases we need to account for in this process. And I think one of the strongest is this: we confuse how we feel about a given set of experiences when we’re not experiencing them, with the nature of the set of experiences itself.

Suppose you say that you would rather, at the end of your life, 1) be uploaded to an experience machine in which you burn alive for 5 minutes but then experience 1 million years of pure unadulterated joy, than 2) die as normal. By hypothesis, since this is an experience machine, there are no effects on other people. This is purely a matter of the value of the experiences themselves. Whose preferences are you really honoring here by choosing (1)? It does honor your current self’s preference-over-futures. It certainly doesn’t respect the preferences-over-immediate-states of the moments of your future self that would be burning. Those moments of experience surely would extremely disprefer burning alive, wanting desperately for it to stop. If this discussion of moments of experience is puzzling, I recommend learning about the problems with the folk view of personal identity; here is an eloquent statement of an alternative view:

Thinking this way, I sometimes slip into thinking of myself as a “time-slice,” where the experiences of the time-slice have the special “I am there for it” property, but none of the experiences of any future people do; rather, future versions of myself are more like especially intimate friends, people who I love and support, towards whom I feel deep understanding and loyalty, but whose experiences fall on the other side of the same chasm that separates my experience from the experiences of other people around me. [emphasis mine]

Here, I don’t mean the intellectual attitude you might have of believing, during the burning-alive moments, “Although this sucks right now, it will be worth it.” That is a preference in some sense, a von Neumann-Morgenstern (VNM) utility theory “preference,” but not the kind of basic preference—desire, or craving—that is clearly morally problematic to violate. A preference of the former sense can be attributed to a machine programmed to convert the galaxy into plastic. I see no reason to honor that preference in itself, independent of whether the plastic-maximizer has subjective experiences of dissatisfaction upon seeing how little plastic is in the universe, or any game-theoretic considerations. What really matters is any attitude of the form, “I (do not) want this current experience to change.”

What about the preferences for joy?

Perhaps you think that while choosing (1) disrespects the preferences of the person-moments who burn, this is just outweighed by respecting the preferences of the far, far, far more numerous person-moments experiencing the joy. Far more and stronger desires, not mere plastic-maximizer preferences, are satisfied by (1), on this account.

The problem is that choosing (2) could not possibly violate the desires of the joyous person-moments in any consequential sense of the term “violate,” because under that choice such moments don’t exist. Again, we should not confuse “making a world that would be evaluated negatively according to some hypothetical utility function” with “making a world that is experienced as dispreferred.” Why exactly should we think the former is a problem?

Notice the genuine asymmetry here. Choosing (2) does not instantiate any person-moments whose craving for the many joyous moments is denied. But choosing (1) does instantiate some person-moments whose craving to not burn alive is denied, and flagrantly so. A common misconception is that by rejecting the goodness of eventual preference satisfaction for currently nonexistent beings (or experience moments), one necessarily has to reject the badness of eventual preference dissatisfaction for currently nonexistent beings. I might agree with this claim if the preferences in question were third-person VNM preferences over futures, the sorts of preferences that I argued above weren’t really relevant moral objects. But that’s not what we’re considering here. The asymmetry lies in this: the alternative to creating those joyous person-moments, in this thought experiment, is not the creation of person-moments with a frustrated desire for joy. (If it were, a negative utilitarian would object to that just as the critic would.) It is simply no creation at all. But the alternative to no creation is to create person-moments that do obviously experience a severely frustrated desire, by virtue of burning.

I must stress that by affirming this asymmetry, I am not asserting some special metaphysical status of “persons,” or badness-for-someone. The point is not that you can’t harm a being who doesn’t exist yet. Such a view would be subject to the non-identity problem, and the absurd implication that a prospective parent should be indifferent between having a child with chronic pain and one without. Quite the opposite. By considering the moments of experience that might result from each option in our dilemma, we see how cruel it is to subject some of them to torture, however relatively brief, without relieving any greater harm.

It seems more paternalistic, in the relevantly problematic sense, to insist to those moments of oneself that will suffer, “Your sacrifice, which I impose on you, is for the greater good. You wouldn’t want to violate the preferences of your past self, who decided that you should go through this, would you? Don’t be so authoritarian!” To insist this would be to attribute desire frustration to imaginary beings, moments that would not have existed in the alternative option.

Another advantage of this approach is that we don’t need to apply ad hoc patches to the problems arising from uninformed preferences. The typical solution is to “idealize” preferences, considering what they would be if the subject were fully informed and reflected for a long time. I have never found this approach satisfactory, especially because it proves too much. “Idealized” preferences are super underdetermined, and in practice seem to reduce to whichever preferences best accord with valence utilitarianism (or immediately experienced desires) anyway. It’s more appropriate and parsimonious, I think, to assess the moment-to-moment preferences about their respective momentary experiences. These are surely as informed as they reasonably can be, because they are in the midst of the experience itself. So when, for example, a doctor gives a painful vaccination to a child, what makes this acceptable is not that the child’s preference against the shot is uninformed—surely that preference counts for something in its own right. Rather, it’s that the child will predictably disprefer the experience of the corresponding illness.

Other statements of the objection

Some other (anonymized) quotes on this point, and my response from this perspective:

I maintain that people are the final arbiters of what is good for them.

Discord user

This is probably roughly right, but only when such people are assessing “what is good for them” while they’re experiencing it. It makes no sense to say that I, currently, am the final arbiter of whether it would be good for my future experience-moments to have my consciousness extended after death in the experience machine described above. Who am I to impose that terrible fate on a person who isn’t my current self, who just happens to share memories with me? I almost certainly know better than someone else whether those future experience-moments would regret the whole sequence at the end of it, or consider it worthwhile—and this is exactly why paternalism is generally dangerous. But this doesn’t mean I “know” that the pleasure of the many outweighs the agony of the few, even if the many wouldn’t miss it.

If I found out [someone] had tried to prevent my birth (convince my parents not to have me, for example) in order to “save” me from suffering, I’d be really mad at them.

Comment on “Fear and Loathing at Effective Altruism Global 2017

Mad on whose behalf? Your actual, born self? Clearly not, since you’re alive. This purported wrong didn’t happen to you. Your hypothetical non-born self? How would they be wronged? They had no preferences that would be violated by refraining from creating them. If you’d be really mad in this case, it seems you’d have to be really mad at everyone who is in the financial position to have another child with a decently high standard of living, but doesn’t do so.

“Preferences” in particular seems like an obvious candidate for ‘thing to reduce morality to’; what’s your argument for only basing our decisions on dispreference or displeasure and ignoring positive preferences or pleasure (except instrumentally)?

Rob Bensinger

I suppose that by “positive preference” he means a preference of the form, “I do not want this experience to change.” If so, there is no problem posed by not having that preference. What is the point in putting yourself in a state of wanting something to continue, given the risk that it may not continue, rather than simply avoiding the want in the first place? It’s perfectly good to not want.

Concluding thoughts

I’ve frequently seen symmetric utilitarians assert that negative utilitarianism is based on an arbitrary asymmetry, as if we treated all negative numbers as infinitely greater in absolute value than positive numbers. But not only does that rely on the controversial stipulation that happiness and suffering are opposites of the same currency, it doesn’t account for the independent arguments for the asymmetry, one of which I’ve presented here.

It’s not just that NUs have the intuition that the procreation asymmetry is true, that the Very Repugnant Conclusion is extremely unpalatable, that Omelas is an unjust civilization and Ivan Karamazov is right to return his ticket, that the Logic of the Larder is wrong even if we very implausibly grant that farmed animals have more happiness than suffering in their lives, etc.—not just that we have inferred from these intuitions the fundamental frivolity of pleasure in comparison with misery. Rather, the asymmetry is inherent to the structure of the preferences of sentient beings.

How Do You Know?: Some Amateur Moral Epistemology

Despite having thought hard about thorny philosophical questions for years, as recorded excruciatingly in this blog, here is a brutal truth: I can’t say I know exactly what are good reasons for having some ethical beliefs over others. (Though I have some guesses!)

This is very unfortunate, not least because I think it undermines my perceived credibility relative to that of people I’d consider overconfident.

On one hand, I’m tempted to just say, “Well, shoot, the problem of good reasons for believing things in general hasn’t been solved! There’s the problem of induction, Descartes’ radical skepticism, etc. But we get on with our lives just fine. We might be brains in vats, but does that really matter?”

I think there’s something to this pragmatism. But there’s a key disanalogy here. You can’t sweep moral epistemology under the rug of just-do-what-works, because the meaning of “what works” is exactly what’s under dispute! The proof of the pudding of science, say, is the construction of technologies that would fail if the scientific findings were wrong. What about ethics?

Option 1: Motivation of “good” behavior

The seemingly obvious answer is that someone who understands ethics well should act like a common-sensically “good person.” This is the premise behind articles lamenting the inability of ethicists to behave better than other philosophers (or other people generally). But this doesn’t really answer the question, now does it? It presumes we already know what good is.

A charitable interpretation is that we roughly do know what good is, by the definition of good as alignment with our intuitions. It’s just that we need to make our understanding of those intuitions more systematic and consistent. Arguably many supposed moral disagreements are about matters of empirical fact. If I believed that the death penalty was systematically much better than alternatives at reducing rates of extreme violence, namely worse outcomes than the executions, it would be much harder to oppose. While I’m inclined to think intuitions are unavoidable in the chase for The Ought, this account isn’t satisfactory. It seems more likely to me that what we generally “know” is what social norms most people endorse, or feel uneasy violating, and history attests to the massive gulf that can exist between those norms and what actually (probably) increases wellbeing.

To see this, consider that some people have the intuition that terrible actions merit eternal punishment, i.e., hell. This is a majority belief among Americans! (In Western Europe the numbers are much lower, around 20%, but to me this is still shockingly high.) Conceivably, I suppose some could believe in hell but not that it is just. But I doubt this is very common; belief in hell only really makes sense if one believes religions that claim its existence, and almost invariably these religions purport that a morally flawless God condemns people to such a hell. Even in thought experiments divorced from the influence of religious dogma, most people evidently think it is good to inflict suffering on someone guilty of massive wrongdoings, without any possibility of rehabilitation or deterrence of others.

I find these intuitions not only unbelievable, but thoroughly, bone-chillingly repulsive. There is absolutely no rehabilitative purpose eternal suffering could serve. Even as a potential deterrent, belief in hell is quite clearly not effective enough to relieve more suffering than it would entail if it were real; negative infinity plus a very very large positive number is still negative infinity. This moral belief appeals to our basest desires for pure retribution, and does not reflect a sensible principle for making the world better.

Maybe one could rationalize this as believers not really thinking hell is just, but only resigning to the unimpeachable wisdom of a higher power who evidently thinks it is just. This could be like how I find it mind-rending to try to believe in any of the interpretations of quantum mechanics, but I defer to the expertise of the physicists who give the most sensible endorsements of them. Consider this post partly an invitation to my hell-believer readers, to tell me if they see things this way. In any case, I prefer to take people at their word.

So I don’t accept that “we” know what good is, if “we” can make what seems to me to be such an obvious and horrifyingly backwards error.

And they would surely say the same about my views. So would my future self, probably! That’s the particularly scary part, the part that prompted me to write all this. My premises seem just obvious to me, the sort of thing no person who really comprehends the facts of human experience and isn’t arbitrarily privileging themselves could deny—although how these premises are put together into conclusions, not so much. The implication seems to be one of: 1) No one is “right” about this, we’re just starting from irreconcilable premises. 2) I’m just so good at thinking that I found a truth very few people recognize. 3) I am actually dead wrong about those supposedly undeniable axioms. 4) Most others actually have already found the narrow truths I find basically undeniable, it’s just that we have stacked different extra layers onto the same core.

I’ll address the possibility of (1) later. (2) is enormously unlikely from the outside, and pretty arrogant, though not out of the question. It feels about as hard to accept (3) as to deny the law of noncontradiction. Maybe that’s too strong, but consider that some logicians have argued there are reasons to deny that law. And one of my friends has said they find one of my views on value about as (in)credible as the view that tolerates contradictions. I have changed my mind before, but if I recall correctly, these changes weren’t radical changes in bedrock. They were changes in how I made sense of the bedrock.

I think I’d always known I could explain the “goodness” or “badness” of states of affairs, character traits, rules, whatever, in terms of conscious experience and wellbeing. But I had to learn that I could go one step further, that the non-instrumental value of “positive” wellbeing was in reasonable doubt, while the intrinsic disvalue of the negative wasn’t. I had to learn that comparisons of the wellbeing of populations of different sizes could force my intuitions into a corner, and the most plausible resolution was to accept that my brain was just not designed to comprehend 1) astronomical numbers and 2) circumstances drastically different from the real world. Others should not suffer because of that problem. I had to learn that, contra Rawls, “separateness of persons” is not a deep fact built into the universe that withstands philosophical scrutiny. It is a disconnect between moments of experience no more fundamental than the disconnect between myself before and after I am put to sleep for a surgery. Reifying this disconnect too much could have horrific consequences for the many moments of experience that one refuses to help. To be specific, when you catch yourself thinking that some utilitarian policy harms the individual for the sake of the collective, remember that the collective is a lot of individuals. They would not be comforted by the idea that you’d be defending one individual by neglecting their needs.

I mention these examples not so much to proselytize my perspective, as to show that even when everyone shares the basic principles, there are a lot of details to work out. I suppose this makes Option 1, properly interpreted, more reasonable than I suspected at first.

This blends into (4), which I find most plausible. Basically everyone thinks suffering is bad, often really bad, all else equal. And I share basically everyone’s response of disgust at other non-suffering things, or absences of things. I hear about the organ-harvesting doctor example and think, “Gross. I don’t want that.” I hear Lucretius’ and Epicurus’ take on death and think, “That feels off, come on, almost no one wants to die so it has to be bad.” Most others, prompted by their disgust at such things, consider them bad as well in a sense commensurable with suffering. The difference between us is that I see my responses of disgust, in these cases, as either rough signals of instrumental badness, or signals of evolutionary-fitness-thwarting-ness. The consequences of this difference are substantial, but it’s no more bizarre than the difference between those who desire money strongly for its own sake, and those who recognize that money is just a means to ends. (I have used an analogy that’s flattering to myself here, but I know I could be wrong!)

So if I end up changing my views again, it will probably be because of a discovery that the instrumental-badness hypothesis for some X simply isn’t more plausible than its denial. I’d encourage the reader to think themselves about their own answer to the question, “What would make you change your mind?”

This line of thinking gestures at an overlapping option:

Option 1.1: Accordance with common sense

I can’t tell you how many times I’ve read defenses of ethical theories that attempt to avoid the weird problems by only forming local theories, ones that pertain to the problems people actually face in daily life and leave thought experiments to the birds.

This is a deeply problematic response, because these supposedly weird problems do bear on daily life, if you include in “daily life” things like your choice of career, what you might donate to, what policies you support, whether you have children… Hard questions don’t go away by ignoring them, or relegating them to a separate sphere from the “easy” stuff. I won’t say more on this point simply because the perspective I’m rejecting here doesn’t seem remotely defensible, and I haven’t heard such a defense even from the smart people I know who endorse it. But I’d welcome comments providing this defense, if it exists.

Option 2: Accordance with one’s reflected preferences

This is a perspective that is very common in circles of people I respect and talk to a lot, but not really elsewhere. The idea is that “right” and “wrong” are no more or less than whatever satisfy or violate the preferences of a given speaker, provided that the speaker thinks critically about what those preferences actually are. This is not merely following instantaneous whims. But it is still fundamentally relative to the subject.

While I’m not fully sympathetic to this view, it’s important to appreciate how far it could take you. First, “preferences” here are not just self-interest. An altruistic person can want bad things not to happen to others. They can want this quite strongly, in a way that motivates them to sacrifice self-interest. Second, what matters on this view is not whether you lack the feeling “my preference is frustrated,” but whether (and to what extent) the preference actually is satisfied. In particular this means that relative to an altruist’s values, it is very wrong to take a magic pill that makes them an egoist—eliminating the altruistic desires, so the feeling of preference frustration never occurs.

Also, the altruist has very good reasons for encouraging others to do altruistic things as well, even if they don’t really want (yet) to be altruistic. Practically, even if you can’t convince Hitler to be a good person by appeals to reason or to what he wants, you can still stop him by force and social institutions. The process of moral evolution within a person, according to Option 2, is really a process of discovering more coherently what you truly want. “I should do X” is purported to only make sense as “Doing X will achieve what I want,” with “want” broadly construed, as something like “value” or “prefer.”

I think this is closer to the truth than Option 1. It’s almost surely closer than a view of cultural relativism. (Why should it matter that “my culture” has a given set of values, rather than just myself or the set of people I generally respect as moral thinkers? Why is a given culture’s set of values inherently worth honoring, no matter how destructive?) But it still is unsatisfactory.

For one, my experience with forming new moral beliefs just hasn’t matched this picture. This doesn’t necessarily mean there’s an inconsistency. I’m sure a proponent of Option 2 could insist that I’m mistakenly interpreting my data. But changing my mind on questions surrounding “what ought I do with my time/life decisions” doesn’t feel like figuring out my preferences. I roughly know what I prefer, or value. Many things I value, like a disproportionate degree of comfort for myself, or avoiding the icky feeling of a conclusion that defies the common sense of my primate brain, are things I very much wish I didn’t value because I don’t think I should value them.

A common response I’ve received is something like: “The values you don’t think you ‘should’ have are simply ones that contradict stronger values you hold. You have meta-preferences/meta-values.” Sure, but I don’t think this has always been the case. Before I learned about the cluster of ideas I’ve been obsessing over the past few years, I don’t think it would have been accurate to say I really did “value” Impartial Maximization of Good Across Sentient Beings (TM). This was a value I had to adopt, to bring my motivations in line with my reasons. Encountering said ideas did not feel like “Oh, you know what, deep down this was always what I would’ve wanted to optimize for, I just didn’t know I would’ve wanted it.” I probably would’ve agreed that Impartial Maximization of Good Across Sentient Beings (TM) was what I ought to do, and disagreed on the details because I was less informed, but I didn’t want it. I wanted my own happiness, and largely still do.

Further, this view seems to imply that if I preferred to be in pointless suffering for an hour, I ought to bring that about. The evidently sensible reply is that no one prefers pointless suffering, by the very nature of suffering, so the hypothetical here is misleading. I’m not sure about this. I have talked to at least one person who affirms that it’s conceivable for a being to want its own suffering, and that there’s nothing irrational about this. Others affirm this, though maybe they’re confused.

The fundamental question here is whether terminal goals, values, and preferences can be irrational, or if this is a category error. And this is not merely definitional, I think. “Rational” in the sense that both I and proponents of Option 2 use it is, I think, a normative term. It declares that which an agent has reason to do, that which it would be a mistake not to do. For both me and the Option 2 proponent, rationality is an idealized function that labels actions with their extent of “what you have reason to do”-ness. The difference is that they would say this function is always relative to terminal goals, and indeed makes no sense absent such goals. I see the appeal of this view, but as suggested by the paragraph above, I don’t see what makes a goal so special. The existence of a goal makes no demands for its own achievement, except as far as a being feels dissatisfied when they fail at their goal. That dissatisfaction, to my eyes, does make such a demand. How could it not?

But what would I say to the following observation?: It sure seems like moral attitudes are strong social emotions, an outgrowth of evolutionarily adaptive machinery. Why would we expect these to track any “truth,” rather than express our desires? It appears the simplest explanation for why we have the attitudes we do is a combination of genetic and memetic fitness.

I’d actually agree with this, when it comes to many attitudes. All else equal, yes, a given moral intuition doesn’t deserve sacred respect. It probably has some game-theoretic function that could be irrelevant, in different environments, to the business of “making the world a better place.” I’d probably go so far as to say the vast majority of moral disagreement that can’t be explained by empirical disagreement, is attributable to misapplications of useful principles outside of their appropriate contexts, and our failure to internalize the fact that other consciousnesses exist besides our own.

But something like the disvalue of suffering, well, explaining that away with evolution would miss the point. Yes, when you suffer, it’s because your genes wanted you to register destructive stimuli so you could avoid them. That doesn’t mean your experience of its badness is an illusion. I can’t say the same for, say, the feeling that bad people deserve hell.

Why, again, is all this relevant to my original question? Because if I could accept Option 2, the answer would be pretty obvious. I could look at what I want, check that I’ve identified the most fundamental wants and eliminated contradictory wants, and go from there. Thought experiments would still be somewhat confusing, but they would just be evidence that my wants are complicated—they would not make strong demands on my choice of principles to endorse.

Unfortunately I can’t accept it. It doesn’t quite make sense.

If you started reading this essay expecting a satisfying answer by the end, sorry to disappoint you. My bumper sticker answer would be something like: “I observe what I don’t like to experience, that which I feel needs to stop. I consider abstract dilemmas that help me extend this evidence to less intuitive or familiar cases. Then I acknowledge that I am not special. My own experiences should not be privileged above others’, any more than my present experiences should be privileged above my future ones.” There are probably some philosophical holes here, but I think everyone needs to consider this question if they are to not be carried through life by short-term wants alone.

On My Hypocrisy

For eight years as a vegetarian, I wondered how thoughtful, kind, ethical people could continue eating meat when they knew where it came from. Unlike many positions I hold about politics or philosophy, this is one that I simply can’t see any valid justification for disagreeing with. It’s literal torture for the sake of a product that isn’t necessary.

Then it hit me that this was exactly what vegans probably thought about me, as someone who had continued eating dairy and eggs despite knowing where they came from.

It’s not as if I didn’t know how much pain these industries were causing. I knew that male calves are slaughtered as excess byproducts of the dairy pipeline. I knew that cows and chickens were repeatedly impregnated to their physical limits to produce milk and eggs, respectively. I knew that male chicks were ground up en masse, treated as useless objects. All these practices would continue if no one ever ate another bite of meat.

So what took me so long? What was stopping me from removing all animal products, not just meat, from my diet?

To be fair to my past self, this wasn’t for lack of trying, exactly. I briefly went vegan in high school until my doctor expressed concerns about my weight loss. In hindsight, this was because I wasn’t eating a balanced diet, not because of limitations of veganism. Now that I eat according to something more reasonable than a teenager’s naive idea of veganism, this isn’t an issue, and I’m reasonably sure I could have returned to it sooner if I hadn’t rejected beans, a variety of vegetables, and vitamin supplements like the plague. (Perhaps poor timing for that idiom.)

Still, it was hypocrisy. And it demands an explanation.

I should preface the claims that follow with a couple notes. First, the actual causes of many human behaviors often differ from the stories we tell ourselves about our reasons for such behaviors. And I’m no exception. So the best I can do is try to reasonably diagnose my past disease of cognitive dissonance, but this will be imperfect.

Second, I don’t mean to suggest that vegetarians are evil. In general I think that people treat hypocrisy with a level of contempt that is disproportionate to the harm it causes, independent of the badness of whatever bad thing the hypocrite is doing. Consuming dairy and eggs was a moral failing of mine, yes, and I’m not proud of it. But my motivation for writing this is mostly to try to reflect on a weird inconsistency, not to harshly condemn it. (I would, of course, encourage vegetarian readers to take the next step, just not out of a sense of contempt.) If you’re considering going vegetarian but afraid that’s not enough, for the love of God, don’t listen to the purists. I promise you the animals would prefer you do something that helps them rather than say “why bother?” and walk away.

Okay, that’s out of the way.

The most obvious answer (read: excuse) is that they just tasted so good. Pizza, cookies, doughnuts, risotto, even egg-based meat substitutes. These things were nontrivial temptations. But it’s not so obvious that I really liked these foods much more than I’d liked meat before going vegetarian. As a kid I’d happily gorged on meat as much as the average American, probably, and if anything I was relatively repulsed by dairy and eggs.

Maybe it was the sheer accessibility of these foods. You sort of have to go out of your way to eat meat at social functions, while cookies at a party or a box of cheese crackers are just there, ready for the taking. Pizza especially is everywhere in college, basically synonymous with “free food” at talks.

Maybe it wasn’t how much I liked them, but how much I wanted them. Cheese is addictive. This is a neuroscientific claim, not hyperbole, and this seems to be true to a degree not far behind caffeine.

Maybe I just underestimated how easy it would be to stop. And I do mean easy. People’s experiences vary, and of course it takes some adjustment, but considering I made this switch just as I started grad school, that should tell you this really didn’t require a Herculean feat of willpower or fat wallet.

But I believe it was most strongly a matter of plausible deniability of the harm. Or at least theoretical deniability. There’s no reason in principle that dairy or egg production requires exploiting or painfully slaughtering animals, I told myself. By purchasing these products I was not really expressing demand for the practices behind them, just the products themselves. Right? Meat, by contrast, was inherently violent.

This was, of course, nonsense. Cows and chickens don’t care about “in principle,” not when they are denied damn near every aspect of a flourishing life because profit trumps well-being. This unfortunately describes the overwhelming majority of farmed cows and chickens. (Link to a disturbing but frank and truthful source of evidence on this.) Yet somehow I was able to convince myself, for so long, that this didn’t matter, or at least didn’t matter enough to make me change my behavior.

And that’s terrifying.

I’d like to believe I had good reasons, that I was doing what was justified given the information available to me. But I can’t honestly say that’s what happened. What happened was that I wanted something, there were no sufficiently strong social norms preventing me from taking it, and I could point to some semblance of a relevant difference between meat and everything else that’s forcibly taken from animals.

What’s particularly distressing is not just that I was a hypocrite, but that I knew I was a hypocrite. I knew it as much as Hank Green knew it about his omnivory. Such is the power of akrasia.

All of this is to say, to any readers who eat meat: I get it. I know where you’re coming from, I really do. I know how impossible it appears to be to remove beloved foods from your plate, especially in the face of social inertia. I know what it’s like to really care about animals, as you probably do, and not want them to suffer, and recognize that factory farms should be abolished as about half of Americans in a recent survey apparently do⁠—and yet to nonetheless do something on a daily basis that contradicts these values. I also know how deeply uncomfortable and frustrating it can be to be criticized for that action, by people who are surely no saints themselves.

I won’t call you a bad person. Not just because I think that speaking in terms of good or bad people is inherently counterproductive, though this is true, especially since believing that you’re a bad person tends to just leave you wondering, “why bother trying?” But also because I don’t think I was a “bad person,” and that would be the inevitable conclusion if I applied the same standards as I do to other people.

I did bad things. They were serious failures. Yet here’s the thing: they were acts, and they didn’t determine what I was capable of doing (or not doing) in the future. Not to say they didn’t reinforce their own habit, that much is beyond question. The point is that once I let go of the idea that doing these bad things marked me as a Bad Person, the need to rationalize them started to fall away. (You’d think I would’ve learned this lesson from going vegetarian, but I never claimed to be smart.)

I disabused myself of the myth that I was, or needed to be, a timeless being who has achieved moral enlightenment, and therefore could not abide evidence that I’d made a large mistake.

I am not some exceptional paragon of self-control or virtue. So I know you can do this too. If it feels hard, please reach out to me, as I’m happy to share recipes!

Diary of an Amateur Machine Learning Guy

[These numberings of the days are mostly made up. I can’t remember exactly how long it took to do each step of this wild process. I hope this is readable and vaguely enjoyable to readers who don’t do what I do.]

Day 1.

I have an idea.

It’s not a terribly brilliant idea, and I have my doubts that it will be able to scale well. But it’s an idea nonetheless, and that’s precious.

Day 2.

What do you mean it’s not differentiable?

… Oh.

I guess the bright side of this is that harder problems have more academic prestige.

Day 5.

Worry not, friend, it seems there’s a solution to this that has to do with GANs! Those magic things that let people create fake faces.

No, my idea has nothing to do with faces whatsoever. I’m as confused as you are.

But it should work. In theory. The same basic principle is there: make artificial imitations of some behavior by pitting them against a judge who rejects the imitations that are bad. Kind of like evolution I suppose, but, uh…

Much faster. Hopefully.

I even got to do the sexy Good-Will-Hunting-furiously-writing-math-on-a-chalkboard-that-isn’t-even-really-that-complicated-but-it-looks-mathy thing!

Day 6.

What do you mean the professor of the course doesn’t like the idea? Whatever. He’s condescending anyway.

Day 8.

I found an alternative. Sort of. It’s premised on the assumption that humans will act based on instant gratification rather than plan optimally for long-term utility, which is, well, frankly probably closer to the truth anyway.

And you can differentiate it, by gum.

Day 12.

Well this is exciting! My advisor likes the idea. Can’t wait to disappoint him.

Day 15.

This requires a lot more calculus than I expected. Not the hard kind, just…a lot. Why did I put so many parameters into this thing? Ugh. I can differentiate it but I don’t want to. I really don’t. An unpleasant prospect.

Day 17.

How’d you spend your spring break? Coachella? Burning Man? [Disclaimer: I don’t actually know nor care for the purposes of this narrative when Coachella or Burning Man take place.] Ah. That’s…

Tame.

I scoured the Internet for a closed form expression of the third moment of a multivariate Gaussian while watching Little Miss Sunshine.

Day 18.

Hm. Bad news: such an expression appears not to exist, at least in publication.

Good news: I can just approximate! In statistics we don’t say, “I love you,” we say, “By the strong law of large numbers, this Monte Carlo estimate is consistent” and I think that’s beautiful.

Day 20.

At long last, done with the calculus. Fortunately the cool thing about gradients is that since they’re just the change in the function when you nudge the input a bit, you can check you calculated them correctly by, well, nudging the input a bit.

The uncool thing is that I checked them and they’re wrong.

Day 21.

So after some tweaking they were…sometimes right. You’ll never guess who spent an hour wondering why the gradients were wrong and found that a humble sign error was to blame.

But also, even after the tweaking, they varied a lot.

And you’re going to love this. I had every reason to expect they would.

See, I took a closer look at the source paper for the method I’d been using to put this little idea into practice and, well, wouldn’t you know it! There’s a solution to this problem, and in fact it’s one of their key findings that put this paper on the map.

Let’s try it out.

Huh.

Those were much easier to compute. And the variance is way down.

pikachu

Day 23.

It’s, er, quite a bit harder to get the step of “implementing stuff that has already been invented but is necessary to simulate the ‘data’ I need for this project” done efficiently than I expected.

Long story short, I need to get an agent (by which I mean, a virtual dot that moves around) to successfully navigate toward the Good corners of a grid, and away from the Bad ones. One would think this is easy.

arrows

Alas.

Day 24.

So, there’s a method I could’ve used for the thing I was complaining about yesterday. I could have used it, but I decided against it because it relied on information on this grid that, yes, I technically had, but I didn’t need to have in principle. There might be harder problems out there where I can’t fall back on such info, and I’d feel pretty silly then, now wouldn’t I?

Perhaps.

However, it just didn’t seem to occur to me, in my infinite wisdom, that this was part of the aforementioned “implementing stuff that has already been invented but is necessary to simulate the ‘data’ I need for this project” step. Now, if something’s already invented, and you want to test how good something you invented is, is it really worth your time to fine-tune that first something? Is it?

No. Not when using the method you stubbornly refused to try before cuts the runtime down about five-fold.

Day 25.

The iron law of making machine learning work in practice is: “Every for-loop must go.”

Shame I was too lazy to take that to heart before.

Day 27.

I’m running out of Greek letters.

Day 28.

Deleting old functions I don’t need anymore is inhumanly satisfying. Like when you close 20 tabs when you’re finished with some project that required Sources and References.

Day 29.

If I’m not mistaken, I once mentioned on this blog that a course in my undergrad days taught me that computers are not magic. They can only store numbers with so much precision. As you’d expect, xkcd explains this well.

This fact has come back to bite me in the ass.

I’m getting infinities out of things that, well, should definitely be finite. This seems to be an instance of dividing by garbage minus garbage.

Day 30.

Never mind, there’s a trick to get around that. Amazing that just doing the same math in a different order can save you from certain doom.

The next obstacle, however, is that I seem to have injected so much noise into the simulations that this poor algorithm can’t make sense of it. Oops. I think this is what’s going to cause the robot uprising: humans demanding the impossible.

Day 31.

Well, now that I’ve successfully given this thing a task that isn’t too hard, it seems it’s too easy. I mean, look at that grid I annotated with my Michelangelean MS Paint skills. It’s pretty small. Let’s kick it up a notch.

Now the grids look like the eyes of the undead:

Day 32.

I need a few benchmarks to see how well this program is working. There’s some information it’s trying to reconstruct, and if it does that reconstruction right, it should be able to get lots of points on some tasks. So naturally you’d test it against a version of the program that already knows this hidden information.

And naturally, you’d find that your algorithm doesn’t come close to matching that.

So you’d try making a weaker baseline, but one that should at least be better than random. Let’s say, one that uses a similar model as your pet algorithm, but a more-wrong model.

And then, purely hypothetically of course, you’d discover that the more-wrong model, which required about 1/10 as much math to get right, works better most of the time.

Day 33.

One possibility is that the more complex algo sucks because of a classic case of local minima. (This is cliche I know, I am sorry to any readers familiar with this field.) There’s a landscape of pits, and we want our ball to roll into the lowest one, but if it’s already heading into a higher pit, it doesn’t know it can’t go lower. It just can’t see that far ahead. So what do we do? We keep a memory of the lowest pit our humble little ball has seen so far, shake it out if it hasn’t decreased to our liking recently, let it roll down again, and repeat.

Let’s try that.

Oh no.

It made things worse.

Day 34.

Because I feel like escaping these problems by creating new ones, let’s see what happens when the data come from sims who actually do optimize for the long term, contra the assumption built into our model. I figure our hapless hero will do worse on that kind of data, but hopefully have some robustness. That would be neat.

Oh.

Huh.

It works better on that kind of data.

Much better. Okay. I guess.

Day 35.

git commit -m”fixed huge very dumb bug in evaluate functions”

Long story short, it seems I was accidentally initializing the algorithm with…the right answer. Which it’s not supposed to know.

That’s one way to do it.

Day 36.

Please work.

Please work.

Please work.

Day 37.

It didn’t work. But it wasn’t supposed to! I set a value such that the algorithm had to basically tell the difference between a rational person successfully pursuing their goal, and an anti-rational person pursuing the opposite of their goal. This is impossible. (Without extra assumptions.)

Classic.

Day 38.

git commit -m”fixed a colosally dumb typo that ruined all my data hahahhahahahah life is pain”

Day 39.

Please don’t work.

Please don’t work.

Please don’t work.

Day 40.

Great. It* both worked and didn’t work. Precisely the result that gives me the least information.

*something that wasn’t supposed to work because it was trained on random “data”

Day 46.

Okay. At last, it’s time. It’s all working, time at last to do the final tests.

All 200 of them. Each of which seems to take half an hour. It seems I’ll need to put two laptops to work on this nonsense, and I know they’re going to fall asleep and put these tests to a screeching halt when I fall asleep, no matter how much I tell them not to.

Hey, in my defense, the slow part is not mine, it’s the necessary evil step that is a mere means to checking how well my baby works.

Day 51.

Well that was pleasant. This tweet summarizes my feelings quite well.

Now for the finishing touch: Googling matplotlib syntax every 3 minutes to make pretty graphs of the results.

Day 55.

Complete at last, and you’ll never guess what the most formidable head of this hydra I had to slay was:

Rewriting damn near all of my code to fix issues with a runtime calculator not liking my use of global variables.

I’m sure that after all of this you’re probably convinced that I despised this project, but honestly, far from it. No challenge of mine has felt more stimulating yet possibly useful (emphasis on “possibly”) than this one, aside from what I’m working on now. Hopefully someone can learn from my embarrassing mistakes.

Dopamine Morals: A Critique of Value Pluralism

I recently listened to episodes of Stephen West’s “Philosophize This” series. It’s good stuff! But I have a bone to pick with the episodes on pluralism and Isaiah Berlin, because I think there are some seriously basic errors in the ideas advocated therein. (Rhyme unintentional.)

[A clarification: pluralism may be defensible under other conceptions. I am simply addressing pluralism as it’s presented by Berlin, or at least by West’s-summary-of-Berlin. Also, this paper by Ole Martin Moen gives a more philosophically rigorous argument for monism than the one here.]

The heart of the critique in these episodes is leveled at monism, an umbrella term for any belief that within some context—be it the sources of knowledge, politics, or ethics, but here I’ll mostly focus on the latter—there is one principle or value that should be the basis for our judgments in that context. For any new readers who don’t know, since I think that reducing suffering is the only moral value that matters for its own sake, this makes me a monist. Some other examples of monism might be (I say “might” because people who hold these views could dispute this, but it seems reasonable to me to classify them as such): the belief that all morals derive from the categorical imperative, that government exists solely to secure human rights, that doing God’s will is the only good, that all justified beliefs are empirically verifiable, and so on.

So what’s wrong with monism? According to [my understanding of] Berlin [based on an hour’s worth of audio summary], it supposedly doesn’t account for people’s ability to both (a) have conflicting prioritization of certain values over others, and (b) be completely rational while doing so. One of Berlin’s examples is mercy versus justice. The argument goes: no one entirely promotes mercy over justice, or justice over mercy. To make moral choices, we have to accept that these two values are “incommensurable,” that is, they can’t be compared on the same scale and judged that one outweighs the other. So, this apparently leads us to view both as intrinsically valuable.

This line of reasoning is bizarre to me. When I look at an example like this, where these multiple values come into conflict, this is exactly the sort of problem in moral deliberation where a monistic principle would shine. The fact that pluralistic views can never tell you how much mercy versus justice there should be in a given situation, because they’re treated as apples and oranges, is a bug, not a feature. When you have to make a decision in your own life about a balance of mercy and justice, say, voting on some criminal justice policy, do you throw up your hands and say these two ideals can’t be compared or weighed? Do you just flip a coin to choose between them? Do you declare that adjudicating between them by some coherent standard would be the height of monomaniacal closed-mindedness?

Of course not. You consider which balance would have the best outcomes, where “best” is (hopefully) measured in effects on people’s well-being. If you don’t do this, well, you’re welcome to show me how your decision-making process is superior.

To be sure, I don’t think this is what Berlin thought the implications of his claims were, and I’ve listened to the charitable characterization of his views as carefully as I could. But these are the implications, and I simply don’t see a way around this. Another clarification: yes, I agree with Berlin’s observation that human values are complex. There are billions of people on this planet, and obviously our preferences are going to conflict. What I reject is proposing that the solution to this complexity is to accept such conflicts. It is literally impossible for everyone to get what they desire, so our response to this should be to find a way to at least put the relative strengths of those desires (or, more to the point, strengths of experiences that come with them) on a comparable scale. If we refuse to do this, the alternative is aimless at best, negligent at worst.

It gets worse. Monism is supposedly responsible for totalitarianism in the twentieth century! Indeed responsible for all political violence. Let the strength of that extraordinary claim sink in, and ask yourself if it seems justified.

Berlin’s logic is that when people are convinced that there is one good that should be achieved, their thinking becomes monolithic. In particular, they’re willing to do whatever it takes to serve this good, even if that means trampling on large groups of people systematically.

I would be sympathetic to this concern if monistic philosophies tended to cluster, in such a way that someone who believes one monistic principle will fluctuate towards others. In other words, if utilitarianism for example inevitably slid down some slope towards Nazism, Stalinism, imperialism, or what have you. But this is very evidently not the case. It’s precisely a monistic philosophy that makes someone less vulnerable to drift into some form of totalitarianism, if they aren’t already focused on the sort of principle that considers absolute power a part of the good. If the principle in question is upholding the well-being of sentient beings, making them miserable through oppression makes no sense. The pluralist by contrast has at best shaky recourse for judging the values of totalitarians as invalid. To say that such values aren’t really human values and hence aren’t worth upholding is just special pleading.

But the more crucial point is that monism is itself not a monolith, any more than, say, religion is. To decry Buddhism, Judaism, Jainism, or Taoism as oppressive and antithetical to human progress because they happen to be in the same class “religion” as the fundamentalist variants of Christianity or Islam (which are both of those things) would be horribly naive. No less naive is Berlin’s move to do the same for monism—there is a universe of difference in value-space between “the one good is to relieve human and animal suffering” and “the one good is to promote the power or ‘purity’ of the white race.” Someone who holds the former can and will condemn totalitarianism as a source of tremendous suffering.

What’s especially perplexing here is that West claims any attempt to distill what is good into a single monistic principle “ultimately fails in the long run.” But “fails” in what sense? By what standard? Certainly not a monistic one, as far as he or Berlin is concerned. If it’s by the standards of a plurality of values, which ones dominate and why?

West’s explanation of Berlin’s answer to this is that he was not a relativist, but rather an “objective pluralist.” The plurality of values is supposedly a collection of different means toward the same ends shared by all humans: “All men seek food and drink, shelter and security; all men want to procreate; all men seek social intercourse, justice, a degree of liberty, means of self-expression, and the like.”

There are number of problems with this. I am evidence that “all men want to procreate” is false, and identifying it as false is not a mere nitpick, because it shows that Berlin’s pretense of objectively identifying universal human values is quite questionable. Further, all of these things are themselves means! Imagine that you had no biological need for food or drink whatsoever, and that you didn’t enjoy consuming these things. Would you still consider them worth consuming? If you’re rational, then no, because by assumption we’ve subtracted away the ends toward which food and drink are means: avoiding the pangs of starvation/thirst and enjoying flavor that removes the blandness from the experience. If “justice” causes prisoners to suffer needlessly just for retribution’s sake, justice fails to serve as the means toward the end of flourishing. And so on. It’s exactly the fact that these conflicting values are means, not ends, that allows us to make decisions about which ones are most important to promote in certain circumstances. An activist may go on hunger strike to promote justice, for instance.

Most fundamentally, I think this objective pluralism mistakes what we often do value (abstractions) for what we ought to value (our bedrock experience). I don’t see how a system of ethics that boils down to prescribing that we just do whatever it is we already wanted to do (perhaps in an enlightened sense of “want”) in the first place can be called ethics at all. (It happens that this is where I diverge with moral antirealists, as I’ve discovered through recent conversations, but that’s beyond the scope of this post.) Many of the things we value, in the sense that Berlin uses this term, are contingencies of our psychology, evolutionary history, and culture. This doesn’t imply they’re actually worth pursuing. Nor does it imply they aren’t, either, but the decision one way or another has to be with reference to some external standard.

This is worth highlighting because I see this disconnect so often in my disagreements with people on moral philosophy. It usually goes something like this:

Me:  Right and wrong are determined by [blah blah blah].

Them:  But [blah blah blah] implies [unsavory conclusion]. You don’t seriously believe that, do you?

Me:  Well, why exactly is that conclusion unsavory to you in the first place? No, really, why? This isn’t a rhetorical question.

Them:  It’s self-evidently awful! This is concerning.

Me:  It’s at least as concerning and self-evidently awful to reject [blah blah blah], because of [even more unsavory conclusion]. But the problem with Battle of the Unsavory Conclusions is that we have to be very careful that we’re not just smuggling in our preconceptions about what’s “right,” which came not from a genuine inspection of the facts about well-being but from social expedience and tribalism.

Them:  Our preconceptions are in some sense the best we’ve got. They’re just the things people want. What more could you want other than what you want?

Me:  That’s just where my spade is turned, I suppose. I deny that what we “want” is really the relevant object here. As any adult knows from experience, pursuing what we want often leads to consequences we don’t like. That’s what matters: what we (dis)like. It’s the difference between dopamine – the chemical of want – and, well, whatever causes good feelings. Anyway, believe me, I don’t want the right answer to be [unsavory conclusion]! It disgusts me. I had to be dragged kicking and screaming to my principles. The fact that I came to that answer anyway, because I decided to be consistent and follow the only larger standard that conceivably makes sense to me, should tell you something.

And everyone clapped. Apologies for the over-the-top “dialogue,” but I hope you get the picture. The whole project of ethics consists in critiquing what you want for the sake of the good, and it’s unfortunate that Berlin and West so scathingly decry attempts to do so according to a consistent principle.

Meditations on Meditation

My friends have peer pressured me into making a habit of meditation. Successfully.

I’m way too much of a beginner to pretend I can offer any wisdom here. The point of this post is just to dump my thoughts and feelings about my experience with this practice. Particularly the things I’ve learned about it that weren’t at all apparent to me for years, during which I’d looked at meditation as a neat curiosity that was just not for me.

I actually tried to start this sometime in high school. I had read that it was surprisingly effective at making people happier on a consistent basis, and was even–nay, especially–endorsed by secular folk, despite the association with Buddhism and Hinduism (the former can be “secular” too, it’s just generally considered a religion). My curiosity piqued, I tried reading and following along with the exercises in Mindfulness.

But it didn’t stick for some reason. My best guess is that the problem (for me at least; I’m sure many people have found that book helpful) was that it didn’t include auditory guidance in the moment, someone there to nudge me into doing, or not doing, what I was supposed to. I’ll say more on that in a sec.

My attempts didn’t stop there. I tried getting back into this meditation business again around a year ago. By then, the practice had the approval of very smart people who shared my basic outlook on living ethically. Still I couldn’t manage to do it consistently.

Fast forward to a month ago, when the power of a social pact brought me back into the fray.

For the first time, instead of just reading instructions and trying it in silence, I did guided meditation. In a sense this felt like cheating, like it’s not how you’re supposed to do it. If so, well, I’m very glad I cheated. The guide (not going to name names because I’m not advertising here, but feel free to ask me if you’re curious) was quite effective at their job, especially because these guided sessions taught me just how many different facets there are to meditation. It’s far more than focusing on your breath and trying not to think.

For one, you don’t have to take deep breaths every time. This was how I’d tried meditating before, and consequently it was more exhausting than it needed to be. Indeed, forcing the breath instead of just paying attention to it misses the point.

The point isn’t quite “trying not to think” either. Trying too hard to do any of the sub-practices of meditation is apparently counterproductive. Instead, the guide encouraged me to simply notice each thought as a thought, as a mental object that enters and leaves the field of consciousness just like a sound or scent.

Doing this makes it all too obvious that your thoughts aren’t really something you choose, though you can of course make an effort to guide your thoughts, exercising some executive control. My meditation guide described this as noticing that the illusion of free will is itself an illusion: we don’t even subjectively experience free will, much less have objective evidence for it. Many people find this unsettling, but I’d argue there’s a tremendous comfort in this. It offers hope for a world without the blind instinct for retribution. Hating others, or yourself, makes little sense when you realize that our actions are products of prior causes. And no, this doesn’t mean that morality is baseless. Clearly your actions can still be motivated by your understanding of the value or disvalue they could cause. You can still motivate other people’s behavior by rewarding it when it’s good and discouraging it when it’s bad–indeed it’s precisely because human actions are determined by causes that this is possible. But getting even for its own sake, with no rehabilitative purpose whatsoever, is baseless.

Where was I? Ah, right. Meditation. I’m supposed to be writing about that.

Focusing on the breath is an important component to start with, but it’s just that: a start. Our minds’ default mode is not to feel raw sensations as they are, rather to keep as many of them in the background as possible while we pursue abstract goals. (Personally I have nothing against abstract goals, as my choice to work on abstractions for a living suggests.) So it’s only sensible that a total beginner should put the breath at the center, as an intuitive gateway into observing the contents of consciousness. As the days went on, my guide recommended exercises that went beyond the breath, including paying attention to the feeling of every body part, its contact with other objects or the air.

Following that exercise led me a few weeks ago to my most profound meditative experience so far, though I’m sure it pales in comparison to those of people with much more of a history with this. It’s hard to describe in a way that won’t sound hippie-dippy, but let me give it a try.

As I let myself just feel each mundane sensation that normally lies humbly in the background, the pendulum swung both ways. First every such sensation became much clearer, a distinct piece of my consciousness: my hands folded into each other, my weight on the chair, the shape of my face, my toes resting on the ground. (Yes, I fully realize the irony of using the word “my” so frequently when the whole point of meditation is to break the illusion of the ego.)

But then, paradoxically, these sensations grew indistinct. The guide described it as a cloud, and that is all too accurate. You feel a tingling, a sort of numbness, a subtle dissolution into your surroundings like a Cezanne painting. Then something like an aura floods your skin, you breath a bit deeper, you slightly float. It’s honestly shockingly pleasant, considering it comes just from orienting your mind in the right way. And there’s a touch of mania in it. A brief sense of, “Wait, is this what they were talking about? Is this oneness?”

Although I definitely wouldn’t describe it as addictive, this experience does instill in you a desire to return to it every time you meditate. This makes you try to find it, you struggle to get yourself into this state of melding your senses into a homogeneous cloud, and that struggle only takes you farther away from your intended destination. I suppose learning to treat this as something nice to be grateful for when it happens, but not something I should always expect–as with all good things in life–is one lesson. The other is that the mind is far more powerful at shaping experiences, without any additional external stimulus than the ones that are already there, than I’d known before. I’ve never heard anyone who meditates report distress or frustration from this, and I wouldn’t say I feel these either (any negativity is as trivial as when I watch a new show I like and am disappointed when an episode doesn’t match my hyped expectations). But it’s worth addressing, especially because I’d expected meditation to monotonically decrease the amount of desire in my life.

I haven’t yet fully internalized what it means to say there is no self, as the guide has insisted in these sessions. Taken in some ways, this just seems obviously false. My consciousness is fundamentally limited to whatever it is that the senses of my brain can perceive, and I can never truly access the subjectivity of others even though I have very good reasons to believe it exists. That’s the sense in which I’m a distinct self from other selves. I’m familiar with the arguments that the notion of personal identity is illusory, and I find them compelling, but they don’t seem to contradict the ineffable observation that I don’t feel everything that is felt. As far as I can tell, the self that meditation rejects is the sort of homunculus fallacy, the idea that there’s some “me” that is watching the movie of my senses and thoughts, when really the only me is those senses and thoughts.

That said, I don’t need to believe that I have no self in order to recognize that my lack of access to other people’s suffering doesn’t make their suffering any less important than mine. As David Pearce says, any tendency toward egoism that I or anyone else has is due to a limitation on our knowledge of the experience that is out there. It’s not a genuine reason to privilege myself. Every moment of pain is as real as another. When you fully accept this, you question the groundless assumption that doing whatever is in your self-interest is “rational” by default. I haven’t heard meditation gurus or Buddhists explicitly say this is what no-self is supposed to mean, so take this as just an interpretation. But no matter the connection to meditation per se, I can’t deny it.

Now, there’s something important I should admit here as well: I haven’t really noticed any improvement in my life due to meditation outside of the experience itself. I haven’t been doing it for very long, so hopefully it comes with practice. But it’s important to manage one’s expectations. After hearing promises that it turns you into a better person, the truth is a bit underwhelming. In its own right, meditation does provide me a few minutes in my day where I give myself permission to not follow thoughts wherever they drag me, like a hyperactive dog on a leash. It could be that that’s all it’s supposed to do. I don’t know.

When Are We Ever Gonna Use This?

A strange tendency I noticed as I went through college-level math was that people in this field celebrated math not in spite of its uselessness (at least in some subfields), but because of it.

No, really. “Applied” is a dirty word in math departments. The “purity” of a field is a badge of prestige, and after you go through enough proof-y math courses, it pretty much never occurs to you to ask what the point of some lecture material is. You don’t prove it because it’s useful, but because it’s beautiful, or not obvious.

I think this is profoundly unfortunate.

Let the record show that I’m as much a fan of theory as the next grad student. Theoretical is not the antonym of useful. As far as I understand it, and this understanding will probably grow a lot more mature over my PhD, the purpose of theory is not just to develop rigor for its own sake–although having certainty about some things in life is nice, and the only corner of the universe I’ve found that provides such certainty is mathematical theorems. Its purpose is to clarify why things work a certain way, when they do in fact work, so you know when to expect those things to work in new territory.

This is the point in the essay where I suppose an example would be in order. It’s tricky to offer an example here simply because, at least in my field, no one teaches you the historical context of theory beyond a basic level. So one is left to speculative which came first, the theory chicken or the application egg. Even explicitly Googling for informed opinions on this issue didn’t help much.

As far as speculations go, I can say that my pet interest–reinforcement learning–is a case where the theory apparently came first, at least concerning the basics. People reasoned that the key to artificial intelligence, beyond the dead-end of hard-coded logic, might lie in modeling a being as pursuing states that tend to give it rewards (motivated by studies of animal behavior). It’s hard to imagine that without this foundation, someone would have just stumbled upon something like the deep Q-network and wondered, “Huh, why did that work?” But at the same time, the hare of practice has absolutely blown past the humble theory tortoise. Machine learning is full of things working despite us having no reason to think they should. On the other hand, there are methods that often don’t work so well even though you’d naively expect them to, and later theory along with a better method explained this: In reinforcement learning, updating your strategy based on the actual long-term results of your actions tends to work worse (for long-term goals!) than updating based on the difference between your short-term prediction and the short-term outcome.

A reasonable question to ask is, why should you care? If something works consistently, is it really such a good use of your time to figure out why?

My first reply is that it’s not even evident in many cases that the “something” in question really does work consistently. I argued that neural networks generalize better than you’d expect them to, but that’s not a very high bar unfortunately. Ninety nine times out of a hundred, an algorithm that seems appropriate for the task at hand doesn’t work until you tweak some knobs (called “hyperparameters” because that sounds cooler) just right. A good theory of how those knobs influence learning, under what circumstances, would save engineers a lot of time, although we don’t have a guarantee from the gods that such a theory exists.

Even for something as seemingly simple as linear regression (line-fitting), how do you know which of the massive number of combinations of predictors, interactions, and variants of the algorithm to try, without a theory to point you in the right direction? If your answer is “just do what has worked best historically,” not only does this assume future data will behave like past data (which you need theory to justify, if you even can justify this), there doesn’t even seem to be a readily understandable database of what has worked. By all means, if you know of a good meta-study that checks the design of studies against the success of implementations of their ideas, send it my way.

Okay, suppose your application of interest really is reliable. The catch is that it’s only reliable so far. If you expect that the circumstances in which you’ll use that application will never change, the complexity of the problem will never increase, and you’ll never want to extend that application to something harder, then I can’t sell theory to you. But this pretty much never happens in practice, so ironically the fetishism of practice over theory is itself based on an abstract idealization.

I’ve been thinking about this question not just because I like scolding purist math nerds, but because I’m starting to seriously try to narrow down what kind of research I want to do for the coming years. A certain taste for the inherent coolness of statistics is necessary for someone who has committed to five or six years of training in it. Yet the danger is that I might get so wrapped up in that coolness that I forget to ask that critical question: Why should you care?

My Plato fanfiction, “What If I’m Wrong?”, was a first pass at justifying the general research direction I’m taking, among other things. And I stand by those remarks after several months of preoccupation with technical details in coursework, which distracted me from those sorts of big-picture considerations. I think that if I could make a substantial contribution to the Project of “design artificially intelligent systems that will be able and ‘willing’ to optimize good objectives,” this would be my best shot at improving the trajectory of the future. (And yes, if “good objectives” sounds vague to you, that’s because it turns out that even defining the Project itself is really hard; basically I mean objectives that avoid seriously harmful unintended consequences.) Not just by averting the downsides that Bostrom, Russell, Amodei, and others have argued are worthy of attention, but perhaps more importantly by enabling our species to solve enormously complex problems (both natural and social) that we simply haven’t had the tools to solve so far.

That is, of course, a big if to grapple with.

Even before getting into the nitty-gritty of original research per se, reading the technical literature on approaches to that Project has given me a more visceral sense of how hard it will be–every solution that comes to mind has holes in it. Not just hard for me, mind you. These are people with daunting CVs and breathtakingly creative solutions to related problems, and yet their work, as impressive as it is and as much as I wish I could be just 10% as prolific in my career, has massive fundamental limitations when it comes to solving the Project. If this seems easy to you, Robert Miles’ videos are excellent explanations of why your solution probably isn’t as airtight as you think it is.

The subset of the Project I’ve thought hardest about is, basically: How can an algorithm learn what humans want–”learn,” because specifying that by hand is notoriously difficult–to a sufficient degree of fidelity that we would trust this algorithm to throw absurdly strong computational resources into achieving its idea of what we want? “Want” doesn’t quite capture the full essence of what I mean here, but “need” or “what is good for us” strikes people as paternalistic. Still, the disconnect between what we think we want, and what we would actually prefer to happen if we thought through the full consequences, is colossal, and this disconnect is part of what makes the Project harder than I’d thought before I went down the arXiv rabbit hole.

arxiv

Figure 1. My browser tabs as of late.

Even if that part of the Project is resolved, though, the next step ahead of us is to understand how to coordinate these systems that do what “humans” want, given that human desires are obviously not in perfect harmony (even though they’re relatively similar in the space of all possible things agents could pursue). Game theory was developed to try to answer this sort of question for human conflicts, where incentive structures and incomplete access to information lead people/states to take supposedly rational actions that make everyone worse off. The scale of that problem seems like it will be even larger when powerful optimizing machines are in the mix. In the long term this is what I’d prefer to prioritize, especially because it’s plausible that here is where fates worse than death lie for humanity. But my impression is that I need a solid basis in the technicalities of the single-agent cases before I can make useful progress on more complex problems, and in practice some of the more promising proposals for the “learning what humans want” goal involve prediction of the behavior of other agents–something integral to the multi-agent coordination problem, as I see it.

An unfortunate side effect of trying to do research that aims at the correct objective, rather than developing better ways of fulfilling the usual objectives statisticians consider, is that this type of research can’t be purely “technical.” Some degree of philosophical baggage necessarily slips in.

I don’t think this is inherently bad. Far from it–a bad idea well-implemented or well-pursued is a waste of everyone’s time and resources.

The problem is that, for all the hype around interdisciplinary research in academia these days, the other disciplines that math folks would prefer to mingle with are, in practice, limited to the sciences. Philosophy is supposedly too fuzzy, subjective, political even. For the purposes of playing the game well enough to have any long-term influence, I’ll have to follow some of academia’s perverse incentives.

That won’t stop me from emphatically noting on this non-technical blog, however, that those incentives are perverse. Technical sophistication isn’t valuable in its own right, if the goals are wrong. Notice that this is a stronger claim than that research should be “useful” in the sense that first comes to mind when you imagine useful research. Something like this paper, whose experimental results are about backflipping simulated noodles, is probably more useful than a paper demonstrating the application of a narrow, non-generalizable technique to a real robot. Because the former gets at the heart of enabling future robots to do what we want, without us needing to specify detailed instructions for “what we want” that are safe against loopholes. The former is addressing something closer to the right goal. It’s asking the right questions.

To be a bit more concrete about the philosophical baggage, the Project seems to require answers to questions like:

  • How do you tell what someone wants without directly asking them (until natural language understanding becomes so advanced that AIs can just ask us)?
  • What should an AI do when extrapolation of what we think we want into the future leads it to conclude that the best action is something that we intuitively find strange, or dangerous? Would our refusal of such an action be justified, or just a reflection of our irrationality and short-sightedness? I’m actually a bit more sympathetic to the latter interpretation than I think most people are, since I think moral false negatives are no less egregious than false positives. But regardless, the appropriate answers to these questions will require nuance. At the very least we’d like some justification of the thought process leading this AI to its counterintuitive conclusion, before we trust it to go hog wild.
  • How should an AI handle cases in which “what we want” or what we think is right changes substantially over time? Moral standards have obviously changed over history, and it seems hard to deny that in many if not most ways this has been progress. Indeed, if this AI predicts those sorts of changes, how can we know when to trust them?
  • If the AI discovers that common human values are fundamentally inconsistent or violate logic, like transitivity for instance, should it override the weakest links in our chain of contradictions? I think yes, but I know of several smart people who (inexplicably as far as I’m concerned) don’t care if their values are logically consistent.
  • Should an AI’s ethical model include strict constraints, rules that it can’t break no matter what? As you can imagine if you’ve read any of my philosophical ramblings, I’d say no, at least in the long term. There’s an argument to be made that these constraints can be useful training wheels for AI in its toddler years, however respecting such rules too rigidly could lead to serious moral negligence, and I don’t think advocates of this approach are as concerned about these false-negative failures as they should be. Also, for any safety measure, we need to be cognizant of the costs it imposes on the capability of the system. Unfortunately, myopic economic pressures to have the latest and greatest uber-optimizing technology are going to march on, so our solutions need to able to catch up.

I’m convinced that theory will be necessary to approach these questions, and to construct the technical implementations of their answers. And others who have thought for quite some time on this issue seem to agree; no less than Stuart Russell has called for a rethinking of the objective of AI, as a necessary precondition for the useful and beautiful applications of that technology that every starry-eyed futurist dreams of.

The Cosmic Sinking Ship: Questions on the Value of the Future

[Update: As of December 2020, I don’t entirely endorse the thesis of this post. I think I understated the risks that human space colonization could increase future suffering. Still, I stand by the importance of the considerations explored here.]

I’ve described myself as an antinatalist “with some asterisks.”

These are the asterisks.

This is a term one can Google effortlessly, but in brief, antinatalism is the belief that procreation is wrong, generally because it imposes harms upon a child that they never would have experienced if they hadn’t been born. If your child might experience something traumatic in their lifetime, that harm simply can’t be justified by any prospects of happiness, so the argument goes.

To understand where I’m coming from here, it would be best to take a step back and consider the universal picture. Modest, I know.

Douglas Adams famously remarked, “In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.” This is satire, yes, but if we fill in the gaps a bit we can see that this claim has some serious teeth to it. Before the emergence of sentience, the cosmos was a sea of stars doing star things, including serving as nuclear reactors that generated the heavier elements that could accrete into planets. I’ve heard astronomers call this phase in prehistory “violent” in an attempt to be poetic, but in the sense that no actual beings were harmed in the making of this film, this was a perfectly peaceful state of affairs. No need existed.

Then came DNA.

Still not an inherently violent chapter in the book of the past, as long as it was limited to the kinds of life so basic that they couldn’t feel pain or emotions. However, chemistry stumbled upon a way to not just replicate (arguably stars did that already, as did fire), but replicate through self-contained vessels that could react to their environments.

In an entirely secular sense, this was the Fall, more or less. Matter was no longer content to bounce around according to gradients of entropy. Now, matter did something cruel. It created a hole in life, the hole of need. Of awareness that something was missing, something had to be changed, even if just in short bursts. Matter didn’t want to do this. It was just the sort of thing that you’d expect to emerge in a system where the replicating vessels faced limited resources, because the capacity to regard destructive stimuli as urgently bad conferred a competitive advantage on creatures with that capacity.

This point is crucial, because it relates to the probability that our planet was not the only one to have produced need. I admit I’m an amateur on the question of extraterrestrial sentience – sure, there’s the Drake equation and Fermi paradox, but even if we resolved the latter, the existence of life doesn’t necessarily imply the existence of feeling. Still, given the unfathomable number of planets out there and what we know about the conditions for them to produce life, it would be an extremely strange coincidence if none of them sustained life forms with this mercilessly adaptive mechanism we call suffering.

I say this despite that on some days I find myself wondering why exactly the forces of biology (on Earth) needed to produce experiences per se, rather than just negative feedback loops and instinctive responses to avoid things that threaten gene replication. Clearly my cells and organs do just fine at adapting to changing circumstances and fulfilling their prime genetic directives, without any need for awareness of these internal changes on my part.

This is the kind of staring-into-the-abyss question that makes the abyss stare back at you, in my experience. On one hand, sentience proper seems like it could just as easily not have come to be, some subjective icing on a physical cake. This intuition implies that it should be very difficult to predict which sorts of materials and processes generate subjective experience, besides our own brains. It isn’t clear whether brains produce sentience because of their particular chemical composition, or because they’re information processing systems, or because of some other property. On the other hand, it would be rather arbitrary and weird if carbon or neurons in particular, as opposed to any other element or arrangement of atoms, were necessary for sentience.

This leads us to suspect the universe could be full of sentience, in corners we simply can’t access objectively. The best we can do apparently is reason by analogy, checking the extent to which a given being (currently, just other animals) has the neural structures we know are at least correlated with feeling. But just how far does the analogy go? Am I a monster for forcing reinforcement learning algorithms to pursue some objective over and over again? While drawing the line just at our own species seems mistaken, indeed horribly so as this line caused the atrocity of animal farming, full-blown panpsychism is hard to swallow.

In any case, scientific parsimony confronts us with the conclusion that if consciousness, or more properly experiences of “goodness” and “badness,” are the sorts of things that evolved systems are pressured to produce, then somewhere out there an E.T. is probably longing for something.

What does this have to do with antinatalism, you ask?

The essence of antinatalism is the recognition that humans’ capacity for need is insatiable, and this inevitably leads at least some of us to suffer greatly in ways that we can’t simply excuse as worth it for some amusement park. Since I’ve argued for that view at length elsewhere, I won’t belabor it again. But if this is compelling to you, and you also reject speciesism, the logical consequence is that a voluntary exodus only of humanity from the valley of tears won’t be enough. That would be nothing less than an abandonment of nonhuman animals to the sinking ship that is our Darwinian planet.

To see what I mean by this, imagine that you and a thousand children have been placed in a torture chamber. There’s an exit, but it’s too high up to be within reach of any of the children. If you were to hop out through this exit, you’d be able to escape safely, but you couldn’t help bring any of the kids out after your own escape. The only way for them to get out is for you to lift each one to the exit, one at a time. Every second you spend in there is, by hypothesis, torture. But it is so for the kids as well.

Would it be right to escape without helping the children get out first? I can’t in good conscience say so. [One reader’s response: “if you stay in to try and help, there’s a high probability that you won’t be sufficiently concerned about suffering and will end up stuffing trillions more kids into the torture chamber, even while letting a few of them out.”]

To every other animal, we’re the adults in this scenario. If we exit the picture, they can’t save themselves. And make no mistake, they have much to be saved from. To be sure, the farmed animals need saving from us, but as immensely tragic as that institution is, I’m somewhat optimistic that within this century humans will replace animal farming with a more humane system. If that happens, it will be impossible to deny the weight of our species’s power (and responsibility) to help the rest of the biosphere escape the slavery of their genes, which demand perpetuation at all costs.

And beyond! If any of those E.T.s I speculated about above are in the unfortunate Goldilocks zone of just complex enough a life form to be sentient, yet not so complex they can free themselves from the worst that sentience has to offer, then to the extent that we physically could help them in the future, we apparently should.

Yet this isn’t a destiny we should take lightly. While of course the torture chamber analogy is hyperbolic, and life usually isn’t that bad, there is so much blood, sweat, and tears that will be demanded of us if we set out to do a global (if not cosmic) rescue. And the part that makes me particularly uneasy is that “us” includes not just the currently living, but future generations. The very same instinct that inspires my sympathy for every nonhuman creature capable of pain, and my mission to help them, makes me very wary of shoving new people into the arena to achieve that mission. It’s basically breeding soldiers for a war. Creepy, to say the least.

Let’s not fool ourselves, though. Every other reason people give for having kids sounds no less manipulative or careless, indeed often more so. Creating a new life for your own happiness, as a way to try to fill an existential hole or, worse, fix a broken relationship is breathtakingly selfish. And as a practical matter, humans are so far from reaching a complete consensus not to have kids, that any argument for procreation on the margin based on our imperative to help nonhumans isn’t compelling. I know that for the foreseeable future, many people will keep procreating, and it seems like a far more efficient use of my time and energy to invest in my own efforts to improve the world, than to raise a child in the hopes of contributing to some sort of altruist army.

Rationally speaking, there doesn’t seem to be a contradiction between the antinatalist sentiment, at its core, and recognition that humans owe our support to beings that can’t help themselves. Having said that, thinking about this apparent tension has made me a little more understanding of critics of my opinions. My beliefs about what we should do are wildly sensitive to certain parameters, or hinges, so to speak. Case in point, this whole observation that nature is hostile to wildlife and we have moral duties to such wildlife is something I only really internalized a little less than a year ago, and it literally makes the difference between near-term human extinction being immensely good or catastrophic.

This sort of volatility seems to make people uncomfortable. My reply is that we don’t really have a reason to expect the real world to conform to the narrow ranges of possibility we’ve constructed. The abortion debate might be the best example of this: whether you think abortion is permissible and attempts to restrict access to it are a fundamental violation of pregnant people’s autonomy (for the sake of unconscious cells), or that it’s equivalent to genocide, depends on beliefs and values that can change over a lifetime, often quite suddenly. That’s to be expected. Shockingly, conclusions depend on premises.

So no: in the face of this uncertainty, of the tyranny of nature that will go unchallenged if humans go on vacation into oblivion, I can’t say I support human extinction. (Gasp, how contrarian!) But my reasons for that are very different from those who think that our survival is an end in itself, that it would be us who would be (primarily) harmed. And this difference is crucial, because I do think there will come a day when our descendants will have done as much as they could for other species. Dark energy sends other planets farther from our reach every day, so eventually the dictum “ought implies can” will excuse us; we won’t be obligated to do the impossible. Perhaps even more sobering is the possibility that humans veer down a terribly mistaken ethical path that makes us more trouble than we’re worth, a parasite whose attempts to help only make things worse, as far as the powerless are concerned. As long as factory farming exists, that might well describe us currently, no speculation about the future required. It’s not a permanent description, however, which gives me enough hope for humanity’s redemption arc that I tentatively stand on the anti-extinction side. We have a lot of mess to clean up if we’re going to stick around.

Neural Networks and Occam’s No-Shave November

[Note from the future: Since NeurIPS 2019, this interpretation has become somewhat outdated, although my understanding is that the theory I’m citing here is still valid in the regime of very wide, but not necessarily deep, networks. See this paper if you’re an ML wonk.]

I’ve been reading a bit lately on a bizarre mystery in machine learning lore, one that flies in the face of some sweeping claims I’ve made in previous posts here. There’s basically no math (at least no math notation) here, if that’s a concern for you. 🙂

To put things glibly, gigantic neural networks with designs so convoluted that some of them literally have “convolutional” in their names (disclaimer: this is not actually the reason they’re called convolutional, lest Yann LeCun kick my ass for saying so) should not work. Not according to the conventional statistical wisdom. (Or at least one common interpretation of that wisdom—I’ll come back to this.)

Why? Because there’s a rule of thumb in this field about how simplicity helps generalization. According to this rule, while more complex models can effortlessly explain the data you train them on, they might not predict new data very well. This is the mathematical analogue of the fuzzy Occam’s razor often used for qualitative arguments. Getting an intuition for why exactly this principle works can take some time, but the gist of at least one argument used in ML theory 101 is this: The larger the set of models you allow a computer to pick from, the more likely it is that at least one of these models could “fool” you by working well on training data, but sucking elsewhere. In other words, more complex model spaces have more opportunities to produce models that just happen to fit noise.

This is actually really similar to a problem faced by modern geneticists. In the interest of finding out which genes are linked to nasty health outcomes, they could look at every gene that can be practically sequenced, and see what associations are hiding there. But since inferring associations isn’t an exact science, there’s some risk of a false positive every time they test a gene. And the chance of making it through that sequence of tests without any false positives shrinks the longer the sequence gets.

It’s of course possible that if a learner searches a large space of hypotheses and picks the best one they find, that one won’t fool them, i.e. the hypothesis explains the data they’ve seen so far effectively because it’s a hypothesis based on the real pattern. But here we’re just being cautious, leading us to conclude that using a simpler, more constrained space of hypotheses to choose from reduces the risk of a happy coincidence that doesn’t work well elsewhere. Learning theorists have proven guarantees like “most likely, the outside error won’t get worse than [this]”, where [this] is something that decreases with simplicity.

Hopefully that’s a more intelligible and less handwavey explanation than the one in my older post. We’ve established that neural networks, colossally complex as they are, shouldn’t work.

But work they do!

Though unsolved problems in the out-of-training performance of these behemoths still abound, they can hold their own at tasks like image classification on entirely new data—a key indicator that they aren’t just lazily “learning” the data they’re trained on without picking up on genuine signals. By “hold their own,” I mean sometimes even better than humans.

There seems to be a sort of doublethink in machine learning education, at least at the more basic levels. One of your first lessons you learn from a finger-wagging guru on Coursera or in your classroom is, “Don’t overfit!” Which is a very good lesson in most contexts. But then this very same guru starts teaching you about neural networks, and you notice in the seminal papers that these networks look like they’re going out of their way to commit statistical sins that would result in overfitting, if there were any karma in this universe. Yet they don’t! For the most part.

So what’s going on?

If you read my little justification for Occam’s razor above closely, you’d notice I said the learner “searches a large space of hypotheses.” This wasn’t just empty metaphor. When it comes to models whose knobs can be turned to infinitely many different numbers, like neural networks or even your friendly neighborhood linear regression, we can’t coherently compare the numbers of possible models in two different classes. Contrasted with something like decision trees, where you can count the number of possible trees of some size, it’s not clear how many “more” neural nets with three layers there are versus two layers. Each connection in the network can be any real number.

So in the interest of both grappling with this infinity weirdness, and bringing things back to the notion of ability to fit noise, we need a measure of complexity that deals with the size of a search space. Basically the idea of the explanation we’re leading up to is that if you want to understand how well a neural network can do on new data, it’s overly conservative to lump that network in with literally all possible neural networks. Instead, we want to consider the space of networks that actually would be searched while training on real data.

With that in mind, let’s look at just what the job of a neural network is in the first place.

Each of those lines/connections has some number associated with it, and at each node, some boring math happens based on the line-numbers and the input. This network is tasked with making sure the output of this whole mess matches some target output for as much of the data as possible. It can do this by nudging those line-numbers up or down according to repeated feedback.

Now, imagine that two of these networks are competing to nudge their respective line-numbers in ways that explain some data. One of them is allowed to nudge the lines to literally whatever numbers its little old abstract heart desires. A -6,349 here, a 0.000000007 there, an 89,000,135,594,003 on the top. Neural anarchy. The other one is pretty heavily restricted. It only gets to nudge each line between, say, 0 and 1. If that makes it harder to explain all the data, well, tough cookies! Life, alas, isn’t fair even for these humble math webs.

Which one do you think is going to have to seek out an actual pattern in the data to explain it, and be less likely to find a solution that is good by coincidence? Which one, meanwhile, is free to just keep looking around until it gets to a a dumb-luck solution?

Even though both of them can technically choose among infinitely many numbers, one infinity is much larger than the other one, in much the same way as the universe is larger than a beetle. The restriction on the second network’s allowed search space keeps it from accidentally fitting noise.

This doesn’t yet explain the magic of neural networks, but I think it’s worth stepping back to appreciate this point in its own right, because honestly when I realized that this was the heart of such magic, it was one of the biggest “a-ha” moments I can remember. I’d been so used to thinking of machine learning in terms of “more parameters [dials you can turn to define the model] = less generalization power.” Reading this paper made me more skeptical of that rough heuristic, but it didn’t really provide an intuitive alternative. So I went on just accepting that deep learning is alchemy and there’s nothing we mere mortals can do to understand its underpinnings. In hindsight, my first clue should’ve been that some classic techniques used to make linear models more generalizable work by restricting some numbers (those used to define the line) to a bounded set. But in my defense, I don’t recall ever having been explicitly taught that this is the practical sense in which these techniques make the models “simpler.” Also, based on the fact that papers addressing this mystery frame it as “neural networks have a lot of dials they can turn, ergo they’re complex, so how do they generalize so well?”, clearly I’m not the only one this insight wasn’t obvious to.

Okay. So. We have to explain the magic. Because even though simplicity in the sense we care about is possible for massive networks—if they only make minor nudges from wherever they started—it still remains to be seen whether (1) neural networks can achieve that level of simplicity through some natural constraints, and (2) do so without losing the ability to explain even the training data.

This is where things get a bit more speculative, I have to confess. I’ve tried my best to interpret the mathematical mumbo-jumbo in this paper, “The role of over-parametrization in generalization of neural networks,” which was the impetus for this whole post. But this interpretation hasn’t gone through any peer or mentor review yet, though I’ll definitely pick the brain of the professor who recommended similar papers to me in due time.

Disclaimers aside, point 1 and to some extent 2 fall out of the following thought experiment. Imagine that the first gray layer in that diagram above is arbitrarily wide, practically infinite. The connections between the input nodes and that layer essentially throw the data from their original form, which can be large (say, 256 x 256 pixels) but still a manageable number of dimensions (AKA “features”), into a new representation in very, very, very high-dimensional space. The Twilight Zone of data. These connections have picked out an unfathomably large number of features of the data, which don’t exactly have human-interpretable meaning in the same way that pixel intensities do, but they’re no less valid as data features than pixels are. They’re just a translation of information the data already contain into a stranger language.

So far, so good. How can the network use this new representation of the data to label them all in a way that’s both consistent with the true labels, and doesn’t require a search of such a massive space that it risks fitting noise? Well basically, it can start off the training round by setting the weight (i.e. importance in the final labeling decision) of each feature to some infinitesimally small—but nonzero—number. The final label ends up being derived from the weighted sum of all those features. (To be more precise, the network assigns a score to each label, i.e. those weighted sums, and it guesses the label with the highest score.)

Here’s the kicker: because there’s such an astronomical number of these features to choose from, there is bound to be some combination of them that the network can pick out as the relevant ones that perfectly determine the training data labels. The network can just nudge down to zero the weights for whichever features it discovers aren’t necessary to explain the data, and all the other ones will be able to combine with the features to give the correct output labels.

Let that sink in. The network can achieve perfect consistency, without needing to search a very large space in nudging the last layer of connections because each connection is so weak to begin with, and without needing to search any space at all for the first layer. That last point is really key—the sheer size of this layer means that the data are projected into a new representation of such fine-grained detail, that these representations don’t need to change in order for the network to learn effectively. Really, that first layer’s connections don’t need to be learned at all. Also, the argument for why the network doesn’t need to search much space in changing the second layer only works if the data are “real.” If there’s basically no signal or pattern in the data, so that labels are effectively random, the network needs to keep changing itself significantly to learn this junk data. This could explain why these networks can generalize on genuine data but not pure noise, even though they can achieve perfect training performance on both.

If this all seems too abstract, it helps to think about how having more “natural” features for each data point makes learning easier. Say you want to classify songs into different genres, given some increasingly specific information about them. First you’re told the BPM of each song. That sort of helps; probably the average pop song has more BPM than the average classical piece. But there’s still a lot of variance. Pop most likely isn’t much faster or slower than death metal. So next maybe you’re told the most common chords used in the song. This would help a lot, but it’s still not airtight. There’s a whole website (warning: this site might steal hours of your life) dedicated to examples of song pairs with similar chord progressions, but many of them are very different genres. Then you’re told the most-used nouns in the lyrics. And so on. More features mean more axes of difference so that you can come closer to defining rules that cleanly separate the genres.

The same thing holds for the features that the network creates simply by multiplying a matrix by the raw data. This is why machine learning folk like to describe neural networks as automatically defining features, no hand-engineering required. Of course, if you have a reason a priori to think that some natural feature has a causal relationship to the outcome you want to predict, it doesn’t hurt to have those natural features. The structure of image classification networks was specially designed to exploit the fact that images are two-dimensional grids of color intensities, which can be broken down into certain shapes and such.

There are two more things I want to address here before I leave you to bask in the brilliance of this theory (one that I’m jealous I didn’t think of first, but that’s life).

For one, where am I getting all this from? How do you know I’m not just making up a just-so story? Let me draw a distinction between my sort-of philosophical interpretations and the scientific/mathematical results. I should say that as far as I can remember, I haven’t yet found a formal explanation of this problem that directly connects (a) the ability of a network to find a coincidentally good (but poorly generalizable) model, with (b) the size of the space of models the network searches. This just seems to be the natural extension of that same connection made for finite “spaces,” lists of models of a certain form that you can just count. It’s the least-vague concept of “complexity” I’m aware of in this context, and it’s one that fits the math presented in the source paper. The details in that source aren’t worth belaboring here, but the graphs and tables give some useful clues: just as I’ve said here, the key term that controls test error is the distance of the trained network from its starting point, a measure of its search space. And that term decreases as the network gets larger, for reasons suggested in the last paragraph of the second page.

Second, even once you wrap your head around this idea of distance-from-the-starting-point as a measure of complexity, it seems hard to apply it to some other settings. Many machine learning techniques don’t use an explicit search-with-feedback procedure to learn at all; they just solve some equations based on the data, settling on a unique solution. This isn’t even limited to simple linear regression either; apparently arbitrarily complex functions can be learned this way through something called a kernel machine. So by our “search space” criterion, it would seem these models are perfectly generalizable, no? But empirically this definitely isn’t the case for all of them. A computer can find a squiggly line that passes through every data point perfectly, in no time flat, without search, but suspiciously well-fitted squiggly lines are textbook cases of bad generalization performance. Kernel machines, though, can do pretty well on some classic datasets like handwritten digits.

Sadly I don’t have a good answer to this new mystery. It’s something I’ll want to look into, but who knows how long it will take to solve!

Brief postscript on why anyone should care about this, since we could just accept that the universe gave us a lucky break and not think too hard about it: For one, if we don’t understand why NNs can sometimes generalize, we can’t predict when they won’t, after some changes to the architecture that will inevitably happen as engineering progresses. That would be bad news if we put neural network-based systems in charge of important societal infrastructure of any sort. Also, sometimes neural networks just aren’t a computationally feasible tool for some job, so when we have to bring in other methods, it would be nice to have an idea what the salient features of NNs for generalization are (to copy them). Finally, it’s pretty important to remember that this entire line of reasoning has been based on the assumption that the data in question will always follow the same distribution. In non-jargon, this is more or less the assumption that the same underlying physical, biological, social, whatever processes produce the data every time. Trusting that that assumption will always hold is a recipe for disaster, especially because a model can trick you by kicking ass even on test error, but breaking down when the test data come from a new (but related) process. So if we can get a firm grasp on the basis of generalization in this simpler setting, hopefully that will equip us to design systems that are robust to changes in the data source.