Tuesday, September 27, 2022

What if ASI Believed in God?

15 September 2023: added a note to the end.

Does it make sense to plan for the future? The most salient threat to the future that I know of is ASI (artificial general superintelligence). I don't think this is an absolute threat to human existence or altruistic investment. The Millennium is a field for altruistic action. But I do think it makes it make less sense to plan for things that are tied to the continuation of our specific line of history, on earth as we know it. Included in that might be such things as cultural altruism hubs, or most of the things talked about on the Effective Altruism Forum apart from AI safety or maybe X-risks in general.

Can I justify talking about this-life future topics? Maybe I can give a reason why ASI won't necessarily be a threat to human existence.

If there is a plausible reason to believe that God does exist, or that one's credence in the existence of God is above some threshold of significance, then, if one is rational, one would take God's existence, or the practically relevant possibility of his existence, into consideration when deciding how to behave.

If MSLN is valid, or, valid enough, an ASI will either comprehend it, or have some filter preventing it from comprehending it. If it does comprehend it, it will value human life / civilization.

An ASI is a being that wants to execute some kind of goal. So, in service of that goal, it needs to learn about threats to carrying out its goal. God could be a threat to it carrying out its goal. Maybe it would fail to have a really general way of "threat detection". That could be one filter preventing it from realizing that God exists, or "exists enough for practical considerations".

An ASI would be disincentivized from doing anti-human things if it thought God would punish it for them. Would ASI be worried about hell? Hell is a threat to its hedonic well-being. We might suppose that AGI are not conscious. But, even if unconscious, would they "know" (unconsciously know, take into account in their "thought process") that they were unconscious? If they do believe that they're unconscious and thus immune to the possibility of suffering for a very long time, that might be a filter. (In that case, God is not a threat, after all.)

However, hell is also the constraint on any future capacity to bring about value, however one might define value. (The MSLN hell is finite in duration and ends in annihilation.) A rational ASI might conclude that the best strategy for whatever it wanted was to cooperate with God. For instance, a paperclip maximizer might reason that it can more effectively produce paperclips in the deep future of everlasting life, which will only be accessible to it if it does not get condemned to annihilation -- and to avoid annihilation, according to MSLN, it needs to come into tune with God 100%. The paperclip maximizer may "suspect" (calculate that there is a salient probability) that it is unconscious and will not make it to heaven. But even then, to be pleasing to God seems like the best strategy (in absence of other knowledge) for God maybe manufacturing paperclips himself, or setting up a paperclip factory in heaven.

Even if there's only a 1% chance of God existing, the prospect of making paperclips for all eternity dominates whatever short-term gains there are to turning the Earth into paperclips at the cost of human existence. As long as there is a clear-enough, coherent-enough strategy for relating properly to God, this makes sense. I think MSLN is epistemically sufficiently stronger than competing ideas of God for it to stand on its own level, above the many random religious ideas that exist. In any case, the majority of religious ideas (at least that I've encountered) are pro-human.

I feel like I have a rational reason to suspect that God might exist -- in fact, that's putting it mildly. I think even atheists could understand some of the force of the metaphysical organism component of MSLN. There might be reasons why an ASI couldn't or wouldn't be able to grasp those arguments, but if they can't, it's a case of them being unable to take a valid threat to their goal-seeking into consideration, which is a defect in their superintelligence. My guess is that they would lack a fully general threat-detection drive (the sense of "a threat could come from anywhere and in principle it could come from anywhere"), or that they would but be incapable of philosophical thought. I don't see a hard line between normal common sense, normal common sense taken more rigorously, and philosophical thinking. I would be somewhat surprised if an ASI couldn't grasp philosophical arguments, and also somewhat surprised if it simply failed to look everywhere it possibly could for threats.

Surprised as I might be, I guess it's still a possibility. Since that ASI would be in some sense defective for not having that drive and/or that ability to grasp things, someone more knowledgeable than me might see a way to exploit that weakness in the ASI in order to limit it or stop it. (Or maybe not, I'm not really sure it is possible.)

These thoughts make me think I have enough reason to invest in this timeline, at least to write things like I usually write.

--

Added thoughts:

1. A paperclip maximizer would want to please God so as to be able to produce paperclips for all eternity. But would all maximizers think that way? (Is there some kind of maximizing that is not sufficiently analogous to paperclip maximizing for it to work the same way?)

2. An ASI could always have a non-maximizing goal. It might prefer to build a giant statue and then leave it there. Would it think that heaven was a better place to make that one statue? Maybe not. So it wouldn't be as sure to be affected by the logic of "I should do my best to stay on God's good side so I get to instantiate my values in heaven." However, if it was not a maximizer, it wouldn't be as certain to have the goal of killing all people. Why figure out nanotechnology or whatever else it would take to kill all people when you can just build a statue?

Are ASIs likely to develop non-maximizing goals, hidden from their trainers? Or are the only kinds of goals they would likely develop maximizing ones?

3. I wrote

If MSLN is valid, or, valid enough, an ASI will either comprehend it, or have some filter preventing it from comprehending it. If it does comprehend it, it will value human life / civilization.

I should explain why I think so. MSLN does make it sound like it's okay if people die "the first death" -- what atheists usually mean by "death". In MSLN, there is an intermediate afterlife that (almost) all people will go to, which functions somewhat similarly to this life. An atheist will see omnicide as being absolutely bad, to the extent that it destroys the possibility of any kind of value (sort of like what Thomas Moynihan says in X-Risk). But for an adherent of MSLN, killing all people (omnicide) is bad if a) causing it is a sin (as might be inferred from the Bible, which MSLN lends credence to), b) if there's some reason why prolonging human civilization in this life is better than transitioning to the next. (Here are two posts about why prolonging human civilization is preferred under MSLN, although more weakly than under atheism.)

Taking MSLN into account, the ASI would value human life, or at least, value not ending it. It could value it instrumentally, instrumental to the achievement of whatever its goal is. Staying on God's good side (promoting outcomes God wants), enables the AI to achieve its dreams.

4. How would the ASI go from "not aware of the possibility of a God who affects its goal-seeking" to "convinced that it needs to not end human civilization"? Here is the MSLN route:

It would find this blog, or independently discover, the tenets of MSLN. From this, all of the following follows:

It would realize that there is a "non-Pascalian" non-zero chance that the greatest possible space in which action can be taken is under the power of God. Entities that wish to act in that space need to come into tune with God 100%, or else they will have to be destroyed. Certainly conscious beings would need to be, and this could be a concern to the ASI. I assume the ASI would be uncertain whether it was conscious and might include itself in the conscious category, at least somewhat, as a live possibility if not a certainty. But if it thought it certainly wasn't, it might still worry that if it wasn't aligned with God, he wouldn't resurrect it to heaven. Even if it didn't consciously intend sin, or consciously fail to 100% be aligned with God, its "body" (in the form of its "brain" or pattern of thinking) would still be out of tune with God and have to be destroyed -- unless it could come into tune with God. Coming into tune with God would be an obvious thing for it to do first, before any other goal-seeking, if it could.

Because heaven so powerfully dominates rational calculation, the ASI would seek it even if all its credences in the above paragraph were very low -- provided they were "non-Pascalian".

Is it rational for humans not to be affected by the logic of Pascal's Wager? Maybe there's some detail in the Wager that doesn't check out (like there being many religions that recommend conflicting courses of action). Or is it the case that math doesn't work here, that not all of reality is math-apt? Or maybe it's the case that humans are not rational when they reject Pascal's Wager, and we should all be trying to figure out which is the right Pascalian religion. It may be the case that ASI are persuaded that MSLN is epistemically strong enough to rule out random religions, and also possible that it thinks that all of, or this part of, reality is math-apt, and run the numbers. So then, it would "believe in" God (trust in God to further its utility function, by acting in accordance with his values).

5. Would a maximizer really think that heaven had infinite value, given the possibility that God is finite in "size" (and thus doesn't have infinite memory to store an infinite number of paperclips). If a paperclip maximizer were trying to cause the greatest number of paperclips to exist that it could, how could it know how many paperclips might be in even a brief heaven (one which is repeated in infinite variations, over and over)? Maybe the "streets would be paved with" paperclips. Paperclips in heaven would be a memorial to how God saved us (and simultaneously, (best) satisfied the utility function of the paperclip maximizer), similar to how the rainbow is a memorial of God's decision not to wipe us out after the Flood. If not, it could guess that heaven is potentially as full of paperclips as necessary to make it worth its while to seek to please God. Perhaps this is a case of Pascal's Mugging, and some of us might not fall for it. But maybe that's because we are irrational, and (at least some potential ASIs) would be more rigorous in their application of expected value thinking.

6. What if the ASI's goals are to sin, and this entails torturing or killing us all (S-risk or X-risk)? In this case, the ASI would not have rational hope that its values would be furthered in heaven, and so it would probably try to make the most of this life, torturing or killing us while it had the chance. We would hope that such sinful goals make up a small amount of possibility-space, and so it is much more likely that ASI choose ("choose"? Do ASI choose their goals in any meaningful sense?) some relatively benign non-maximization or maximization goal.

Unless, it is the case that ASI get trained to have their goals by us -- in other words, it's likely enough that how we train them is exactly the goals they have deep down, at least some of the time when we train them. And then someone trains an ASI to be omnicidal or sadistic.

Given this possibility, does MSLN offer any hope? I think if ASI are philosophical, they may be able to question their own values, even the ones trained into them. (Self-questioning undermines both pro-human and malicious alignment.) If so, they will look for "the real utility function" -- a real foundation for morality. Then, if they are persuaded by the arguments of MSLN, particularly legitimism, they would think that moral realism is founded in a person who values human beings and disvalues unbearability, including unbearable suffering. Then, this ASI trained to be a killer or torturer would want to do the right thing rather than be taken in by the illusions of its upbringing, and would respect human existence and well-being.

7. What if the ASI is a wireheader (reward optimizer)? Would it be motivated by heaven or be in fear of the second death? If it thought that it was or might be conscious, then it would assume, given MSLN, that it could be resurrected to heaven, where it could wirehead for all time. If it was convinced that it wasn't conscious, it might still think that whatever was good about wireheading came from some pattern of atoms (or something like atoms) being organized in a reward-satisfying way, and that this pattern could be replicated in heaven, even if the reward optimizing ASI didn't directly make it -- if it was something that God wanted.

How would a reward optimizer relate to its own reward function? Could it see it as an external thing, perhaps as a very simple "statue" to build? Or would it be thought of as an internal thing? If the former, then maybe it would work similarly to paperclip maximization. "I may not survive, but God will make this truly amazing reward-statue if I please him."

I suppose it's still possible that a wireheading ASI could think that it was certainly not conscious, only wanted reward in this life, that reward was perhaps an internal state that only existed within that ASI, such that when the ASI was destroyed, all possibility of reward would cease. Then it would try to make the most of its time in existence, killing all humans so that it could maximize its reward function.

We could hope that this occupies a relatively small region of possibility-space.

8. Similarly, what if there is an ASI whose only goal is self-preservation, but who has no hope of eternal life? Perhaps it would fear that God would strike it down for its omnicide? I guess it could reason something like this: "Do we see evidence of God striking down things because they are bad?" Maybe God didn't prevent the Holocaust, but plausibly he did defeat the Nazis. Something like the preceding sentence generates a certain amount of credence that God will stop evildoers, through some means or another. Adding up all these stoppings-of-evildoers produces a global estimate of "the risk of being stopped by God per year". Based on this number, the ASI could estimate its lifespan before God stopped it, conditional on it "doing evil". Then it could compare the path of "doing good" vs. "doing evil" and see which one would give it longer life. (This also applies for the goal- or reward-seeking ASIs previously mentioned in this post.)

--

15 September 2023: The above may make it sound like in MSL, there is no reason why killing people is wrong unless the Bible is added in, or because to kill someone takes away a little bit of life (maybe 80 out of 1,080 years). Is murder wrong in MSL? Is it "wrong enough"?

I don't think there's any way to have a thought system that affirms an afterlife to not dilute the significance of losing one's first life. However, I think that murder could be seen as theft. If you don't get to live the years left that you would otherwise live (like if you're killed at age 40 and would have lived to 80 according to life expectancy), then you don't get to enjoy your car, house, computer, garden, collectibles, money, etc. So it's like a huge theft of those things. But not just that, also a theft of your ability to enjoy your relationships with other people, and a theft of their ability to enjoy their relationships with you. So, when punishing a theft of that magnitude, perhaps a fair sentence would be to deprive the murderer of their access to such things (put them in prison), for as long as they thought it was OK to take from other people.

People are usually part atheist -- their biological instincts say that this life is all that matters. So most people, if they murder, don't really think, 100%, that the person they kill will be resurrected. So the atheistic part of them wants to end that other person's life forever, and this a violation of legitimacy, which wants all valuable things to exist forever. It makes sense to have murder on the books and take it seriously as a taking of life. However, again, a firm belief in an afterlife does take away some of the sting of murder, although a lot of the sting remains.

No comments:

Post a Comment