When the environment is unknown, Bayes-optimal agents may fail to act optimally even asymptotically. A Bayesian agent acting in a multi-agent environment learns to predict the other agents’ policies if its prior assigns positive probability to them (in other words, its prior contains a grain of truth ). , precisely specified decision algorithms, not genetic algorithms), so that humans can understand why AI systems behave as they do. Much of our research is therefore aimed at putting theoretical foundations under AI robustness work. Ai thesis proposals. Only small classes are known to have a grain of truth and the literature contains several related impossibility results. Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the grain of truth problem.
In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of policies that contains all computable policies as well as Bayes-optimal policies for every lower semicomputable prior over the class.
What does optimal reasoning look like for resource-bounded agents in the physical world?
However, agents based on Thompson sampling converge to play ε-Nash equilibria in arbitrary unknown computable multi-agent environments.
For safety purposes, a mathematical equation defining general intelligence is more desirable than an impressive but poorly-understood code kludge. While these results are purely theoretical, we show that they can be computationally approximated arbitrarily closely. MIRI focuses on AI approaches that can be made transparent (e. We consider settings where traditional decision and probability theory frequently break down: settings where, there is no sharp, exist, or is admitted.