Kaelbling, L. P. et al. Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1), 99–134 (1998).
Maia, T. V. Reinforcement learning, conditioning, and the brain: Successes and challenges. Cogn. Affect. Behav. Neurosci. 9(4), 343–364 (2009).
Braun, D. A. et al. Structure learning in action. Behav. Brain Res. 206(2), 157–165 (2010).
Gershman, S. J. et al. Context, learning, and extinction. Psychol. Rev. 117(1), 197–209 (2010).
Wilson, R. C. & Niv, Y. Inferring relevance in a changing world. Front. Hum. Neurosci. 5, 189 (2012).
Dayan, P. & Berridge, K. C. Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation. Cogn. Affect. Behav. Neurosci. 14(2), 473–492 (2014).
Wilson, R. C. et al. Orbitofrontal cortex as a cognitive map of task space. Neuron 81(2), 267–279 (2014).
Gershman, S. J. et al. Discovering latent causes in reinforcement learning. Curr. Opin. Behav. Sci. 5, 43–50 (2015).
Tervo, D. G. R. et al. Toward the neural implementation of structure learning. Curr. Opin. Neurobiol. 37, 99–105 (2016).
Niv, Y. Learning task-state representations. Nat. Neurosci. 22(10), 1544–1553 (2019).
Salzman, C. D. & Fusi, S. Emotion, cognition, and mental state representation in amygdala and prefrontal cortex. Annu. Rev. Neurosci. 33, 173–202 (2010).
Zaidi, Q. Visual inferences of material changes: Color as clue and distraction. Wiley Interdiscip. Rev. Cogn. Sci. 2(6), 686–700 (2011).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015).
Sutton, R. S. & Barto, A. G. Reinforcement learning: an introduction. In Adaptive Computation and Machine Learning, Vol. xviii (MIT Press, 1998).
Dayan, P. & Daw, N. D. Decision theory, reinforcement learning, and the brain. Cogn. Affect. Behav. Neurosci. 8(4), 429–453 (2008).
Lee, D. et al. Neural basis of reinforcement learning and decision making. Annu. Rev. Neurosci. 35(1), 287–308 (2012).
Tolman, E. C. Cognitive maps in rats and men. Psychol. Rev. 55(4), 189 (1948).
Hemmi, J. M. & Menzel, C. R. Foraging strategies of long-tailed macaques, Macaca fascicularis: Directional extrapolation. Anim. Behav. 49(2), 457–464 (1995).
Wilson, R. C. et al. Humans use directed and random exploration to solve the explore—exploit dilemma. J. Exp. Psychol. Gen. 143(6), 2074 (2014).
Kolling, N. et al. Neural mechanisms of foraging. Science 336(6077), 95–98 (2012).
Kaplan, R. et al. The neural representation of prospective choice during spatial planning and decisions. PLoS Biol. 15(1), e1002588 (2017).
Kolling, N. et al. Prospection, perseverance, and insight in sequential behavior. Neuron 99(5), 1069-1082.e7 (2018).
Meder, B. et al. Stepwise versus globally optimal search in children and adults. Cognition 191, 103965 (2019).
Bromberg-Martin, E. S. & Hikosaka, O. Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron 63(1), 119–126 (2009).
Bromberg-Martin, E. S. & Hikosaka, O. Lateral habenula neurons signal errors in the prediction of reward information. Nat. Neurosci. 14(9), 1209–1216 (2011).
Blanchard, T. C. et al. Orbitofrontal cortex uses distinct codes for different choice attributes in decisions motivated by curiosity. Neuron 85(3), 602–614 (2015).
Iigaya, K. et al. The modulation of savouring by prediction error and its effects on choice. Elife 5, e13747 (2016).
Wang, M. Z. & Hayden, B. Y. Monkeys are curious about counterfactual outcomes. Cognition 189, 1–10 (2019).
White, J. K. et al. A neural network for information seeking. Nat. Commun. 10(1), 1–19 (2019).
Foley, N. C. et al. Parietal neurons encode expected gains in instrumental information. Proc. Natl. Acad. Sci. 114(16), E3315–E3323 (2017).
Horan, M. et al. Parietal neurons encode information sampling based on decision uncertainty. Nat. Neurosci. 22(8), 1327–1335 (2019).
Stephens, D. W. & Krebs, J. R. Foraging Theory (Princeton University Press, 1986).
Hills, T. T. et al. Optimal foraging in semantic memory. Psychol. Rev. 119(2), 431 (2012).
Metcalfe, J. & Jacobs, W. J. People’s study time allocation and its relation to animal foraging. Behav. Proc. 83(2), 213–221 (2010).
Pirolli, P. L. T. Information Foraging Theory: Adaptive Interaction with Information (Oxford University Press, 2009).
Charnov, E. L. Optimal foraging, the marginal value theorem. Theor. Popul. Biol. 9(2), 129–136 (1976).
McNamara, J. Optimal patch use in a stochastic environment. Theor. Popul. Biol. 21(2), 269–288 (1982).
Davidson, J. D. & El Hady, A. Foraging as an evidence accumulation process. PLoS Comput. Biol. 15(7), e1007060 (2019).
Stephens, D. W. et al. Foraging: Behavior and Ecology (University of Chicago Press, 2007).
McNamara, J. M. & Houston, A. I. Optimal foraging and learning. J. Theor. Biol. 117(2), 231–249 (1985).
Fu, W.-T. & Pirolli, P. SNIF-ACT: A cognitive model of user navigation on the World Wide Web. Hum. Comput. Interact. 22(4), 355–412 (2007).
Osu, R. et al. Practice reduces task relevant variance modulation and forms nominal trajectory. Sci. Rep. 5(1), 1–17 (2015).
Gallistel, C. R. et al. The rat approximates an ideal detector of changes in rates of reward: Implications for the law of effect. J. Exp. Psychol. Anim. Behav. Process. 27(4), 354 (2001).
Inclan, C. & Tiao, G. C. Use of cumulative sums of squares for retrospective detection of changes of variance. J. Am. Stat. Assoc. 89(427), 913–923 (1994).
Todd, P. M. & Hills, T. T. Foraging in mind. Curr. Dir. Psychol. Sci. 29(3), 309–315 (2020).
Pirolli, P. L. T. Information Foraging Theory: Adaptive Interaction with Information (Oxford University Press, 2007).
Giraldeau, L.-A. & Caraco, T. Social foraging theory. In Social Foraging Theory (Princeton University Press, 2000).
Rescorla, R. A. & Wagner, A. R. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In Classical Conditioning II: Current Research and Theory (eds Black, A. H. & Prokasy, W. F.) (Appleton-Century-Crofts, 1972).
Sutton, R. S. et al. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999).
Momennejad, I. Learning structures: Predictive representations, replay, and generalization. Curr. Opin. Behav. Sci. 32, 155–166 (2020).
Littman, M. L. A tutorial on partially observable Markov decision processes. J. Math. Psychol. 53(3), 119–125 (2009).
Kaelbling, L. P. et al. Reinforcement learning: A survey. J. Artif. Intell. Res. 4, 237–285 (1996).
Shannon, C. E. & Weaver, W. The Mathematical Theory of Communication (University of Illinois Press, 1963).
Crupi, V. et al. Generalized information theory meets human cognition: Introducing a unified framework to model uncertainty and information search. Cogn. Sci. 42(5), 1410–1456 (2018).
Miller, G. Informavores. In The Study of Information: Interdisciplinary Messages (eds Machlup, F. & Mansfield, U.) 111–113 (Wiley, 1983).
Coenen, A. et al. Asking the right questions about the psychology of human inquiry: Nine open challenges. Psychon. Bull. Rev. 26(5), 1548–1587 (2019).
Gureckis, T. & Markant, D. Active learning strategies in a spatial concept learning game. In Proceedings of the Annual Meeting of the Cognitive Science Society (2009).
Markant, D. & Gureckis, T. Does the utility of information influence sampling behavior? In Proceedings of the Annual Meeting of the Cognitive Science Society (2012).
Oaksford, M. & Chater, N. A rational analysis of the selection task as optimal data selection. Psychol. Rev. 101(4), 608 (1994).
Oaksford, M. & Chater, N. Rationality in an Uncertain World: Essays on the Cognitive Science of Human Reasoning (Psychology Press, 1998).
Nelson, J. D. Finding useful questions: On Bayesian diagnosticity, probability, impact, and information gain. Psychol. Rev. 112(4), 979 (2005).
Oaksford, M. & Chater, N. Bayesian Rationality: The Probabilistic Approach to Human Reasoning (Oxford University Press, 2007).
Nelson, J. D. et al. Experience matters: Information acquisition optimizes probability gain. Psychol. Sci. 21(7), 960–969 (2010).
Nelson, J. D. et al. Children’s sequential information search is sensitive to environmental probabilities. Cognition 130(1), 74–80 (2014).
Schmidhuber, J. Curious model-building control systems. In 1991 IEEE International Joint Conference on Neural Networks (IEEE, 1991).
Thrun, S. & Möller, K. Active exploration in dynamic environments. In Advances in Neural Information Processing Systems (1992).
Thrun, S. Exploration in active learning. In Handbook of Brain Science and Neural Networks 381–384 (1995).
Settles, B. Active learning. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–114 (2012).
Markant, D. B. & Gureckis, T. M. Is it better to select or to receive? Learning via active and passive hypothesis testing. J. Exp. Psychol. Gen. 143(1), 94 (2014).
Griffiths, T. L. & Tenenbaum, J. B. Structure and strength in causal induction. Cogn. Psychol. 51(4), 334–384 (2005).
Kemp, C. & Tenenbaum, J. B. Structured statistical models of inductive reasoning. Psychol. Rev. 116(1), 20 (2009).
Koechlin, E. Prefrontal executive function and adaptive behavior in complex environments. Curr. Opin. Neurobiol. 37, 1–6 (2016).
Wason, P.C. Reasoning. In New Horizons in Psychology (eds Foss, B.) 135–151 (1966).
Wason, P. C. Reasoning about a rule. Q. J. Exp. Psychol. 20(3), 273–281 (1968).
Gregory, R. On how little information controls so much behaviour. Ergonomics 13(1), 25–35 (1970).
Snyder, M. & Swann, W. B. Hypothesis-testing processes in social interaction. J. Pers. Soc. Psychol. 36(11), 1202 (1978).
Trope, Y. & Bassok, M. Confirmatory and diagnosing strategies in social information gathering. J. Pers. Soc. Psychol. 43(1), 22 (1982).
Klayman, J. & Ha, Y.-W. Confirmation, disconfirmation, and information in hypothesis testing. Psychol. Rev. 94(2), 211 (1987).
Siskind, J. M. A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition 61(1–2), 39–91 (1996).
Trope, Y. & Liberman, A. Social hypothesis testing: Cognitive and motivational mechanisms (1996).
Poletiek, F. H. Hypothesis-Testing Behaviour (Psychology Press, 2013).
Markant, D. B. et al. Self-directed learning favors local, rather than global, uncertainty. Cogn. Sci. 40(1), 100–120 (2016).
Pirolli, P. & Card, S. Information foraging. Psychol. Rev. 106(4), 643 (1999).
Najemnik, J. & Geisler, W. S. Optimal eye movement strategies in visual search. Nature 434(7031), 387–391 (2005).
Vergassola, M. et al. ‘Infotaxis’ as a strategy for searching without gradients. Nature 445(7126), 406 (2007).
Johnson, A. et al. The hippocampus and exploration: dynamically evolving behavior and neural representations. Front. Hum. Neurosci. 6, 216 (2012).
Manohar, S. G. & Husain, M. Attention as foraging for information and value. Front. Hum. Neurosci. 7, 711 (2013).
Good, I. J. Weight of evidence, corroboration, explanatory power, information and the utility of experiments. J. R. Stat. Soc. Ser. B (Methodol.) 22(2), 319–331 (1960).
Myung, J. I. & Pitt, M. A. Optimal experimental design for model discrimination. Psychol. Rev. 116(3), 499 (2009).
Markant, D. & Gureckis, T. Category learning through active sampling. In Proceedings of the Annual Meeting of the Cognitive Science Society (2010).
Markant, D. & Gureckis, T. Modeling information sampling over the course of learning. In Proceedings of the Annual Meeting of the Cognitive Science Society (2011).
Tsividis, P., et al. Information selection in noisy environments with large action spaces. In Proceedings of the Annual Meeting of the Cognitive Science Society (2014).
Rich, A. S. & Gureckis, T. M. Exploratory choice reflects the future value of information. Decision 5, 177 (2017).
Nelson, J. & Movellan, J. Active inference in concept learning. Adv. Neural Inf. Process. Syst. 13 (2000).
Steyvers, M. et al. Inferring causal networks from observations and interventions. Cogn. Sci. 27(3), 453–489 (2003).
Schulz, L. E. et al. Preschool children learn about causal structure from conditional interventions. Dev. Sci. 10(3), 322–332 (2007).
Najemnik, J. & Geisler, W. S. Eye movement statistics in humans are consistent with an optimal search strategy. J. Vis. 8(3), 4 (2008).
Gopnik, A. The Philosophical Baby: What Children’s Minds Tell Us About Truth, Love & the Meaning of Life (Random House, 2009).
Bonawitz, E. B. et al. Just do it? Investigating the gap between prediction and action in toddlers’ causal inferences. Cognition 115(1), 104–117 (2010).
Cook, C. et al. Where science starts: Spontaneous experiments in preschoolers’ exploratory play. Cognition 120(3), 341–349 (2011).
Bramley, N. R. et al. Conservative forgetful scholars: How people learn causal structure through sequences of interventions. J. Exp. Psychol. Learn. Mem. Cogn. 41(3), 708 (2015).
Ruggeri, A. & Lombrozo, T. Children adapt their questions to achieve efficient search. Cognition 143, 203–216 (2015).
McCormack, T. et al. Children’s use of interventions to learn causal structure. J. Exp. Child Psychol. 141, 1–22 (2016).
Rothe, A. et al. Do people ask good questions?. Comput. Brain Behav. 1(1), 69–89 (2018).
Meier, K. M. & Blair, M. R. Waiting and weighting: Information sampling is a balance between efficiency and error-reduction. Cognition 126(2), 319–325 (2013).
Yang, S.C.-H. et al. Active sensing in the categorization of visual patterns. Elife 5, e12215 (2016).
Nelson, J.D., et al. Towards a theory of heuristic and optimal planning for sequential information search (2018).
Badre, D. et al. Frontal cortex and the discovery of abstract action rules. Neuron 66(2), 315–326 (2010).
Wu, C. M. et al. Generalization guides human exploration in vast decision spaces. Nat. Hum. Behav. 2(12), 915–924 (2018).
Schulz, E. et al. Finding structure in multi-armed bandits. Cogn. Psychol. 119, 101261 (2020).
Schapiro, A. C. et al. Neural representations of events arise from temporal community structure. Nat. Neurosci. 16(4), 486–492 (2013).
Collins, A. & Koechlin, E. Reasoning, learning, and creativity: frontal lobe function and human decision-making. PLoS Biol. 10(3), e1001293 (2012).
Collins, A. G. & Frank, M. J. Cognitive control over learning: Creating, clustering, and generalizing task-set structure. Psychol. Rev. 120(1), 190 (2013).
Collins, A. G. et al. Human EEG uncovers latent generalizable rule structure during learning. J. Neurosci. 34(13), 4677–4685 (2014).
Donoso, M. et al. Foundations of human reasoning in the prefrontal cortex. Science 344(6191), 1481–1486 (2014).
Collins, A. G. The cost of structure learning. J. Cogn. Neurosci. 29(10), 1646–1655 (2017).
Xia, L. & Collins, A. G. Temporal and state abstractions for efficient learning, transfer, and composition in humans. Psychol. Rev. 128, 643 (2021).
Hills, T. T. Animal foraging and the evolution of goal-directed cognition. Cogn. Sci. 30(1), 3–41 (2006).
Viswanathan, G. M. et al. The Physics of Foraging: An Introduction to Random Searches and Biological Encounters (Cambridge University Press, 2011).
Hills, T. T. et al. Adaptive Lévy processes and area-restricted search in human foraging. PLoS ONE 8(4), e60488 (2013).
Hills, T. T. et al. The central executive as a search process: Priming exploration and exploitation across domains. J. Exp. Psychol. Gen. 139(4), 590 (2010).
Cain, M. S. et al. A Bayesian optimal foraging model of human visual search. Psychol. Sci. 23, 0956797612440460 (2012).
Wolfe, J. M. When is it time to move to the next raspberry bush? Foraging rules in human visual search. J. Vis. 13(3), 1–17 (2013).
Calhoun, A. J. et al. Maximally informative foraging by Caenorhabditis elegans. Elife 3, e04220 (2014).
Rothe, A., et al. Asking and evaluating natural language questions. In CogSci (2016).
Huberman, B. A. et al. Strong regularities in world wide web surfing. Science 280(5360), 95–97 (1998).
Church, K. & Hanks, P. Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990).
Payne, S. J. et al. Discretionary task interleaving: Heuristics for time allocation in cognitive foraging. J. Exp. Psychol. Gen. 136(3), 370 (2007).
Wilke, A. et al. Fishing for the right words: Decision rules for human foraging behavior in internal search tasks. Cogn. Sci. 33(3), 497–529 (2009).
Payne, S. & Duggan, G. Giving up problem solving. Mem. Cognit. 39(5), 902–913 (2011).
Hills, T. T. et al. Foraging in semantic fields: How we search through memory. Top. Cogn. Sci. 7(3), 513–534 (2015).
Turrin, C. et al. Social resource foraging is guided by the principles of the Marginal Value Theorem. Sci. Rep. 7(1), 11274 (2017).
Saraiya, P., et al. Effective features of algorithm visualizations. In Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education (2004).
Lam, H. A framework of interaction costs in information visualization. IEEE Trans. Vis. Comput. Graph. 14(6), 1149–1156 (2008).
Ye, W. & Damian, M. F. Exploring task switch costs in a color-shape decision task via a mouse tracking paradigm. J. Exp. Psychol. Hum. Percept. Perform. 48(1), 8 (2022).
Araujo, C. et al. Eye movements during visual search: The costs of choosing the optimal path. Vis. Res. 41(25–26), 3613–3625 (2001).
Baloh, R. W. et al. Quantitative measurement of saccade amplitude, duration, and velocity. Neurology 25(11), 1065–1065 (1975).
van Beers, R. J. The sources of variability in saccadic eye movements. J. Neurosci. 27(33), 8757–8770 (2007).
Hoppe, D. & Rothkopf, C. A. Multi-step planning of eye movements in visual search. Sci. Rep. 9(1), 1–12 (2019).
Callaway, F. et al. Fixation patterns in simple choice reflect optimal information sampling. PLoS Comput. Biol. 17(3), e1008863 (2021).
Wedel, M., et al. Modeling eye movements during decision making: A review. Psychometrika 1–33 (2022).
Oaten, A. Optimal foraging in patches: A case for stochasticity. Theor. Popul. Biol. 12(3), 263–285 (1977).
Ollason, J. Learning to forage—optimally?. Theor. Popul. Biol. 18(1), 44–56 (1980).
Wyckoff, L. B. Jr. The role of observing responses in discrimination learning. Part I. Psychol. Rev. 59(6), 431 (1952).
Wyckoff, L. Toward a quantitative theory of secondary reinforcement. Psychol. Rev. 66(1), 68 (1959).
Blanchard, R. The effect of S− on observing behavior. Learn. Motiv. 6(1), 1–10 (1975).
Dinsmoor, J. A. Observing and conditioned reinforcement. Behav. Brain Sci. 6(4), 693–704 (1983).
Roper, K. L. & Zentall, T. R. Observing behavior in pigeons: The effect of reinforcement probability and response cost using a symmetrical choice procedure. Learn. Motiv. 30(3), 201–220 (1999).
Vasconcelos, M. et al. Irrational choice and the value of information. Sci. Rep. 5(1), 1–12 (2015).
Prokasy, W. F. Jr. The acquisition of observing responses in the absence of differential external reinforcement. J. Comp. Physiol. Psychol. 49(2), 131 (1956).
Kreps, D. M. & Porteus, E. L. Temporal resolution of uncertainty and dynamic choice theory. Econom. J. Econom. Soc. 185–200 (1978).
Beierholm, U. R. & Dayan, P. Pavlovian-instrumental interaction in ‘observing behavior’. PLoS Comput. Biol. 6(9), e1000903 (2010).
Basile, B. M. & Hampton, R. R. Monkeys recall and reproduce simple shapes from memory. Curr. Biol. 21(9), 774–778 (2011).
Gottlieb, J. & Oudeyer, P.-Y. Towards a neuroscience of active sampling and curiosity. Nat. Rev. Neurosci. 19(12), 758–770 (2018).
Calhoun, A. J. & Hayden, B. Y. The foraging brain. Curr. Opin. Behav. Sci. 5, 24–31 (2015).
Barack, D. L. & Platt, M. L. Engaging and exploring: Cortical circuits for adaptive foraging decisions. In Impulsivity 163–199 (Springer, 2017).