I thought I would begin the year with some thoughts. Recently my focus has turned to psychology.
I am trying to rethink psychological issues from a more mathematically informed perspective. I am considering several ideas.
1 - Let us think of decision-making as a continuous time stochastic process. This could translate into an equation that relates accumulated evidence, drift rate (the quality of the evidence), noise, and a Wiener process. A decision occurs when the function hits a boundary.
2 – In thinking about learning as a continuous process, it equates to something like minimizing a prediction error. Learning rate becomes a dynamical parameter, not a constant. This suggests an approach to rapid versus slow learners and to pathological learning such as addiction and depression.
3 - Ultimately a relational ontology gets me to something like a neural field model of cognition. Thus the question, how do distributed populations of neurons give rise to stable thoughts, memories, or percepts? If it is a continuous dynamical system, then individual psychological states become attractors in a continuous process; this suggests a dynamic approach to working memory, hallucinations, and various kinds of breakdowns of cognitive ability.
4 – Fechner's law could be a way of looking at subjective experience in a quantitative form. Quantitative change can relate to qualitative experience – this implies that mental states should be treated as temporarily extended processes.
5 – Consider the RW model – application predicts rapid early learning, diminishing returns, and persistence of early priors. These phenomena are exactly what we see. Overall: Psychological systems behave as if they were following the steepest descent on an error landscape. Mental content is not static. Meaning arises from trajectories, not representations. The mind is defined by how it changes, not by what it contains. Learning is a differential equation about the self.
----
DDMs // drift-diffusion models of decision-making // decisions emerge from continuous accumulation rather than discrete comparisons. The same formalism handles both perceptual decisions (is that a face?) and value-based choices (should I take this job?). The drift rate estimates a "signal clarity" parameter that explains individual differences and contextual effects.
What happens when boundaries themselves are dynamic or learned? This gets at impulsivity, patience, and how depression might alter decision thresholds.
Learning rates as dynamic parameters//The Rescorla-Wagner model with constant α misses something fundamental about adaptive systems—they need to modulate how much they update based on uncertainty, volatility, and context. Bayesian approaches formalize this through precision-weighting of prediction errors.
Addiction might involve excessive learning rates for reward prediction errors in specific circuits.
Depression could involve learned helplessness through overly rigid priors that resist updating.
Anxiety might be hyperactive learning about threats.
Neural field dynamics and attractors // If mental states are attractors in a high-dimensional neural state space, then "stability" of a thought or percept means the system has settled into a basin. Working memory becomes actively maintaining a state near an attractor against noise. Hallucinations could be spurious attractors that the system falls into. Cognitive flexibility versus rigidity maps onto the depth and breadth of attractor basins. Creativity might involve noise-driven exploration of state space, while rumination is getting trapped in a tight attractor basin.
Binding problems // How do we maintain a coherent percept of, say, "red ball moving leftward"? If it's a single attractor state rather than separate features that need binding, the problem dissolves.
Psychophysics, temporal extension // Fechner's law (and Stevens' power law) reveal that subjective experience is a compressed transform of physical intensity. The logarithmic or power-law relationship ensures we're sensitive across vast dynamic ranges. Treating mental states as temporally extended processes rather than instantaneous snapshots changes the problem-set. Consciousness is ‘intentional’ as Husserl stated: awareness is about trajectories and transitions. Mathematically, this is closer to a vector than a scalar. We have escaped the world of frozen moments. This result connects to the "specious present" in phenomenology.
Learning and priors//The Rescorla-Wagner asymptotic learning curve shows that early experiences have outsized influence – not because they're "special" – but because that's when uncertainty is at its highest peak and learning rates also achieve a maximum. This explains some of the power of cultural imprinting, attachment patterns, and why early trauma is so persistent. It's literally foundational to the error landscape you're building on.
Synthesis//The mind is defined by how it changes, not by what it contains – this is the key.
The Predictive processing/active inference approach is distinct from representationalist cognitive science. Mental content is trajectories through state space, not stored symbols. Meaning isn't a mapping between representations and world. Meaning is a pattern of transitions and their consequences for prediction error.
The steepest descent perspective is that thermodynamic systems minimize free energy, which for perception and learning means minimizing prediction error.
Questions … extensions:
• How do multiple timescales interact? You have fast perceptual dynamics, medium learning rates, and slow developmental/cultural priors. How do they couple?
• What role does embodiment play? If meaning is trajectories, then sensorimotor contingencies matter—the same "internal" trajectory might mean different things depending on what actions it affords. The environment/organism boundary blurs.
• Can this framework handle social cognition? Other minds might be attractors we simulate, or empathy could be synchronization of trajectories across coupled dynamical systems.
• How does language fit? Words might be attractor labels or controllers that shape trajectories, rather than symbols that denote representations.
Let’s rebuild psychology on a foundation of continuous dynamical systems theory – since brains are continuous, embodied, dynamical systems, evolved for prediction and control in an environment. The mathematics show structural properties of adaptive systems.
//The model suggests that learning is path dependent. History matters, not just outcomes.
The problem is that an organism must change itself in response to the world. Yet it only has access to prediction errors.
Learning is not about storing facts but about continuous self-modification under uncertainty.
But then the Bayes approach suggests that people get trapped in early "weight spaces" and lose the ability to grow ... the early environment has an outsize impact. This result into skepticism ...
// If learning is gradient descent on an error landscape, and early learning carves deep channels in that landscape, then later experience flows along paths already established. The system becomes increasingly constrained by its own history. Early priors aren't just influential—they're constitutive of the space in which later learning happens. You learn how to learn early, and that meta-structure may be nearly impossible to escape. This is an empirical question …
This isn't just about content ("I learned X when young"). It's structural: the very dimensionality of the hypothesis space, the features one is capable of extracting, the errors one is capable of detecting—are all shaped by early experience. A child raised without language during critical periods doesn't just lack vocabulary; they may lack the neural architecture for certain syntactic operations. Cognition is also the full development of a capacity that is merely contingent …
The Bayesian formalization makes this stark: Strong early priors that get reinforced become increasingly difficult to overcome. As you accumulate experience, your effective learning rate for anything contradicting those priors approaches zero. The precision-weighting means you discount evidence that doesn't fit. This is adaptive—it prevents you from being blown around by noise—but it also means you can become trapped in a locally optimal but globally sub-optimal configuration.
Epistemically: If you're trapped in your weight space, how do you know your current mind isn't just an artifact of your learning history? Your confidence in any belief might just reflect that you've settled into a deep attractor, not that you've found truth. You can't "see far enough" because your very perceptual/cognitive apparatus was shaped by contingent early experience.
Practically: This has dark implications for therapy, education, social change. If early trauma or poverty or cultural programming fundamentally shapes the architecture of learning, then later interventions are trying to work within a constrained space. You're not just overcoming false beliefs—you're trying to reshape the system that generates beliefs—How does one fight this?
Existentially: The continuous self-modification means there's no stable "you" evaluating the process from outside. The thing doing the learning IS the learning process. You can't step outside your trajectory to assess it objectively. You are the learning process up to this point …
However
Metaplasticity and multiple timescales//Neural systems have plasticity of plasticity. Learning rates themselves can be learned. The Bayesian framework assumes fixed priors, but real brains can detect when their model is systematically failing and increase uncertainty, reopening the learning process. This is what happens in "insight" and "cognitive restructuring"—a meta-level change that allows ground-level updating. It's rare, difficult, and possible.
Noise and exploration// Pure gradient descent gets stuck in local minima. But biological systems have noise—from neural stochasticity, from neuromodulators, from sleep, from drugs, from stress.
Noise can kick you out of stable attractors and allow exploration of different regions of state space.
Creativity, psychedelics, meditation, trauma = ways the system escapes its own stability.
Social and cultural scaffolding//You're not learning in isolation. Other minds, institutions, practices can provide structure that your individual system can't generate alone. Language, mathematics, scientific method—these are cultural technologies that extend the hypothesis space beyond what any individual could construct. They're external ratchets preventing complete path-dependence. These are prized gearworks in the toolkit of thinking.
Predictive error itself as the escape//The system is optimizing for prediction error minimization, but the world is complex and non-stationary. Persistent large prediction errors—especially in domains that matter—can force architectural change.
Suffering is the signal that your model is inadequate.
The question is whether organisms can tolerate enough sustained prediction error to reshape themselves rather than just avoiding or explaining it away …
Objection: mechanisms exist, but they're partial, difficult, and constrained by the very history they're trying to escape. The mathematician who learns to "think differently" is still constrained by having learned mathematics within one particular cultural and cognitive framework.
Result? -- a kind of pragmatic fallibilism?
• The current understanding is path-dependent
• The learner can't fully escape his or her learning history
• But one can still work toward less inadequate models
• Self-modification under uncertainty is the only game in town
• Skepticism is part of the trajectory—a recognition of constraint that might slightly loosen the constraint—the impossible possibility
The fact that we can formulate this problem, recognize the trap, feel the vertigo—that's already a form of meta-level cognitive flexibility that the pure path-dependence story appears to make impossible.
Or// This very thought is just my own weight space trying to avoid its own vertigo.
The thing doing the learning IS the learning process.
//One addition – a lemma? -- The Rescorla-Wagner model with a dynamic α models the plasticity of the organism in response to its environment.
In the standard Rescorla-Wagner model, α (learning rate) is a fixed parameter representing the "associability" of a stimulus or the general learning capacity of the organism. It's treated as a property of the stimulus or the organism that stays constant across learning.
But if α becomes dynamic—if it varies based on uncertainty, surprise, context, volatility—then it's no longer just a parameter in a learning rule. It becomes a measure of plasticity itself. The organism's ability to modify itself in response to prediction errors is now part of what's being tracked – this is the thing we are trying to optimize.
This shifts the ontology:
• Static α: The organism has a fixed capacity to learn from experience
• Dynamic α: Plasticity is itself adaptive—the organism learns how much to learn
Mathematically, dynamic α might depend on:
• Uncertainty about the environment (higher uncertainty → higher α)
• Volatility (rapid environmental change → higher α to track it)
• Prediction error history (persistent errors → increase α; accuracy → decrease α)
• Neuromodulatory state (dopamine, norepinephrine modulating plasticity)
Psychologically, this captures:
• Developmentally sensitive periods (high α early, decreasing with age)
• Attention (α higher for attended stimuli)
• Emotional arousal (strong emotions temporarily increase α)
• Freezing (α approaching zero for domains with entrenched priors)
A learning system with dynamic learning rates is a system that regulates its own plasticity. The trajectory of α over time maps the organism's capacity for self-modification—where it remains flexible, where it has rigidified, and under what conditions it can reopen learning.
This makes α(t) a kind of meta-parameter—a window into the organism's relationship with its own changeability — approaching a formalism to express the problem of moral philosophy
No comments:
Post a Comment