Cognitive Architecture Overview
A layered approach to cognitive architecture design. Separating oversight, intelligence, reasoning, and execution into composable components with specialized engines.
The Core Problem
Most AI systems are built as monoliths. A single model handles everything: understanding intent, reasoning about problems, taking actions, and evaluating results. This works for demos but breaks down in production systems that need reliability, debuggability, and graceful degradation.
The alternative is layered architecture with specialized engines: decomposing cognitive functions into distinct layers, where each layer contains purpose-built components for specific tasks. This pattern has a long history in cognitive science—from Newell's unified theories of cognition through modern architectures like ACT-R, Soar, CLARION, and LIDA.
The Layer Model
Drawing from decades of cognitive architecture research, a practical system separates concerns across four primary layers:
┌─────────────────────────────────────────────────────────────────┐ │ OVERSIGHT LAYER │ │ Monitoring, safety evaluation, metacognition, self-correction │ ├─────────────────────────────────────────────────────────────────┤ │ INTELLIGENCE LAYER │ │ Planning, learning, multi-model coordination, uncertainty │ ├─────────────────────────────────────────────────────────────────┤ │ REASONING LAYER │ │ Generation, routing, validation, aggregation │ ├─────────────────────────────────────────────────────────────────┤ │ EXECUTION LAYER │ │ Tool registry, orchestration, sandboxed runtime │ └─────────────────────────────────────────────────────────────────┘
Each layer has distinct responsibilities. Information flows both up (results, observations, state) and down (goals, instructions, context). This mirrors the hierarchical control structures found in biological cognition and has proven effective in systems like Soar and LIDA.
Oversight Layer
The oversight layer provides meta-level monitoring and evaluation. Drawing from global workspace theory (as implemented in LIDA) and metacognitive research, this layer doesn't generate outputs directly—it monitors, evaluates, and guides lower layers.
Key Functions
- Safety evaluation — Checking outputs against constraints before delivery
- Quality monitoring — Detecting errors, inconsistencies, and degraded performance
- Confidence calibration — Estimating uncertainty and knowing when to abstain
- Self-correction — Triggering revision when problems are detected
- Resource governance — Preventing runaway computation or infinite loops
Opinion: Oversight should be structurally separate from generation. When the same component both generates and evaluates outputs, it tends to approve its own work. This separation creates the kind of healthy tension that catches errors before they propagate.
Intelligence Layer
The intelligence layer handles complex reasoning that goes beyond simple generation. This maps to what ACT-R calls the "procedural module" and what Soar implements through its decision cycle and chunking mechanisms.
Key Functions
- Multi-model coordination — Multiple models (or invocations) deliberating, critiquing, and synthesizing. Similar to ensemble methods but with structured interaction.
- Planning — Goal decomposition, dependency analysis, plan generation. Breaking complex objectives into achievable steps.
- Learning — Runtime adaptation from experience. Not weight updates, but structured learning that persists across sessions (similar to Soar's chunking).
- Uncertainty quantification — Calibrated confidence estimates. Distinguishing epistemic uncertainty (lack of knowledge) from aleatoric uncertainty (inherent randomness).
Opinion: Multi-model deliberation is underrated. Single-model inference produces confident-sounding outputs even when wrong. Structured disagreement—models arguing positions and critiquing each other—surfaces errors that single inference misses.
Reasoning Layer
The reasoning layer is a pipeline: generate candidates, route to appropriate handlers, validate outputs, aggregate results. This corresponds to what FORR calls the "advisor" pattern—multiple specialized reasoners contributing to decisions.
Pipeline Components
- Generator — Produces candidate outputs. May invoke multiple models or strategies in parallel.
- Router — Directs requests to appropriate specialists. Pattern-based with learned routing preferences (similar to mixture-of-experts).
- Validator — Checks outputs against constraints. Schema validation, safety checks, consistency verification.
- Aggregator — Combines multiple outputs into coherent responses. Handles conflict resolution and synthesis.
Opinion: The reasoning layer should be as simple as possible while being as sophisticated as necessary. Most requests don't need multi-model deliberation or complex planning—they need fast, validated generation. The reasoning layer is the common path; upper layers are invoked selectively based on complexity and risk.
Execution Layer
Where the system touches the world. The execution layer is intentionally constrained—it can only do what it's explicitly allowed to do. This draws from principles of least privilege and defense in depth.
Components
- Tool Registry — Catalog of available capabilities. Tools are registered with schemas, permissions, and constraints.
- Orchestrator — Coordinates multi-tool operations. Handles sequencing, parallelism, and error propagation.
- Sandboxed Runtime — Isolated execution environment. Strict resource limits, controlled access, ephemeral state.
Opinion: The execution layer should be maximally paranoid. Whitelists beat blacklists. If a capability isn't explicitly registered and approved, it doesn't exist. Default-deny is essential for systems that can take real-world actions.
Cross-Cutting Concerns
Memory
Memory isn't a layer—it's a capability that every layer needs. This aligns with how LIDA and ACT-R treat memory as multiple interacting systems (working memory, declarative memory, procedural memory) rather than a single store.
Opinion: Context windows are caches, not databases. Critical state—user preferences, task progress, accumulated knowledge—should live in durable storage with explicit read/write operations. Relying solely on context leads to forgotten information and inconsistent behavior.
Event-Driven Coordination
Layers communicate through events, not just function calls. This enables loose coupling while maintaining coordination—similar to how LIDA's "codelets" broadcast to a global workspace.
Lifecycle Management
Complex systems need explicit initialization and shutdown sequences. State needs to be persisted. In-flight operations need to complete or be cleanly cancelled. Resources need to be released in the right order.
Practical Trade-offs
Complexity vs. Capability
More layers and components means more coordination overhead. The architecture needs to justify its complexity through improved reliability, debuggability, or capability. Each component should exist because its absence caused problems.
Latency vs. Thoroughness
Full-stack processing—oversight evaluation, multi-model deliberation, validated generation, sandboxed execution—takes longer than a single model call. The architecture needs escape valves: fast paths for simple requests, tiered processing based on complexity and risk.
Explicitness vs. Emergence
This approach explicitly engineers cognitive functions rather than hoping they emerge from training. This is a bet: that explicit, inspectable systems are more reliable than emergent ones. The bet may be wrong, but at least it's testable.
What This Enables
- Debuggability — When something goes wrong, you can trace through layers to find where it went wrong.
- Testability — Components can be tested in isolation.
- Gradual improvement — Upgrade one component without touching others.
- Graceful degradation — If a component fails, the system can continue with reduced capability rather than complete failure.
- Observability — Each component can emit telemetry. You can see how the system processes requests, not just what it outputs.
Open Questions
- What's the right granularity? — When does splitting into more components help versus hurt? Too few and you lose benefits; too many and coordination dominates.
- Should architecture be learned? — Current architectures are hand-designed. Could the structure itself be learned or evolved?
- How do we validate oversight? — If the oversight layer approves an output, how do we know the approval is correct?
- What's the right interface between layers? — Function calls, events, shared memory? Each has trade-offs.
- How does this scale? — Does this architecture work for systems with hundreds of specialists, or do fundamentally different approaches become necessary?
Conclusion
Cognitive architecture is the art of decomposition: breaking complex behavior into pieces that can be built, tested, and improved independently. The specific architecture presented here—four layers with specialized components—is one approach drawing on decades of cognitive science research.
What matters more than the specific structure is the commitment to explicit design. If you can't explain how your system makes decisions, you can't trust those decisions. If you can't test components in isolation, you can't improve them reliably. Architecture is the foundation that makes everything else possible.
Further Reading
Foundational Works
- Newell, A. (1990). Unified Theories of Cognition — The foundational framing of cognitive architectures
- Anderson, J. R. (1983). The Architecture of Cognition — Precursor to ACT-R; understanding production systems
- Laird, J. E. (2012). The Soar Cognitive Architecture — Definitive book on goal-driven, rule-based architecture
Modern Overviews
- Kotseruba, I. & Tsotsos, J. K. (2025). The Computational Evolution of Cognitive Architectures — Comprehensive modern survey
- Ferreira, M. I. A. (ed.). Cognitive Architectures (Springer) — Focus on natural and artificial cognition
Key Survey Papers
- Kotseruba & Tsotsos (2020). "40 Years of Cognitive Architecture Research: Core Abilities and Applications" — Artificial Intelligence Review
- Duch, Oentaryo & Pasquier (2008). "Cognitive Architectures: Where Do We Go From Here?" — AGI Conference
Specific Architectures
- ACT-R — Production-rule based architecture modeling human cognition
- Soar — Goal-driven architecture with learning via chunking
- LIDA — Biologically inspired with global workspace emphasis
- CLARION — Hybrid symbolic/subsymbolic with explicit/implicit learning
- FORR — Weighted advisor systems and decision heuristics