Robo-Cortex Logo

A Self-Evolving Embodied Agent via Dual-Grain
Cognitive Memory and Autonomous
Knowledge Induction

Abstract

The ability to navigate and interact with complex environments is central to real-world embodied agents, yet navigation in unseen environments remains challenging due to “experiential amnesia,” where existing trajectory-driven or reactive policies fail to synthesize generalizable strategies from past interactions. We propose Robo-Cortex, a self-evolving framework that enables robots to autonomously induce navigation heuristics and refine cognitive strategies through a continuous reflection-adaptation loop. By abstracting success patterns and failure pitfalls into natural-language heuristics, Robo-Cortex enables a transition from passive execution to active strategy evolution. Our core innovation is an Autonomous Knowledge Induction (AKI) mechanism that distills multimodal trajectories into a structured Navigation Heuristic Library for knowledge generalization. The architecture further incorporates a Dual-Grain Cognitive Memory system, comprising a Short-term Reflective Memory (SRM) for real-time local progress analysis, and a Long-term Principle Memory (LPM) that abstracts past trajectories into reusable guiding and cautionary principles. To ensure robust decision-making, we introduce a multimodal Imagine-then-Verify loop, where a world model simulates potential outcomes and a VLM-based evaluator validates action plans. Extensive evaluations on IGNav, AR, and AEQA show that Robo-Cortex consistently outperforms strong baselines in both task success and exploration efficiency, with gains of up to +4.16% SPL over the strongest prior method and up to +15.30% SPL under heuristic transfer to unseen environments. Preliminary real-world robotic experiments further support the effectiveness of Robo-Cortex in physical settings.

Overview

Overview of Robo-Cortex

Overview of Robo-Cortex. Robo-Cortex is a self-evolving embodied navigation framework with three components: an Imagine-then-Verify planning loop for closed-loop decision making, Dual-Grain Cognitive Memory for reflection at two temporal scales, and Autonomous Knowledge Induction for distilling transferable navigation heuristics from experience. Together, they form an interaction-reflection-conceptualization-adaptation loop for continual strategy evolution.

Paradigm

Comparison of embodied-agent paradigms

Comparison of prior embodied-agent paradigms and Robo-Cortex. Memory-augmented navigation mainly preserves scene-level and trajectory-level context for decision making, but does not explicitly form strategy-level memory. Self-improving agents can refine behavior through critique, abstraction, or skill/model updates, yet are typically not grounded in closed-loop embodied interaction or do not induce transferable embodied heuristics. Robo-Cortex unifies reflective memory, principle memory, and heuristic induction in a self-evolving embodied framework that distills multimodal interaction experience into transferable navigation heuristics for future reflection, planning, and strategy evolution.

Method

Internal Workflow of Robo-Cortex

Internal Workflow of Robo-Cortex. Robo-Cortex integrates (a) Imagine-then-Verify Planning Loop, (b) Short-Term Reflective Memory, (c) Long-Term Principle Memory and (d) Autonomous Knowledge Induction through a shared memory graph. During execution, recent subtasks are analyzed for local progress and failure patterns, while related past experiences are retrieved as principle-level guidance. Meanwhile, accumulated trajectories are continually abstracted into reusable navigation heuristics and fed back into future reflection and planning, enabling continual strategy evolution over time.

Real World Experiments

Real-world benefit of short-term reflection

Real-world benefit of short-term reflection. In an image-goal navigation task, the robot without SRM drifts away from the target after losing goal-relevant cues at a critical step. With SRM, Robo-Cortex detects the misalignment, reflects on the failure, and recovers by returning toward the last known goal-consistent region, leading to successful completion.

Real World Demo

Success Case

Living Room
Supermarket
Bedroom

Failure Case

Living Room
Supermarket
Bedroom