Project Overview

M.I.R.A. — Modular Interactive Robotic Agent

Embodied desktop AI interface with expressive behavior, modular cognition, and local action execution.

Status: Active software prototype

M.I.R.A. is a software-first robotic assistant prototype designed around the idea that an intelligent interface should not only answer, but also appear attentive, reactive, and embodied. The current implementation focuses on a desktop companion interface: a PySide6 application with animated eyes, conversational input, session memory, intent recognition, local actions, and a debug panel for tuning expressive behavior in real time.

The long-term objective is to use this software core as the interaction layer for a future physical assistant. Instead of starting from motors and sensors, the project first develops the agent architecture: perception of user input, cognition, action selection, UI feedback, expressive state transitions, and a modular path toward voice, vision, and robotics integration.

Architecture Features Roadmap Status

Why M.I.R.A.

The previous naming separated the cognitive core and the physical robot into two identities. M.I.R.A. simplifies the concept into a single project name that can scale from the current desktop prototype to a future embodied robotic platform.

Modular

Independent components for cognition, actions, UI, state management, and expression.

Interactive

Text-based interaction today, with a clear path toward voice, camera input, and multimodal arbitration.

Robotic Agent

A software agent designed to become the behavioral layer of a future physical assistant.

Main Features

Expressive Face

Animated eyes with blinking, idle motion, emotional states, eyelid deformation, asymmetry, and smooth transitions.

Cognitive Core

Intent inference, response generation, session memory, action mapping, and event-driven state transitions.

Local Actions

Action registry and executor for system-level tasks such as opening URLs, launching apps, showing notifications, and retrieving system information.

Rule / LLM Intent Engine

Rule-based interpretation for deterministic behavior, plus an optional local LLM-backed parser through Ollama.

Live Debug Panel

Runtime tuning of facial profiles, blink timing, idle animation, speaking pulse, thinking drift, and asymmetry controls.

Event-Driven Interaction

Centralized event bus connecting user input, processing states, inferred intents, action results, and visual feedback.

At a Glance

M.I.R.A. is an embodied AI interface prototype that explores how a desktop assistant can feel more present through motion, expression, and stateful interaction. The system is built around a face widget, a chat panel, a debug panel, and a modular backend that coordinates cognition and behavior.

The assistant receives text input, stores the interaction in session memory, infers an intent, optionally builds and executes an action request, generates a response, and maps the result to a visible face state. This creates a tight loop between cognition and embodiment: listening, thinking, speaking, confusion, happiness, tiredness, and other expressive states are reflected visually instead of remaining hidden in the backend.

System Architecture

M.I.R.A. software architecture User input is processed by the interaction layer, cognition layer, action system, and expressive face controller through an event-driven architecture. User Input Chat · focus · text events Event Bus Signals · state changes Brain Intent · memory · response Actions Apps · URLs · OS State Manager Idle · listening · thinking Embodied Behavior Decay · micro-reactions Face UI Eyes · motion · tuning events input request

Implementation Highlights

PySide6 Desktop App

Main window layout with a large expressive face on the left and a chat/debug control panel on the right.

Interaction Pipeline

User input triggers listening, processing, intent inference, action execution, response generation, and visual state updates.

Session Memory

Recent user and assistant messages are stored in memory, together with the latest inferred intent and contextual metadata.

Action System

Actions are registered through a central registry and executed through a dedicated executor with success/failure events.

Expressive Controller

Each face state maps to an expression profile with size, offsets, corner radius, eyelid deformation, blink timing, and animation flags.

Local LLM Path

Optional Ollama integration converts natural language into structured intents and action requests while preserving the same backend contract.

Technologies

Python PySide6 / Qt Event-driven architecture Rule-based intent engine Local LLM / Ollama Action registry Desktop automation Session memory Procedural animation Human-machine interaction

Use Cases & Demos

Current and planned demonstrations for the embodied assistant prototype.

Conversational Companion

Text interaction with visual feedback: listening while the user types, thinking during processing, and speaking when responding.

Local Desktop Assistant

Open applications, navigate to websites, show notifications, and retrieve basic system information from natural language commands.

Expressive UI Sandbox

Live tuning of expression profiles for designing eye behavior, emotional states, and micro-animations.

Roadmap

Software Core Prototype

PySide6 UI, animated face, chat panel, debug panel, event bus, state manager, brain, session memory, and rule-based intent engine.

Current

Action Layer

Registry/executor pattern for local actions such as time/date, memory introspection, URL opening, app launching, notifications, and system info.

Current

Local LLM Integration

Use a local model through Ollama to parse natural language into structured intents and actions while keeping rule-based fallback behavior.

In progress

Voice & Perception

Add microphone input, speech-to-text, wake interaction, webcam-based presence detection, and multimodal arbitration.

Next

Physical Embodiment

Extend the software core into a physical robotic assistant with sensors, lights, audio, mechanical expression, and low-level hardware control.

Future

Status & Scope

~35% software prototype complete

The current milestone is not a hardware MVP yet. The project is deliberately focused on the interaction and cognition layer first: a working desktop prototype that can express state, interpret basic requests, execute local actions, and provide a foundation for more advanced multimodal and robotic behavior.