Mimir - Development Roadmap

Platform-agnostic, BYOK AI coding agent CLI. TypeScript, test-driven, cross-platform.

Priority: Configuration/Teams → Tools → Agent Orchestration → Core Features → Polish


Phase 1: Foundation & Infrastructure

Goal: Core project structure, CI/CD, platform abstractions, infrastructure

Project Setup

  • Initialize TypeScript project with yarn
  • Configure tsconfig.json (strict mode)
  • Set up ESLint + Prettier
  • Configure Vitest for testing
  • Set up project directory structure
  • Create .gitignore with .mimir/ entries
  • Initialize Git repository

Core Infrastructure

  • Logging: Winston/Pino, log rotation, .mimir/logs/, context-aware logging
  • Error Handling: Custom error classes, global handler, Sentry integration
  • Monitoring: Performance hooks, metrics collection, health checks
  • Security: npm audit, Snyk, input validation, secrets management, rate limiting
  • Database: SQLite schema, migrations, connection pooling, backup strategy
  • Configuration: Zod validation, env configs, secrets encryption, migration system
  • Caching: In-memory cache for tokens/files, invalidation, size limits
  • Build: tsup bundling, binary compilation, multi-platform builds
  • Development: VSCode settings, debug configs, git hooks (pre-commit, pre-push)

CI/CD Pipeline

  • GitHub Actions: test.yml, build.yml, release.yml
  • Code coverage (Codecov)
  • Automated linting and type checking

Installation Scripts

  • install.ps1 for Windows (PowerShell)
  • install.sh for Unix (bash/zsh)
  • Test on Windows 10/11, macOS, Linux (Ubuntu, Debian, Fedora) - via GitHub Actions test-installation.yml

Platform Abstraction Layer

  • IFileSystem interface + cross-platform implementation (fs/promises + globby)
  • IProcessExecutor interface + cross-platform implementation (execa)
  • IDockerClient interface skeleton
  • Path utilities (normalize, resolve, join)
  • Unit tests for all abstractions

Phase 2: Configuration System & Teams/Enterprise Support

Goal: Robust configuration with Teams integration, storage abstraction, policy enforcement

See: docs/contributing/plan-enterprise-teams.md for detailed architecture

Configuration Schema

  • Define Zod schemas for all config types
  • Generate TypeScript types from schemas
  • Default configuration values
  • Configuration documentation

Configuration Loader (Enhanced)

  • YAML parser for .mimir/config.yml
  • Global config (~/.mimir/config.yml)
  • Project config (.mimir/config.yml)
  • .env file support
  • Environment variable overrides
  • CLI flag overrides
  • Configuration merge/priority system
  • Zod validation
  • NEW: Teams/Enterprise config source (highest priority, enforced)
  • NEW: Config hierarchy refactor (default → teams → global → project → env → cli)

Storage Abstraction

  • IStorageBackend interface
  • LocalSQLiteStorage implementation (existing DB code)
  • TeamsCloudStorage implementation (API-based)
  • HybridStorage implementation (local-first with background sync)
  • Conversation storage (save/load/list/delete)
  • Message storage (append/load)
  • Tool call recording
  • Permission/audit logging
  • Config caching (for offline Teams mode)

Teams API Client

  • TeamsAPIClient interface and implementation
  • Authentication (login/logout/status)
  • Config endpoints (GET /orgs/{orgId}/config)
  • Tools endpoints (GET/POST /orgs/{orgId}/tools)
  • Custom commands endpoints (GET /orgs/{orgId}/commands)
  • Allowlist endpoints (GET /orgs/{orgId}/allowlist)
  • Conversation sync (POST /orgs/{orgId}/conversations/sync)
  • Audit log sync (POST /orgs/{orgId}/audit/sync)
  • Sandbox execution (POST /sandbox/execute)
  • LLM proxy (POST /llm/chat, POST /llm/chat/stream)
  • Quota management (GET/POST /orgs/{orgId}/users/{userId}/quota)

Authentication System

  • mimir teams login command
  • mimir teams logout command
  • mimir teams status command
  • Token storage (~/.mimir/auth.json)
  • Token encryption
  • Token refresh logic
  • OAuth flow (browser-based)
  • Enterprise mode detection

Sync Manager

  • Background batch sync (configurable interval)
  • Sync queue (conversation, audit, tools, commands)
  • Conflict resolution strategies
  • Offline mode handling
  • Force sync command (mimir teams sync)
  • Sync status tracking

Policy Enforcement

  • Enforce Teams config (cannot be overridden locally)
  • Allowed models enforcement
  • Allowed sub-agents enforcement
  • Forced sub-agents (e.g., security agent)
  • Model selection per sub-agent enforcement
  • Docker sandbox mode enforcement (local/cloud/auto)
  • Budget limits enforcement (daily/weekly/monthly)

Permission & Security Configuration

  • Command Allowlist
    • Glob patterns and regex matching
    • Default allowlist (safe commands)
    • Custom allowlist in config
    • Team-shared allowlist templates (.mimir/allowlist.yml)
  • Auto-Accept Configuration
    • autoAccept: true/false/ask
    • alwaysAcceptCommands list
    • Command-specific auto-accept rules
  • Risk Assessment Levels
    • Risk levels: low, medium, high, critical
    • Command classification by risk
    • acceptRiskLevel config setting
    • Auto-block above configured risk level
  • NEW: Teams shared allowlist integration

Keyboard Shortcuts Configuration

  • Define default keyboard shortcuts
  • Allow customization in config (keyBindings section)
  • Support for Ctrl, Alt, Shift combinations
  • Platform-specific defaults (Cmd on macOS, Ctrl on Windows/Linux)
  • Configurable shortcuts for:
    • Interrupt/cancel (default: Ctrl+C)
    • Mode switching (default: Shift+Tab)
    • Accept command (default: Enter)
    • Quick reject (default: Escape)
    • Alternative instruction (default: Ctrl+E)
    • Help overlay (default: ?)
  • Load custom shortcuts from CLAUDE.md (KeyBindingsManager)

Configuration Storage

  • Create ~/.mimir/ on first run
  • Generate default config.yml template
  • Create .mimir/ on mimir init
  • Auto-add .mimir/ to .gitignore
  • Example configurations and templates
  • Example custom commands (test-coverage.md, commit.md, doctor.md)

Testing

  • Test loading from all sources
  • Test priority/override system
  • Test validation with invalid configs
  • Test permission system
  • Test risk assessment
  • Test allowlist loader
  • Test keyboard bindings (platform-specific)
  • Test Teams config source (mocked API)
  • Test storage abstraction implementations
  • Test sync manager (batch sync, conflict resolution)
  • Test authentication flow
  • Test enforcement (allowed models, sub-agents, etc.)

Phase 3: LLM Provider Abstraction

Goal: Provider-agnostic LLM integration (7+ providers + local models)

Base Provider Architecture

  • ILLMProvider interface (chat, streamChat, countTokens, calculateCost)
  • BaseLLMProvider abstract class
  • Common HTTP request logic (APIClient)
  • Retry logic with exponential backoff
  • Error handling for API failures

Provider Implementations

  • DeepSeek: API integration, tiktoken, model selection, cost calc, streaming (OpenAI-compatible)
  • Anthropic: API integration, tiktoken approximation, model selection, cost calc, streaming
  • OpenAI: Coming later (similar to DeepSeek, OpenAI-compatible)
  • Google/Gemini: Coming later (requires Gemini SDK)
  • Qwen: Coming later (OpenAI-compatible, similar to DeepSeek)
  • Ollama (Local): Coming later (local API, no cost)

Provider Factory

  • ProviderFactory.create()
  • API key loading from config/env
  • Graceful handling of missing keys
  • Provider-specific configurations
  • NEW: Proxied provider for Teams mode (route through Teams API)

Shared Utilities

  • Pricing data (hybrid: API + static fallback, 24h cache)
  • Tool formatters (OpenAI ↔ Anthropic format conversion)
  • Stream parsers (SSE for OpenAI/Anthropic formats)
  • API client wrapper (axios-based with error mapping)

Testing

  • Mock HTTP requests (MSW)
  • Test DeepSeek provider (chat, streaming, tools, errors)
  • Test Anthropic provider (chat, streaming, tools, errors)
  • Test error scenarios (rate limits, network)
  • Test token counting accuracy
  • Test cost calculations
  • Test proxied provider (Teams mode)

Phase 4: Tool System

Goal: Comprehensive tool system with built-in tools, custom tools, MCP, and Teams integration

See: docs/contributing/plan-tools.md for detailed architecture

Tool Architecture

  • Tool interface (name, description, schema, enabled, tokenCost, execute)
  • ToolContext interface (platform, config, conversation, logger, llm, permissions)
  • ToolRegistry class
  • Tool discovery system
  • Tool execution wrapper
  • Tool result formatting
  • Token cost estimation
  • Enable/disable tools in config

Built-in Tools

  • FileOperationsTool
    • Read, write (with backup), edit (find/replace, line-based)
    • List directory, create directories
    • Delete (with confirmation)
    • Check existence, get metadata
  • FileSearchTool
    • grep/ripgrep integration
    • Glob pattern matching
    • Regex search
    • Include/exclude patterns
    • Formatted results with line numbers
  • BashExecutionTool
    • Execute commands in project directory
    • Capture stdout/stderr
    • Timeout handling
    • Failure handling
    • Windows (PowerShell) and Unix (bash/zsh) support
    • Command allowlist/blocklist
    • Permission prompt system
  • GitTool
    • git status, diff, log, branch, commit, checkout
    • Detect git repository
    • Parse git output

Tool Configuration

  • Tool enable/disable in config.yml
  • Per-tool settings (timeout, permissions, etc.)
  • Token cost tracking
  • System prompt size calculation
  • Teams-enforced tools (cannot be disabled)

/tools Command (In-Chat Management)

  • /tools - list all tools with status and token costs
  • /tools enable <name> - enable a tool
  • /tools disable <name> - disable a tool (if not enforced)
  • /tools info <name> - show tool details (description, schema, cost)
  • /tools tokens - show token cost breakdown (visual bar chart)
  • Update config.yml when enabling/disabling tools
  • Display total system prompt token cost

Custom Tools (TypeScript Runtime)

  • YAML tool definition loader (.mimir/tools/*.yml)
  • JSON Schema to Zod conversion
  • TypeScript code compilation (esbuild)
  • Docker sandbox execution (isolated context)
  • Context injection (platform, config, conversation, logger, llm)
  • Permission system integration (inherit allowlist)
  • Tool-specific permissions (allowlist, autoAccept, riskLevel)
  • Error handling and timeout enforcement
  • Example custom tools (run_tests, analyze_dependencies)

Sandbox Runtime for Custom Tools

  • Docker image for tool execution (mimir/tool-sandbox:node18)
  • Sandbox runtime library (safe platform abstractions)
  • IPC/API for host communication (file ops, command execution)
  • Resource limits (CPU, memory, timeout)
  • Security: prevent escaping working directory

Model Context Protocol (MCP) Support

  • MCP Client: stdio/HTTP transports, server lifecycle, tool parsing
  • MCP Server Management: Discovery, auto-start, health checks, failure handling
  • MCP Configuration: Config schema, server definitions (command, args, env)
  • MCP Tool Registry: Dynamic registration, namespacing (server/tool), conflict handling
  • Built-in MCP Servers: Filesystem server, Git server (optional dependencies)
  • MCP Tool Adapter: Wrap MCP tools as Tool interface

Teams Tool Integration

  • Load tools from Teams API (GET /orgs/{orgId}/tools)
  • Teams tools override local tools (if conflict)
  • Teams tools are enforced (cannot be disabled)
  • Display Teams tools in /tools command
  • Execute Teams tools (may route to cloud sandbox)

Syntax Highlighting

  • Integrate highlighter (Shiki/highlight.js)
  • Support major languages (TS, JS, Python, Go, Rust, .NET, etc.)
  • Apply to file content and code blocks

Testing

  • Unit tests per built-in tool
  • Integration tests with real operations
  • Mock filesystem
  • Platform-specific behavior tests
  • Custom tool loading and execution
  • Docker sandbox tests
  • MCP client with mock servers
  • Permission prompt system tests
  • Teams tool sync tests
  • /tools command interactions

Phase 5: Docker Sandbox

Goal: Secure, isolated code execution

Docker Client

  • Complete DockerClient class (dockerode)
  • Detect Docker installation
  • Handle Windows Docker Desktop and Unix daemon
  • Connection error handling

Sandbox Images

  • Dockerfile.base (Alpine/Ubuntu)
  • Dockerfile.node (Node.js)
  • Dockerfile.python (Python)
  • Dockerfile.tool-sandbox (for custom tools)
  • Multi-arch builds (amd64, arm64)

Container Management

  • Build custom images
  • Run containers with commands
  • Resource limits (CPU, memory, timeout)
  • Mount project directory (read-only option)
  • Capture output (stdout/stderr)
  • Cleanup and timeout handling
  • Result caching

Code Execution Tool

  • Integrate Docker sandbox
  • Execute in sandboxed environment
  • Multiple runtimes (Node, Python, etc.)
  • Return results to agent
  • Error handling

Cloud Sandbox Integration

  • Detect Teams mode and dockerMode config
  • Route to cloud sandbox when configured
  • Execute via Teams API (POST /sandbox/execute)
  • Fallback to local on failure (if allowed)

Testing

  • testcontainers for integration tests
  • Container creation/cleanup tests
  • Resource limits enforcement
  • Timeout handling
  • Multi-platform Docker support
  • Cloud sandbox routing tests

Phase 6: ReAct Agent Loop

Goal: Core agent reasoning, action loop, interrupt handling

Agent Architecture

  • Agent class
  • ReAct loop (Reason -> Act -> Observe)
  • LLM provider integration
  • Tool registry integration
  • Max iteration limit
  • Early stopping conditions

Interrupt & Control System

  • Cancel/Interrupt
    • Graceful SIGINT handling (Ctrl+C)
    • Save agent state before interruption
    • Resume from interruption point
    • Show partial results on cancel
    • Resource cleanup (containers, temp files)
  • Mode Switching During Execution
    • Pause agent, show mode menu
    • Switch between plan/act/discuss modes
    • Preserve context when switching
    • Resume in new mode
  • Alternative Instructions
    • On permission prompt, allow typing alternative
    • “edit” option instead of just “always accept”
    • Parse alternative instruction
    • Update agent plan
    • Show updated plan before proceeding

Reasoning

  • Format messages for LLM (system prompt, history, tools)
  • Parse LLM response for actions
  • Handle tool calling format
  • Handle “finish” action
  • Handle malformed responses

Acting

  • Execute tool based on LLM action
  • Pass arguments to tool
  • Handle tool errors
  • Format tool results
  • Permission & Risk Assessment
    • Assess command risk before execution
    • Check against allowed commands (local + Teams)
    • Prompt user if not auto-accepted
    • Block high-risk if configured
    • Log all permission decisions

Observing

  • Store tool results in history
  • Update agent state
  • Log actions and observations
  • Track token usage per iteration

Error Handling

  • LLM API errors
  • Tool execution errors
  • Retry logic
  • Helpful error messages
  • Failure recovery

Testing

  • Mock LLM responses
  • Test complete agent loops
  • Test error recovery
  • Test max iteration handling
  • Test permission system
  • Test interrupt handling
  • Test mode switching
  • Test alternative instruction parsing

Phase 7: Conversation History & Memory

Goal: Persistent conversation storage with Teams sync

Storage

  • SQLite schema (conversations, messages, tool_calls, permissions)
  • Database initialization
  • Conversation CRUD operations
  • Message append operations
  • Permission decision audit trail
  • NEW: Use IStorageBackend abstraction (supports local + cloud)

Memory Management

  • ConversationMemory class
  • Load history on resume
  • Append new messages
  • Context window management
  • Message truncation strategies
  • Export to JSON/Markdown

History Management (CLI commands)

  • mimir history list - list recent conversations
  • mimir history resume <id> - continue conversation
  • mimir history export <id> - export to file
  • mimir history clear - delete conversation history

Teams Sync Integration

  • Background sync to Teams API (if enabled)
  • Load conversations from Teams API
  • Conflict resolution (cloud vs local)

Testing

  • SQLite operations
  • Conversation persistence
  • Resume functionality
  • Export formats
  • Teams sync workflows

Phase 8: Token Counting & Cost Analysis

Goal: Real-time token and cost tracking with Teams quota enforcement

Token Counting

  • Integrate tiktoken
  • Count input tokens before LLM call
  • Extract output tokens from response
  • Count per message
  • Track cumulative per session

Cost Calculation

  • Pricing data structure (per provider/model)
  • Load pricing from config
  • Calculate cost per message
  • Calculate cumulative cost per session
  • Store costs in database

Real-Time Display

  • Show token count after each message
  • Show cost after each message
  • Show session total (tokens + cost)
  • Color-code warnings (80%, 90% of budget)

Budget Management

  • Budget limit in config
  • Check budget before LLM calls
  • Warn when approaching limit
  • Stop when budget exceeded
  • Allow override with flag

Teams Quota Enforcement

  • Check quota via Teams API before expensive operations
  • Display org-level quota usage
  • Enforce daily/weekly/monthly limits
  • Show quota in mimir teams status

Cost Analytics (CLI commands)

  • mimir cost today - today’s spending
  • mimir cost week - weekly spending
  • mimir cost month - monthly spending
  • mimir cost compare - compare providers
  • mimir cost export - export to CSV
  • In-chat display: show cost after each message

Cost Comparison Dashboard

  • Comparison table (DeepSeek vs others)
  • Calculate savings
  • Show historical trends
  • Recommend cheaper alternatives

Testing

  • Token counting accuracy
  • Cost calculations
  • Budget enforcement
  • Analytics queries
  • Teams quota checks

Phase 9: CLI & Terminal UI

Goal: Polished terminal interface with modes and shortcuts

Command Structure

CLI Commands (repo/session management):

  • mimir - start interactive chat (main command)
  • mimir init - initialize project (create .mimir/)
  • mimir history list - list conversations
  • mimir history resume <id> - resume conversation
  • mimir history export <id> - export conversation
  • mimir history clear - delete old conversations
  • mimir cost today - today’s spending
  • mimir cost week - weekly spending
  • mimir cost month - monthly spending
  • mimir cost compare - compare providers
  • mimir doctor - run diagnostics
  • mimir permissions list - show allowlist
  • mimir permissions add <pattern> - add to allowlist
  • mimir permissions remove <pattern> - remove from allowlist
  • mimir checkpoint list - list checkpoints
  • mimir checkpoint restore <id> - restore checkpoint
  • NEW: mimir teams login/logout/status/sync - Teams commands
  • mimir --version - show version
  • mimir --help - show help

Slash Commands (in-chat, context-specific):

  • /discuss <topic> - switch to architect/discuss mode
  • /plan <description> - create execution plan
  • /act - switch to autonomous execution mode
  • /mode <plan|act|discuss> - change mode
  • /compact - manually compact context
  • /model <provider> - switch LLM provider
  • /models - list available models
  • /checkpoint - create checkpoint now
  • /undo - undo last operation
  • NEW: /tools [list|enable|disable|info|tokens] - manage tools
  • /help - show slash commands
  • Custom commands loaded from .mimir/commands/

Interactive Chat UI (Ink)

  • Display user/assistant messages (streaming)
  • Display tool calls with spinners
  • Display tool results (formatted)
  • Show token/cost info
  • Status indicators (thinking, executing, etc.)
  • User input with autocomplete
  • Slash command support
  • Permission Prompts
    • Display command and risk level (color-coded)
    • Options: y/n/a(always)/never/edit/view
    • On “edit”: show command, allow alternative, replan, show updated plan
    • “Always accepted” indicator for auto-approved
    • Show Teams allowlist status
  • Keyboard Shortcuts
    • Implement configured shortcuts
    • Help overlay (show shortcuts)
    • Customizable per user config

Plan, Act, Discuss Modes

  • Plan Mode: Create task breakdown, get approval, allow editing
  • Act Mode: Execute autonomously, show progress, checkpoints, allow interruption
  • Architect/Discuss Mode: Interactive planning
    • Agent asks clarifying questions
    • Multi-turn Q&A
    • Present approaches with pros/cons
    • Discuss trade-offs
    • Let user guide decisions
    • Generate architecture plan
    • Questions: scale, preferences, performance vs maintainability, existing patterns
    • Switch to Act mode after approval

Mode Switching

  • Smooth transitions between modes
  • Mode indicator in UI
  • Commands: /mode plan, /mode act, /mode discuss
  • Keyboard shortcut (configurable, default Shift+Tab)
  • Preserve agent state
  • Cancel current operation with confirmation

Task Display

  • Todo list (checkboxes)
  • Progress bars
  • Tree view for nested tasks
  • Status updates (pending, in-progress, done, failed)

Syntax Highlighting

  • Code in chat messages
  • File diffs
  • Command output

Logging

  • Structured logger (Winston/Pino)
  • Log levels (debug, info, warn, error)
  • Write to .mimir/logs/
  • --verbose flag for debug
  • --quiet flag for minimal output

Testing

  • CLI commands with mocked dependencies
  • Ink components (@testing-library/react)
  • User interactions
  • Permission prompt flows
  • Interrupt/cancel handling
  • Mode switching
  • Alternative instruction system
  • Discuss mode Q&A flow
  • Keyboard shortcuts
  • Tools command UI

Phase 10: Agent Orchestration & Multi-Agent System

Goal: Task decomposition, specialized agents, parallel execution

See: docs/contributing/plan-agent-orchestration.md for detailed architecture

Agent Orchestrator Core

  • AgentOrchestrator interface and implementation
  • Task complexity detection (single vs multi-agent)
  • Task decomposition (LLM-based parallel task planning)
  • Dependency graph construction
  • Topological sort (execution order)
  • Parallel execution engine
  • Result aggregation

Sub-Agent Management

  • Sub-agent creation (createSubAgent)
  • Nested agent creation (createNestedAgent)
  • Agent lifecycle management (start, pause, resume, stop)
  • Agent status tracking (pending, running, waiting, completed, failed, interrupted)
  • Budget enforcement per agent (tokens, cost, duration)
  • Nesting depth limits (configurable, default: 2 levels)

Specialized Agent Roles

  • Main Agent: Orchestrator, full tool access
  • Finder Agent: Quick searches, file discovery (Haiku/Qwen, read-only tools)
  • Oracle Agent: Deep reasoning, complex bugs (o3/GPT-5, full tools)
  • Librarian Agent: API/docs research (Sonnet 4.5, read-only)
  • Refactoring Agent: Code refactoring (Sonnet 4.5, write tools)
  • Reviewer Agent: Code review, security (Sonnet 4.5/o3, read + git)
  • Tester Agent: Test generation (Sonnet 4.5, write + bash)
  • Rush Agent: Quick targeted loops (Haiku, limited tools, 3-5 iterations)

Role-Based Tool Restrictions

  • Map agent roles to allowed tools
  • Enforce tool restrictions per agent
  • Override with Teams configuration (if enforced)

Interactive Agent Plan UI

  • Display parallel task plan to user
  • Show recommended role, model, estimated cost per task
  • Allow user to edit task descriptions
  • Allow user to change models (if not enforced)
  • Options: approve, cancel, edit, manual configuration
  • Configurable auto-approval mode

Multi-Agent Execution UI

  • Display all agents in one pane (stacked vertically)
  • Show status icon per agent (○ pending, ● running, ✓ completed, ✗ failed, ◌ waiting)
  • Show elapsed time, cost, tokens per agent
  • Show compact todo list per agent (first 3 items + ”… +N more”)
  • Keyboard shortcut to expand agent details
  • Real-time updates (500ms refresh)

Communication & Context

  • Centralized message queue (orchestrator manages all messages)
  • Inter-agent messaging (sendMessage, broadcastMessage)
  • Shared context (working directory, findings, decisions)
  • Agent-to-orchestrator communication
  • Result sharing between agents

Teams Enforcement for Agents

  • Enforce allowed models per agent
  • Enforce allowed sub-agent roles
  • Forced sub-agents (e.g., security agent on every write)
  • Model selection per sub-agent enforcement
  • Nesting depth limits (enterprise policy)

Testing

  • Task decomposition logic (mocked LLM)
  • Dependency graph construction
  • Topological sort
  • Parallel execution
  • Budget enforcement
  • Model selection/fallbacks
  • Inter-agent communication
  • Nested agent creation
  • Role-based tool restrictions
  • Teams enforcement for agents

Phase 11: Model Switching & Context Management

Goal: Dynamic model switching, intelligent context pruning

Model Switching (slash commands in chat)

  • /model <provider> - switch provider mid-conversation
  • /models - list available models
  • Context transfer when switching
  • Preserve conversation history
  • Adjust token limits per model
  • Check against allowed models (Teams enforcement)

Context Compaction

  • Context window monitoring
  • Detect approaching token limit
  • Summarize old messages
  • Clear tool results
  • Adaptive pruning strategies
  • Manual compaction: /compact command

Smart Context Management

  • Relevance scoring for messages
  • Keep important context (system prompts, recent)
  • Prune low-relevance old messages
  • Preserve critical info (file paths, decisions)

Local Model Support

  • Detect Ollama installation
  • List available Ollama models
  • Pull models if missing
  • Handle model loading time
  • Optimize prompts for smaller models

Testing

  • Model switching
  • Context compaction strategies
  • Various context sizes
  • Teams allowed models enforcement

Phase 12: Custom Commands & Checkpoints

Goal: User extensibility, code safety

Custom Slash Commands

  • Command file format (Markdown with frontmatter)
  • Load from .mimir/commands/ and ~/.mimir/commands/
  • Parse arguments ($1, $2, $ARGUMENTS)
  • Bash execution support (!command)
  • Register with agent
  • /help shows custom commands
  • Example commands
  • Permissions in Commands
    • Specify required permissions
    • Inherit risk level from definition
    • Support auto-accept: true in frontmatter
  • Teams custom commands (loaded from Teams API)

Checkpoint System

  • Auto-create before file changes
  • Store git diff and file backups in .mimir/checkpoints/
  • CLI Commands:
    • mimir checkpoint list - show all checkpoints
    • mimir checkpoint restore <id> - restore checkpoint
  • Slash Commands:
    • /checkpoint - create checkpoint now
    • /undo - undo last operation
  • Show diff before restore
  • Confirm destructive operations
  • Auto cleanup (keep last N)

Doctor Command (CLI)

  • mimir doctor - run full diagnostics
    • Node.js version check
    • Docker installation check
    • API keys configured
    • File permissions
    • Network connectivity
    • LLM provider connection test
    • MCP server health
    • Teams connection test (if authenticated)
  • Suggest fixes
  • Auto-fix when possible

MIMIR.md Support

  • Load context from MIMIR.md (or custom file via config)
  • Include in system prompt
  • Hierarchical MIMIR.md (global + project)
  • Template generation with mimir init

Testing

  • Custom command loading
  • Checkpoint creation/restoration
  • Doctor diagnostics
  • Command permission system
  • Teams custom commands sync

Phase 13: Polish & Launch

Goal: Production-ready release

Code Quality

  • 85%+ test coverage
  • Fix all linting errors
  • Address TypeScript strict mode issues
  • Optimize performance bottlenecks
  • Memory leak detection and fixes

Documentation

  • README with quickstart
  • Installation guide
  • Configuration guide
  • Custom commands guide
  • Custom tools guide
  • MCP integration guide
  • Permission system guide
  • Teams/Enterprise guide
  • Agent orchestration guide
  • API documentation
  • Example projects

Examples

  • Simple task automation
  • Code review workflow
  • Multi-file refactoring
  • Custom command for team
  • Custom tool example
  • Multi-agent collaboration
  • MCP server integration
  • Permission templates
  • Teams deployment example

Error Messages & UX

  • Review error messages for clarity
  • Add helpful suggestions
  • Improve onboarding
  • Interactive setup wizard

Performance

  • Benchmark common operations
  • Optimize slow paths
  • Implement caching
  • Profile memory usage

Security Audit

  • User input handling
  • Command injection vulnerabilities
  • Path traversal vulnerabilities
  • Docker sandbox security
  • Permission system security
  • Dependency audit (npm audit)
  • Teams API security (auth, encryption)

Release

  • Version 1.0.0
  • Build binaries (all platforms)
  • Publish to npm
  • GitHub release

Success Metrics

MVP (Phase 1-9)

  • CLI on Windows, macOS, Linux
  • 2 LLM providers (DeepSeek, Anthropic) - more coming
  • Tool system (built-in + custom + MCP)
  • Permission & security system
  • Interrupt/cancel support
  • Architect/discuss mode
  • Docker sandbox
  • Conversation history
  • Cost tracking
  • 80%+ test coverage

v1.0 (Phase 1-13)

  • All MVP features
  • Teams/Enterprise support
  • Agent orchestration & multi-agent
  • Custom tools (TypeScript)
  • Custom commands
  • Alternative instruction system
  • Mode switching
  • Keyboard shortcuts
  • 85%+ test coverage
  • Comprehensive documentation
  • Example projects

Implementation Priority

  1. Config/Teams Foundation (Phase 2) - Prepare for enterprise features
  2. Tool System (Phase 4) - Core functionality for agents
  3. Agent Orchestration (Phase 10) - Enable multi-agent workflows
  4. Polish remaining phases (5-9, 11-13) - Complete MVP features