Tool System - Implementation Plan
Overview
Comprehensive tool system for Mimir, including built-in tools, MCP integration, and custom tools with TypeScript code execution.
Goals
- Built-in Tools: Core tools (file ops, bash, git, search)
- Tool Configuration: Enable/disable tools, track token costs
- Custom Tools: User-defined tools with TypeScript code
- MCP Integration: Dynamic tool loading from MCP servers
- Permission System: Inherit permission checks for all tools
- Token Tracking: Show system prompt size impact per tool
- /tools Command: Manage tools interactively
Architecture
1. Tool Interface
interface Tool {
name: string;
description: string;
schema: z.ZodObject<any>; // Zod schema for argument validation
enabled: boolean;
tokenCost: number; // Estimated tokens this tool adds to system prompt
source: 'built-in' | 'custom' | 'mcp' | 'teams';
execute(args: any, context: ToolContext): Promise<ToolResult>;
}
interface ToolContext {
// Platform abstractions
platform: {
fs: IFileSystem;
executor: IProcessExecutor;
docker: IDockerClient;
};
// Configuration
config: Config;
// Conversation context
conversation?: {
id: string;
messages: Message[];
workingDirectory: string;
};
// Utilities
logger: Logger;
llm: ILLMProvider; // Allow tools to call LLM if needed
// Permissions
permissions: PermissionChecker;
}
interface ToolResult {
success: boolean;
output: string; // Formatted output for LLM
metadata?: {
duration: number;
tokensUsed?: number;
cost?: number;
};
error?: string;
}2. Tool Registry
class ToolRegistry {
private tools: Map<string, Tool> = new Map();
private config: Config;
constructor(
private fs: IFileSystem,
private teamsClient?: TeamsAPIClient
) {}
async loadAll(config: Config): Promise<void> {
this.config = config;
// 1. Load built-in tools
await this.loadBuiltInTools();
// 2. Load Teams/Enterprise tools (if authenticated)
if (config.teams?.enabled && config.teams.features.sharedTools) {
await this.loadTeamsTools();
}
// 3. Load MCP tools (if configured)
if (config.mcp?.enabled) {
await this.loadMCPTools();
}
// 4. Load custom local tools (if allowed)
if (!config.enforcement?.disableLocalTools) {
await this.loadCustomTools();
}
// Filter by enabled status
this.filterDisabledTools();
}
private async loadBuiltInTools(): Promise<void> {
const builtInTools = [
new FileOperationsTool(),
new FileSearchTool(),
new BashExecutionTool(),
new GitTool(),
];
for (const tool of builtInTools) {
this.registerTool(tool);
}
}
private async loadTeamsTools(): Promise<void> {
const teamsTools = await this.teamsClient!.listTools(
this.config.teams!.orgId
);
for (const toolDef of teamsTools) {
const tool = await this.buildCustomTool(toolDef);
this.registerTool(tool);
}
}
private async loadMCPTools(): Promise<void> {
const mcpClient = new MCPClient(this.config.mcp!);
await mcpClient.connect();
const mcpTools = await mcpClient.listTools();
for (const mcpTool of mcpTools) {
this.registerTool(new MCPToolAdapter(mcpTool, mcpClient));
}
}
private async loadCustomTools(): Promise<void> {
const toolsDir = '.mimir/tools/';
if (!(await this.fs.exists(toolsDir))) {
return;
}
const toolFiles = await this.fs.glob(`${toolsDir}/*.yml`);
for (const file of toolFiles) {
try {
const toolDef = await this.loadToolDefinition(file);
const tool = await this.buildCustomTool(toolDef);
this.registerTool(tool);
} catch (error) {
console.warn(`Failed to load tool from ${file}:`, error);
}
}
}
private async loadToolDefinition(
file: string
): Promise<CustomToolDefinition> {
const content = await this.fs.readFile(file, 'utf-8');
const parsed = yaml.parse(content);
// Validate schema
const schema = z.object({
name: z.string(),
description: z.string(),
enabled: z.boolean().default(true),
tokenCost: z.number().optional(),
schema: z.object({
type: z.literal('object'),
properties: z.record(z.any()),
required: z.array(z.string()).optional(),
}),
runtime: z.enum(['node', 'typescript']).default('typescript'),
code: z.string(),
permissions: z.object({
allowlist: z.array(z.string()).optional(),
autoAccept: z.boolean().optional(),
riskLevel: z.enum(['low', 'medium', 'high', 'critical']).optional(),
}).optional(),
});
return schema.parse(parsed);
}
private async buildCustomTool(
def: CustomToolDefinition
): Promise<Tool> {
// Build Zod schema from JSON schema
const zodSchema = this.jsonSchemaToZod(def.schema);
// Compile and sandbox TypeScript code
const executor = await this.createToolExecutor(def);
return {
name: def.name,
description: def.description,
schema: zodSchema,
enabled: def.enabled ?? true,
tokenCost: def.tokenCost ?? this.estimateTokenCost(def),
source: 'custom',
execute: async (args: any, context: ToolContext) => {
return await executor.execute(args, context);
},
};
}
private async createToolExecutor(
def: CustomToolDefinition
): Promise<CustomToolExecutor> {
if (def.runtime === 'typescript') {
return new TypeScriptToolExecutor(def, this.fs);
}
throw new Error(`Unsupported runtime: ${def.runtime}`);
}
private filterDisabledTools(): void {
const toolsConfig = this.config.tools ?? {};
for (const [name, tool] of this.tools.entries()) {
// Check if disabled in config
if (toolsConfig[name]?.enabled === false) {
this.tools.delete(name);
}
}
}
registerTool(tool: Tool): void {
if (this.tools.has(tool.name)) {
console.warn(`Tool ${tool.name} already registered, skipping`);
return;
}
this.tools.set(tool.name, tool);
}
getTool(name: string): Tool | undefined {
return this.tools.get(name);
}
getAllTools(): Tool[] {
return Array.from(this.tools.values());
}
getToolsForLLM(): ToolDefinitionForLLM[] {
return this.getAllTools()
.filter(tool => tool.enabled)
.map(tool => ({
name: tool.name,
description: tool.description,
parameters: zodToJsonSchema(tool.schema),
}));
}
getTotalTokenCost(): number {
return this.getAllTools()
.filter(tool => tool.enabled)
.reduce((sum, tool) => sum + tool.tokenCost, 0);
}
private estimateTokenCost(def: CustomToolDefinition): number {
// Estimate based on description + schema size
const descriptionTokens = Math.ceil(def.description.length / 4);
const schemaTokens = Math.ceil(JSON.stringify(def.schema).length / 4);
return descriptionTokens + schemaTokens;
}
private jsonSchemaToZod(schema: any): z.ZodObject<any> {
// Convert JSON schema to Zod schema
// (simplified implementation, full version would handle all JSON schema features)
const shape: Record<string, z.ZodTypeAny> = {};
for (const [key, prop] of Object.entries(schema.properties)) {
const propSchema = prop as any;
if (propSchema.type === 'string') {
shape[key] = z.string();
} else if (propSchema.type === 'number') {
shape[key] = z.number();
} else if (propSchema.type === 'boolean') {
shape[key] = z.boolean();
} else if (propSchema.type === 'array') {
shape[key] = z.array(z.any());
} else {
shape[key] = z.any();
}
if (propSchema.description) {
shape[key] = shape[key].describe(propSchema.description);
}
if (!schema.required?.includes(key)) {
shape[key] = shape[key].optional();
}
}
return z.object(shape);
}
}3. Custom Tool Execution (TypeScript)
interface CustomToolDefinition {
name: string;
description: string;
enabled: boolean;
tokenCost?: number;
schema: JSONSchema;
runtime: 'typescript' | 'node';
code: string;
permissions?: {
allowlist?: string[];
autoAccept?: boolean;
riskLevel?: 'low' | 'medium' | 'high' | 'critical';
};
}
class TypeScriptToolExecutor {
private compiledCode?: string;
private sandboxDocker: IDockerClient;
constructor(
private definition: CustomToolDefinition,
private fs: IFileSystem
) {
this.sandboxDocker = new DockerClient(); // For isolated execution
}
async execute(args: any, context: ToolContext): Promise<ToolResult> {
// 1. Compile TypeScript to JavaScript (if not already compiled)
if (!this.compiledCode) {
this.compiledCode = await this.compile();
}
// 2. Execute in Docker sandbox (isolated context)
return await this.executeInSandbox(args, context);
}
private async compile(): Promise<string> {
// Option 1: Use esbuild for fast compilation
const result = await esbuild.build({
stdin: {
contents: this.definition.code,
loader: 'ts',
},
bundle: true,
platform: 'node',
target: 'node18',
format: 'cjs',
write: false,
});
return result.outputFiles[0].text;
// Option 2: Use tsc or tsx (slower but more compatible)
// const tempFile = await this.fs.writeTemp('tool.ts', this.definition.code);
// await this.executor.execute({ command: 'npx', args: ['tsc', tempFile] });
// return await this.fs.readFile(tempFile.replace('.ts', '.js'));
}
private async executeInSandbox(
args: any,
context: ToolContext
): Promise<ToolResult> {
// Create sandbox environment with limited context
const sandboxContext = this.createSandboxContext(context);
// Prepare execution script
const script = this.buildExecutionScript(args, sandboxContext);
// Run in Docker container (isolated)
const result = await this.sandboxDocker.runContainer({
image: 'mimir/tool-sandbox:node18',
command: ['node', '-e', script],
env: {
ARGS: JSON.stringify(args),
CONTEXT: JSON.stringify(sandboxContext),
},
timeout: 30000, // 30 second timeout
memory: '256m',
cpus: 1,
});
// Parse result
try {
const output = JSON.parse(result.stdout);
return {
success: result.exitCode === 0,
output: output.result,
metadata: {
duration: result.duration,
},
error: output.error,
};
} catch (error) {
return {
success: false,
output: '',
error: `Tool execution failed: ${error.message}`,
};
}
}
private createSandboxContext(context: ToolContext): SandboxContext {
// Provide limited, serializable context to sandbox
return {
config: {
// Only expose non-sensitive config
workingDirectory: context.conversation?.workingDirectory,
// Don't expose API keys, tokens, etc.
},
conversation: context.conversation
? {
id: context.conversation.id,
workingDirectory: context.conversation.workingDirectory,
// Don't include full message history (could be huge)
}
: undefined,
};
}
private buildExecutionScript(
args: any,
sandboxContext: SandboxContext
): string {
return `
const { platform, logger } = require('./sandbox-runtime');
// User's compiled tool code
${this.compiledCode}
// Execute tool
(async () => {
try {
const args = JSON.parse(process.env.ARGS);
const context = {
platform,
logger,
config: JSON.parse(process.env.CONTEXT).config,
conversation: JSON.parse(process.env.CONTEXT).conversation,
};
const result = await execute(args, context);
console.log(JSON.stringify({
result: result,
error: null
}));
} catch (error) {
console.log(JSON.stringify({
result: null,
error: error.message
}));
}
})();
`;
}
}
// Sandbox runtime (injected into Docker container)
// Provides safe platform abstractions
class SandboxRuntime {
platform = {
fs: {
async readFile(path: string): Promise<string> {
// Validate path (prevent escaping working directory)
// Call host via IPC/API
},
async writeFile(path: string, content: string): Promise<void> {
// Validate and call host
},
// ... other safe fs operations
},
executor: {
async execute(command: string, args: string[]): Promise<any> {
// Require permission check on host
// Execute on host (not in sandbox)
},
},
};
logger = {
info: (msg: string) => console.log(`[INFO] ${msg}`),
warn: (msg: string) => console.warn(`[WARN] ${msg}`),
error: (msg: string) => console.error(`[ERROR] ${msg}`),
};
}4. Example Custom Tool Definition
# .mimir/tools/run_tests.yml
name: run_tests
description: Run project tests and analyze failures. Returns test results with failure details.
enabled: true
tokenCost: 450 # Estimated tokens for description + schema
schema:
type: object
properties:
pattern:
type: string
description: Test file pattern to match (e.g., "*.test.ts", "auth/**")
coverage:
type: boolean
description: Include coverage report in output
watch:
type: boolean
description: Run in watch mode (not recommended for LLM)
required: [pattern]
runtime: typescript
permissions:
# Commands this tool is allowed to run
allowlist:
- "yarn test*"
- "npm test*"
- "vitest*"
autoAccept: false # Prompt user before running
riskLevel: medium
code: |
interface TestArgs {
pattern: string;
coverage?: boolean;
watch?: boolean;
}
interface TestResult {
passed: number;
failed: number;
failures: Array<{
file: string;
test: string;
error: string;
}>;
coverage?: {
lines: number;
branches: number;
functions: number;
};
}
export async function execute(
args: TestArgs,
context: ToolContext
): Promise<string> {
const { platform, logger } = context;
logger.info(`Running tests matching: ${args.pattern}`);
// Build test command
const testCmd = ['test', args.pattern];
if (args.coverage) {
testCmd.push('--coverage');
}
// Execute tests
const result = await platform.executor.execute({
command: 'yarn',
args: testCmd,
cwd: context.conversation?.workingDirectory,
});
// Parse test output (Vitest format)
const testResult = parseTestOutput(result.stdout);
// Format result for LLM
if (testResult.failed === 0) {
return `✓ All ${testResult.passed} tests passed!${
args.coverage ? `\n\nCoverage: ${testResult.coverage?.lines}% lines` : ''
}`;
} else {
let output = `✗ ${testResult.failed} tests failed (${testResult.passed} passed)\n\n`;
output += 'Failures:\n';
for (const failure of testResult.failures) {
output += `\n${failure.file} > ${failure.test}\n`;
output += ` ${failure.error}\n`;
}
return output;
}
}
function parseTestOutput(stdout: string): TestResult {
// Parse Vitest output format
const lines = stdout.split('\n');
const result: TestResult = {
passed: 0,
failed: 0,
failures: [],
};
// Example parsing (simplified)
for (const line of lines) {
if (line.includes('PASS')) {
result.passed++;
} else if (line.includes('FAIL')) {
result.failed++;
// Parse failure details...
}
}
return result;
}5. Configuration Schema (Extended)
# .mimir/config.yml
tools:
# Built-in tools
file_operations:
enabled: true
file_search:
enabled: true
bash_execution:
enabled: true
git:
enabled: true
# Custom tools (can be disabled individually)
run_tests:
enabled: true
analyze_dependencies:
enabled: false # Disable if not needed
# Global tool settings
showTokenCosts: true # Show token cost in /tools command
autoLoadCustomTools: true # Auto-load from .mimir/tools/
maxExecutionTime: 60000 # Max execution time per tool (ms)6. /tools Command (In-Chat Management)
class ToolsCommand {
async execute(args: string[], config: Config): Promise<void> {
const subcommand = args[0];
switch (subcommand) {
case 'list':
await this.listTools();
break;
case 'enable':
await this.enableTool(args[1]);
break;
case 'disable':
await this.disableTool(args[1]);
break;
case 'info':
await this.showToolInfo(args[1]);
break;
case 'tokens':
await this.showTokenCosts();
break;
default:
await this.listTools();
}
}
private async listTools(): Promise<void> {
const tools = this.registry.getAllTools();
const totalTokens = this.registry.getTotalTokenCost();
console.log('Available Tools:\n');
for (const tool of tools) {
const status = tool.enabled ? '✓' : '✗';
const source = this.formatSource(tool.source);
console.log(
`${status} ${tool.name.padEnd(25)} ${source.padEnd(12)} ~${tool.tokenCost} tokens`
);
}
console.log(`\nTotal system prompt cost: ~${totalTokens} tokens`);
console.log('\nUsage:');
console.log(' /tools enable <name> - Enable a tool');
console.log(' /tools disable <name> - Disable a tool');
console.log(' /tools info <name> - Show tool details');
console.log(' /tools tokens - Show token breakdown');
}
private async showTokenCosts(): Promise<void> {
const tools = this.registry.getAllTools();
console.log('Token Cost Breakdown:\n');
const sorted = tools
.filter(t => t.enabled)
.sort((a, b) => b.tokenCost - a.tokenCost);
for (const tool of sorted) {
const bar = this.createBar(tool.tokenCost, 50);
console.log(`${tool.name.padEnd(25)} ${bar} ${tool.tokenCost}`);
}
const total = this.registry.getTotalTokenCost();
console.log(`\nTotal: ${total} tokens (~$${this.estimateCost(total)})`);
}
private async enableTool(name: string): Promise<void> {
const tool = this.registry.getTool(name);
if (!tool) {
console.error(`Tool not found: ${name}`);
return;
}
if (tool.enabled) {
console.log(`Tool already enabled: ${name}`);
return;
}
// Update config
await this.configManager.updateToolConfig(name, { enabled: true });
console.log(`✓ Enabled tool: ${name} (+${tool.tokenCost} tokens)`);
}
private async disableTool(name: string): Promise<void> {
const tool = this.registry.getTool(name);
if (!tool) {
console.error(`Tool not found: ${name}`);
return;
}
if (!tool.enabled) {
console.log(`Tool already disabled: ${name}`);
return;
}
// Cannot disable if enforced by Teams
if (this.isEnforced(tool)) {
console.error(
`Cannot disable tool: ${name} (enforced by enterprise policy)`
);
return;
}
// Update config
await this.configManager.updateToolConfig(name, { enabled: false });
console.log(`✗ Disabled tool: ${name} (-${tool.tokenCost} tokens)`);
}
private async showToolInfo(name: string): Promise<void> {
const tool = this.registry.getTool(name);
if (!tool) {
console.error(`Tool not found: ${name}`);
return;
}
console.log(`\nTool: ${tool.name}`);
console.log(`Description: ${tool.description}`);
console.log(`Source: ${this.formatSource(tool.source)}`);
console.log(`Enabled: ${tool.enabled ? 'Yes' : 'No'}`);
console.log(`Token Cost: ~${tool.tokenCost} tokens`);
console.log('\nParameters:');
const schema = zodToJsonSchema(tool.schema);
console.log(JSON.stringify(schema, null, 2));
}
private formatSource(source: string): string {
const badges = {
'built-in': '[Built-in]',
'custom': '[Custom]',
'mcp': '[MCP]',
'teams': '[Teams]',
};
return badges[source] || `[${source}]`;
}
private createBar(value: number, maxWidth: number): string {
const max = 1000; // Max expected token cost
const width = Math.min(Math.round((value / max) * maxWidth), maxWidth);
return '█'.repeat(width) + '░'.repeat(maxWidth - width);
}
private estimateCost(tokens: number): string {
// Rough estimate: $0.001 per 1000 tokens (varies by provider)
return (tokens / 1000 * 0.001).toFixed(4);
}
private isEnforced(tool: Tool): boolean {
// Check if tool is enforced by Teams config
return tool.source === 'teams';
}
}Example output:
$ /tools
Available Tools:
✓ file_operations [Built-in] ~320 tokens
✓ file_search [Built-in] ~280 tokens
✓ bash_execution [Built-in] ~350 tokens
✓ git [Built-in] ~420 tokens
✓ run_tests [Custom] ~450 tokens
✗ analyze_dependencies [Custom] ~380 tokens
✓ security_scan [Teams] ~520 tokens
Total system prompt cost: ~2720 tokens
Usage:
/tools enable <name> - Enable a tool
/tools disable <name> - Disable a tool
/tools info <name> - Show tool details
/tools tokens - Show token breakdown
$ /tools tokens
Token Cost Breakdown:
security_scan ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 520
run_tests ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 450
git █████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 420
bash_execution ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 350
file_operations ███████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 320
file_search ██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 280
Total: 2340 tokens (~$0.0023)Built-in Tools
1. File Operations Tool
class FileOperationsTool implements Tool {
name = 'file_operations';
description = 'Read, write, edit, list, and delete files';
enabled = true;
tokenCost = 320;
source = 'built-in' as const;
schema = z.object({
operation: z.enum(['read', 'write', 'edit', 'list', 'delete', 'exists']),
path: z.string().describe('File or directory path'),
content: z.string().optional().describe('Content to write (for write/edit)'),
pattern: z.string().optional().describe('Search pattern for edit operation'),
replacement: z.string().optional().describe('Replacement text for edit'),
backup: z.boolean().optional().default(true).describe('Create backup before changes'),
});
async execute(args: any, context: ToolContext): Promise<ToolResult> {
const { operation, path } = args;
switch (operation) {
case 'read':
return await this.read(path, context);
case 'write':
return await this.write(path, args.content, args.backup, context);
case 'edit':
return await this.edit(path, args.pattern, args.replacement, args.backup, context);
case 'list':
return await this.list(path, context);
case 'delete':
return await this.delete(path, context);
case 'exists':
return await this.exists(path, context);
default:
return { success: false, output: '', error: 'Unknown operation' };
}
}
private async read(path: string, context: ToolContext): Promise<ToolResult> {
try {
const content = await context.platform.fs.readFile(path, 'utf-8');
return {
success: true,
output: `Content of ${path}:\n\n${content}`,
};
} catch (error) {
return {
success: false,
output: '',
error: `Failed to read ${path}: ${error.message}`,
};
}
}
// ... other operations
}2. File Search Tool
class FileSearchTool implements Tool {
name = 'file_search';
description = 'Search files using grep, glob patterns, or regex';
enabled = true;
tokenCost = 280;
source = 'built-in' as const;
schema = z.object({
mode: z.enum(['grep', 'glob', 'regex']),
query: z.string().describe('Search query or pattern'),
path: z.string().optional().describe('Directory to search (default: current)'),
include: z.array(z.string()).optional().describe('File patterns to include'),
exclude: z.array(z.string()).optional().describe('File patterns to exclude'),
});
async execute(args: any, context: ToolContext): Promise<ToolResult> {
// Implementation using ripgrep or native search
}
}3. Bash Execution Tool
class BashExecutionTool implements Tool {
name = 'bash_execution';
description = 'Execute shell commands with permission checks';
enabled = true;
tokenCost = 350;
source = 'built-in' as const;
schema = z.object({
command: z.string().describe('Command to execute'),
args: z.array(z.string()).optional().describe('Command arguments'),
timeout: z.number().optional().default(30000).describe('Timeout in ms'),
});
async execute(args: any, context: ToolContext): Promise<ToolResult> {
// 1. Check permissions
const allowed = await context.permissions.checkPermission(
`${args.command} ${args.args?.join(' ') || ''}`,
context.config
);
if (!allowed) {
return {
success: false,
output: '',
error: 'Permission denied',
};
}
// 2. Execute
try {
const result = await context.platform.executor.execute({
command: args.command,
args: args.args,
timeout: args.timeout,
});
return {
success: result.exitCode === 0,
output: result.stdout,
error: result.exitCode !== 0 ? result.stderr : undefined,
};
} catch (error) {
return {
success: false,
output: '',
error: error.message,
};
}
}
}4. Git Tool
class GitTool implements Tool {
name = 'git';
description = 'Execute git operations (status, diff, log, commit, etc.)';
enabled = true;
tokenCost = 420;
source = 'built-in' as const;
schema = z.object({
operation: z.enum(['status', 'diff', 'log', 'commit', 'branch', 'checkout']),
args: z.array(z.string()).optional(),
});
async execute(args: any, context: ToolContext): Promise<ToolResult> {
// Git operations with permission checks for destructive operations
}
}MCP Integration
class MCPToolAdapter implements Tool {
constructor(
private mcpTool: MCPToolDefinition,
private mcpClient: MCPClient
) {}
get name(): string {
return `mcp_${this.mcpTool.server}_${this.mcpTool.name}`;
}
get description(): string {
return this.mcpTool.description;
}
get enabled(): boolean {
return true;
}
get tokenCost(): number {
// Estimate based on MCP tool description
return Math.ceil(this.description.length / 4) + 100;
}
get source(): 'mcp' {
return 'mcp';
}
get schema(): z.ZodObject<any> {
// Convert MCP schema to Zod
return this.convertMCPSchemaToZod(this.mcpTool.schema);
}
async execute(args: any, context: ToolContext): Promise<ToolResult> {
// Forward to MCP server
const result = await this.mcpClient.callTool(
this.mcpTool.server,
this.mcpTool.name,
args
);
return {
success: result.success,
output: result.content,
error: result.error,
};
}
}Testing Strategy
Unit Tests
- Tool registry loading (built-in, custom, MCP, teams)
- Tool execution (mocked context)
- Permission checking
- Token cost estimation
- Config updates (/tools enable/disable)
Integration Tests
- Custom tool compilation (TypeScript → JavaScript)
- Sandbox execution (Docker container)
- MCP tool loading and execution
- Teams tool sync
End-to-End Tests
- Full tool lifecycle (load → execute → result)
- /tools command interactions
- Permission prompt flows
Security Considerations
- Sandboxed Execution: Custom tools run in Docker containers
- Permission Inheritance: All tools go through permission system
- Path Validation: Prevent directory traversal attacks
- Timeout Enforcement: Prevent infinite loops
- Resource Limits: CPU, memory limits for sandbox
- Code Review: Teams tools reviewed by admin before deployment
Implementation Phases
Phase 1: Core Tool System
- Tool interface and registry
- Built-in tools (file ops, search, bash, git)
- Tool configuration schema
- Token cost tracking
Phase 2: /tools Command
- List tools
- Enable/disable tools
- Show token costs
- Tool info display
Phase 3: Custom Tools
- YAML definition loader
- TypeScript compilation (esbuild)
- Docker sandbox execution
- Context injection
- Permission system integration
Phase 4: MCP Integration
- MCP tool adapter
- Dynamic MCP tool loading
- MCP server lifecycle management
Phase 5: Teams Integration
- Teams tools loading
- Enforced tools (cannot be disabled)
- Teams allowlist integration
Next Steps
- Implement core tool registry
- Build built-in tools
- Add /tools command
- Create custom tool system
- Integrate with agent loop