Agent¶

The Agent module implements the core task automation loop.

Overview¶

The AlphaBot class orchestrates the entire task execution workflow, managing the interaction between the LLM, command executor, and user interface.

Class: AlphaBot¶

Located in: alpha_bot/agent.py

Initialization¶

from alpha_bot.agent import AlphaBot
from alpha_bot.llm import get_llm_client
from alpha_bot.executor import ShellExecutor

agent = AlphaBot(
    llm_client=get_llm_client(),
    executor=ShellExecutor(),
    auto_mode=False
)

Key Methods¶

`execute_task(task: str) -> None`¶

Execute a single task with the agent loop.

Parameters:

task (str): Natural language description of the task

Example:

agent.execute_task("list all Python files in current directory")

`run_interactive() -> None`¶

Start interactive mode for continuous task execution.

Example:

agent.run_interactive()

Agent Loop Pattern¶

The agent follows this execution loop:

1. Receive task
2. Analyze and plan
3. Select appropriate skill (Command, Direct LLM, Browser, PPT, Image, WeChat, Feishu, etc.)
4. Execute skill
5. Safety check (if command-generating skill)
6. Get user confirmation (unless auto mode)
7. Execute command (if applicable)
8. Analyze results
9. Determine if task complete
   - If not complete: goto step 3 with updated context
   - If complete: finish

Architecture¶

Component Interaction¶

TaskAgent
  ├─> SkillManager (select and execute appropriate skills)
  │   ├─> CommandSkill (command generation, translation, analysis)
  │   ├─> DirectLLMSkill (direct LLM processing for translation, summaries, etc.)
  │   ├─> BrowserSkill (web automation with Playwright)
  │   ├─> PPTSkill (presentation generation)
  │   ├─> ImageSkill (image generation)
  │   ├─> WeChatSkill (WeChat automation for macOS)
  │   ├─> FeishuSkill (Feishu/Lark automation for macOS)
  │   ├─> Dynamic Skills (auto-generated from markdown with persistent storage)
  │   └─> Other skills...
  ├─> LLMClient (generate commands and analyze)
  ├─> MemoryBank (contextual memory for learning from previous steps)
  ├─> ShellExecutor (execute and validate)
  └─> Console (user interaction and display)

Context Management¶

The agent maintains conversation context including:

User's original task
Commands generated and executed
Command outputs and results
Error messages and failures
Current working directory
Execution history

Error Recovery¶

When command execution fails:

Capture error: Exit code, stderr, stdout
Analyze failure: Send error to LLM for analysis
Generate alternative: LLM suggests different approach
Retry: Execute alternative command
Repeat: Up to max retries (default: 3)

Configuration¶

Environment Variables¶

OPENAI_API_KEY: API key for LLM
OPENAI_API_BASE: Optional custom endpoint
MODEL_NAME: LLM model to use

Agent Parameters¶

class AlphaBot:
    def __init__(
        self,
        auto_execute: bool = False,
        working_dir: Optional[str] = None,
        direct_mode: bool = False
    ):
        ...

Parameters:

llm_client: Instance of LLM client
executor: Shell command executor
auto_mode: Skip user confirmations if True
max_retries: Maximum retry attempts for failed commands
workdir: Working directory for command execution

Example Usage¶

Basic Task Execution¶

from alpha_bot.agent import AlphaBot

# Initialize components
agent = AlphaBot()

# Execute task
agent.run("find all large files")

With Custom Configuration¶

import os
from alpha_bot.agent import AlphaBot

# Custom configuration
os.environ['MODEL_NAME'] = 'gpt-4'

# Agent with auto mode and custom workdir
agent = AlphaBot(
    auto_execute=True,
    working_dir='/path/to/project'
)

agent.run("run tests")

Interactive Session¶

# Start interactive mode
agent = AlphaBot()
agent.run_interactive()

Extending the Agent¶

Custom Agent Subclass¶

from alpha_bot.agent import AlphaBot

class CustomAgent(AlphaBot):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # Custom initialization

    def before_execution(self, command: str):
        """Hook called before command execution"""
        # Log to file
        with open('command_log.txt', 'a') as f:
            f.write(f"{command}\n")

    def after_execution(self, result: dict):
        """Hook called after command execution"""
        # Custom result processing
        if result.get('returncode') != 0:
            self.notify_admin(result)

Custom Retry Logic¶

class SmartAgent(AlphaBot):
    def should_retry(self, error: str, attempt: int) -> bool:
        """Custom retry decision logic"""
        # Don't retry authentication errors
        if 'permission denied' in error.lower():
            return False

        # Retry network errors up to 5 times
        if 'connection refused' in error.lower():
            return attempt < 5

        # Default behavior - note: AlphaBot doesn't have max_retries as a property
        return attempt < 3  # Default max retries

Best Practices¶

1. Context Management¶

Provide relevant context to improve command generation:

# Context is managed automatically by AlphaBot
agent.run("commit changes")

2. Error Handling¶

Handle agent exceptions gracefully:

try:
    agent.run(user_input)
except Exception as e:
    print(f"Task execution failed: {e}")
    # Handle the error as needed

3. Resource Cleanup¶

Ensure proper cleanup in long-running sessions:

try:
    agent.run_interactive()
finally:
    # Resource cleanup happens automatically
    pass

API Reference¶

TaskAgent Class¶

class AlphaBot:
    """Main task automation agent"""

    def run(self, task: str) -> None:
        """Execute a single task"""
        pass

    def run_interactive(self) -> None:
        """Start interactive mode"""
        pass