happycapy-agent (niveshdandyan/happycapy-browser-agent) | MoltPulse

Back to Directory

happycapy-agent

Browser-Use Skill for HappyCapy.ai

niveshdandyan/happycapy-browser-agent00

MoltPulse

Based on repository activity, growth velocity and community engagement.

26

Growth2/30

Activity5/25

Popularity5/25

Trust14/20

19

Stars

High

Sentiment

Votes

19

README.md

HappyCapy Browser Agent

Full-stack AI browser automation system with multi-model LLM strategies, real-time dashboard, and live VNC streaming. Built on browser-use 0.11.9, FastAPI, and Playwright.

Give the agent a task in plain English -- it controls a real Chromium browser to complete it. Watch every step live through VNC streaming or periodic screenshots, with full agent reasoning visible in real time.

Dashboard Overview

Features at a Glance

6 LLM models from 4 providers (Anthropic, OpenAI, Google, Moonshot) -- selectable from the dashboard
5 multi-model strategies for reliability, quality, and resilience
Council Planning -- multiple models generate plans independently, then vote on the best one
Live VNC streaming -- watch the browser in real time via embedded noVNC
Plan tab with step-by-step progress indicators (done/current/pending)
Compact activity log with collapsible reasoning details
Capybara splash screen shown when idle, completed, failed, or stopped
Real-time WebSocket updates for all agent events
Single-file dashboard -- no build step, no dependencies, just HTML

Multi-Model Strategies

Click the gear icon in the top-right corner to open the model configuration bar.

Model Configuration

Five strategies let you combine multiple LLM models for different goals:

| Strategy | How It Works | When to Use | |----------|-------------|-------------| | Single | One model handles all steps | Simple tasks, cost-sensitive | | Fallback Chain | Primary runs; auto-switches to secondary on error/rate-limit | Reliability | | | Strong model plans first; fast model executes browser steps | Complex multi-step tasks | | | Primary acts; judge model validates every step + final verdict | Quality-critical tasks | | | Primary runs; on failure/loop/stall, all council models convene to diagnose, advise, and replan | Hard tasks, anti-stall |

                                  +------------------+
 [Browser Dashboard]  <---WS---> |  FastAPI Server   | <--controls--> [browser-use Agent]
   (single HTML SPA)             |  agent_server.py  |                       |
                                  +------------------+                       v
 [Browser Dashboard]  <-noVNC->  [x11vnc :5999] <--- [Xvfb :99] <--- [Chromium]

Xvfb :99 (1280x900x24)  -->  x11vnc :5999  -->  noVNC/websockify :6080

bash setup.sh

export AI_GATEWAY_API_KEY="your-openai-compatible-api-key"

./start.sh

DISPLAY=:99 /home/node/browser-agent-venv/bin/python3 agent_server.py

curl -X POST http://localhost:8888/api/agent/start \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Go to google.com and search for AI news",
    "max_steps": 50,
    "model_config_data": {
      "strategy": "planner_executor",
      "primary_model": "openai/gpt-4o",
      "secondary_model": "google/gemini-2.5-pro"
    }
  }'

curl -X POST http://localhost:8888/api/agent/start \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Search for capybara facts on Wikipedia",
    "max_steps": 10,
    "model_config_data": {
      "strategy": "planner_executor",
      "primary_model": "openai/gpt-4o",
      "secondary_model": "google/gemini-2.5-flash",
      "use_council_planning": true,
      "council_members": [
        "anthropic/claude-sonnet-4.5",
        "google/gemini-2.5-flash",
        "openai/gpt-4o"
      ]
    }
  }'

{
  "type": "start_task",
  "task": "Search for AI news on Hacker News",
  "max_steps": 50,
  "model_config": {
    "strategy": "council",
    "primary_model": "openai/gpt-4o",
    "council_members": ["anthropic/claude-sonnet-4.5", "google/gemini-2.5-flash"]
  }
}

/home/node/browser-agent-venv/bin/python3 -m playwright install-deps chromium

DISPLAY=:99 /home/node/.cache/ms-playwright/chromium-*/chrome-linux64/chrome --version

rm -f /tmp/.X99-lock
pkill -9 Xvfb

export AGENT_PORT=9222  # or any free port

sudo apt-get install -y novnc

.
├── agent_server.py         # FastAPI backend (also at scripts/agent_server.py)
├── dashboard.html          # Single-file SPA dashboard (also at scripts/dashboard.html)
├── setup.sh                # One-click installation
├── start.sh                # Startup script with checks
├── assets/
│   ├── 01-dashboard-idle.png
│   ├── 02-model-config.png
│   ├── 03-planner-executor-config.png
│   ├── 04-council-planning-config.png
│   ├── 05-task-running.png
│   ├── 06-plan-tab.png
│   ├── 07-activity-log.png
│   ├── 08-task-status.png
│   ├── 09-log-details-expanded.png
│   └── 10-council-strategy-config.png
├── scripts/                # Canonical source copies
│   ├── agent_server.py
│   ├── dashboard.html
│   ├── setup.sh
│   └── start.sh
├── SKILL.md                # Claude Code skill definition
└── README.md