AI-Powered Browser Automation for Everyone
Natural language commands. No coding required. Works anywhere.
Developer Automation
Smarter than Playwright
Skip brittle CSS selectors and complex scripting. Use natural language to automate testing, scraping, and workflows. AI adapts to page changes automatically.
Example command:
"Go to GitHub, search for React hooks, and open the first result"
Elderly Assistant
Simplifying the web
Help seniors navigate complex websites, fill forms, book appointments, and find information - all through simple voice-like commands.
Example command:
"Go to my pharmacy website and refill my prescription"
Use Cases
🔍 Research Automation
Automatically search multiple sources, navigate documentation, and gather information.
"Go to Wikipedia and search for quantum computing"
🛒 E-commerce Tasks
Search products, compare prices, and navigate shopping sites autonomously.
"Search for wireless headphones on Amazon"
📝 Form Filling
Intelligent form detection and filling with context understanding.
"Fill out the contact form with my details"
🔗 Multi-step Workflows
Chain complex actions across multiple pages and sites.
"Go to GitHub and search for React components"
Ready to Get Started?
Semantic Fetch Intelligence v1.0.0
Free to use • MIT Licensed • Commercial use allowed
Get on Millpond.aiAvailable on Millpond.ai • Chrome & Edge compatible • Requires Gemini API key
Quick Start Guide
Get the Extension
- Visit Millpond.ai and get
fetch-extension-v1.0.0.zip - Extract the ZIP file
- Open Chrome/Edge and navigate to
chrome://extensions/ - Enable "Developer mode" (toggle in top right)
- Click "Load unpacked" and select the extracted folder
Configure API Key
- Get a free Gemini API key from ai.google.dev
- Click the extension icon in your browser toolbar
- Click the settings (⚙️) icon
- Paste your API key and save
Example API key format:
AIzaSyDxxxxxxxxxxxxxxxxxxxxxxxxxxx
Start Automating
- Open the extension side panel
- Type a natural language command
- Press Enter or click the send button
- Watch Fetch autonomously complete the task!
💡 Try these examples:
- → "Go to Wikipedia and search for artificial intelligence"
- → "Search for wireless mouse on amazon.com"
- → "Navigate to reddit.com and search for machine learning"
Common Issues
Extension won't load?
Make sure you extracted the ZIP and selected the dist folder when loading.
API quota exceeded?
Free tier has daily limits. Wait 24 hours or upgrade your API plan at ai.google.dev/pricing.
Agent not starting?
Extension requires a regular webpage to be active. It won't work on chrome:// or edge:// pages. Try opening google.com first.
Technical Specifications
Core Technology
- Extension Type
- Chrome Extension MV3
- Framework
- React 19 + TypeScript
- AI Model
- Gemini 2.5 Flash
- Styling
- Tailwind CSS
- Build Tool
- Vite
Required Permissions
-
✓
sidePanel
Display extension UI in browser side panel
-
✓
activeTab
Access current tab for automation
-
✓
scripting
Inject scripts to interact with pages
-
✓
storage
Store API key and settings
-
✓
tabs
Create and navigate tabs
Key Features
- → Semantic DOM analysis
- → Autonomous agent loop
- → Natural language commands
- → Multi-step task execution
- → Error recovery & retry logic
- → Real-time status updates
Requirements
-
•
Browser
Chrome/Edge (MV3 support)
-
•
API Key
Free Gemini API key from Google
-
•
Network
Internet connection for AI inference
Version 1.0.0
Release Date
January 2026
License
MIT (Free to use)
Package Size
113 KB
Availability
Framework
Cormorant Foraging v1.0
System Architecture
Autonomous Agent Loop
┌─────────────────┐
│ User Input │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Start Agent │
└────────┬────────┘
│
▼
┌────────────┐
│ Valid Tab? │───── No ────▶ ┌──────────────────┐
└────────────┘ │ Create New Tab │
│ └────────┬─────────┘
Yes │
│ │
└────────────────────────────────┘
│
▼
┌──────────────────┐
│ Analyze DOM │◀─────────┐
└────────┬─────────┘ │
│ │
▼ │
┌──────────────────┐ │
│ Semantic │ │
│ Anchoring │ │
└────────┬─────────┘ │
│ │
▼ │
┌──────────────────┐ │
│ Gemini AI │ │
│ Thinking │ │
└────────┬─────────┘ │
│ │
▼ │
┌─────────┐ │
│Decision?│ │
└────┬────┘ │
│ │
┌──────────────┼──────────────┐ │
│ │ │ │
Navigate Click Type │
│ │ │ │
▼ │ │ │
┌──────────┐ │ │ │
│Update URL│ │ │ │
└────┬─────┘ │ │ │
│ │ │ │
▼ ▼ ▼ │
┌──────────┐ ┌──────────┐ ┌──────────┐
│Wait Load │ │Execute │ │Execute │
└────┬─────┘ │Click │ │Type │
│ └────┬─────┘ └────┬─────┘
└─────────────┴─────────────┴──────┘
│
│ Submit ──▶ Execute Submit ──┐
│ │
│ Finish ──▶ Task Complete │
│ │ │
└─────────────────────┘ │
│
▼
┌─────────────────┐
│Display Results │
└─────────────────┘
Component Architecture
┌──────────────────┐
│ Side Panel UI │
└────────┬─────────┘
│
▼
┌─────────┐
│App.tsx │
└────┬────┘
│
├────────────────┬─────────────────┬──────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌────────────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────────────┐
│ GeminiService │ │ DOM │ │ChatMessage │ │ Settings UI │
│ │ │ Observer │ │ List │ │ │
└───────┬────────┘ └────┬─────┘ └──────────────┘ └────────┬─────────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌──────────┐ ┌──────────────────┐
│ Gemini API │ │ Content │ │ Storage API │
│ │ │ Script │ │ │
└────────────────┘ └────┬─────┘ └──────────────────┘
│
▼
┌──────────┐
│ Web Page │
└──────────┘
┌──────────────────┐
│ Background │
│ Service Worker │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Tab Management │
└──────────────────┘
Data Flow
Core Modules
App.tsx
Main React component managing agent state, UI, and orchestration of the autonomous loop.
GeminiService.ts
AI inference layer communicating with Gemini API for decision-making and action planning.
domObserver.ts
Semantic DOM extraction using anchoring techniques to identify actionable elements.
background.js
Service worker handling tab lifecycle, navigation events, and extension lifecycle.
Code Examples
Basic Agent Invocation
// User types in natural language
const userInput = "Go to Wikipedia and search for artificial intelligence";
// Agent starts autonomous loop
startAgent(userInput);
// Agent will:
// 1. Navigate to wikipedia.org
// 2. Find search input
// 3. Type the query
// 4. Submit the form
// 5. Report completion
Semantic DOM Extraction
// domObserver.ts - Semantic anchoring
export function getSemanticDOM(): string {
const elements = document.querySelectorAll('a, button, input, textarea, select');
const anchors = Array.from(elements).map((el, idx) => {
const tag = el.tagName.toLowerCase();
const text = el.textContent?.trim().slice(0, 50);
const id = el.id;
const name = el.getAttribute('name');
const type = el.getAttribute('type');
return {
index: idx,
tag,
text,
id,
name,
type,
selector: generateSelector(el)
};
});
return JSON.stringify(anchors, null, 2);
}
AI Decision Making
// gemini.ts - Get next action from AI
async getNextAction(dom: string, goal: string, history: string): Promise {
const response = await this.ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `
GOAL: ${goal}
HISTORY: ${history}
CURRENT DOM: ${dom}
What is the next step?
`,
config: {
systemInstruction: SYSTEM_PROMPT,
responseMimeType: "application/json",
responseSchema: {
type: Type.OBJECT,
properties: {
thought_process: { type: Type.STRING },
action: { type: Type.STRING },
selector: { type: Type.STRING },
value: { type: Type.STRING },
url: { type: Type.STRING }
}
}
}
});
return JSON.parse(response.text);
}
Executing Actions on Page
// domObserver.ts - Execute action in page context
export function executeAction(
action: string,
selector: string,
value: string
): { success: boolean; message: string } {
try {
const element = document.querySelector(selector);
if (!element) {
return { success: false, message: `Element not found: ${selector}` };
}
switch (action) {
case 'click':
(element as HTMLElement).click();
return { success: true, message: 'Clicked element' };
case 'type':
if (element instanceof HTMLInputElement || element instanceof HTMLTextAreaElement) {
element.value = value;
element.dispatchEvent(new Event('input', { bubbles: true }));
return { success: true, message: `Typed "${value}"` };
}
break;
case 'submit':
if (element instanceof HTMLFormElement) {
element.submit();
} else {
(element as HTMLElement).click();
}
return { success: true, message: 'Submitted form' };
}
return { success: false, message: 'Unknown action' };
} catch (error) {
return { success: false, message: error.message };
}
}
Agent Action Schema
{
"thought_process": "The user wants to search Wikipedia. I need to navigate to wikipedia.org first.",
"action": "navigate",
"url": "https://www.wikipedia.org",
"selector": null,
"value": null
}
// After navigation completes...
{
"thought_process": "I'm on Wikipedia homepage. I can see a search input with id 'searchInput'. I'll type the query.",
"action": "type",
"selector": "#searchInput",
"value": "artificial intelligence",
"url": null
}
// After typing...
{
"thought_process": "Query typed. Now I'll submit the search form.",
"action": "submit",
"selector": "form[role='search']",
"value": null,
"url": null
}
// After results load...
{
"thought_process": "Search completed successfully. Results are displayed.",
"action": "finish",
"selector": null,
"value": null,
"url": null
}
API Reference
Agent Actions
navigate
Navigate to a specified URL
{ "action": "navigate", "url": "https://example.com" }
click
Click an element by selector
{ "action": "click", "selector": "button.submit" }
type
Type text into an input element
{ "action": "type", "selector": "#search", "value": "query text" }
submit
Submit a form by selector
{ "action": "submit", "selector": "form#login" }
finish
Mark task as complete
{ "action": "finish", "thought_process": "Task completed successfully" }
Agent Status States
IDLE
Agent waiting for user input
ANALYZING_DOM
Extracting page structure
THINKING
AI deciding next action
EXECUTING
Performing action on page
WAITING_FOR_NAV
Waiting for page navigation
FINISHED
Task completed successfully
ERROR
Error occurred, agent stopped
TypeScript Interfaces
// types.ts
export enum AgentStatus {
IDLE = 'Idle',
ANALYZING_DOM = 'Analyzing page...',
THINKING = 'Thinking...',
EXECUTING = 'Executing action...',
WAITING_FOR_NAV = 'Navigating...',
FINISHED = 'Complete',
ERROR = 'Error'
}
export interface AgentAction {
thought_process: string;
action: 'navigate' | 'click' | 'type' | 'submit' | 'finish';
selector?: string;
value?: string;
url?: string;
}
export interface ChatMessage {
role: 'user' | 'model' | 'system';
content: string;
timestamp: number;
type: 'text' | 'action_log';
metadata?: AgentAction;
}