LocalMind

A private AI research agent running Gemma entirely in your browser via WebGPU. Tool calling, persistent memory, and web search — all on-device.

Only your search queries touch the network (and only when you choose to). All reasoning stays on your device.

Models are cached after first download — future visits load instantly.

Requirements: Chrome 113+, Edge 113+, or Firefox 130+ with WebGPU.

Available Models

Gemma 3 1B (~760 MB) — fast, text-only chat
Gemma 4 E2B (~1.5 GB) — multimodal + agent
Gemma 4 E4B (~4.9 GB) — multimodal + agent, best quality

Gemma 4 Agent Tools

calculate — math, percentages, conversions
get_current_time — date/time with timezone
store_memory — save facts to persistent memory
search_memory — recall from stored memories
web_search — search the web (requires API key)
fetch_page — read a web page’s content
set_reminder — browser notification after N minutes
list_memories — show what’s stored in memory
delete_memory — forget specific memories

Translation works directly — Gemma 4 speaks 140+ languages natively, no tool needed.

Gemma 3 1B works as a simple chatbot — no agent tools.

Document Upload

Text — .txt, .md, .json, .csv
PDF — extracted via PDF.js, auto-summarized
DOCX — extracted via mammoth.js, auto-summarized

Documents are chunked, embedded, and stored as searchable knowledge. A summary is generated on upload.

Multimodal (Gemma 4)

📎 Attach — images, audio, MP4 video, or documents
📷 Camera / 🎤 Mic / Paste / Drag & drop

Document Upload

Text — .txt, .md, .json, .csv
PDF — extracted via PDF.js, auto-summarized
DOCX — extracted via mammoth.js, auto-summarized
Folder — open a local folder to ingest all .md/.txt/.pdf files at once; re-open to sync only changed files (incremental)

Conversations

New Chat — archives to History + starts fresh
Clear — deletes without saving
History — sidebar, click to resume any past chat
Share — generate an encrypted or plain link to share any conversation; recipient opens the URL to load it

Memory browser

Category pills — filter by fact / preference / finding / document / conversation with live counts
Source grouping — document chunks grouped by file; bulk “Delete all” per source
Audit — flags stale (>60 days), near-duplicates (cosine sim ≥0.92), and outliers (low avg similarity to category); bulk or per-item delete; auto-reruns after each deletion

Output & Export

Save as MD — download any response as Markdown (or write directly to open folder if one is active)
Code download — hover code blocks for download button
Export / Import — in Memory panel, full data as JSON
Auto-backup — toggle in Settings to download on New Chat

Batch Prompts

Enter one prompt per line in the Batch panel — they run sequentially through the full agent loop
{{previous}} — explicit placeholder substituted with the previous response text
Auto-inject — checkbox (on by default) appends the previous response as context even without a placeholder; disabled for any prompt that already contains {{previous}}
Stop — halts after the current generation finishes; progress shown live

Other

Web Search — Settings → provider + API key → 🌐 button
Thinking Mode — see chain-of-thought (collapses when done)
Cache management — view/clear cached models in Settings
Response badges — On-device / Agent / Web-enriched

Math & Conversions

What is 15% of 2450? Convert 72 Fahrenheit to Celsius Compound interest: $10K at 7% for 5 years?

Time & Reminders

What time is it in Tokyo? Remind me in 5 minutes to check the oven

Memory

Remember: I'm a software engineer on Dashboard Pro What do you know about me? Forget my preferences

Translation

Translate "Good morning" to Japanese, French, Hindi Train station directions in Spanish & German

Writing & Analysis

Write a polite meeting decline email Microservices vs monolith: pros and cons Explain WebGPU in 3 simple sentences

Documents (attach a PDF, DOCX, or text file)

Summarize the uploaded document Main conclusions from my document

Multimodal (attach an image first)

Describe this image in detail Transcribe text from this image

Web Research (requires API key)

Top tech news today Latest WebGPU browser support status AI in the browser: recent articles with sources

Coding

Sieve of Eratosthenes in Python async/await vs Promises explained

Click any prompt to paste it. Web search needs a provider in Settings. Multimodal needs a Gemma 4 model + an attached image.

Initializing...

Drop image, audio, or video here

🧠

Private AI, right in your browser

Gemma runs entirely on your device. Nothing is sent to any server.

Switch to a Gemma 4 model for tool calling, memory, and multimodal input.
Web search requires an API key (Tavily, Brave) or a self-hosted SearXNG instance.

Preparing to download model...

System prompt (optional) Show reasoning (thinking mode)

Web search provider

API key

SearXNG instance URL

Auto-download backup on New Chat

Model cache

Memory 0 memories

Batch Prompts 0 prompts

Auto-inject {{previous}}

History 0 conversations

LocalMind

Available Models

Gemma 4 Agent Tools

Document Upload

Multimodal (Gemma 4)

Document Upload

Conversations

Memory browser

Output & Export

Batch Prompts

Other

Math & Conversions

Time & Reminders

Memory

Translation

Writing & Analysis

Documents (attach a PDF, DOCX, or text file)

Multimodal (attach an image first)

Web Research (requires API key)

Coding

Private AI, right in your browser

Share Conversation