Computer Use
An AI capability that lets the model control a computer — moving the cursor, clicking buttons, typing, and reading the screen — to complete tasks autonomously.
Computer Use is an agentic capability where an AI model interacts with a computer interface the same way a human would: by looking at the screen, moving the mouse, clicking, and typing. Rather than calling a structured API, the model observes pixel-level screen state and decides what actions to take.
Anthropic introduced Computer Use for Claude in late 2024. Key characteristics: - The model receives screenshots of the current screen state - It decides which action to take (click coordinates, keystroke, scroll) - Actions are executed in a sandboxed VM environment - The loop continues until the task is complete or the model asks for input
Use cases: Automating repetitive browser workflows, filling forms, navigating legacy software without an API, web scraping with dynamic content, and software QA testing.
Limitations: Slower than API-based automation, errors in action selection can cascade, and it requires a sandboxed environment to run safely.
Example
An operator uses Computer Use to have Claude log into a supplier portal, download invoice PDFs, and rename them by order number — a task with no API available.