Skip to main content
OpenAI’s Computer Use Agent (openai-cua) leverages GPT models with vision and reasoning capabilities for screenshot-based browser automation.

Overview

OpenAI Computer Use provides:
  • Screenshot-based interactions for visual understanding of web pages
  • Advanced reasoning powered by GPT models
  • High accuracy for complex web interactions
  • Structured outputs for reliable data extraction

Supported Models

ModelModel IDBest For
Computer Use Previewcomputer-use-previewScreenshot-based automation (default)

Code Example

import Anchorbrowser from 'anchorbrowser';

const anchorClient = new Anchorbrowser({
  apiKey: process.env.ANCHORBROWSER_API_KEY
});

const response = await anchorClient.agent.task(
  'Find the pricing information and extract the plan details',
  {
    taskOptions: {
      url: 'https://example.com/pricing',
      agent: 'openai-cua',
      // model: 'computer-use-preview',  // Default model
      maxSteps: 25,
      outputSchema: {
        type: 'object',
        properties: {
          plans: {
            type: 'array',
            items: {
              type: 'object',
              properties: {
                name: { type: 'string' },
                price: { type: 'string' },
                features: { type: 'array', items: { type: 'string' } }
              }
            }
          }
        }
      }
    }
  }
);

console.log(response);

Configuration Options

ParameterTypeDescription
agentstringMust be openai-cua
modelstringOpenAI model to use (default: computer-use-preview)
urlstringStarting URL for the task
max_stepsintegerMaximum actions the agent can take
output_schemaobjectJSON Schema for structured output