Get DOM

POST
/tools-api/chrome/get-dom

Description

Retrieves the complete Document Object Model (DOM) of the current page in the Chrome browser. This endpoint provides the HTML structure of the web page, allowing LLM agents to analyze and understand the page content and structure for further interaction.

The endpoint is useful for:

The DOM is automatically trimmed for large pages to avoid exceeding prompt size limits when sending the result to an LLM. This ensures optimal performance while still providing the most relevant parts of the page structure.

Note: Chrome must be started with the Open Chrome action first, otherwise this endpoint will return an error.

Request

This endpoint accepts a POST request with an empty JSON object. No request parameters are required.

{}

Response

The response contains the HTML content of the current page in Chrome browser.

Response Structure

Property Type Description
success Boolean Indicates whether the operation was successful.
message String Contains the HTML content of the page if successful, or an error message if failed.
timestamp DateTime The time when the action was executed.

Example Response (Simplified)

{
  "success": true,
  "message": "Example Page......",
  "timestamp": "2023-11-15T14:30:22.567Z"
}
Note: The complete HTML DOM is included in the "message" field of the response, ensuring consistency with other Chrome endpoints. All JSON keys in the response are in camelCase format.

Code Examples

import requests
import json

url = "http://localhost:54321/tools-api/chrome/get-dom"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer your-api-key-here"
}

response = requests.post(url, headers=headers, json={})

if response.status_code == 200:
    result = response.json()
    if result["success"]:
        dom_content = result["message"]
        print("DOM retrieved successfully")
        # Process the DOM content here
    else:
        print(f"Error: {result['message']}")
else:
    print(f"Request failed with status code: {response.status_code}")
import axios from 'axios';

interface DomResponse {
  success: boolean;
  message: string;
  timestamp: string;
}

async function getDom() {
  try {
    const response = await axios.post(
      'http://localhost:54321/tools-api/chrome/get-dom',
      {},
      {
        headers: {
          'Content-Type': 'application/json',
          'Authorization': 'Bearer your-api-key-here'
        }
      }
    );
    
    if (response.data.success) {
      const domContent = response.data.message;
      console.log('DOM retrieved successfully');
      // Process the DOM content here
      return domContent;
    } else {
      console.error(`Error: ${response.data.message}`);
      return null;
    }
  } catch (error) {
    console.error('Failed to get DOM:', error);
    return null;
  }
}

// Usage
getDom().then(dom => {
  if (dom) {
    // Work with the DOM content
    console.log('DOM length:', dom.length);
  }
});
using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

public class ChromeApiClient
{
    private readonly HttpClient _httpClient;
    private readonly string _baseUrl;
    private readonly string _apiKey;

    public ChromeApiClient(string baseUrl, string apiKey)
    {
        _baseUrl = baseUrl;
        _apiKey = apiKey;
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {apiKey}");
    }

    public async Task GetDomAsync()
    {
        var url = $"{_baseUrl}/tools-api/chrome/get-dom";
        var content = new StringContent("{}", Encoding.UTF8, "application/json");
        
        var response = await _httpClient.PostAsync(url, content);
        var jsonResponse = await response.Content.ReadAsStringAsync();
        
        var options = new JsonSerializerOptions { PropertyNameCaseInsensitive = true };
        var result = JsonSerializer.Deserialize(jsonResponse, options);
        
        if (result.Success)
        {
            Console.WriteLine("DOM retrieved successfully");
            return result.Message;
        }
        else
        {
            Console.WriteLine($"Error: {result.Message}");
            return null;
        }
    }
}

public class ActionResponse
{
    public bool Success { get; set; }
    public string Message { get; set; }
    public DateTime Timestamp { get; set; }
}

Usage Notes

1. Browser Dependency: This endpoint requires an active Chrome browser instance. Make sure to use the Open Chrome action before calling this endpoint.

2. DOM Size Management: For very large webpages, the DOM content will be automatically trimmed to avoid exceeding token limits when passing to LLMs. The most important structural elements will be preserved.

3. Common Use Cases:

4. Alternative: For text-only content extraction, consider using the chromeGetText endpoint instead, which returns just the visible text content.