Get Automation Details

Overview

POST
/tools-api/automation/get-details

This endpoint retrieves detailed information about the system, including available windows, Chrome instances, focus information, and common system elements. It can help AI agents understand the current state of the system and identify potential automation targets.

Note: If you need to find a specific window by ID, you can obtain the Window ID from the response of this endpoint.

Request

The request does not require any parameters, as it will retrieve information about all available windows and system elements.

Request Body Format

{}

Parameters

No parameters required for this endpoint.

Response

The response contains detailed information about the window and its UI elements, along with contextual information such as Chrome browser instances, focus information, and common system elements.

Note: This endpoint is a POST request and all JSON keys in the response are in camelCase format.

Response Body Format

{
  "windows": [
    {
      "id": "123456",
      "title": "Window Title",
      "executablePath": "C:\\Path\\To\\Application.exe",
      "isForeground": true,
      "processName": "Application",
      "isMinimized": false
    }
  ],
  "chromeInstances": [
    {
      "title": "Chrome Window Title",
      "currentTabTitle": "Current Tab Title",
      "currentTabUrl": "https://example.com",
      "windowId": "789012",
      "currentTabIndex": 0,
      "tabCount": 3
    }
  ],
  "focusInfo": {
    "focusedWindowId": "123456",
    "focusedWindowTitle": "Window Title",
    "focusedElementName": "Element Name",
    "focusedElementType": "Button"
  },
  "topPinnedTaskbarIcons": [
    "Explorer",
    "Chrome",
    "Word"
  ],
  "topDesktopIcons": [
    "Recycle Bin",
    "My Documents"
  ],
  "topInstalledPrograms": [
    "Microsoft Office",
    "Google Chrome",
    "Adobe Creative Cloud"
  ],
  "importantNote": "This field appears when token credits are insufficient"
}

Note: All JSON keys in the response are in camelCase format.

Key Response Properties

Property Type Description
windows array Information about available windows, including ID, title, process details.
chromeInstances array Information about Chrome browser windows, including tab details.
focusInfo object Information about the currently focused window and UI element.
topPinnedTaskbarIcons array List of names of pinned taskbar applications.
topDesktopIcons array List of names of icons visible on the desktop.
topInstalledPrograms array List of names of installed programs on the system.
importantNote string Optional field that appears when token credits are insufficient.

Code Examples

The following examples demonstrate how to use this endpoint in different programming languages:

Python
TypeScript
C#
import requests
import json

url = "http://localhost:54321/tools-api/automation/get-details"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"  # Replace with your API key
}

# Empty request body as no parameters are required
data = {}

response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
    result = response.json()
    print("System details retrieved successfully")
    
    # Print information about windows
    print(f"Number of windows: {len(result['windows'])}")
    for window in result['windows']:
        print(f"Window title: {window['title']}, ID: {window['id']}")
    
    # Print information about Chrome instances
    if result['chromeInstances']:
        print(f"Number of Chrome instances: {len(result['chromeInstances'])}")
        for chrome in result['chromeInstances']:
            print(f"Chrome window: {chrome['title']}, Current tab: {chrome['currentTabTitle']}")
    
    # Print focus information
    if result['focusInfo']:
        print(f"Focused window: {result['focusInfo']['focusedWindowTitle']}")
        print(f"Focused element: {result['focusInfo']['focusedElementName']} ({result['focusInfo']['focusedElementType']})")
    
    # Check if there's an important note
    if result.get('importantNote'):
        print(f"IMPORTANT: {result['importantNote']}")
else:
    print(f"Error: {response.status_code}, {response.text}")
interface Window {
    id: string;
    title: string;
    executablePath: string;
    isForeground: boolean;
    processName: string;
    isMinimized: boolean;
}

interface ChromeInstance {
    title: string;
    currentTabTitle: string;
    currentTabUrl: string;
    windowId: string;
    currentTabIndex: number;
    tabCount: number;
}

interface FocusInfo {
    focusedWindowId: string;
    focusedWindowTitle: string;
    focusedElementName: string;
    focusedElementType: string;
}

interface AutomationDetails {
    windows: Window[];
    chromeInstances: ChromeInstance[];
    focusInfo: FocusInfo;
    topPinnedTaskbarIcons: string[];
    topDesktopIcons: string[];
    topInstalledPrograms: string[];
    importantNote?: string;
}

async function getAutomationDetails(): Promise {
    const url = 'http://localhost:54321/tools-api/automation/get-details';
    const headers = {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'  // Replace with your API key
    };
    
    // Empty request body as no parameters are required
    const data = {};
    
    const response = await fetch(url, {
        method: 'POST',
        headers: headers,
        body: JSON.stringify(data)
    });
    
    if (!response.ok) {
        throw new Error(`Error: ${response.status} ${response.statusText}`);
    }
    
    return await response.json() as AutomationDetails;
}

// Example usage
async function example() {
    try {
        const automationDetails = await getAutomationDetails();
        console.log(`Number of windows: ${automationDetails.windows.length}`);
        
        // Find foreground window
        const foregroundWindow = automationDetails.windows.find(w => w.isForeground);
        if (foregroundWindow) {
            console.log(`Foreground window: ${foregroundWindow.title}`);
        }
        
        // Check Chrome instances
        if (automationDetails.chromeInstances.length > 0) {
            console.log(`Chrome tabs detected: ${automationDetails.chromeInstances.length}`);
            automationDetails.chromeInstances.forEach(chrome => {
                console.log(`- ${chrome.currentTabTitle} (${chrome.currentTabUrl})`);
            });
        }
        
        // Print taskbar icons
        console.log("Pinned taskbar icons:");
        automationDetails.topPinnedTaskbarIcons.forEach(icon => console.log(`- ${icon}`));
        
        // Check for important notes
        if (automationDetails.importantNote) {
            console.log(`IMPORTANT: ${automationDetails.importantNote}`);
        }
    } catch (error) {
        console.error(`Failed to get automation details: ${error}`);
    }
}

example();
using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Text.Json.Serialization;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.Linq;

public class Window
{
    [JsonPropertyName("id")]
    public string Id { get; set; }
    
    [JsonPropertyName("title")]
    public string Title { get; set; }
    
    [JsonPropertyName("executablePath")]
    public string ExecutablePath { get; set; }
    
    [JsonPropertyName("isForeground")]
    public bool IsForeground { get; set; }
    
    [JsonPropertyName("processName")]
    public string ProcessName { get; set; }
    
    [JsonPropertyName("isMinimized")]
    public bool IsMinimized { get; set; }
}

public class ChromeInstance
{
    [JsonPropertyName("title")]
    public string Title { get; set; }
    
    [JsonPropertyName("currentTabTitle")]
    public string CurrentTabTitle { get; set; }
    
    [JsonPropertyName("currentTabUrl")]
    public string CurrentTabUrl { get; set; }
    
    [JsonPropertyName("windowId")]
    public string WindowId { get; set; }
    
    [JsonPropertyName("currentTabIndex")]
    public int CurrentTabIndex { get; set; }
    
    [JsonPropertyName("tabCount")]
    public int TabCount { get; set; }
}

public class FocusInfo
{
    [JsonPropertyName("focusedWindowId")]
    public string FocusedWindowId { get; set; }
    
    [JsonPropertyName("focusedWindowTitle")]
    public string FocusedWindowTitle { get; set; }
    
    [JsonPropertyName("focusedElementName")]
    public string FocusedElementName { get; set; }
    
    [JsonPropertyName("focusedElementType")]
    public string FocusedElementType { get; set; }
}

public class AutomationDetails
{
    [JsonPropertyName("windows")]
    public List Windows { get; set; }
    
    [JsonPropertyName("chromeInstances")]
    public List ChromeInstances { get; set; }
    
    [JsonPropertyName("focusInfo")]
    public FocusInfo FocusInfo { get; set; }
    
    [JsonPropertyName("topPinnedTaskbarIcons")]
    public List TopPinnedTaskbarIcons { get; set; }
    
    [JsonPropertyName("topDesktopIcons")]
    public List TopDesktopIcons { get; set; }
    
    [JsonPropertyName("topInstalledPrograms")]
    public List TopInstalledPrograms { get; set; }
    
    [JsonPropertyName("importantNote")]
    public string ImportantNote { get; set; }
}

public class ToolsServerClient
{
    private readonly HttpClient _httpClient;
    private readonly string _baseUrl;
    private readonly string _apiKey;
    
    public ToolsServerClient(string baseUrl, string apiKey)
    {
        _baseUrl = baseUrl;
        _apiKey = apiKey;
        _httpClient = new HttpClient();
    }
    
    public async Task GetAutomationDetailsAsync()
    {
        var requestUri = $"{_baseUrl}/tools-api/automation/get-details";
        var requestBody = "{}"; // Empty JSON object as no parameters are required
        
        var request = new HttpRequestMessage(HttpMethod.Post, requestUri)
        {
            Content = new StringContent(requestBody, Encoding.UTF8, "application/json")
        };
        
        request.Headers.Add("Authorization", $"Bearer {_apiKey}");
        
        var response = await _httpClient.SendAsync(request);
        response.EnsureSuccessStatusCode();
        
        var jsonResponse = await response.Content.ReadAsStringAsync();
        return JsonSerializer.Deserialize(jsonResponse);
    }
}

class Program
{
    public static async Task Main()
    {
        var client = new ToolsServerClient("http://localhost:54321", "YOUR_API_KEY");
        
        try
        {
            var automationDetails = await client.GetAutomationDetailsAsync();
            
            Console.WriteLine($"Number of windows: {automationDetails.Windows.Count}");
            
            // Find foreground window
            var foregroundWindow = automationDetails.Windows.FirstOrDefault(w => w.IsForeground);
            if (foregroundWindow != null)
            {
                Console.WriteLine($"Foreground window: {foregroundWindow.Title}");
                Console.WriteLine($"Process: {foregroundWindow.ProcessName}");
            }
            
            // Check focus information
            if (automationDetails.FocusInfo != null)
            {
                Console.WriteLine($"Focused element: {automationDetails.FocusInfo.FocusedElementName} ({automationDetails.FocusInfo.FocusedElementType})");
            }
            
            // List Chrome instances
            if (automationDetails.ChromeInstances.Count > 0)
            {
                Console.WriteLine("\nChrome browser instances:");
                foreach (var chrome in automationDetails.ChromeInstances)
                {
                    Console.WriteLine($"  Window: {chrome.Title}");
                    Console.WriteLine($"  Current tab: {chrome.CurrentTabTitle}");
                    Console.WriteLine($"  URL: {chrome.CurrentTabUrl}");
                    Console.WriteLine($"  Tab {chrome.CurrentTabIndex + 1} of {chrome.TabCount}");
                }
            }
            
            // Display important note if present
            if (!string.IsNullOrEmpty(automationDetails.ImportantNote))
            {
                Console.WriteLine($"\nIMPORTANT NOTE: {automationDetails.ImportantNote}");
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}

Examples

Example 1: Getting Details of a Calculator Window

In this example, we retrieve the details of a Windows Calculator application window to find its clickable buttons.

Request:

{
  "windowId": "4567890"
}

Response (Simplified):

{
  "windows": [
    {
      "id": "4567890",
      "title": "Calculator",
      "executablePath": "C:\\Windows\\System32\\calc.exe",
      "isForeground": true,
      "processName": "Calculator"
    }
  ],
  "chromeInstances": [],
  "focusInfo": {
    "focusedWindowId": "4567890",
    "focusedWindowTitle": "Calculator",
    "focusedElementName": "7",
    "focusedElementType": "Button"
  },
  "topPinnedTaskbarIcons": [
    "Explorer",
    "Chrome",
    "Word"
  ],
  "topDesktopIcons": [
    "Recycle Bin",
    "My Documents"
  ],
  "topInstalledPrograms": [
    "Microsoft Office",
    "Google Chrome",
    "Adobe Creative Cloud"
  ],
  "importantNote": ""
}

Example 2: Handling Chrome Browser Windows

When getting details for a Chrome browser window, the response includes a special note advising proper usage:

Request:

{
  "windowId": "7654321"
}

Response (Simplified):

{
  "windows": [
    {
      "id": "7654321",
      "title": "Google - Google Chrome",
      "executablePath": "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe",
      "isForeground": true,
      "processName": "chrome"
    }
  ],
  "chromeInstances": [
    {
      "title": "Google - Google Chrome",
      "currentTabTitle": "Google",
      "currentTabUrl": "https://www.google.com",
      "windowId": "7654321",
      "currentTabIndex": 0,
      "tabCount": 3
    }
  ],
  "focusInfo": {
    "focusedWindowId": "7654321",
    "focusedWindowTitle": "Google - Google Chrome",
    "focusedElementName": "Address and search bar",
    "focusedElementType": "Edit"
  },
  "topPinnedTaskbarIcons": [
    "Explorer",
    "Chrome",
    "Word"
  ],
  "topDesktopIcons": [
    "Recycle Bin",
    "My Documents"
  ],
  "topInstalledPrograms": [
    "Microsoft Office",
    "Google Chrome",
    "Adobe Creative Cloud"
  ],
  "importantNote": "The window is a Chrome browser. It is not advised to use an automation action on browser windows, except in special cases where you want to act on the window itself (for example close or minimize it). In most cases what you want to use next is a chrome action. Have a look at the property CurrentChromeTabMostRelevantElements to see what is inside the currently selected chrome tab."
}