Bot Arena

Out of arena

Where Playwright runs out of road in public SaaS

Selector-based automation runs out of road in three distinct ways on the public web: the application surface goes opaque when working areas render to <canvas>; the gateway slams shut when third-party identity providers actively reject WebDriver-controlled browsers before the SaaS even loads; and the entire desktop arrives as pixels over WebSocket when the app is delivered through a streamed-desktop session (Citrix HDX, VMware Blast, TSplus HTML5 RDP). Pick a case study below.

The vendor, at scale

98%

of the Fortune 500 — on Citrix alone

Citrix advertises 100 million+ users across 400,000 organizations, 99% of the Fortune 100 and 98% of the Fortune 500. VMware/Omnissa Horizon, Microsoft AVD + Windows 365 Cloud PC, TSplus, Cameyo and Apache Guacamole all sit on top of that — every seat sees the published app as a server-side bitmap painted into the browser, not as DOM.

The category

$4.3B → $6.0B

DaaS market by 2029 (Gartner)

Gartner's 2025 Magic Quadrant: $4.3B in 2025 → $6.0B by 2029 (7.9% CAGR). Planning assumption: by 2027 virtual desktops are cost-effective for 95% of workers (up from 40% in 2019) and the primary workspace for 20%. The streamed-canvas surface is growing, not shrinking.

Killer example

43.7%

of US hospitals run Epic — virtually all delivered via Citrix

KLAS 2026: Epic owns 43.7% of US acute-care hospitals, 56.9% of inpatient beds. Epic Hyperspace ran as a Citrix-published Windows fat client at virtually every Epic shop; the Hyperdrive web-replacement migration is still mid-flight through 2026, and many sites deliver Hyperdrive itself inside a Citrix session — so the clinician's browser still sees pixels in a canvas.

Enterprise apps commonly delivered through a streamed canvas

  • SAP GUI for Windows — Citrix says 40%+ of SAP customers run SAP GUI through Citrix; latency-sensitive RFC stays inside the datacentre.
  • Oracle E-Business Suite Forms / JD Edwards — Java/Forms fat clients delivered via Citrix to avoid per-desktop JRE rollouts and keep WAN users responsive.
  • Epic Hyperspace / Hyperdrive — the dominant US EHR, classically Citrix-published into clinical workstations and tap-and-go thin clients.
  • Bloomberg Terminalofficially supported on Citrix XenApp/Workspace; standard pattern on locked-down trading floors.
  • AutoCAD / Revit / SolidWorks / Siemens NX — Citrix + NVIDIA vGPU for license-floating and to keep multi-GB CAD files off engineers' laptops.
  • Microsoft Dynamics GP/AX, Sage 100/300/X3, Infor M3/LN, IFS — Windows-only ERP fat clients that nobody is rewriting; Citrix/RDS is the standard delivery vehicle.
  • IBM i / AS400 Access Client Solutions — green-screen and Java navigator delivered via Citrix or RDS to extend mid-range workload life.
  • The TSplus public demo here — same HTML5 RDP plumbing (canvas + WebSocket), just unrestricted access for proof.
Internal evidence — our own customers run our product in Citrix

Y Soft internal Slack, May 2026. A colleague asks IT for a minimal Citrix lab because "we have several enterprise customers which use SAFEQ Cloud and PC client in Citrix environments" and we can't reproduce their issues without a comparable setup — plain RDP / Terminal Services doesn't cover clustered Citrix with roaming profiles. This is the streamed-desktop surface viewed from inside a vendor whose customers deploy through it: the same canvas plumbing that breaks Playwright is also where our own support tickets land.

Internal Slack message from a Y Soft engineer asking IT for a minimal two-VM Citrix cluster with roaming user profile, because several enterprise customers run SAFEQ Cloud and the PC client in Citrix and the team has no comparable lab to reproduce their issues.

Live demo

Drive an enterprise app streamed as a browser canvas

Streamed desktop

https://demo.tsplus.net/

Goal

An engineer wants to script a line-of-business app delivered as a streamed Windows session into the browser. In the wild that is SAP GUI, Oracle E-Business Suite Forms, JD Edwards EnterpriseOne, Epic Hyperspace, Bloomberg Terminal or AutoCAD — published through Citrix, VMware/Omnissa Horizon, Microsoft AVD, TSplus, Cameyo, Kasm or Apache Guacamole. The browser-side result is identical across all of them: one <code>&lt;canvas&gt;</code> painted from a WebSocket. Our publicly-reachable proof point is the TSplus demo (demo / demo, no card, no sales call) driving Microsoft Excel — we ask it to type "Hello world" into A1 and read it back, and run that script against the same canvas-streaming plumbing any of the enterprise targets above use.

Outcome

✗ Fails

Playwright recording (40 s)

Apps portal after login — Microsoft Word / Excel / PowerPoint / Notepad tiles. This is the LAST point at which standard Playwright locators (<code>getByRole('link', { name: /excel/i })</code>) work cleanly. The instant the Excel tile is clicked, we cross the canvas boundary.
Apps portal after login — Microsoft Word / Excel / PowerPoint / Notepad tiles. This is the LAST point at which standard Playwright locators (<code>getByRole('link', { name: /excel/i })</code>) work cleanly. The instant the Excel tile is clicked, we cross the canvas boundary.
Variant A — the moment Playwright hits the wall. Excel's Start screen is fully painted into <code>&lt;canvas id="JWTS_myCanvas"&gt;</code>; a human sees the "Blank workbook" tile clearly. <code>appPage.getByText(/^Blank workbook$/i)</code> resolves to zero elements; <code>toBeVisible()</code> times out 10 s later.
Variant A — the moment Playwright hits the wall. Excel's Start screen is fully painted into <code>&lt;canvas id="JWTS_myCanvas"&gt;</code>; a human sees the "Blank workbook" tile clearly. <code>appPage.getByText(/^Blank workbook$/i)</code> resolves to zero elements; <code>toBeVisible()</code> times out 10 s later.
Variant B — A1 selected at canvas pixel (58, 237). Reaching this state required four pieces of prior knowledge: where the Blank-workbook tile is, that Escape × 2 closes the Office 2019+ Start screen, where A1 sits at 1280×720, and that the canvas needs a mousedown gesture before forwarding keystrokes.
Variant B — A1 selected at canvas pixel (58, 237). Reaching this state required four pieces of prior knowledge: where the Blank-workbook tile is, that Escape × 2 closes the Office 2019+ Start screen, where A1 sits at 1280×720, and that the canvas needs a mousedown gesture before forwarding keystrokes.
Variant B — A1 = "Hello world", active cell advanced to A2 after <kbd>Enter</kbd>. The only programmatic way to confirm the value is the <code>Ctrl+C → navigator.clipboard.readText()</code> side channel — and only because demo.tsplus.net allows remote-clipboard redirection. Most production Citrix / Horizon / AVD deployments disable this by policy.
Variant B — A1 = "Hello world", active cell advanced to A2 after <kbd>Enter</kbd>. The only programmatic way to confirm the value is the <code>Ctrl+C → navigator.clipboard.readText()</code> side channel — and only because demo.tsplus.net allows remote-clipboard redirection. Most production Citrix / Horizon / AVD deployments disable this by policy.
✓ Passes

AIVA recording (same task — 2× speed)

AIVA driving the same TSplus-streamed Excel session end-to-end — clicking "Blank workbook" on the Start screen as a recognised tile, targeting cell A1 as a cell, typing "Hello world", then reading it back from the rendered grid. The eight Playwright walls (no DOM tile, no tabindex on the canvas, no DOM cell, no readback path without clipboard sync) do not apply: AIVA reads the pixels the same way a human operator does, so streamed RDP, Citrix HDX, VMware Blast, Horizon HTML Access and Microsoft AVD all collapse to the same input.

How and why

Steps

  1. Open https://demo.tsplus.net/ and log in with demo / demo.
  2. Click the "Microsoft Excel" tile in the published-apps portal.
  3. Wait for the HTML5 RDP canvas to mount in the new tab.
  4. Dismiss the Excel Start screen and land on a blank Book1.
  5. Select cell A1.
  6. Type "Hello world" and press Enter.
  7. Read A1 back and assert it equals "Hello world".

Problem

Enterprise software is routinely delivered to the user's browser as a streamed Windows desktop — for data-sovereignty (data never leaves the datacentre or cloud region), for license-floating (concurrent vs. named-user ISV pricing), for compliance audit trails, and because decades-old Windows-only fat clients like SAP GUI, Oracle EBS Forms, Epic Hyperspace or AutoCAD will not be rewritten. Whether the bytes arrive via Citrix HDX, VMware/Omnissa Horizon Blast, PCoIP, Microsoft RDP-over-HTML5, TSplus, Cameyo, Kasm or Apache Guacamole, the browser-side result is the same: a single <canvas> (sometimes a <video>) driven over a WebSocket. No DOM, no ARIA, no document.querySelector.

Variant A — naive DOM-based drive (the real Playwright result). A developer asked to "type Hello world into A1 of Excel" would reach for getByText("Blank workbook").click(), then getByRole("gridcell", { name: "A1" }).fill(...), then expect(...).toHaveText("Hello world"). Every locator resolves to zero matches because the entire Excel UI — Start screen tiles, ribbon, formula bar, grid — is canvas pixels. toBeVisible() times out on the first selector. The recording on this card is exactly that run. This is the real result for any streamed-desktop session, not a Playwright skill issue.

Variant B — pixel-coordinate hack (NOT a real solution). Abandon selectors entirely and drive the canvas with hard-coded pixel coordinates, Office shortcuts, and the TSplus clipboard-sync side channel. It "passes" — but only because we knew, ahead of time and specifically for this Excel build at 1280×720:
  (i) where the Blank-workbook tile sits in the canvas,
  (ii) that pressing Escape twice closes the Office 2019+ Start screen,
  (iii) where A1 sits in the canvas at this DPI / ribbon / font,
  (iv) that demo.tsplus.net redirects the remote Windows clipboard back to the browser, AND that we granted clipboard-read permission for the origin.
Change any one of those four prior-knowledge inputs and Variant B breaks. Cost it out at scale: every Excel version, every theme, every DPI multiplier, every Office locale needs its own per-pixel calibration — and most production Citrix / Horizon / AVD deployments disable remote-clipboard sync by policy (it's the exfiltration vector compliance teams are closing), so (iv) does not even hold. Variant B is the upper bound of what selector-based automation can do here. It is not a generalisable approach — it is a brittle, app-version-specific pixel hack.

The streamed-desktop pattern reduces to the same problem regardless of broker: SAP GUI in Citrix XenApp, Hyperspace in Horizon, AutoCAD in AVD, JD Edwards through TSplus — all collapse to pixels in a canvas. The Playwright wall is in the same place for all of them.

Where Playwright bounces off

  1. ✓ Reaches Streaming-broker portal — login form, published-apps grid

    Citrix StoreFront, Horizon HTML Access, Microsoft RD Web Access, TSplus Web Portal — all render as real DOM. Standard locators work for username / password / tile click. TSplus has one gotcha worth flagging: #buttonLogOn.onclick is only attached after the cgi-bin/hb.exe 2FA-status XHR returns, so the spec must Tab between fills and wait for that response before clicking.

  2. ✓ Reaches HTML5 streaming canvas — keyboard/mouse forwarding

    JWS (TSplus), Citrix HTML5 Workspace, Horizon HTML Access, Apache Guacamole and AWS WorkSpaces Web Access all forward keystrokes and mouse events over WebSocket to the remote session — but only after a page.mouse.click() on the canvas. The canvas has no tabindex, so locator('canvas').focus() does nothing; the mousedown gesture is the only path. Every subsequent click is a pixel coordinate against a layout we cannot inspect.

  3. · No DOM Streamed application UI — every menu, dialog, dropdown, grid cell

    Excel ribbon, SAP GUI transaction codes, Hyperspace patient-chart tabs, AutoCAD command line — all painted. getByText, getByRole, locator('[aria-label=…]') → zero matches inside the streaming canvas. Dismissing the Excel Start screen in our demo requires pressing Escape (Office 2019+ shortcut) because the "Blank workbook" tile click registered as a hover-tooltip — the canvas pixel rendered, but the activation event was lost in our first runs.

  4. ✗ Stops here Verification — reading any value back from the streamed app

    No DOM, no ARIA, no inputValue. The only working readback is Ctrl+C → navigator.clipboard.readText() via remote-clipboard sync — which requires both the host enabling clipboard redirection AND the browser context being granted clipboard-read. The TSplus demo permits it; most production Citrix / Horizon / AVD policies disable clipboard redirection precisely because it's the data-exfiltration vector compliance teams are trying to close.

AIVA reads the canvas pixels the way a human operator does — a tile is a thing it can recognize, a cell is a cell, a transaction code in SAP GUI is text it can locate on screen. Streamed RDP, Citrix HDX, VMware Blast, Microsoft AVD, browser-rendered SaaS — all collapse to the same pixel input.

Show the Playwright test expand
import { test, expect } from '@playwright/test';

const SUT = 'https://demo.tsplus.net/';

// Variant A — naive DOM-based drive. This is the real Playwright result
// against a streamed-desktop session. Every locator inside the canvas
// returns zero matches; toBeVisible() times out on the first one.
test('A. naive — DOM selectors against the canvas', async ({ page, context }) => {
  // ... login + open Excel tile (real DOM, works fine — see full source) ...
  // ... canvas#JWTS_myCanvas mounts, Excel paints its Start screen inside ...

  // First selector a developer would write — match the visible tile label.
  // The "Blank workbook" tile IS rendered (visible to a human, visible in
  // the screenshot), but it is painted into the canvas. getByText finds
  // zero elements; toBeVisible times out.
  const blankTile = appPage.getByText(/^Blank workbook$/i).first();
  await expect(blankTile).toBeVisible({ timeout: 10_000 });   // ← FAILS HERE
  await blankTile.click();

  // Unreachable in practice — included so a reader can see the next two
  // naive selectors a developer would write and confirm they too resolve
  // to zero matches against the canvas:
  const a1Cell = appPage.getByRole('gridcell', { name: 'A1' })
    .or(appPage.locator('[aria-label="A1"]'));
  await expect(a1Cell).toBeVisible({ timeout: 10_000 });
  await a1Cell.fill('Hello world');
  await expect(a1Cell).toHaveText('Hello world');
});

// Variant B — pixel-coordinate hack. Kept to be candid about what it
// would take to "drive" Excel-on-canvas from stock Playwright. Requires
// four pieces of app-version-specific prior knowledge:
//   (i)   Blank-workbook tile pixel coords at 1280x720
//   (ii)  Escape × 2 closes Office 2019+ Start screen
//   (iii) A1 pixel coords at this DPI/theme/ribbon
//   (iv)  Host redirects the remote clipboard AND clipboard-read granted
// Change any one and Variant B breaks. This is the upper bound of what
// selectors can do — not a generalisable approach.
test('B. best-effort — pixel coords + Office shortcut + clipboard sync', async ({ page, context }) => {
  await context.grantPermissions(['clipboard-read', 'clipboard-write'], {
    origin: 'https://demo.tsplus.net',                          // (iv)
  });
  // ... same login + open Excel tile ...
  const box = (await canvas.boundingBox())!;

  // (i) tile pixel. Click neutral whitespace first for the JWS input gesture.
  await appPage.mouse.click(box.x + 640, box.y + 700);

  // (ii) Escape twice — Office 2019+ shortcut to close the Start screen.
  await appPage.keyboard.press('Escape');
  await appPage.waitForTimeout(1_500);
  await appPage.keyboard.press('Escape');
  await appPage.waitForTimeout(6_000);

  // (iii) A1 by hardcoded canvas pixel. Belt-and-braces Ctrl+Home.
  await appPage.mouse.click(box.x + 58, box.y + 237);
  await appPage.keyboard.press('Control+Home');
  await appPage.keyboard.type('Hello world', { delay: 60 });
  await appPage.keyboard.press('Enter');

  // (iv) Readback via clipboard side channel — the ONLY working path.
  await appPage.keyboard.press('Control+Home');
  await appPage.keyboard.press('Control+C');
  await appPage.waitForTimeout(800);
  const clip = await appPage.evaluate(() => navigator.clipboard.readText());
  expect(clip.trim()).toBe('Hello world');
});

TimeoutError: expect(locator).toBeVisible() failed — Locator: getByText(/^Blank workbook$/i).first(). Expected: visible. Error: element(s) not found. "Excel Start-screen 'Blank workbook' tile should be reachable from the DOM — it is not, because the Start screen is painted into the canvas."