[FR] Document Object Model Integration #70

AdamSobieski · 2025-01-09T04:22:11Z

Introduction

What if, in addition to text-string prompts, DOM documents could be used as prompts?

This would enable model-independent multimodal prompting in a manner intuitive to Web developers.

As considered, such multimodal prompts could utilize a subset of HTML5 markup tags including those for: sections and paragraphs of text, source code, mathematics, lists, tables, images, audio, video, and embedded files and data.

The prompt() and promptStreaming() functions on sessions could distinguish between provided arguments of types string and Document.

Text

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const p = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'p');
p.append('This is some prompt content.');
html.append(p);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Mathematics

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const math = the_prompt.createElementNS('http://www.w3.org/1998/Math/MathML', 'math');
// ...
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
html.append(math);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Lists

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const ol = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'ol');
// ...
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
html.append(ol);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Tables

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const table = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'table');
// ...
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
html.append(table);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Images

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const img = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'img');
img.setAttributeNS('http://www.w3.org/1999/xhtml', 'src', 'data:image/png;base64,...');
html.append(img);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const img = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'img');
img.setAttributeNS('http://www.w3.org/1999/xhtml', 'src', 'https://example.org/media/picture-123.png');
html.append(img);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Audio

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const audio = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'audio');
audio.setAttributeNS('http://www.w3.org/1999/xhtml', 'src', 'https://example.org/media/audio-123.mp3');
html.append(audio);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Video

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const video = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'video');
video.setAttributeNS('http://www.w3.org/1999/xhtml', 'src', 'https://example.org/media/video-123.mpeg');
html.append(video);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Embedding Files and Data

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const embed = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'embed');
embed.setAttributeNS('http://www.w3.org/1999/xhtml', 'type', 'text/csv');
embed.setAttributeNS('http://www.w3.org/1999/xhtml', 'src', 'https://example.org/data/data-123.csv');
html.append(embed);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Metadata

const the_prompt = document.implementation.createDocument('...', 'prompt', null);
const html = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'html');
const head = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'head');
const body = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'body');
const meta = the_prompt.createElementNS('http://www.w3.org/1999/xhtml', 'meta');
meta.setAttributeNS('http://www.w3.org/1999/xhtml', 'name', 'author');
meta.setAttributeNS('http://www.w3.org/1999/xhtml', 'content', 'Bob Smith');
head.append(meta);
html.append(head);
body.append('This is some prompt content.');
html.append(body);
the_prompt.append(html);
const result = await session.prompt(the_prompt);

Prompt Markup Language

<prompt xmlns="..." version="1.0">
  <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
      <meta name="author" content="Bob Smith" />
    head>
    <body>
      <p>This is some prompt content.p>
      <img src="https://example.org/media/picture-123.png" />
      <embed src="https://example.org/data/data-123.csv" />
    body>
  html>
prompt>

Prompt Templates

<prompt xmlns="..." version="1.0">
  <html xmlns="http://www.w3.org/1999/xhtml" xmlns:promptml="...">
    <head>
      <meta name="author" content="Bob Smith" />
    head>
    <body>
      <p>This is some prompt content.p>
      <img src="https://example.org/media/picture-123.png" />
      <embed src="https://example.org/data/data-123.csv" />
      <p>Templates could be <promptml:template promptml:key="t1" />.p>
    body>
  html>
prompt>

const result = await session.prompt(the_prompt, { templates: { t1: "useful" } });

<prompt xmlns="..." version="1.0">
  <html xmlns="http://www.w3.org/1999/xhtml" xmlns:promptml="...">
    <head>
      <meta name="author" content="Bob Smith" />
    head>
    <body>
      <p>This is some prompt content.p>
      <img src="https://example.org/media/picture-123.png" />
      <embed src="https://example.org/data/data-123.csv" />
      <p>Templates could be useful.p>
    body>
  html>
prompt>

Prompt Events

<prompt xmlns="..." version="1.0" onenter="..." onexit="...">
  <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
      <meta name="author" content="Bob Smith" />
    head>
    <body>
      <p>This is some prompt content.p>
      <img src="https://example.org/media/picture-123.png" />
      <embed src="https://example.org/data/data-123.csv" />
    body>
  html>
prompt>

<prompt xmlns="..." version="1.0">
  <html xmlns="http://www.w3.org/1999/xhtml" xmlns:promptml="...">
    <head>
      <meta name="author" content="Bob Smith" />
    head>
    <body>
      <p>This is some prompt content.p>
      <img src="https://example.org/media/picture-123.png" />
      <embed promptml:onenter="..." promptml:onexit="..." src="..." />
    body>
  html>
prompt>

Exchange Markup Language

<exchange xmlns="..." version="1.0">
  <provide type="text/plain">This is some prompt content.provide>
  <expect type="text/plain" />
exchange>

<exchange xmlns="..." version="1.0">
  <provide type="text/plain">This is some prompt content.provide>
  <expect type="application/json">
    <schema type="application/schema+json" src="..." />
  expect>
exchange>

<exchange xmlns="..." version="1.0">
  <provide type="application/promptml+xml">
    <prompt xmlns="..." version="1.0">
      <html xmlns="http://www.w3.org/1999/xhtml">
        <p>This is some prompt content.p>
      html>
    prompt>
  provide>
  <expect type="application/json">
    <schema type="application/schema+json" src="..." />
  expect>
exchange>

The text was updated successfully, but these errors were encountered:

Closes #40. Somewhat helps with #70.

AdamSobieski · 2025-03-29T15:41:44Z

Here is a current version of the slideshow that I hope to present at the 03-31 meeting, time permitting: Prompt-API-Plus-DOM.pptx .

AdamSobieski · 2025-03-31T20:33:32Z

@domenic, as asked during the meeting: what are some of the benefits and use-case scenarios that would be enabled or simplified by having markup-based prompts in addition to text-based prompts?

A less steep learning curve for Web developers to get started with multimodal prompts.
The portability of multimodal prompts across models.
Web developers could create, store, load, share, and reuse multimodal prompts as files or resources.
1. Prompts and chat histories could be stored as files, served from servers, and stored within EPUB containers.

const response = await fetch('https://example.org/prompts/prompt-123.promptml');
const text = await response.text();
const parser = new DOMParser();
const the_prompt = parser.parseFromString(text, 'application/xml');
const result = await session.prompt(the_prompt);

Prompt-related templating features.
Prompt-related JavaScript events.

A related question is: which features could not be provided – either at all or in the same way – using a JavaScript library which uses or encapsulates the Prompt API?

The portability of multimodal prompts across models.
1. The transformation or transpiling of hypertext-based prompts and their components into those formats and styles preferred by individual models.
Prompt-related JavaScript events.

A new question is: should a Prompt Markup Language be able to express, in addition to user prompts, system prompts and their components, e.g., tool-definition sections, and/or chat histories, e.g., sequences of prompts?

domenic added a commit that referenced this issue Jan 20, 2025

Add image and audio prompting API

ff96dc3

Closes #40. Somewhat helps with #70.

domenic added a commit that referenced this issue Jan 20, 2025

Add image and audio prompting API

2a9f391

Closes #40. Somewhat helps with #70.

domenic mentioned this issue Jan 20, 2025

Add image and audio prompting API #71

Merged

domenic added the enhancement New feature or request label Jan 23, 2025

domenic added a commit that referenced this issue Feb 25, 2025

Add image and audio prompting API

331914a

Closes #40. Somewhat helps with #70.

anssiko mentioned this issue Mar 28, 2025

Prompt API: consider local use cases for RAG and Agentic RAG webmachinelearning/proposals#8

Open

yorkie mentioned this issue Jun 6, 2025

API: implement the Prompt API M-CreativeLab/jsar-runtime#76

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FR] Document Object Model Integration #70

[FR] Document Object Model Integration #70

AdamSobieski commented Jan 9, 2025 •

edited

Loading

AdamSobieski commented Mar 29, 2025 •

edited

Loading

Uh oh!

AdamSobieski commented Mar 31, 2025 •

edited

Loading

Uh oh!

[FR] Document Object Model Integration #70

[FR] Document Object Model Integration #70

Comments

AdamSobieski commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Introduction

Text

Mathematics

Lists

Tables

Images

Audio

Video

Embedding Files and Data

Metadata

Prompt Markup Language

Prompt Templates

Prompt Events

Exchange Markup Language

AdamSobieski commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AdamSobieski commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AdamSobieski commented Jan 9, 2025 •

edited

Loading

AdamSobieski commented Mar 29, 2025 •

edited

Loading

AdamSobieski commented Mar 31, 2025 •

edited

Loading