Commit | Line | Data |
---|---|---|
9a1dfe2b KL |
1 | Terminal Emulator Images Standard - Proposed Design - Simplified |
2 | ================================================================ | |
3 | ||
4 | Version: 1 | |
5 | ||
6 | ||
7 | ||
8 | Purpose | |
9 | ------- | |
10 | ||
11 | See the [original proposal](images.md) for purpose, design goals, and | |
12 | definitions. | |
13 | ||
14 | This document is an updated proposal to address feedback on the first | |
15 | proposal, which included: "overengineered", "hopelessly | |
16 | overengineered", and "unnecessarily complex." | |
17 | ||
18 | I perceive this feedback as a positive: it is far easier to imagine a | |
19 | feature and remove it, than to fail to picture it and need it later. | |
20 | The original proposal was a superset of every image format referenced, | |
21 | and generalized beyond to multimedia. This proposal is sharply | |
22 | reduced from that to: "put this pixel rectangle from the image, into | |
23 | that cell-based rectangle with specific scaling policy". It is mostly | |
24 | a subset of the iTerm2 protocol, with specifications for what happens | |
25 | to the cursor, and more precise definitions of the | |
26 | "preserveAspectRatio" equivalent options. | |
27 | ||
28 | ||
29 | ||
30 | Tradeoffs | |
31 | --------- | |
32 | ||
33 | Simplifying the original proposal will significantly reduce | |
34 | complexity, but also eliminates features. The major tradeoffs offered | |
35 | in this revised proposal are: | |
36 | ||
37 | 1. Elimination of the layers feature, and with it the ability to place | |
38 | images behind text. In this proposal, a Cell on the screen will | |
39 | show either a (part of a) visible image, or a (part of a) text | |
40 | glyph, but never both. | |
41 | ||
42 | 2. Elimination of the "url" option, and with it the ability for an | |
43 | application to specify a filename or other method for the terminal | |
44 | to find the file data on the local machine. Image data must always | |
45 | be passed inline with the sequences. | |
46 | ||
47 | 3. Elimination of response codes, and with it: | |
48 | ||
49 | - The ability for multiplexers to blindly pass on the sequences to | |
50 | their host terminal. | |
51 | ||
52 | - The ability for applications to reliably detect success or | |
53 | failure of image display operations. | |
54 | ||
55 | 4. Elimination of pixel-oriented image placement operations, and with | |
56 | it the ability of applications to pass on image calculations to the | |
57 | terminal. An application which requires pixel-perfect rendering | |
58 | must generate the pixels it needs, aligned such to be displayed at | |
59 | the top-left corner of the text Cell rectangle. | |
60 | ||
61 | ||
62 | ||
63 | Summary | |
64 | ------- | |
65 | ||
66 | This revised document proposes two independent new features: | |
67 | ||
68 | 1. A method to transfer image data for immediate display within the | |
69 | screen Cell grid ("Direct Images"). | |
70 | ||
71 | 2. A method to transfer image data to a terminal-managed cache, and | |
72 | later display that data within the screen Cell grid ("Cached | |
73 | Images"). | |
74 | ||
75 | The only difference between the first and second feature is the | |
76 | presence of an ID key. Direct images do not use an ID key, while | |
77 | cached images use a store operation with ID key followed by one or | |
78 | more display operations with ID key. | |
79 | ||
80 | Images are applied to text Cells, and once set handled the same way | |
81 | text Cells are handled: erasing a line erases the image Cells on that | |
82 | line, inserting a character will shift image Cells on that row over, | |
83 | scrolling will shift the image up, and so on. Therefore, terminals | |
84 | will need to be prepared for the scenario that every Cell on the | |
85 | display is a separate image, with a separate display scaling option | |
86 | that will need to be re-applied automatically if font metrics change. | |
87 | ||
88 | ||
89 | ||
90 | All Features - Detection | |
91 | ------------------------ | |
92 | ||
93 | Applications can detect support for these features using Primary | |
94 | Device Attributes (DA) and DECID (ESC Z, or 0x9A). | |
95 | ||
96 | Terminals that support this standard will repond with additional | |
97 | parameter(s): "224" for direct images and "225" for cached images. A | |
98 | recap of the parameters xterm supports is listed below, with these new | |
99 | feature responses included: | |
100 | ||
101 | | VT220 (and higher) Response | Description | | |
102 | |-----------------------------|--------------------------------------------| | |
103 | | 1 | 132-columns | | |
104 | | 2 | Printer | | |
105 | | 3 | ReGIS graphics | | |
106 | | 4 | Sixel graphics | | |
107 | | 6 | Selective erase | | |
108 | | 8 | User-defined keys | | |
109 | | 9 | National Replacement Character sets | | |
110 | | 1 5 | Technical characters | | |
111 | | 1 6 | Locator port | | |
112 | | 1 7 | Terminal state interrogation | | |
113 | | 1 8 | User windows | | |
114 | | 2 1 | Horizontal scrolling | | |
115 | | 2 2 | ANSI color, e.g., VT525 | | |
116 | | 2 8 | Rectangular editing | | |
117 | | 2 9 | ANSI text locator (i.e., DEC Locator mode) | | |
118 | | 2 2 4 | Direct Images Version 1 | | |
119 | | 2 2 5 | Cached Images Version 1 | | |
120 | ||
121 | ||
122 | ||
123 | Direct Images - Summary | |
124 | ----------------------- | |
125 | ||
126 | Non-text data (images) can be sent to the terminal for immediate | |
127 | display in a rectangular region of text Cells. Image data is | |
128 | transmitted to the terminal using a wire format described later in | |
129 | this document. | |
130 | ||
131 | Setting a Cell to image is a destructive operation: the Cell's | |
132 | original text is lost. Similarly, setting a Cell (or multiple Cells | |
133 | for fullwidth glyphs or grapheme clusters) to text is a destructive | |
134 | operation: the image in the Cell(s) is lost. | |
135 | ||
136 | Setting any part of a multi-Cell Tile to image also "breaks up" the | |
137 | Tile into a range of single Cells. In other words, image data can | |
138 | only be carried by a Cell, not a Tile. | |
139 | ||
140 | ||
141 | ||
142 | Direct Images - New Sequences | |
143 | ----------------------------- | |
144 | ||
145 | A terminal with direct images feature must support the following new | |
146 | sequences: | |
147 | ||
148 | | Sequence | Description | | |
149 | |--------------------------------------|-------------------------| | |
150 | | OSC 1 3 3 8 ; F i l e = {args} : {data} BEL | Display image at (x, y) | | |
151 | | OSC 1 3 3 8 ; F i l e = {args} : {data} ST | Display image at (x, y) | | |
152 | ||
153 | ||
154 | ||
155 | For the OSC 1 3 3 8 sequence: | |
156 | ||
157 | * The {args} is a set of key-value pairs (each pair separated by | |
158 | semicolon (';')), followed by a colon (':'), followed by a base-64 | |
159 | encoded string ({data}). | |
160 | ||
161 | * A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z', | |
162 | 'a' - 'z'). | |
163 | ||
164 | * A value is any printable ASCII string not containing whitespace, | |
165 | colon, or semicolon ('!' - '9', '<' - '~'). | |
166 | ||
167 | * Any alpha-numeric key may be specified. A key that is not supported | |
168 | by the terminal is ignored without error. | |
169 | ||
170 | * The image is processed as shown below: | |
171 | ||
172 | - The pixels are drawn starting at the upper-left corner of the text | |
173 | cursor position. | |
174 | ||
175 | - If scroll is specified as 1 (enabled), then: | |
176 | ||
177 | a. The screen is scrolled up if the image overflows into the | |
178 | bottom text row. | |
179 | ||
180 | b. The cursor's final position is on the same column as the | |
181 | starting cursor position, and on the row immediately below the | |
182 | image. | |
183 | ||
184 | - If scroll is omitted or specified as 0 (disabled), then: | |
185 | ||
186 | a. The screen is never scrolled. | |
187 | ||
188 | b. Pixels that would be drawn below the visible region on screen | |
189 | are discarded. | |
190 | ||
191 | c. The cursor's final position is at the same column and row as | |
192 | the starting cursor position, i.e. the cursor does not move at | |
193 | all. | |
194 | ||
195 | - Pixels that would be drawn to the right of the visible region on | |
196 | screen are discarded. | |
197 | ||
198 | ||
199 | ||
200 | The keys for the key-value pairs that must be supported by the | |
201 | terminal are listed below: | |
202 | ||
203 | | Key | Default Value | Description | | |
204 | |--------------|---------------|----------------------------------------------| | |
205 | | type | "image/rgb" | mime-type describing data field | | |
206 | | width | 1 | Number of Cells or pixels wide to display in | | |
207 | | height | 1 | Number of Cells or pixels high to display in | | |
208 | | scale | "none" | Scale/zoom option, see below | | |
209 | | sourceX | 0 | Media source X position to display | | |
210 | | sourceY | 0 | Media source Y position to display | | |
211 | | sourceWidth | "auto" | Media width in pixels to display | | |
212 | | sourceHeight | "auto" | Media height in pixels to display | | |
213 | | scroll | 0 | If 0, scroll the display if needed | | |
214 | ||
215 | A terminal may support additional keys. If a key is specified but not | |
216 | supported by the terminal, then it is ignored without error. | |
217 | ||
218 | ||
219 | ||
220 | The "type" value is a mime-type string describing the format of the | |
221 | base64-encoded binary data. The terminal must support at minimum these | |
222 | mime-types: | |
223 | ||
224 | | Type String | Description | | |
225 | |---------------|--------------------------------------------------------------| | |
226 | | "image/rgb" | Big-endian-encoded 24-bit red, green, blue values | | |
227 | | "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values | | |
228 | | "image/png" | PNG file data as described by (reference to PNG format) | | |
229 | ||
230 | A terminal may support additional types. An application can detect | |
231 | terminal support for a format by: | |
232 | ||
233 | 1. Attempt to draw image, with "scroll" set to 1. | |
234 | ||
235 | 2. Check cursor position DSR 6. | |
236 | ||
237 | 3. If cursor has moved, then the terminal supports this image type. | |
238 | ||
239 | ||
240 | ||
241 | The "width" and "height" values are positive integers describing the | |
242 | number of Cells the image will be placed in. | |
243 | ||
244 | ||
245 | ||
246 | The "scale" value can take the following values: | |
247 | ||
248 | | Value | Meaning | | |
249 | |------------|---------------------------------------------------------------| | |
250 | | "none" | No scaling along either axis. | | |
251 | | "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping | | |
252 | | "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area | | |
253 | | "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit | | |
254 | ||
255 | ||
256 | ||
257 | "sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the | |
258 | rectangle of pixels from the media that will be displayed on the | |
259 | screen. The ranges for these values is shown below: | |
260 | ||
261 | | Key | Minimum Value | Maximum Value | Default Value | | |
262 | |--------------|---------------|-------------------------------|---------------| | |
263 | | sourceX | 0 | Media's full width - 1 | 0 | | |
264 | | sourceY | 0 | Media's full height - 1 | 0 | | |
265 | | sourceWidth | 1 | Media's full width - sourceX | "auto" | | |
266 | | sourceHeight | 1 | Media's full height - sourceY | "auto" | | |
267 | ||
268 | If any of these values are specified and outside the range, no image | |
269 | is displayed, and the cursor does not move. "sourceWidth" and | |
270 | "sourceHeight" can be "auto", which means use the maximum available | |
271 | width/height (given sourceX/sourceY) from the media's inherent | |
272 | dimensions. | |
273 | ||
274 | ||
275 | ||
276 | Cached Images - Summary | |
277 | ----------------------- | |
278 | ||
279 | Non-text data (image) can be sent to the terminal for later display in | |
280 | a rectangular region of text Cells. Image data is transmitted to the | |
281 | terminal using the CSTORE command described below, and displayed on | |
282 | screen using the CDISPLAY command. A single CSTORE command can | |
283 | support many CDISPLAY commands. | |
284 | ||
285 | Upon display, setting a Cell to image is a destructive operation: the | |
286 | Cell's original text is lost. Similarly, setting a Cell (or multiple | |
287 | Cells for fullwidth glyphs or grapheme clusters) to text is a | |
288 | destructive operation: the image in the Cell(s) is lost. | |
289 | ||
290 | Setting any part of a multi-Cell Tile to image also "breaks up" the | |
291 | Tile into a range of single Cells. In other words, image data can | |
292 | only be carried by a Cell, not a Tile. | |
293 | ||
294 | ||
295 | ||
296 | Cached Images - Cache/Memory Management | |
297 | --------------------------------------- | |
298 | ||
299 | The terminal manages a cache of multimedia data on behalf of the | |
300 | application. The application requests media be stored in the cache | |
301 | and provides an ID. This ID is later used to request display on the | |
302 | screen. | |
303 | ||
304 | The amount of memory and retention/eviction strategy for the cache is | |
305 | wholly managed by the terminal, with the following restrictions: | |
306 | ||
307 | * The terminal may not remove items from the cache that have any | |
308 | portion being actively displayed on the primary or alternate | |
309 | screens. | |
310 | ||
311 | The scrollback buffer is permitted, and recommended, to contain only a | |
312 | few (or zero) multimedia images. Terminals should consider retaining | |
313 | only the last 2-5 screens' worth of pixel data in the scrollback | |
314 | buffer. | |
315 | ||
316 | Applications have no control over when images are removed from the | |
317 | cache, and no provision is made to generate/ensure unique IDs. | |
318 | ||
319 | A terminal multiplexer that passes all CSTORE/CDISPLAY commands to the | |
320 | host terminal will need to parse the CSTORE and CDISPLAY sequences for | |
321 | the "id" field and rewrite it to be unique for all of its inner | |
322 | terminals. | |
323 | ||
324 | ||
325 | ||
326 | Cached Images - New Sequences | |
327 | ----------------------------- | |
328 | ||
329 | A terminal with cached images feature must support the following new | |
330 | sequences: | |
331 | ||
332 | | Sequence | Command | Description | | |
333 | |--------------------------------------|-----------|-------------------------| | |
334 | | OSC 1 3 4 0 ; F i l e = {args} : {data} BEL | CSTORE | Display media at (x, y) | | |
335 | | OSC 1 3 4 1 ; Pi ; {args} ST | CDISPLAY | Display media at (x, y) | | |
336 | ||
337 | ||
338 | ||
339 | Cached Images - CSTORE | |
340 | ---------------------- | |
341 | ||
342 | For the CSTORE command: | |
343 | ||
344 | * The {args} is a set of key-value pairs (each pair separated by | |
345 | semicolon (';')), followed by a colon (':'), followed by a base-64 | |
346 | encoded string ({data}). | |
347 | ||
348 | * A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z', | |
349 | 'a' - 'z'). | |
350 | ||
351 | * A value is any printable ASCII string not containing whitespace, | |
352 | colon, or semicolon ('!' - '9', '<' - '~'). | |
353 | ||
354 | ||
355 | ||
356 | The keys for the key-value pairs that must be supported by the | |
357 | terminal are listed below: | |
358 | ||
359 | | Key | Default Value | Description | | |
360 | |--------------|---------------|----------------------------------------------| | |
361 | | id | 0 | ID to refer to the image | | |
362 | | type | "image/rgb" | mime-type describing data field | | |
363 | ||
364 | ||
365 | ||
366 | The "id" value is a non-negative integer between 0 and 999999. | |
367 | ||
368 | ||
369 | ||
370 | The "type" value is a mime-type string describing the format of the | |
371 | base64-encoded binary data. The terminal must support at mimunum these | |
372 | mime-types: | |
373 | ||
374 | | Type String | Description | | |
375 | |---------------|--------------------------------------------------------------| | |
376 | | "image/rgb" | Big-endian-encoded 24-bit red, green, blue values | | |
377 | | "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values | | |
378 | | "image/png" | PNG file data as described by (reference to PNG format) | | |
379 | ||
380 | A terminal may support additional types. An application can detect | |
381 | terminal support for a format by: | |
382 | ||
383 | 1. Store image in cache. | |
384 | ||
385 | 2. Attempt to draw image, with "scroll" set to 1. | |
386 | ||
387 | 3. Check cursor position DSR 6. | |
388 | ||
389 | 4. If cursor has moved, then the terminal supports this image type. | |
390 | ||
391 | ||
392 | ||
393 | Cached Images - CDISPLAY | |
394 | ------------------------ | |
395 | ||
396 | For the CDISPLAY command: | |
397 | ||
398 | * Pi - a non-negative integer ID that was used in a previous CSTORE | |
399 | command. | |
400 | ||
401 | * The {args} is a set of key-value pairs (each pair separated by | |
402 | semicolon (';')), followed by a colon (':'), followed by a base-64 | |
403 | encoded string. | |
404 | ||
405 | * A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z', | |
406 | 'a' - 'z'). | |
407 | ||
408 | * A value is any printable ASCII string not containing whitespace, | |
409 | colon, or semicolon ('!' - '9', '<' - '~'). | |
410 | ||
411 | * Any alpha-numeric key may be specified. A key that is not supported | |
412 | by the terminal is ignored without error. | |
413 | ||
414 | * The image pixels are processed as shown below. | |
415 | ||
416 | - The pixel are drawn starting at the upper-left corner of the text | |
417 | cursor position. | |
418 | ||
419 | - If scroll is specified as 1 (enabled), then: | |
420 | ||
421 | a. The screen is scrolled up if the image overflows into the | |
422 | bottom text row. | |
423 | ||
424 | b. The cursor's final position is on the same column as the | |
425 | starting cursor position, and on the row immediately below the | |
426 | image. | |
427 | ||
428 | - If scroll is omitted or specified as 0 (disabled), then: | |
429 | ||
430 | a. The screen is never scrolled. | |
431 | ||
432 | b. Pixels that would be drawn below the visible region on screen | |
433 | are discarded. | |
434 | ||
435 | c. The cursor's final position is at the same column and row as | |
436 | the starting cursor position, i.e. the cursor does not move at | |
437 | all. | |
438 | ||
439 | - Pixels that would be drawn to the right of the visible region on | |
440 | screen are discarded. | |
441 | ||
442 | ||
443 | ||
444 | The keys for the key-value pairs that must be supported by the | |
445 | terminal are listed below: | |
446 | ||
447 | | Key | Default Value | Description | | |
448 | |--------------|---------------|----------------------------------------------| | |
449 | | id | 0 | ID to refer to the image | | |
450 | | width | 1 | Number of Cells or pixels wide to display in | | |
451 | | height | 1 | Number of Cells or pixels high to display in | | |
452 | | scale | "none" | Scale/zoom option, see below | | |
453 | | sourceX | 0 | Media source X position to display | | |
454 | | sourceY | 0 | Media source Y position to display | | |
455 | | sourceWidth | "auto" | Media width in pixels to display | | |
456 | | sourceHeight | "auto" | Media height in pixels to display | | |
457 | | scroll | 0 | If 1, scroll the display if needed | | |
458 | ||
459 | A terminal may support additional keys. If a key is specified but not | |
460 | supported by the terminal, then it is ignored without error. | |
461 | ||
462 | ||
463 | ||
464 | The "width" and "height" values are positive integers describing the | |
465 | number of Cells the image will be placed in. | |
466 | ||
467 | ||
468 | ||
469 | The "scale" value can take the following values: | |
470 | ||
471 | | Value | Meaning | | |
472 | |------------|---------------------------------------------------------------| | |
473 | | "none" | No scaling along either axis. | | |
474 | | "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping | | |
475 | | "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area | | |
476 | | "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit | | |
477 | ||
478 | ||
479 | ||
480 | "sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the | |
481 | rectangle of pixels from the media that will be displayed on the | |
482 | screen. The ranges for these values is shown below: | |
483 | ||
484 | | Key | Minimum Value | Maximum Value | Default Value | | |
485 | |--------------|---------------|-------------------------------|---------------| | |
486 | | sourceX | 0 | Media's full width - 1 | 0 | | |
487 | | sourceY | 0 | Media's full height - 1 | 0 | | |
488 | | sourceWidth | 1 | Media's full width - sourceX | "auto" | | |
489 | | sourceHeight | 1 | Media's full height - sourceY | "auto" | | |
490 | ||
491 | If any of these values are specified and outside the range, no image | |
492 | is displayed, and the cursor does not move. "sourceWidth" and | |
493 | "sourceHeight" can be "auto", which means use the maximum available | |
494 | width/height (given sourceX/sourceY) from the media's inherent | |
495 | dimensions. |