Commit | Line | Data |
---|---|---|
9a1dfe2b KL |
1 | Terminal Emulator Images Standard - Proposed Design - Simplified |
2 | ================================================================ | |
3 | ||
4 | Version: 1 | |
5 | ||
6 | ||
7 | ||
8 | Purpose | |
9 | ------- | |
10 | ||
11 | See the [original proposal](images.md) for purpose, design goals, and | |
12 | definitions. | |
13 | ||
14 | This document is an updated proposal to address feedback on the first | |
15 | proposal, which included: "overengineered", "hopelessly | |
466f45f3 KL |
16 | overengineered", and "unnecessarily complex." I perceive this |
17 | feedback as a positive: it is far easier to imagine a feature and | |
18 | remove it, than to fail to picture it and need to shoehorn it in | |
19 | later. | |
9a1dfe2b | 20 | |
9a1dfe2b KL |
21 | The original proposal was a superset of every image format referenced, |
22 | and generalized beyond to multimedia. This proposal is sharply | |
23 | reduced from that to: "put this pixel rectangle from the image, into | |
24 | that cell-based rectangle with specific scaling policy". It is mostly | |
466f45f3 KL |
25 | a subset of the iTerm2 protocol, with: |
26 | ||
27 | * Specifications for what happens to the cursor. | |
28 | ||
29 | * More precise definitions of the "preserveAspectRatio" equivalent | |
30 | options. | |
31 | ||
32 | * Explicit restriction to a Cell-based target region. | |
33 | ||
34 | * Definition that pixels not covered by image are set to the current | |
35 | background color. | |
9a1dfe2b KL |
36 | |
37 | ||
38 | ||
39 | Tradeoffs | |
40 | --------- | |
41 | ||
42 | Simplifying the original proposal will significantly reduce | |
43 | complexity, but also eliminates features. The major tradeoffs offered | |
44 | in this revised proposal are: | |
45 | ||
46 | 1. Elimination of the layers feature, and with it the ability to place | |
47 | images behind text. In this proposal, a Cell on the screen will | |
48 | show either a (part of a) visible image, or a (part of a) text | |
49 | glyph, but never both. | |
50 | ||
51 | 2. Elimination of the "url" option, and with it the ability for an | |
52 | application to specify a filename or other method for the terminal | |
53 | to find the file data on the local machine. Image data must always | |
54 | be passed inline with the sequences. | |
55 | ||
56 | 3. Elimination of response codes, and with it: | |
57 | ||
58 | - The ability for multiplexers to blindly pass on the sequences to | |
466f45f3 KL |
59 | their host terminal (because unique IDs are not generated by the |
60 | terminal). | |
9a1dfe2b KL |
61 | |
62 | - The ability for applications to reliably detect success or | |
63 | failure of image display operations. | |
64 | ||
65 | 4. Elimination of pixel-oriented image placement operations, and with | |
66 | it the ability of applications to pass on image calculations to the | |
67 | terminal. An application which requires pixel-perfect rendering | |
68 | must generate the pixels it needs, aligned such to be displayed at | |
69 | the top-left corner of the text Cell rectangle. | |
70 | ||
71 | ||
72 | ||
73 | Summary | |
74 | ------- | |
75 | ||
76 | This revised document proposes two independent new features: | |
77 | ||
78 | 1. A method to transfer image data for immediate display within the | |
79 | screen Cell grid ("Direct Images"). | |
80 | ||
81 | 2. A method to transfer image data to a terminal-managed cache, and | |
82 | later display that data within the screen Cell grid ("Cached | |
83 | Images"). | |
84 | ||
85 | The only difference between the first and second feature is the | |
86 | presence of an ID key. Direct images do not use an ID key, while | |
87 | cached images use a store operation with ID key followed by one or | |
88 | more display operations with ID key. | |
89 | ||
90 | Images are applied to text Cells, and once set handled the same way | |
91 | text Cells are handled: erasing a line erases the image Cells on that | |
92 | line, inserting a character will shift image Cells on that row over, | |
93 | scrolling will shift the image up, and so on. Therefore, terminals | |
94 | will need to be prepared for the scenario that every Cell on the | |
95 | display is a separate image, with a separate display scaling option | |
96 | that will need to be re-applied automatically if font metrics change. | |
97 | ||
98 | ||
99 | ||
100 | All Features - Detection | |
101 | ------------------------ | |
102 | ||
103 | Applications can detect support for these features using Primary | |
104 | Device Attributes (DA) and DECID (ESC Z, or 0x9A). | |
105 | ||
106 | Terminals that support this standard will repond with additional | |
107 | parameter(s): "224" for direct images and "225" for cached images. A | |
108 | recap of the parameters xterm supports is listed below, with these new | |
109 | feature responses included: | |
110 | ||
111 | | VT220 (and higher) Response | Description | | |
112 | |-----------------------------|--------------------------------------------| | |
113 | | 1 | 132-columns | | |
114 | | 2 | Printer | | |
115 | | 3 | ReGIS graphics | | |
116 | | 4 | Sixel graphics | | |
117 | | 6 | Selective erase | | |
118 | | 8 | User-defined keys | | |
119 | | 9 | National Replacement Character sets | | |
120 | | 1 5 | Technical characters | | |
121 | | 1 6 | Locator port | | |
122 | | 1 7 | Terminal state interrogation | | |
123 | | 1 8 | User windows | | |
124 | | 2 1 | Horizontal scrolling | | |
125 | | 2 2 | ANSI color, e.g., VT525 | | |
126 | | 2 8 | Rectangular editing | | |
127 | | 2 9 | ANSI text locator (i.e., DEC Locator mode) | | |
128 | | 2 2 4 | Direct Images Version 1 | | |
129 | | 2 2 5 | Cached Images Version 1 | | |
130 | ||
131 | ||
132 | ||
133 | Direct Images - Summary | |
134 | ----------------------- | |
135 | ||
136 | Non-text data (images) can be sent to the terminal for immediate | |
137 | display in a rectangular region of text Cells. Image data is | |
138 | transmitted to the terminal using a wire format described later in | |
139 | this document. | |
140 | ||
141 | Setting a Cell to image is a destructive operation: the Cell's | |
142 | original text is lost. Similarly, setting a Cell (or multiple Cells | |
143 | for fullwidth glyphs or grapheme clusters) to text is a destructive | |
144 | operation: the image in the Cell(s) is lost. | |
145 | ||
146 | Setting any part of a multi-Cell Tile to image also "breaks up" the | |
147 | Tile into a range of single Cells. In other words, image data can | |
148 | only be carried by a Cell, not a Tile. | |
149 | ||
150 | ||
151 | ||
152 | Direct Images - New Sequences | |
153 | ----------------------------- | |
154 | ||
155 | A terminal with direct images feature must support the following new | |
156 | sequences: | |
157 | ||
158 | | Sequence | Description | | |
159 | |--------------------------------------|-------------------------| | |
160 | | OSC 1 3 3 8 ; F i l e = {args} : {data} BEL | Display image at (x, y) | | |
161 | | OSC 1 3 3 8 ; F i l e = {args} : {data} ST | Display image at (x, y) | | |
162 | ||
163 | ||
164 | ||
165 | For the OSC 1 3 3 8 sequence: | |
166 | ||
167 | * The {args} is a set of key-value pairs (each pair separated by | |
168 | semicolon (';')), followed by a colon (':'), followed by a base-64 | |
169 | encoded string ({data}). | |
170 | ||
171 | * A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z', | |
172 | 'a' - 'z'). | |
173 | ||
174 | * A value is any printable ASCII string not containing whitespace, | |
175 | colon, or semicolon ('!' - '9', '<' - '~'). | |
176 | ||
177 | * Any alpha-numeric key may be specified. A key that is not supported | |
178 | by the terminal is ignored without error. | |
179 | ||
180 | * The image is processed as shown below: | |
181 | ||
182 | - The pixels are drawn starting at the upper-left corner of the text | |
183 | cursor position. | |
184 | ||
466f45f3 KL |
185 | - All pixels in the target Cell rectangle that are not covered by |
186 | the image itself are set the current background color (like | |
187 | sixel raster attributes). | |
188 | ||
9a1dfe2b KL |
189 | - If scroll is specified as 1 (enabled), then: |
190 | ||
191 | a. The screen is scrolled up if the image overflows into the | |
192 | bottom text row. | |
193 | ||
194 | b. The cursor's final position is on the same column as the | |
195 | starting cursor position, and on the row immediately below the | |
196 | image. | |
197 | ||
198 | - If scroll is omitted or specified as 0 (disabled), then: | |
199 | ||
200 | a. The screen is never scrolled. | |
201 | ||
202 | b. Pixels that would be drawn below the visible region on screen | |
203 | are discarded. | |
204 | ||
205 | c. The cursor's final position is at the same column and row as | |
206 | the starting cursor position, i.e. the cursor does not move at | |
207 | all. | |
208 | ||
209 | - Pixels that would be drawn to the right of the visible region on | |
210 | screen are discarded. | |
211 | ||
466f45f3 KL |
212 | - If scale is "none", then pixels that would be drawn outside the |
213 | target Cell rectangle are discarded. | |
214 | ||
9a1dfe2b KL |
215 | |
216 | ||
217 | The keys for the key-value pairs that must be supported by the | |
218 | terminal are listed below: | |
219 | ||
a945b890 KL |
220 | | Key | Default Value | Description | |
221 | |--------------|---------------|---------------------------------------| | |
222 | | type | "image/rgb" | mime-type describing data field | | |
223 | | width | 1 | Number of Cell columns to display in | | |
224 | | height | 1 | Number of Cells rows to display in | | |
225 | | scale | "none" | Scale/zoom option, see below | | |
226 | | sourceX | 0 | Media source X position to display | | |
227 | | sourceY | 0 | Media source Y position to display | | |
228 | | sourceWidth | "auto" | Media width in pixels to display | | |
229 | | sourceHeight | "auto" | Media height in pixels to display | | |
230 | | scroll | 0 | If 0, scroll the display if needed | | |
9a1dfe2b KL |
231 | |
232 | A terminal may support additional keys. If a key is specified but not | |
233 | supported by the terminal, then it is ignored without error. | |
234 | ||
235 | ||
236 | ||
237 | The "type" value is a mime-type string describing the format of the | |
238 | base64-encoded binary data. The terminal must support at minimum these | |
239 | mime-types: | |
240 | ||
241 | | Type String | Description | | |
242 | |---------------|--------------------------------------------------------------| | |
243 | | "image/rgb" | Big-endian-encoded 24-bit red, green, blue values | | |
244 | | "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values | | |
245 | | "image/png" | PNG file data as described by (reference to PNG format) | | |
246 | ||
247 | A terminal may support additional types. An application can detect | |
248 | terminal support for a format by: | |
249 | ||
250 | 1. Attempt to draw image, with "scroll" set to 1. | |
251 | ||
252 | 2. Check cursor position DSR 6. | |
253 | ||
254 | 3. If cursor has moved, then the terminal supports this image type. | |
255 | ||
256 | ||
257 | ||
258 | The "width" and "height" values are positive integers describing the | |
259 | number of Cells the image will be placed in. | |
260 | ||
261 | ||
262 | ||
263 | The "scale" value can take the following values: | |
264 | ||
265 | | Value | Meaning | | |
266 | |------------|---------------------------------------------------------------| | |
267 | | "none" | No scaling along either axis. | | |
268 | | "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping | | |
269 | | "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area | | |
270 | | "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit | | |
271 | ||
272 | ||
273 | ||
274 | "sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the | |
275 | rectangle of pixels from the media that will be displayed on the | |
276 | screen. The ranges for these values is shown below: | |
277 | ||
278 | | Key | Minimum Value | Maximum Value | Default Value | | |
279 | |--------------|---------------|-------------------------------|---------------| | |
280 | | sourceX | 0 | Media's full width - 1 | 0 | | |
281 | | sourceY | 0 | Media's full height - 1 | 0 | | |
282 | | sourceWidth | 1 | Media's full width - sourceX | "auto" | | |
283 | | sourceHeight | 1 | Media's full height - sourceY | "auto" | | |
284 | ||
285 | If any of these values are specified and outside the range, no image | |
286 | is displayed, and the cursor does not move. "sourceWidth" and | |
287 | "sourceHeight" can be "auto", which means use the maximum available | |
288 | width/height (given sourceX/sourceY) from the media's inherent | |
289 | dimensions. | |
290 | ||
291 | ||
292 | ||
293 | Cached Images - Summary | |
294 | ----------------------- | |
295 | ||
296 | Non-text data (image) can be sent to the terminal for later display in | |
297 | a rectangular region of text Cells. Image data is transmitted to the | |
298 | terminal using the CSTORE command described below, and displayed on | |
299 | screen using the CDISPLAY command. A single CSTORE command can | |
300 | support many CDISPLAY commands. | |
301 | ||
302 | Upon display, setting a Cell to image is a destructive operation: the | |
303 | Cell's original text is lost. Similarly, setting a Cell (or multiple | |
304 | Cells for fullwidth glyphs or grapheme clusters) to text is a | |
305 | destructive operation: the image in the Cell(s) is lost. | |
306 | ||
307 | Setting any part of a multi-Cell Tile to image also "breaks up" the | |
308 | Tile into a range of single Cells. In other words, image data can | |
309 | only be carried by a Cell, not a Tile. | |
310 | ||
311 | ||
312 | ||
313 | Cached Images - Cache/Memory Management | |
314 | --------------------------------------- | |
315 | ||
316 | The terminal manages a cache of multimedia data on behalf of the | |
317 | application. The application requests media be stored in the cache | |
318 | and provides an ID. This ID is later used to request display on the | |
319 | screen. | |
320 | ||
321 | The amount of memory and retention/eviction strategy for the cache is | |
322 | wholly managed by the terminal, with the following restrictions: | |
323 | ||
324 | * The terminal may not remove items from the cache that have any | |
325 | portion being actively displayed on the primary or alternate | |
326 | screens. | |
327 | ||
328 | The scrollback buffer is permitted, and recommended, to contain only a | |
329 | few (or zero) multimedia images. Terminals should consider retaining | |
330 | only the last 2-5 screens' worth of pixel data in the scrollback | |
331 | buffer. | |
332 | ||
333 | Applications have no control over when images are removed from the | |
334 | cache, and no provision is made to generate/ensure unique IDs. | |
335 | ||
336 | A terminal multiplexer that passes all CSTORE/CDISPLAY commands to the | |
337 | host terminal will need to parse the CSTORE and CDISPLAY sequences for | |
338 | the "id" field and rewrite it to be unique for all of its inner | |
339 | terminals. | |
340 | ||
341 | ||
342 | ||
343 | Cached Images - New Sequences | |
344 | ----------------------------- | |
345 | ||
346 | A terminal with cached images feature must support the following new | |
347 | sequences: | |
348 | ||
15383c0a KL |
349 | | Sequence | Command | Description | |
350 | |--------------------------------------|-----------|--------------------------| | |
351 | | OSC 1 3 4 0 ; F i l e = {args} : {data} BEL | CSTORE | Store image in cache | | |
352 | | OSC 1 3 4 0 ; F i l e = {args} : {data} ST | CSTORE | Store image in cache | | |
353 | | OSC 1 3 4 1 ; Pi ; {args} BEL | CDISPLAY | Display image at (x, y) | | |
354 | | OSC 1 3 4 1 ; Pi ; {args} ST | CDISPLAY | Display image at (x, y) | | |
9a1dfe2b KL |
355 | |
356 | ||
357 | ||
358 | Cached Images - CSTORE | |
359 | ---------------------- | |
360 | ||
361 | For the CSTORE command: | |
362 | ||
363 | * The {args} is a set of key-value pairs (each pair separated by | |
364 | semicolon (';')), followed by a colon (':'), followed by a base-64 | |
365 | encoded string ({data}). | |
366 | ||
367 | * A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z', | |
368 | 'a' - 'z'). | |
369 | ||
370 | * A value is any printable ASCII string not containing whitespace, | |
371 | colon, or semicolon ('!' - '9', '<' - '~'). | |
372 | ||
373 | ||
374 | ||
375 | The keys for the key-value pairs that must be supported by the | |
376 | terminal are listed below: | |
377 | ||
378 | | Key | Default Value | Description | | |
379 | |--------------|---------------|----------------------------------------------| | |
380 | | id | 0 | ID to refer to the image | | |
381 | | type | "image/rgb" | mime-type describing data field | | |
382 | ||
383 | ||
384 | ||
385 | The "id" value is a non-negative integer between 0 and 999999. | |
386 | ||
387 | ||
388 | ||
389 | The "type" value is a mime-type string describing the format of the | |
390 | base64-encoded binary data. The terminal must support at mimunum these | |
391 | mime-types: | |
392 | ||
393 | | Type String | Description | | |
394 | |---------------|--------------------------------------------------------------| | |
395 | | "image/rgb" | Big-endian-encoded 24-bit red, green, blue values | | |
396 | | "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values | | |
397 | | "image/png" | PNG file data as described by (reference to PNG format) | | |
398 | ||
399 | A terminal may support additional types. An application can detect | |
400 | terminal support for a format by: | |
401 | ||
402 | 1. Store image in cache. | |
403 | ||
404 | 2. Attempt to draw image, with "scroll" set to 1. | |
405 | ||
406 | 3. Check cursor position DSR 6. | |
407 | ||
408 | 4. If cursor has moved, then the terminal supports this image type. | |
409 | ||
410 | ||
411 | ||
412 | Cached Images - CDISPLAY | |
413 | ------------------------ | |
414 | ||
415 | For the CDISPLAY command: | |
416 | ||
417 | * Pi - a non-negative integer ID that was used in a previous CSTORE | |
418 | command. | |
419 | ||
420 | * The {args} is a set of key-value pairs (each pair separated by | |
421 | semicolon (';')), followed by a colon (':'), followed by a base-64 | |
422 | encoded string. | |
423 | ||
424 | * A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z', | |
425 | 'a' - 'z'). | |
426 | ||
427 | * A value is any printable ASCII string not containing whitespace, | |
428 | colon, or semicolon ('!' - '9', '<' - '~'). | |
429 | ||
430 | * Any alpha-numeric key may be specified. A key that is not supported | |
431 | by the terminal is ignored without error. | |
432 | ||
433 | * The image pixels are processed as shown below. | |
434 | ||
435 | - The pixel are drawn starting at the upper-left corner of the text | |
436 | cursor position. | |
437 | ||
438 | - If scroll is specified as 1 (enabled), then: | |
439 | ||
440 | a. The screen is scrolled up if the image overflows into the | |
441 | bottom text row. | |
442 | ||
443 | b. The cursor's final position is on the same column as the | |
444 | starting cursor position, and on the row immediately below the | |
445 | image. | |
446 | ||
447 | - If scroll is omitted or specified as 0 (disabled), then: | |
448 | ||
449 | a. The screen is never scrolled. | |
450 | ||
451 | b. Pixels that would be drawn below the visible region on screen | |
452 | are discarded. | |
453 | ||
454 | c. The cursor's final position is at the same column and row as | |
455 | the starting cursor position, i.e. the cursor does not move at | |
456 | all. | |
457 | ||
458 | - Pixels that would be drawn to the right of the visible region on | |
459 | screen are discarded. | |
460 | ||
461 | ||
462 | ||
463 | The keys for the key-value pairs that must be supported by the | |
464 | terminal are listed below: | |
465 | ||
a945b890 KL |
466 | | Key | Default Value | Description | |
467 | |--------------|---------------|---------------------------------------| | |
468 | | id | 0 | ID to refer to the image | | |
469 | | width | 1 | Number of Cell columns to display in | | |
470 | | height | 1 | Number of Cells rows to display in | | |
471 | | scale | "none" | Scale/zoom option, see below | | |
472 | | sourceX | 0 | Media source X position to display | | |
473 | | sourceY | 0 | Media source Y position to display | | |
474 | | sourceWidth | "auto" | Media width in pixels to display | | |
475 | | sourceHeight | "auto" | Media height in pixels to display | | |
476 | | scroll | 0 | If 1, scroll the display if needed | | |
9a1dfe2b KL |
477 | |
478 | A terminal may support additional keys. If a key is specified but not | |
479 | supported by the terminal, then it is ignored without error. | |
480 | ||
481 | ||
482 | ||
483 | The "width" and "height" values are positive integers describing the | |
484 | number of Cells the image will be placed in. | |
485 | ||
486 | ||
487 | ||
488 | The "scale" value can take the following values: | |
489 | ||
490 | | Value | Meaning | | |
491 | |------------|---------------------------------------------------------------| | |
492 | | "none" | No scaling along either axis. | | |
493 | | "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping | | |
494 | | "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area | | |
495 | | "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit | | |
496 | ||
497 | ||
498 | ||
499 | "sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the | |
500 | rectangle of pixels from the media that will be displayed on the | |
501 | screen. The ranges for these values is shown below: | |
502 | ||
503 | | Key | Minimum Value | Maximum Value | Default Value | | |
504 | |--------------|---------------|-------------------------------|---------------| | |
505 | | sourceX | 0 | Media's full width - 1 | 0 | | |
506 | | sourceY | 0 | Media's full height - 1 | 0 | | |
507 | | sourceWidth | 1 | Media's full width - sourceX | "auto" | | |
508 | | sourceHeight | 1 | Media's full height - sourceY | "auto" | | |
509 | ||
510 | If any of these values are specified and outside the range, no image | |
511 | is displayed, and the cursor does not move. "sourceWidth" and | |
512 | "sourceHeight" can be "auto", which means use the maximum available | |
513 | width/height (given sourceX/sourceY) from the media's inherent | |
514 | dimensions. | |
15383c0a KL |
515 | |
516 | ||
517 | ||
518 | Miscellaneous Items | |
519 | ------------------- | |
520 | ||
521 | "image/rgb" and "image/rgba" also need width/height fields. Propose | |
522 | to specify them as 16-bit unsigned ints, followed by 24-bit or 32-bit | |
523 | data. If data is short, then the rest of the image is assumed to be | |
524 | current background color (like sixel raster attributes). |