slight cleanup
[nikiroo-utils.git] / docs / images2.md
CommitLineData
9a1dfe2b
KL
1Terminal Emulator Images Standard - Proposed Design - Simplified
2================================================================
3
4Version: 1
5
6
7
8Purpose
9-------
10
11See the [original proposal](images.md) for purpose, design goals, and
12definitions.
13
14This document is an updated proposal to address feedback on the first
15proposal, which included: "overengineered", "hopelessly
466f45f3
KL
16overengineered", and "unnecessarily complex." I perceive this
17feedback as a positive: it is far easier to imagine a feature and
18remove it, than to fail to picture it and need to shoehorn it in
19later.
9a1dfe2b 20
9a1dfe2b
KL
21The original proposal was a superset of every image format referenced,
22and generalized beyond to multimedia. This proposal is sharply
23reduced from that to: "put this pixel rectangle from the image, into
24that cell-based rectangle with specific scaling policy". It is mostly
466f45f3
KL
25a subset of the iTerm2 protocol, with:
26
27* Specifications for what happens to the cursor.
28
29* More precise definitions of the "preserveAspectRatio" equivalent
30 options.
31
32* Explicit restriction to a Cell-based target region.
33
34* Definition that pixels not covered by image are set to the current
35 background color.
9a1dfe2b
KL
36
37
38
39Tradeoffs
40---------
41
42Simplifying the original proposal will significantly reduce
43complexity, but also eliminates features. The major tradeoffs offered
44in this revised proposal are:
45
461. Elimination of the layers feature, and with it the ability to place
47 images behind text. In this proposal, a Cell on the screen will
48 show either a (part of a) visible image, or a (part of a) text
49 glyph, but never both.
50
512. Elimination of the "url" option, and with it the ability for an
52 application to specify a filename or other method for the terminal
53 to find the file data on the local machine. Image data must always
54 be passed inline with the sequences.
55
563. Elimination of response codes, and with it:
57
58 - The ability for multiplexers to blindly pass on the sequences to
466f45f3
KL
59 their host terminal (because unique IDs are not generated by the
60 terminal).
9a1dfe2b
KL
61
62 - The ability for applications to reliably detect success or
63 failure of image display operations.
64
654. Elimination of pixel-oriented image placement operations, and with
66 it the ability of applications to pass on image calculations to the
67 terminal. An application which requires pixel-perfect rendering
68 must generate the pixels it needs, aligned such to be displayed at
69 the top-left corner of the text Cell rectangle.
70
71
72
73Summary
74-------
75
76This revised document proposes two independent new features:
77
781. A method to transfer image data for immediate display within the
79 screen Cell grid ("Direct Images").
80
812. A method to transfer image data to a terminal-managed cache, and
82 later display that data within the screen Cell grid ("Cached
83 Images").
84
85The only difference between the first and second feature is the
86presence of an ID key. Direct images do not use an ID key, while
87cached images use a store operation with ID key followed by one or
88more display operations with ID key.
89
90Images are applied to text Cells, and once set handled the same way
91text Cells are handled: erasing a line erases the image Cells on that
92line, inserting a character will shift image Cells on that row over,
93scrolling will shift the image up, and so on. Therefore, terminals
94will need to be prepared for the scenario that every Cell on the
95display is a separate image, with a separate display scaling option
96that will need to be re-applied automatically if font metrics change.
97
98
99
100All Features - Detection
101------------------------
102
103Applications can detect support for these features using Primary
104Device Attributes (DA) and DECID (ESC Z, or 0x9A).
105
106Terminals that support this standard will repond with additional
107parameter(s): "224" for direct images and "225" for cached images. A
108recap of the parameters xterm supports is listed below, with these new
109feature responses included:
110
111| VT220 (and higher) Response | Description |
112|-----------------------------|--------------------------------------------|
113| 1 | 132-columns |
114| 2 | Printer |
115| 3 | ReGIS graphics |
116| 4 | Sixel graphics |
117| 6 | Selective erase |
118| 8 | User-defined keys |
119| 9 | National Replacement Character sets |
120| 1 5 | Technical characters |
121| 1 6 | Locator port |
122| 1 7 | Terminal state interrogation |
123| 1 8 | User windows |
124| 2 1 | Horizontal scrolling |
125| 2 2 | ANSI color, e.g., VT525 |
126| 2 8 | Rectangular editing |
127| 2 9 | ANSI text locator (i.e., DEC Locator mode) |
128| 2 2 4 | Direct Images Version 1 |
129| 2 2 5 | Cached Images Version 1 |
130
131
132
133Direct Images - Summary
134-----------------------
135
136Non-text data (images) can be sent to the terminal for immediate
137display in a rectangular region of text Cells. Image data is
138transmitted to the terminal using a wire format described later in
139this document.
140
141Setting a Cell to image is a destructive operation: the Cell's
142original text is lost. Similarly, setting a Cell (or multiple Cells
143for fullwidth glyphs or grapheme clusters) to text is a destructive
144operation: the image in the Cell(s) is lost.
145
146Setting any part of a multi-Cell Tile to image also "breaks up" the
147Tile into a range of single Cells. In other words, image data can
148only be carried by a Cell, not a Tile.
149
150
151
152Direct Images - New Sequences
153-----------------------------
154
155A terminal with direct images feature must support the following new
156sequences:
157
158| Sequence | Description |
159|--------------------------------------|-------------------------|
160| OSC 1 3 3 8 ; F i l e = {args} : {data} BEL | Display image at (x, y) |
161| OSC 1 3 3 8 ; F i l e = {args} : {data} ST | Display image at (x, y) |
162
163
164
165For the OSC 1 3 3 8 sequence:
166
167* The {args} is a set of key-value pairs (each pair separated by
168 semicolon (';')), followed by a colon (':'), followed by a base-64
169 encoded string ({data}).
170
171* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
172 'a' - 'z').
173
174* A value is any printable ASCII string not containing whitespace,
175 colon, or semicolon ('!' - '9', '<' - '~').
176
177* Any alpha-numeric key may be specified. A key that is not supported
178 by the terminal is ignored without error.
179
180* The image is processed as shown below:
181
182 - The pixels are drawn starting at the upper-left corner of the text
183 cursor position.
184
466f45f3
KL
185 - All pixels in the target Cell rectangle that are not covered by
186 the image itself are set the current background color (like
187 sixel raster attributes).
188
9a1dfe2b
KL
189 - If scroll is specified as 1 (enabled), then:
190
191 a. The screen is scrolled up if the image overflows into the
192 bottom text row.
193
194 b. The cursor's final position is on the same column as the
195 starting cursor position, and on the row immediately below the
196 image.
197
198 - If scroll is omitted or specified as 0 (disabled), then:
199
200 a. The screen is never scrolled.
201
202 b. Pixels that would be drawn below the visible region on screen
203 are discarded.
204
205 c. The cursor's final position is at the same column and row as
206 the starting cursor position, i.e. the cursor does not move at
207 all.
208
209 - Pixels that would be drawn to the right of the visible region on
210 screen are discarded.
211
466f45f3
KL
212 - If scale is "none", then pixels that would be drawn outside the
213 target Cell rectangle are discarded.
214
9a1dfe2b
KL
215
216
217The keys for the key-value pairs that must be supported by the
218terminal are listed below:
219
220| Key | Default Value | Description |
221|--------------|---------------|----------------------------------------------|
222| type | "image/rgb" | mime-type describing data field |
223| width | 1 | Number of Cells or pixels wide to display in |
224| height | 1 | Number of Cells or pixels high to display in |
225| scale | "none" | Scale/zoom option, see below |
226| sourceX | 0 | Media source X position to display |
227| sourceY | 0 | Media source Y position to display |
228| sourceWidth | "auto" | Media width in pixels to display |
229| sourceHeight | "auto" | Media height in pixels to display |
230| scroll | 0 | If 0, scroll the display if needed |
231
232A terminal may support additional keys. If a key is specified but not
233supported by the terminal, then it is ignored without error.
234
235
236
237The "type" value is a mime-type string describing the format of the
238base64-encoded binary data. The terminal must support at minimum these
239mime-types:
240
241| Type String | Description |
242|---------------|--------------------------------------------------------------|
243| "image/rgb" | Big-endian-encoded 24-bit red, green, blue values |
244| "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values |
245| "image/png" | PNG file data as described by (reference to PNG format) |
246
247A terminal may support additional types. An application can detect
248terminal support for a format by:
249
250 1. Attempt to draw image, with "scroll" set to 1.
251
252 2. Check cursor position DSR 6.
253
254 3. If cursor has moved, then the terminal supports this image type.
255
256
257
258The "width" and "height" values are positive integers describing the
259number of Cells the image will be placed in.
260
261
262
263The "scale" value can take the following values:
264
265| Value | Meaning |
266|------------|---------------------------------------------------------------|
267| "none" | No scaling along either axis. |
268| "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping |
269| "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area |
270| "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit |
271
272
273
274"sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the
275rectangle of pixels from the media that will be displayed on the
276screen. The ranges for these values is shown below:
277
278| Key | Minimum Value | Maximum Value | Default Value |
279|--------------|---------------|-------------------------------|---------------|
280| sourceX | 0 | Media's full width - 1 | 0 |
281| sourceY | 0 | Media's full height - 1 | 0 |
282| sourceWidth | 1 | Media's full width - sourceX | "auto" |
283| sourceHeight | 1 | Media's full height - sourceY | "auto" |
284
285If any of these values are specified and outside the range, no image
286is displayed, and the cursor does not move. "sourceWidth" and
287"sourceHeight" can be "auto", which means use the maximum available
288width/height (given sourceX/sourceY) from the media's inherent
289dimensions.
290
291
292
293Cached Images - Summary
294-----------------------
295
296Non-text data (image) can be sent to the terminal for later display in
297a rectangular region of text Cells. Image data is transmitted to the
298terminal using the CSTORE command described below, and displayed on
299screen using the CDISPLAY command. A single CSTORE command can
300support many CDISPLAY commands.
301
302Upon display, setting a Cell to image is a destructive operation: the
303Cell's original text is lost. Similarly, setting a Cell (or multiple
304Cells for fullwidth glyphs or grapheme clusters) to text is a
305destructive operation: the image in the Cell(s) is lost.
306
307Setting any part of a multi-Cell Tile to image also "breaks up" the
308Tile into a range of single Cells. In other words, image data can
309only be carried by a Cell, not a Tile.
310
311
312
313Cached Images - Cache/Memory Management
314---------------------------------------
315
316The terminal manages a cache of multimedia data on behalf of the
317application. The application requests media be stored in the cache
318and provides an ID. This ID is later used to request display on the
319screen.
320
321The amount of memory and retention/eviction strategy for the cache is
322wholly managed by the terminal, with the following restrictions:
323
324* The terminal may not remove items from the cache that have any
325 portion being actively displayed on the primary or alternate
326 screens.
327
328The scrollback buffer is permitted, and recommended, to contain only a
329few (or zero) multimedia images. Terminals should consider retaining
330only the last 2-5 screens' worth of pixel data in the scrollback
331buffer.
332
333Applications have no control over when images are removed from the
334cache, and no provision is made to generate/ensure unique IDs.
335
336A terminal multiplexer that passes all CSTORE/CDISPLAY commands to the
337host terminal will need to parse the CSTORE and CDISPLAY sequences for
338the "id" field and rewrite it to be unique for all of its inner
339terminals.
340
341
342
343Cached Images - New Sequences
344-----------------------------
345
346A terminal with cached images feature must support the following new
347sequences:
348
349| Sequence | Command | Description |
350|--------------------------------------|-----------|-------------------------|
351| OSC 1 3 4 0 ; F i l e = {args} : {data} BEL | CSTORE | Display media at (x, y) |
352| OSC 1 3 4 1 ; Pi ; {args} ST | CDISPLAY | Display media at (x, y) |
353
354
355
356Cached Images - CSTORE
357----------------------
358
359For the CSTORE command:
360
361* The {args} is a set of key-value pairs (each pair separated by
362 semicolon (';')), followed by a colon (':'), followed by a base-64
363 encoded string ({data}).
364
365* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
366 'a' - 'z').
367
368* A value is any printable ASCII string not containing whitespace,
369 colon, or semicolon ('!' - '9', '<' - '~').
370
371
372
373The keys for the key-value pairs that must be supported by the
374terminal are listed below:
375
376| Key | Default Value | Description |
377|--------------|---------------|----------------------------------------------|
378| id | 0 | ID to refer to the image |
379| type | "image/rgb" | mime-type describing data field |
380
381
382
383The "id" value is a non-negative integer between 0 and 999999.
384
385
386
387The "type" value is a mime-type string describing the format of the
388base64-encoded binary data. The terminal must support at mimunum these
389mime-types:
390
391| Type String | Description |
392|---------------|--------------------------------------------------------------|
393| "image/rgb" | Big-endian-encoded 24-bit red, green, blue values |
394| "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values |
395| "image/png" | PNG file data as described by (reference to PNG format) |
396
397A terminal may support additional types. An application can detect
398terminal support for a format by:
399
400 1. Store image in cache.
401
402 2. Attempt to draw image, with "scroll" set to 1.
403
404 3. Check cursor position DSR 6.
405
406 4. If cursor has moved, then the terminal supports this image type.
407
408
409
410Cached Images - CDISPLAY
411------------------------
412
413For the CDISPLAY command:
414
415* Pi - a non-negative integer ID that was used in a previous CSTORE
416 command.
417
418* The {args} is a set of key-value pairs (each pair separated by
419 semicolon (';')), followed by a colon (':'), followed by a base-64
420 encoded string.
421
422* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
423 'a' - 'z').
424
425* A value is any printable ASCII string not containing whitespace,
426 colon, or semicolon ('!' - '9', '<' - '~').
427
428* Any alpha-numeric key may be specified. A key that is not supported
429 by the terminal is ignored without error.
430
431* The image pixels are processed as shown below.
432
433 - The pixel are drawn starting at the upper-left corner of the text
434 cursor position.
435
436 - If scroll is specified as 1 (enabled), then:
437
438 a. The screen is scrolled up if the image overflows into the
439 bottom text row.
440
441 b. The cursor's final position is on the same column as the
442 starting cursor position, and on the row immediately below the
443 image.
444
445 - If scroll is omitted or specified as 0 (disabled), then:
446
447 a. The screen is never scrolled.
448
449 b. Pixels that would be drawn below the visible region on screen
450 are discarded.
451
452 c. The cursor's final position is at the same column and row as
453 the starting cursor position, i.e. the cursor does not move at
454 all.
455
456 - Pixels that would be drawn to the right of the visible region on
457 screen are discarded.
458
459
460
461The keys for the key-value pairs that must be supported by the
462terminal are listed below:
463
464| Key | Default Value | Description |
465|--------------|---------------|----------------------------------------------|
466| id | 0 | ID to refer to the image |
467| width | 1 | Number of Cells or pixels wide to display in |
468| height | 1 | Number of Cells or pixels high to display in |
469| scale | "none" | Scale/zoom option, see below |
470| sourceX | 0 | Media source X position to display |
471| sourceY | 0 | Media source Y position to display |
472| sourceWidth | "auto" | Media width in pixels to display |
473| sourceHeight | "auto" | Media height in pixels to display |
474| scroll | 0 | If 1, scroll the display if needed |
475
476A terminal may support additional keys. If a key is specified but not
477supported by the terminal, then it is ignored without error.
478
479
480
481The "width" and "height" values are positive integers describing the
482number of Cells the image will be placed in.
483
484
485
486The "scale" value can take the following values:
487
488| Value | Meaning |
489|------------|---------------------------------------------------------------|
490| "none" | No scaling along either axis. |
491| "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping |
492| "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area |
493| "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit |
494
495
496
497"sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the
498rectangle of pixels from the media that will be displayed on the
499screen. The ranges for these values is shown below:
500
501| Key | Minimum Value | Maximum Value | Default Value |
502|--------------|---------------|-------------------------------|---------------|
503| sourceX | 0 | Media's full width - 1 | 0 |
504| sourceY | 0 | Media's full height - 1 | 0 |
505| sourceWidth | 1 | Media's full width - sourceX | "auto" |
506| sourceHeight | 1 | Media's full height - sourceY | "auto" |
507
508If any of these values are specified and outside the range, no image
509is displayed, and the cursor does not move. "sourceWidth" and
510"sourceHeight" can be "auto", which means use the maximum available
511width/height (given sourceX/sourceY) from the media's inherent
512dimensions.