experimental 24-bit image protocol
[nikiroo-utils.git] / docs / images2.md
CommitLineData
9a1dfe2b
KL
1Terminal Emulator Images Standard - Proposed Design - Simplified
2================================================================
3
4Version: 1
5
6
7
8Purpose
9-------
10
11See the [original proposal](images.md) for purpose, design goals, and
12definitions.
13
14This document is an updated proposal to address feedback on the first
15proposal, which included: "overengineered", "hopelessly
466f45f3
KL
16overengineered", and "unnecessarily complex." I perceive this
17feedback as a positive: it is far easier to imagine a feature and
18remove it, than to fail to picture it and need to shoehorn it in
19later.
9a1dfe2b 20
9a1dfe2b
KL
21The original proposal was a superset of every image format referenced,
22and generalized beyond to multimedia. This proposal is sharply
23reduced from that to: "put this pixel rectangle from the image, into
24that cell-based rectangle with specific scaling policy". It is mostly
466f45f3
KL
25a subset of the iTerm2 protocol, with:
26
27* Specifications for what happens to the cursor.
28
29* More precise definitions of the "preserveAspectRatio" equivalent
30 options.
31
32* Explicit restriction to a Cell-based target region.
33
34* Definition that pixels not covered by image are set to the current
35 background color.
9a1dfe2b
KL
36
37
38
39Tradeoffs
40---------
41
42Simplifying the original proposal will significantly reduce
43complexity, but also eliminates features. The major tradeoffs offered
44in this revised proposal are:
45
461. Elimination of the layers feature, and with it the ability to place
47 images behind text. In this proposal, a Cell on the screen will
48 show either a (part of a) visible image, or a (part of a) text
49 glyph, but never both.
50
512. Elimination of the "url" option, and with it the ability for an
52 application to specify a filename or other method for the terminal
53 to find the file data on the local machine. Image data must always
54 be passed inline with the sequences.
55
563. Elimination of response codes, and with it:
57
58 - The ability for multiplexers to blindly pass on the sequences to
466f45f3
KL
59 their host terminal (because unique IDs are not generated by the
60 terminal).
9a1dfe2b
KL
61
62 - The ability for applications to reliably detect success or
63 failure of image display operations.
64
654. Elimination of pixel-oriented image placement operations, and with
66 it the ability of applications to pass on image calculations to the
67 terminal. An application which requires pixel-perfect rendering
68 must generate the pixels it needs, aligned such to be displayed at
69 the top-left corner of the text Cell rectangle.
70
71
72
73Summary
74-------
75
76This revised document proposes two independent new features:
77
781. A method to transfer image data for immediate display within the
79 screen Cell grid ("Direct Images").
80
812. A method to transfer image data to a terminal-managed cache, and
82 later display that data within the screen Cell grid ("Cached
83 Images").
84
85The only difference between the first and second feature is the
86presence of an ID key. Direct images do not use an ID key, while
87cached images use a store operation with ID key followed by one or
88more display operations with ID key.
89
90Images are applied to text Cells, and once set handled the same way
91text Cells are handled: erasing a line erases the image Cells on that
92line, inserting a character will shift image Cells on that row over,
93scrolling will shift the image up, and so on. Therefore, terminals
94will need to be prepared for the scenario that every Cell on the
95display is a separate image, with a separate display scaling option
96that will need to be re-applied automatically if font metrics change.
97
98
99
100All Features - Detection
101------------------------
102
103Applications can detect support for these features using Primary
104Device Attributes (DA) and DECID (ESC Z, or 0x9A).
105
106Terminals that support this standard will repond with additional
107parameter(s): "224" for direct images and "225" for cached images. A
108recap of the parameters xterm supports is listed below, with these new
109feature responses included:
110
111| VT220 (and higher) Response | Description |
112|-----------------------------|--------------------------------------------|
113| 1 | 132-columns |
114| 2 | Printer |
115| 3 | ReGIS graphics |
116| 4 | Sixel graphics |
117| 6 | Selective erase |
118| 8 | User-defined keys |
119| 9 | National Replacement Character sets |
120| 1 5 | Technical characters |
121| 1 6 | Locator port |
122| 1 7 | Terminal state interrogation |
123| 1 8 | User windows |
124| 2 1 | Horizontal scrolling |
125| 2 2 | ANSI color, e.g., VT525 |
126| 2 8 | Rectangular editing |
127| 2 9 | ANSI text locator (i.e., DEC Locator mode) |
128| 2 2 4 | Direct Images Version 1 |
129| 2 2 5 | Cached Images Version 1 |
130
131
132
133Direct Images - Summary
134-----------------------
135
136Non-text data (images) can be sent to the terminal for immediate
137display in a rectangular region of text Cells. Image data is
138transmitted to the terminal using a wire format described later in
139this document.
140
141Setting a Cell to image is a destructive operation: the Cell's
142original text is lost. Similarly, setting a Cell (or multiple Cells
143for fullwidth glyphs or grapheme clusters) to text is a destructive
144operation: the image in the Cell(s) is lost.
145
146Setting any part of a multi-Cell Tile to image also "breaks up" the
147Tile into a range of single Cells. In other words, image data can
148only be carried by a Cell, not a Tile.
149
150
151
152Direct Images - New Sequences
153-----------------------------
154
155A terminal with direct images feature must support the following new
156sequences:
157
158| Sequence | Description |
159|--------------------------------------|-------------------------|
160| OSC 1 3 3 8 ; F i l e = {args} : {data} BEL | Display image at (x, y) |
161| OSC 1 3 3 8 ; F i l e = {args} : {data} ST | Display image at (x, y) |
162
163
164
165For the OSC 1 3 3 8 sequence:
166
167* The {args} is a set of key-value pairs (each pair separated by
168 semicolon (';')), followed by a colon (':'), followed by a base-64
169 encoded string ({data}).
170
171* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
172 'a' - 'z').
173
174* A value is any printable ASCII string not containing whitespace,
175 colon, or semicolon ('!' - '9', '<' - '~').
176
177* Any alpha-numeric key may be specified. A key that is not supported
178 by the terminal is ignored without error.
179
180* The image is processed as shown below:
181
182 - The pixels are drawn starting at the upper-left corner of the text
183 cursor position.
184
466f45f3
KL
185 - All pixels in the target Cell rectangle that are not covered by
186 the image itself are set the current background color (like
187 sixel raster attributes).
188
9a1dfe2b
KL
189 - If scroll is specified as 1 (enabled), then:
190
191 a. The screen is scrolled up if the image overflows into the
192 bottom text row.
193
194 b. The cursor's final position is on the same column as the
195 starting cursor position, and on the row immediately below the
196 image.
197
198 - If scroll is omitted or specified as 0 (disabled), then:
199
200 a. The screen is never scrolled.
201
202 b. Pixels that would be drawn below the visible region on screen
203 are discarded.
204
205 c. The cursor's final position is at the same column and row as
206 the starting cursor position, i.e. the cursor does not move at
207 all.
208
209 - Pixels that would be drawn to the right of the visible region on
210 screen are discarded.
211
466f45f3
KL
212 - If scale is "none", then pixels that would be drawn outside the
213 target Cell rectangle are discarded.
214
9a1dfe2b
KL
215
216
217The keys for the key-value pairs that must be supported by the
218terminal are listed below:
219
a945b890
KL
220| Key | Default Value | Description |
221|--------------|---------------|---------------------------------------|
222| type | "image/rgb" | mime-type describing data field |
223| width | 1 | Number of Cell columns to display in |
224| height | 1 | Number of Cells rows to display in |
225| scale | "none" | Scale/zoom option, see below |
226| sourceX | 0 | Media source X position to display |
227| sourceY | 0 | Media source Y position to display |
228| sourceWidth | "auto" | Media width in pixels to display |
229| sourceHeight | "auto" | Media height in pixels to display |
230| scroll | 0 | If 0, scroll the display if needed |
9a1dfe2b
KL
231
232A terminal may support additional keys. If a key is specified but not
233supported by the terminal, then it is ignored without error.
234
235
236
237The "type" value is a mime-type string describing the format of the
238base64-encoded binary data. The terminal must support at minimum these
239mime-types:
240
241| Type String | Description |
242|---------------|--------------------------------------------------------------|
243| "image/rgb" | Big-endian-encoded 24-bit red, green, blue values |
244| "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values |
245| "image/png" | PNG file data as described by (reference to PNG format) |
246
247A terminal may support additional types. An application can detect
248terminal support for a format by:
249
250 1. Attempt to draw image, with "scroll" set to 1.
251
252 2. Check cursor position DSR 6.
253
254 3. If cursor has moved, then the terminal supports this image type.
255
256
257
258The "width" and "height" values are positive integers describing the
259number of Cells the image will be placed in.
260
261
262
263The "scale" value can take the following values:
264
265| Value | Meaning |
266|------------|---------------------------------------------------------------|
267| "none" | No scaling along either axis. |
268| "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping |
269| "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area |
270| "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit |
271
272
273
274"sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the
275rectangle of pixels from the media that will be displayed on the
276screen. The ranges for these values is shown below:
277
278| Key | Minimum Value | Maximum Value | Default Value |
279|--------------|---------------|-------------------------------|---------------|
280| sourceX | 0 | Media's full width - 1 | 0 |
281| sourceY | 0 | Media's full height - 1 | 0 |
282| sourceWidth | 1 | Media's full width - sourceX | "auto" |
283| sourceHeight | 1 | Media's full height - sourceY | "auto" |
284
285If any of these values are specified and outside the range, no image
286is displayed, and the cursor does not move. "sourceWidth" and
287"sourceHeight" can be "auto", which means use the maximum available
288width/height (given sourceX/sourceY) from the media's inherent
289dimensions.
290
291
292
293Cached Images - Summary
294-----------------------
295
296Non-text data (image) can be sent to the terminal for later display in
297a rectangular region of text Cells. Image data is transmitted to the
298terminal using the CSTORE command described below, and displayed on
299screen using the CDISPLAY command. A single CSTORE command can
300support many CDISPLAY commands.
301
302Upon display, setting a Cell to image is a destructive operation: the
303Cell's original text is lost. Similarly, setting a Cell (or multiple
304Cells for fullwidth glyphs or grapheme clusters) to text is a
305destructive operation: the image in the Cell(s) is lost.
306
307Setting any part of a multi-Cell Tile to image also "breaks up" the
308Tile into a range of single Cells. In other words, image data can
309only be carried by a Cell, not a Tile.
310
311
312
313Cached Images - Cache/Memory Management
314---------------------------------------
315
316The terminal manages a cache of multimedia data on behalf of the
317application. The application requests media be stored in the cache
318and provides an ID. This ID is later used to request display on the
319screen.
320
321The amount of memory and retention/eviction strategy for the cache is
322wholly managed by the terminal, with the following restrictions:
323
324* The terminal may not remove items from the cache that have any
325 portion being actively displayed on the primary or alternate
326 screens.
327
328The scrollback buffer is permitted, and recommended, to contain only a
329few (or zero) multimedia images. Terminals should consider retaining
330only the last 2-5 screens' worth of pixel data in the scrollback
331buffer.
332
333Applications have no control over when images are removed from the
334cache, and no provision is made to generate/ensure unique IDs.
335
336A terminal multiplexer that passes all CSTORE/CDISPLAY commands to the
337host terminal will need to parse the CSTORE and CDISPLAY sequences for
338the "id" field and rewrite it to be unique for all of its inner
339terminals.
340
341
342
343Cached Images - New Sequences
344-----------------------------
345
346A terminal with cached images feature must support the following new
347sequences:
348
15383c0a
KL
349| Sequence | Command | Description |
350|--------------------------------------|-----------|--------------------------|
351| OSC 1 3 4 0 ; F i l e = {args} : {data} BEL | CSTORE | Store image in cache |
352| OSC 1 3 4 0 ; F i l e = {args} : {data} ST | CSTORE | Store image in cache |
353| OSC 1 3 4 1 ; Pi ; {args} BEL | CDISPLAY | Display image at (x, y) |
354| OSC 1 3 4 1 ; Pi ; {args} ST | CDISPLAY | Display image at (x, y) |
9a1dfe2b
KL
355
356
357
358Cached Images - CSTORE
359----------------------
360
361For the CSTORE command:
362
363* The {args} is a set of key-value pairs (each pair separated by
364 semicolon (';')), followed by a colon (':'), followed by a base-64
365 encoded string ({data}).
366
367* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
368 'a' - 'z').
369
370* A value is any printable ASCII string not containing whitespace,
371 colon, or semicolon ('!' - '9', '<' - '~').
372
373
374
375The keys for the key-value pairs that must be supported by the
376terminal are listed below:
377
378| Key | Default Value | Description |
379|--------------|---------------|----------------------------------------------|
380| id | 0 | ID to refer to the image |
381| type | "image/rgb" | mime-type describing data field |
382
383
384
385The "id" value is a non-negative integer between 0 and 999999.
386
387
388
389The "type" value is a mime-type string describing the format of the
390base64-encoded binary data. The terminal must support at mimunum these
391mime-types:
392
393| Type String | Description |
394|---------------|--------------------------------------------------------------|
395| "image/rgb" | Big-endian-encoded 24-bit red, green, blue values |
396| "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values |
397| "image/png" | PNG file data as described by (reference to PNG format) |
398
399A terminal may support additional types. An application can detect
400terminal support for a format by:
401
402 1. Store image in cache.
403
404 2. Attempt to draw image, with "scroll" set to 1.
405
406 3. Check cursor position DSR 6.
407
408 4. If cursor has moved, then the terminal supports this image type.
409
410
411
412Cached Images - CDISPLAY
413------------------------
414
415For the CDISPLAY command:
416
417* Pi - a non-negative integer ID that was used in a previous CSTORE
418 command.
419
420* The {args} is a set of key-value pairs (each pair separated by
421 semicolon (';')), followed by a colon (':'), followed by a base-64
422 encoded string.
423
424* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
425 'a' - 'z').
426
427* A value is any printable ASCII string not containing whitespace,
428 colon, or semicolon ('!' - '9', '<' - '~').
429
430* Any alpha-numeric key may be specified. A key that is not supported
431 by the terminal is ignored without error.
432
433* The image pixels are processed as shown below.
434
435 - The pixel are drawn starting at the upper-left corner of the text
436 cursor position.
437
438 - If scroll is specified as 1 (enabled), then:
439
440 a. The screen is scrolled up if the image overflows into the
441 bottom text row.
442
443 b. The cursor's final position is on the same column as the
444 starting cursor position, and on the row immediately below the
445 image.
446
447 - If scroll is omitted or specified as 0 (disabled), then:
448
449 a. The screen is never scrolled.
450
451 b. Pixels that would be drawn below the visible region on screen
452 are discarded.
453
454 c. The cursor's final position is at the same column and row as
455 the starting cursor position, i.e. the cursor does not move at
456 all.
457
458 - Pixels that would be drawn to the right of the visible region on
459 screen are discarded.
460
461
462
463The keys for the key-value pairs that must be supported by the
464terminal are listed below:
465
a945b890
KL
466| Key | Default Value | Description |
467|--------------|---------------|---------------------------------------|
468| id | 0 | ID to refer to the image |
469| width | 1 | Number of Cell columns to display in |
470| height | 1 | Number of Cells rows to display in |
471| scale | "none" | Scale/zoom option, see below |
472| sourceX | 0 | Media source X position to display |
473| sourceY | 0 | Media source Y position to display |
474| sourceWidth | "auto" | Media width in pixels to display |
475| sourceHeight | "auto" | Media height in pixels to display |
476| scroll | 0 | If 1, scroll the display if needed |
9a1dfe2b
KL
477
478A terminal may support additional keys. If a key is specified but not
479supported by the terminal, then it is ignored without error.
480
481
482
483The "width" and "height" values are positive integers describing the
484number of Cells the image will be placed in.
485
486
487
488The "scale" value can take the following values:
489
490| Value | Meaning |
491|------------|---------------------------------------------------------------|
492| "none" | No scaling along either axis. |
493| "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping |
494| "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area |
495| "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit |
496
497
498
499"sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the
500rectangle of pixels from the media that will be displayed on the
501screen. The ranges for these values is shown below:
502
503| Key | Minimum Value | Maximum Value | Default Value |
504|--------------|---------------|-------------------------------|---------------|
505| sourceX | 0 | Media's full width - 1 | 0 |
506| sourceY | 0 | Media's full height - 1 | 0 |
507| sourceWidth | 1 | Media's full width - sourceX | "auto" |
508| sourceHeight | 1 | Media's full height - sourceY | "auto" |
509
510If any of these values are specified and outside the range, no image
511is displayed, and the cursor does not move. "sourceWidth" and
512"sourceHeight" can be "auto", which means use the maximum available
513width/height (given sourceX/sourceY) from the media's inherent
514dimensions.
15383c0a
KL
515
516
517
518Miscellaneous Items
519-------------------
520
521"image/rgb" and "image/rgba" also need width/height fields. Propose
522to specify them as 16-bit unsigned ints, followed by 24-bit or 32-bit
523data. If data is short, then the rest of the image is assumed to be
524current background color (like sixel raster attributes).