simplified proposal
[nikiroo-utils.git] / docs / images2.md
CommitLineData
9a1dfe2b
KL
1Terminal Emulator Images Standard - Proposed Design - Simplified
2================================================================
3
4Version: 1
5
6
7
8Purpose
9-------
10
11See the [original proposal](images.md) for purpose, design goals, and
12definitions.
13
14This document is an updated proposal to address feedback on the first
15proposal, which included: "overengineered", "hopelessly
16overengineered", and "unnecessarily complex."
17
18I perceive this feedback as a positive: it is far easier to imagine a
19feature and remove it, than to fail to picture it and need it later.
20The original proposal was a superset of every image format referenced,
21and generalized beyond to multimedia. This proposal is sharply
22reduced from that to: "put this pixel rectangle from the image, into
23that cell-based rectangle with specific scaling policy". It is mostly
24a subset of the iTerm2 protocol, with specifications for what happens
25to the cursor, and more precise definitions of the
26"preserveAspectRatio" equivalent options.
27
28
29
30Tradeoffs
31---------
32
33Simplifying the original proposal will significantly reduce
34complexity, but also eliminates features. The major tradeoffs offered
35in this revised proposal are:
36
371. Elimination of the layers feature, and with it the ability to place
38 images behind text. In this proposal, a Cell on the screen will
39 show either a (part of a) visible image, or a (part of a) text
40 glyph, but never both.
41
422. Elimination of the "url" option, and with it the ability for an
43 application to specify a filename or other method for the terminal
44 to find the file data on the local machine. Image data must always
45 be passed inline with the sequences.
46
473. Elimination of response codes, and with it:
48
49 - The ability for multiplexers to blindly pass on the sequences to
50 their host terminal.
51
52 - The ability for applications to reliably detect success or
53 failure of image display operations.
54
554. Elimination of pixel-oriented image placement operations, and with
56 it the ability of applications to pass on image calculations to the
57 terminal. An application which requires pixel-perfect rendering
58 must generate the pixels it needs, aligned such to be displayed at
59 the top-left corner of the text Cell rectangle.
60
61
62
63Summary
64-------
65
66This revised document proposes two independent new features:
67
681. A method to transfer image data for immediate display within the
69 screen Cell grid ("Direct Images").
70
712. A method to transfer image data to a terminal-managed cache, and
72 later display that data within the screen Cell grid ("Cached
73 Images").
74
75The only difference between the first and second feature is the
76presence of an ID key. Direct images do not use an ID key, while
77cached images use a store operation with ID key followed by one or
78more display operations with ID key.
79
80Images are applied to text Cells, and once set handled the same way
81text Cells are handled: erasing a line erases the image Cells on that
82line, inserting a character will shift image Cells on that row over,
83scrolling will shift the image up, and so on. Therefore, terminals
84will need to be prepared for the scenario that every Cell on the
85display is a separate image, with a separate display scaling option
86that will need to be re-applied automatically if font metrics change.
87
88
89
90All Features - Detection
91------------------------
92
93Applications can detect support for these features using Primary
94Device Attributes (DA) and DECID (ESC Z, or 0x9A).
95
96Terminals that support this standard will repond with additional
97parameter(s): "224" for direct images and "225" for cached images. A
98recap of the parameters xterm supports is listed below, with these new
99feature responses included:
100
101| VT220 (and higher) Response | Description |
102|-----------------------------|--------------------------------------------|
103| 1 | 132-columns |
104| 2 | Printer |
105| 3 | ReGIS graphics |
106| 4 | Sixel graphics |
107| 6 | Selective erase |
108| 8 | User-defined keys |
109| 9 | National Replacement Character sets |
110| 1 5 | Technical characters |
111| 1 6 | Locator port |
112| 1 7 | Terminal state interrogation |
113| 1 8 | User windows |
114| 2 1 | Horizontal scrolling |
115| 2 2 | ANSI color, e.g., VT525 |
116| 2 8 | Rectangular editing |
117| 2 9 | ANSI text locator (i.e., DEC Locator mode) |
118| 2 2 4 | Direct Images Version 1 |
119| 2 2 5 | Cached Images Version 1 |
120
121
122
123Direct Images - Summary
124-----------------------
125
126Non-text data (images) can be sent to the terminal for immediate
127display in a rectangular region of text Cells. Image data is
128transmitted to the terminal using a wire format described later in
129this document.
130
131Setting a Cell to image is a destructive operation: the Cell's
132original text is lost. Similarly, setting a Cell (or multiple Cells
133for fullwidth glyphs or grapheme clusters) to text is a destructive
134operation: the image in the Cell(s) is lost.
135
136Setting any part of a multi-Cell Tile to image also "breaks up" the
137Tile into a range of single Cells. In other words, image data can
138only be carried by a Cell, not a Tile.
139
140
141
142Direct Images - New Sequences
143-----------------------------
144
145A terminal with direct images feature must support the following new
146sequences:
147
148| Sequence | Description |
149|--------------------------------------|-------------------------|
150| OSC 1 3 3 8 ; F i l e = {args} : {data} BEL | Display image at (x, y) |
151| OSC 1 3 3 8 ; F i l e = {args} : {data} ST | Display image at (x, y) |
152
153
154
155For the OSC 1 3 3 8 sequence:
156
157* The {args} is a set of key-value pairs (each pair separated by
158 semicolon (';')), followed by a colon (':'), followed by a base-64
159 encoded string ({data}).
160
161* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
162 'a' - 'z').
163
164* A value is any printable ASCII string not containing whitespace,
165 colon, or semicolon ('!' - '9', '<' - '~').
166
167* Any alpha-numeric key may be specified. A key that is not supported
168 by the terminal is ignored without error.
169
170* The image is processed as shown below:
171
172 - The pixels are drawn starting at the upper-left corner of the text
173 cursor position.
174
175 - If scroll is specified as 1 (enabled), then:
176
177 a. The screen is scrolled up if the image overflows into the
178 bottom text row.
179
180 b. The cursor's final position is on the same column as the
181 starting cursor position, and on the row immediately below the
182 image.
183
184 - If scroll is omitted or specified as 0 (disabled), then:
185
186 a. The screen is never scrolled.
187
188 b. Pixels that would be drawn below the visible region on screen
189 are discarded.
190
191 c. The cursor's final position is at the same column and row as
192 the starting cursor position, i.e. the cursor does not move at
193 all.
194
195 - Pixels that would be drawn to the right of the visible region on
196 screen are discarded.
197
198
199
200The keys for the key-value pairs that must be supported by the
201terminal are listed below:
202
203| Key | Default Value | Description |
204|--------------|---------------|----------------------------------------------|
205| type | "image/rgb" | mime-type describing data field |
206| width | 1 | Number of Cells or pixels wide to display in |
207| height | 1 | Number of Cells or pixels high to display in |
208| scale | "none" | Scale/zoom option, see below |
209| sourceX | 0 | Media source X position to display |
210| sourceY | 0 | Media source Y position to display |
211| sourceWidth | "auto" | Media width in pixels to display |
212| sourceHeight | "auto" | Media height in pixels to display |
213| scroll | 0 | If 0, scroll the display if needed |
214
215A terminal may support additional keys. If a key is specified but not
216supported by the terminal, then it is ignored without error.
217
218
219
220The "type" value is a mime-type string describing the format of the
221base64-encoded binary data. The terminal must support at minimum these
222mime-types:
223
224| Type String | Description |
225|---------------|--------------------------------------------------------------|
226| "image/rgb" | Big-endian-encoded 24-bit red, green, blue values |
227| "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values |
228| "image/png" | PNG file data as described by (reference to PNG format) |
229
230A terminal may support additional types. An application can detect
231terminal support for a format by:
232
233 1. Attempt to draw image, with "scroll" set to 1.
234
235 2. Check cursor position DSR 6.
236
237 3. If cursor has moved, then the terminal supports this image type.
238
239
240
241The "width" and "height" values are positive integers describing the
242number of Cells the image will be placed in.
243
244
245
246The "scale" value can take the following values:
247
248| Value | Meaning |
249|------------|---------------------------------------------------------------|
250| "none" | No scaling along either axis. |
251| "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping |
252| "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area |
253| "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit |
254
255
256
257"sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the
258rectangle of pixels from the media that will be displayed on the
259screen. The ranges for these values is shown below:
260
261| Key | Minimum Value | Maximum Value | Default Value |
262|--------------|---------------|-------------------------------|---------------|
263| sourceX | 0 | Media's full width - 1 | 0 |
264| sourceY | 0 | Media's full height - 1 | 0 |
265| sourceWidth | 1 | Media's full width - sourceX | "auto" |
266| sourceHeight | 1 | Media's full height - sourceY | "auto" |
267
268If any of these values are specified and outside the range, no image
269is displayed, and the cursor does not move. "sourceWidth" and
270"sourceHeight" can be "auto", which means use the maximum available
271width/height (given sourceX/sourceY) from the media's inherent
272dimensions.
273
274
275
276Cached Images - Summary
277-----------------------
278
279Non-text data (image) can be sent to the terminal for later display in
280a rectangular region of text Cells. Image data is transmitted to the
281terminal using the CSTORE command described below, and displayed on
282screen using the CDISPLAY command. A single CSTORE command can
283support many CDISPLAY commands.
284
285Upon display, setting a Cell to image is a destructive operation: the
286Cell's original text is lost. Similarly, setting a Cell (or multiple
287Cells for fullwidth glyphs or grapheme clusters) to text is a
288destructive operation: the image in the Cell(s) is lost.
289
290Setting any part of a multi-Cell Tile to image also "breaks up" the
291Tile into a range of single Cells. In other words, image data can
292only be carried by a Cell, not a Tile.
293
294
295
296Cached Images - Cache/Memory Management
297---------------------------------------
298
299The terminal manages a cache of multimedia data on behalf of the
300application. The application requests media be stored in the cache
301and provides an ID. This ID is later used to request display on the
302screen.
303
304The amount of memory and retention/eviction strategy for the cache is
305wholly managed by the terminal, with the following restrictions:
306
307* The terminal may not remove items from the cache that have any
308 portion being actively displayed on the primary or alternate
309 screens.
310
311The scrollback buffer is permitted, and recommended, to contain only a
312few (or zero) multimedia images. Terminals should consider retaining
313only the last 2-5 screens' worth of pixel data in the scrollback
314buffer.
315
316Applications have no control over when images are removed from the
317cache, and no provision is made to generate/ensure unique IDs.
318
319A terminal multiplexer that passes all CSTORE/CDISPLAY commands to the
320host terminal will need to parse the CSTORE and CDISPLAY sequences for
321the "id" field and rewrite it to be unique for all of its inner
322terminals.
323
324
325
326Cached Images - New Sequences
327-----------------------------
328
329A terminal with cached images feature must support the following new
330sequences:
331
332| Sequence | Command | Description |
333|--------------------------------------|-----------|-------------------------|
334| OSC 1 3 4 0 ; F i l e = {args} : {data} BEL | CSTORE | Display media at (x, y) |
335| OSC 1 3 4 1 ; Pi ; {args} ST | CDISPLAY | Display media at (x, y) |
336
337
338
339Cached Images - CSTORE
340----------------------
341
342For the CSTORE command:
343
344* The {args} is a set of key-value pairs (each pair separated by
345 semicolon (';')), followed by a colon (':'), followed by a base-64
346 encoded string ({data}).
347
348* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
349 'a' - 'z').
350
351* A value is any printable ASCII string not containing whitespace,
352 colon, or semicolon ('!' - '9', '<' - '~').
353
354
355
356The keys for the key-value pairs that must be supported by the
357terminal are listed below:
358
359| Key | Default Value | Description |
360|--------------|---------------|----------------------------------------------|
361| id | 0 | ID to refer to the image |
362| type | "image/rgb" | mime-type describing data field |
363
364
365
366The "id" value is a non-negative integer between 0 and 999999.
367
368
369
370The "type" value is a mime-type string describing the format of the
371base64-encoded binary data. The terminal must support at mimunum these
372mime-types:
373
374| Type String | Description |
375|---------------|--------------------------------------------------------------|
376| "image/rgb" | Big-endian-encoded 24-bit red, green, blue values |
377| "image/rgba" | Big-endian-encoded 32-bit red, green, blue, alpha values |
378| "image/png" | PNG file data as described by (reference to PNG format) |
379
380A terminal may support additional types. An application can detect
381terminal support for a format by:
382
383 1. Store image in cache.
384
385 2. Attempt to draw image, with "scroll" set to 1.
386
387 3. Check cursor position DSR 6.
388
389 4. If cursor has moved, then the terminal supports this image type.
390
391
392
393Cached Images - CDISPLAY
394------------------------
395
396For the CDISPLAY command:
397
398* Pi - a non-negative integer ID that was used in a previous CSTORE
399 command.
400
401* The {args} is a set of key-value pairs (each pair separated by
402 semicolon (';')), followed by a colon (':'), followed by a base-64
403 encoded string.
404
405* A key can be any alpha-numeric ASCII string ('0' - '9', 'A' - 'Z',
406 'a' - 'z').
407
408* A value is any printable ASCII string not containing whitespace,
409 colon, or semicolon ('!' - '9', '<' - '~').
410
411* Any alpha-numeric key may be specified. A key that is not supported
412 by the terminal is ignored without error.
413
414* The image pixels are processed as shown below.
415
416 - The pixel are drawn starting at the upper-left corner of the text
417 cursor position.
418
419 - If scroll is specified as 1 (enabled), then:
420
421 a. The screen is scrolled up if the image overflows into the
422 bottom text row.
423
424 b. The cursor's final position is on the same column as the
425 starting cursor position, and on the row immediately below the
426 image.
427
428 - If scroll is omitted or specified as 0 (disabled), then:
429
430 a. The screen is never scrolled.
431
432 b. Pixels that would be drawn below the visible region on screen
433 are discarded.
434
435 c. The cursor's final position is at the same column and row as
436 the starting cursor position, i.e. the cursor does not move at
437 all.
438
439 - Pixels that would be drawn to the right of the visible region on
440 screen are discarded.
441
442
443
444The keys for the key-value pairs that must be supported by the
445terminal are listed below:
446
447| Key | Default Value | Description |
448|--------------|---------------|----------------------------------------------|
449| id | 0 | ID to refer to the image |
450| width | 1 | Number of Cells or pixels wide to display in |
451| height | 1 | Number of Cells or pixels high to display in |
452| scale | "none" | Scale/zoom option, see below |
453| sourceX | 0 | Media source X position to display |
454| sourceY | 0 | Media source Y position to display |
455| sourceWidth | "auto" | Media width in pixels to display |
456| sourceHeight | "auto" | Media height in pixels to display |
457| scroll | 0 | If 1, scroll the display if needed |
458
459A terminal may support additional keys. If a key is specified but not
460supported by the terminal, then it is ignored without error.
461
462
463
464The "width" and "height" values are positive integers describing the
465number of Cells the image will be placed in.
466
467
468
469The "scale" value can take the following values:
470
471| Value | Meaning |
472|------------|---------------------------------------------------------------|
473| "none" | No scaling along either axis. |
474| "scale" | Stretch image, preserving aspect ratio, to maximum size in the target area without cropping |
475| "stretch" | Stretch along both axes, distorting aspect ratio, to fill the target area |
476| "crop" | Stretch along both axes, preserving aspect ration, to completely fill the target area, cropping pixels that will not fit |
477
478
479
480"sourceX", "sourceY", "sourceWidth", and "sourceHeight" define the
481rectangle of pixels from the media that will be displayed on the
482screen. The ranges for these values is shown below:
483
484| Key | Minimum Value | Maximum Value | Default Value |
485|--------------|---------------|-------------------------------|---------------|
486| sourceX | 0 | Media's full width - 1 | 0 |
487| sourceY | 0 | Media's full height - 1 | 0 |
488| sourceWidth | 1 | Media's full width - sourceX | "auto" |
489| sourceHeight | 1 | Media's full height - sourceY | "auto" |
490
491If any of these values are specified and outside the range, no image
492is displayed, and the cursor does not move. "sourceWidth" and
493"sourceHeight" can be "auto", which means use the maximum available
494width/height (given sourceX/sourceY) from the media's inherent
495dimensions.