Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sixels: last line cut/truncated on terminal emulators with "correct" text cursor placement #192

Open
dnkl opened this issue Feb 29, 2024 · 61 comments
Labels
compatibility Compatibility (e.g. terminal quirks) research Research & discussion

Comments

@dnkl
Copy link
Contributor

dnkl commented Feb 29, 2024

Sixel capable terminal emulators have gotten cursor placement (after emitting the sixel) wrong since the beginning. They usually put the cursor on a new line under the sixel. This means the terminal content may scroll, if a sixel is printed on the last row.

However, it's not how the VT340 did it. The simplified explanation is that it places the cursor on the last line of the sixel. Thus, if you want to print text under the sixel, you first have to print a newline.

The real algorithm is slightly more complex than that. A sixel is 6 pixels tall. This means it can cover two text rows. The DEC cursor placement algorithm puts the text cursor where the top pixel is. This means there are times when two newlines are required to print text under the sixel.

A number of terminals have started to implement the correct behavior. Terminals that implement the DEC placement algorithm are foot, contour, DomTerm and WezTerm. There may be more that I'm not aware of. XTerm is close to correct, but last time I checked, it placed the cursor on the bottom pixel (i.e. you always need a single newline).

Right now, running chafa <image> && echo "XXXXXX" will look something like this in e.g. foot:

chafa-last-line-cut
(picture shows a part of my dog's paw...)

A bit more information here:

@hackerb9
Copy link

hackerb9 commented Mar 1, 2024

Hi @dnkl. The VT340's algorithm should not be followed too strictly in modern terminals. Despite what the documentation implied, it uses a fast heuristic which relies on the character cell being 20 pixels tall. That algorithm was faster but included a glitch which should not be copied.

If you are designing a terminal that uses characters that are not 20 pixels tall, the algorithm does not apply and will have to be adapted in one of two ways:

  1. I suggest using the simpler algorithm which I believe you referred to as "bottom pixel". Such a terminal will be compatible with the way that programmers presumed the VT340 always worked and that DEC's documentation would lead a reasonable person to believe. All known original programs and sixel images for the VT340 will work correctly with that algorithm. Additionally, it is easily programmed for as one knows exactly how to draw text under any graphic: just send a newline.
  2. j4james has suggested limiting the sixel image resolution so that, regardless of the font resolution, each character cell shows only 10x20 pixels. This might be useful for someone who wants to run hypothetical VT340 software from thirty years ago which knows about and works around the VT340's cursor positioning quirk. One major downside is that graphical resolution is limited and the only way to increase it is to make the font so tiny it is unreadable.

I strongly believe the first method is the correct one for most modern terminals. It lets programmers easily create software that integrates graphics with character cell text interfaces, which is to me what makes sixels useful.


If you've read my discussion with j4james about whether this VT340 behavior is a "glitch", you'll see that even though he believes it is the historical behavior and thus correct for any terminal that claims to emulate a VT340, neither of us could come up with an easy solution for application programmers who want to just splat a sixel image on the screen and show some text underneath it. Since a workaround requires the application to model the internal state of the VT340, no sane program will ever intentionally use this odd behavior, whether it is technically a glitch or not.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 2, 2024

@hackerb9 I don't mind changing foot to always put the cursor on the last row touched by the sixel (i.e. the bottom pixel of the last sixel).

What I don't want is slightly different behavior in modern terminals, and I was under the impression that the other "correct" terminals also followed the DEC algorithm? If not, I'd be more than happy to update foot.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 2, 2024

That said, it looks like chafa isn't emitting a newline at all, so even with the tweaked cursor placement (always put it on the last row touched by the sixel), the image is sometimes cut off.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 3, 2024

@PerBothner @christianparpart @wez I was hoping we could all agree on how to implement cursor placement after emitting a sixel. As far as I can tell, foot, DomTerm, Contour and Wezterm all place the cursor on the same row as the last sixel. But do you follow the DEC algorithm, and place it on the same row as the upper pixel of the last sixel, or do you place it on the last row touched by the sixel (i.e. the row containing the bottom pixel of the last sixel).

I know at least some of you have been following the discussions between @hackerb9 j4james, but I don't know what you ended up implementing. From an application point of view, I think it would be beneficial if we all implemented the same cursor placement algorithm...

Foot currently implements the DEC algorithm, but I think it would be easier for applications if I changed it to just place the cursor on the last row. Then, to print text under the sixel, you know all that's needed is (always) a single newline. Not one or two.

But, I think it's a bad idea to change foot if all other sixel terminals implement the DEC algorithm, and don't want to change.

@PerBothner
Copy link

I agree putting the cursor on the row containing the bottom sixel row makes more sense, and I can certainly change it if that the consensus. I prefer to match xterm.js for various reasons. https://github.com/jerch - what do you think?

@jerch
Copy link

jerch commented Mar 3, 2024

@PerBothner Imho xterm.js currently keeps the text cursor at the row of the bottom-most pixel drawn from last sixel band. Means if the last band contains only "fiftel" (6th pixel never set), the 5th pixel would be the last one, not the sixth anymore.
I did this to allow to print pictures in non 6-multiple px height and still properly align them at the bottom w'o nonsense excess row or excess space at the bottom.
(There is still a bug attached to it, where empty sixel bands at the end might get truncated - jerch/node-sixel#58)

@wez
Copy link

wez commented Mar 3, 2024

I'm open to tweaking wezterm to be more sane, assuming that there are a couple of test cases with examples of where the cursor should end up.

FWIW, I think the current cursor placement in wezterm may well be a bit of a fluke arising from re-using the iterm2 image protocol logic that preceded it rather than a conscious effort to implement the vt340 algorithm.

wezterm's logic for this (shared by iterm2, kitty and sixel handling) can be found here:
https://github.com/wez/wezterm/blob/22424c3280cb21af43317cb58ef7bc34a8cbcc91/term/src/terminalstate/image.rs#L65

the vertical position:
https://github.com/wez/wezterm/blob/22424c3280cb21af43317cb58ef7bc34a8cbcc91/term/src/terminalstate/image.rs#L166-L170

the horizontal position:
https://github.com/wez/wezterm/blob/22424c3280cb21af43317cb58ef7bc34a8cbcc91/term/src/terminalstate/image.rs#L233-L246

@hpjansson
Copy link
Owner

It may be a good idea to get @arakiken and the other mlterm developers on board too. I've been testing with it, since it had one of the first implementations, and is still one of the fastest. It currently (as of version 3.9.3) places the cursor on the row immediately after (that is, the first character row not touched by any sixel, transparent or not).

My main concerns as an application writer are a) consistency between terminals and b) simplicity of design. I'll happily support any consensus terminal developers arrive at.

Imho xterm.js currently keeps the text cursor at the row of the bottom-most pixel drawn from last sixel band. Means if the last band contains only "fiftel" (6th pixel never set), the 5th pixel would be the last one, not the sixth anymore.

I favored this approach at first, but it has the minor annoyance of deliberate image transparency being cut off. It also means applications must inspect the image data in order to know where the cursor'll end up, which is a slightly bigger problem. Correct me if I'm wrong and there's a way around this.

@hpjansson hpjansson added research Research & discussion compatibility Compatibility (e.g. terminal quirks) labels Mar 4, 2024
@dnkl
Copy link
Contributor Author

dnkl commented Mar 4, 2024 via email

@dankamongmen
Copy link

following along for notcurses, good to see this effort taking place

@AnonymouX47
Copy link

Sorry to intrude....

I just want to add that if it's possible to also consider the horizontal cursor position, it'd be really good (from the perspective of an application developer).

A unified vertical position is good enough for aligning images with text or other images vertically but not horizontally (i.e side-by-side).

Yes, it's probably possible to workaround this using absolute cursor positioning or save/restore but these are not always viable options, plus I believe the purpose of a consensus includes eliminating the need for workarounds in applications anyways.

Thank you all.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 4, 2024 via email

@dnkl
Copy link
Contributor Author

dnkl commented Mar 4, 2024 via email

@dankamongmen
Copy link

My understanding is the text cursor's horizontal position isn't changed at all. It only moves vertically. Put another way, it is positioned "at the beginning of the sixel", i.e in the bottom left corner of the sixel.

i went and looked at what we do in notcurses, and we do a hard cursor position after emission of any sixel. i imagine any application wanting to be portably correct will have to do the same thing, no? since they might be dealing with old terminals, or noncompliant ones, and it's not indicated via term queries? i don't want to disrupt unification, but from an app/toolkit author's perspective, i don't see how this helps...?

@dnkl
Copy link
Contributor Author

dnkl commented Mar 4, 2024 via email

@jerch
Copy link

jerch commented Mar 4, 2024

@hpjansson

I favored this approach at first, but it has the minor annoyance of deliberate image transparency being cut off. It also means applications must inspect the image data in order to know where the cursor'll end up, which is a slightly bigger problem. Correct me if I'm wrong and there's a way around this.

No you are right. This "bottom-most colored pixel" behavior cuts a fully transparent line of pixels at the bottom as not being part of the original image. If an image has that line intentionally, it will get stripped. Thats for level 1 sixel.

Correct me if I'm wrong and there's a way around this.

Well I put another warning into the docs not to use level 1 sixel on encoder side anymore, but to go with level 2 with explicit raster attributes denoting width and height extend. DEC STD 070 also tells us, that the graphics extends in raster attributes should never be exceeded by encoders, thus my decoder uses these to trim the graphics, which also solves the issue of non multiple-of-6 image heights in a more deterministic way.
We already had several discussions about the worth of the sixel chapter in DEC STD 070 and how that deviates even from DEC's own machines. Imho DEC STD 070 is the only lengthy source from DEC, thats tries to sound normative, e.g. by implying certain limits on the sixel format, like height and width, or 256 color slots rule. Maybe they did that to get it in line with other industry standards of that time (I guess 256 colors support was at the top notch end of the 80s hardware caps), but it kinda never came into life as they soon stopped the whole sixel line.

@AnonymouX47

I just want to add that if it's possible to also consider the horizontal cursor position, it'd be really good (from the perspective of an application developer).

Thats not possible with sixel level 1, it has no width idea. Every sixel band can have different sixel cursor width to the right (an image might be ragged to the right) - which one to choose from?
Sixel level 2 brings width&height with its raster attributes, so yes that could be used for a right border.
To support both conformance levels - only the start cursor offset in a line is determined, which basically leads to the VT340 cursor mode.

Btw xterm.js also uses the VT340 cursor for IIP as the only supported cursor mode to level out image sequence differences. While it is more annoying to deal with that cursor mode as app dev, if you want to place text right of the image, its handling is always the same:

  • know initial cursor pos, either by tracking in your own buffer state or do a explicit CPR
  • place image of x*y pixels, either with sixel or IIP
  • deduct image output size in cols;rows from TEs grid resolution (to get grid resolution of the TE, either do ioctl or CSI 14/18 t)
  • move text cursor by image rows/cols up/right
  • write your text

@dnkl
Copy link
Contributor Author

dnkl commented Mar 4, 2024 via email

@jerch
Copy link

jerch commented Mar 4, 2024

I'd be more than happy to change it, to instead truncate the image to the width/height specified in the raster attributes. It'd just make everything simpler on the terminal side.

Yepp, it reduces code complexity alot, and on perf side - it is actually ~40% faster during sixel decoding because of known upper bounds prehand in my decoder.

@AnonymouX47
Copy link

AnonymouX47 commented Mar 4, 2024

Thanks (@dnkl, @dankamongmen and @jerch) for the clarifications and suggestions. I guess I can work with those.

EDIT: ... as regards cursor horizontal placement.

@hpjansson
Copy link
Owner

@dnkl

related, but perhaps worth its own issue; chafa currently ends the sixel with a GNL ('-'). Is this intentional? It adds an extra, empty, graphical row. I think it would be better to use a textual newline instead. Fwiw, this behavior has a (very) minor performance impact on foot, for sixels with an explicit width/height, as we're forced to reallocate and enlarge the backing image buffer, and then initialize it to the background color. I'm not really bothered by it, but thought it might be worth mentioning at least. Be happy to move this to a separate issue if you'd prefer that.

I don't remember exactly how intentional it was, but when I wrote most of the encoder back in 2018 I had to work around issues in existing decoders. For instance, I specify the raster dimensions but still make sure to pad every sixel row to the full width, since I noticed a case where the terminal would have garbage in the image buffer otherwise. It's possible the GNL was required by a decoder at some point.

That said, after testing it again now, it seems e.g. mlterm behaves the same with and without the GNL; I think it opens a new sixel row only when its pixel data starts arriving. I don't know of anything that needs the final GNL anymore, so I'll remove it.

I'm also partial to the idea that raster attributes should preempt dynamic resizing. It makes things more predictable for everyone.

@hackerb9
Copy link

hackerb9 commented Mar 6, 2024

I'm glad to see all the terminal developers here working together!

If I can summarize, it sounds like everyone is in agreement that modern terminals should allow what I will refer to as splat-nl-print: Applications may send sixels to a screen and simply send a newline before any text if they do not wish to overlap the graphics. Although VT340 compatibility is not the highest priority, I can add that my tests show splat-nl-print as the algorithm of choice even on a real VT340 as the occasional glitch is vanishingly rare in actual usage.

Additional points brought up:

  • Should the width and height specified in the Raster Attributes (RA) be used as a clipping box despite DEC's documentation explicitly stating RA does not limit the size of the image? Personally, I think, "Yes". It is a reasonable optimization for modern terminals when there is exactly one RA present in the sixel data stream. However, I would also hope modern terminals would be robust enough to fall back to unoptimized rendering when necessary — for example, no RA in the image, multiple RAs, an RA with zero width / height, or data where the program doesn't know the size ahead of time. (Sidenote: I do not expect any modern emulator to be able to handle @jerch's endless scrolling sixels.)

  • Should the text after a new line overwrite transparent pixels at the bottom of the graphic? I believe so unless the RA width and height specify otherwise.

  • Should Graphic New Line scroll the screen immediately before pixel data is received? Yes, I think that is correct. And applications encoding sixels should not output a final Graphic New Line at the end of the stream.

  • How can positioning of text to the right of a sixel image be made easier for application developers? I agree that a new issue should be created to discuss this. (If someone does, please @ me in the discussion as I'm curious about possible solutions.)

@j4james
Copy link

j4james commented Mar 6, 2024

If you're going to define your own version of Sixel, can you please make it something that apps can opt into or out of with a mode. Worst case, if you don't want to implement both standard and non-standard cursor placement, you could still report the mode as permanently set, and then apps can at least tell what behavior to expect from the terminal.

@PerBothner
Copy link

Have you tested recent versions of xterm? I think it is desirable to be compatible with xterm. It may be a good idea to contact Thomas E. Dickey, the maintainer of xterm. He has tweaked the handling of Sixels in the past, and may be open to (if necessary) doing so to match the "saner" behavior.

@PerBothner
Copy link

@j4james I don't believe there is a "standard version" of Sixel. That is part of the problem: Different implementations act differently. Is "standard Sixel" whatever DEC implemented in their terminals? Are all such terminals consistent? What about the specifications (manuals) from DEC? What about corner cases not convered in the manuals? What about xterm - and which version of xterm? If all of these were consistent, I'd consider that as "standard sixel" - but I'm pretty certain that is not the case,

@hackerb9
Copy link

hackerb9 commented Mar 6, 2024

Have you tested recent versions of xterm? I think it is desirable to be compatible with xterm. It may be a good idea to contact Thomas E. Dickey, the maintainer of xterm. He has tweaked the handling of Sixels in the past, and may be open to (if necessary) doing so to match the "saner" behavior.

UPDATE I have determined that I was mistaken about Xterm's behavior regressing. In fact, it is now almost precisely correct. The one thing it is missing, however, is moving the text cursor down on Graphic New Lines, which just happens to be the default output from ImageMagick's convert tool. @ThomasDickey.

@hackerb9
Copy link

hackerb9 commented Mar 6, 2024

Here is a new script, textcursor2.sh, which shows how a TEXT NEW LINE (or, equivalently, CURSOR DOWN) separates a sixel image from any following text on a VT340 with its 20 pixel high character cell.

textcursor2

It also shows what happens when GRAPHIC NEW LINE is used; the most important feature of which seems to be that it acts exactly like a single text new line whenever the image height is a multiple of the character cell height.

@hpjansson
Copy link
Owner

hpjansson commented Mar 8, 2024

That's an interesting proposal, though ideally I'd wish for something that can be used without returning to the first column, and which doesn't require a blank row. Chafa (by request) has a --relative switch that's supposed to print output at the current cursor location, leaving it immediately below the bottom-left corner. This works for symbol graphics, but the promise is hard to uphold for other image protocols.

Example image - click to expand

@hackerb9
Copy link

hackerb9 commented Mar 8, 2024

As I understand it, the controversial issue is: How should applications and terminals handle this common case: How to print an image (with no mixing of text and image on the same line), and then move the cursor to the first line below the image (that does not overlap any of the pixels)?

Fortunately, there is not much genuine controversy on that point: just send a single newline, '\n'. Easypeasy.

Click to read Hackerb9's humble opinion

The reason there seems to be controversy is that the vt340 can have an extremely rare quirk where the text will overwrite a few pixels at the bottom of the image. Depending upon your goal for the terminal emulator you are writing, this may or may not be important. It's not a major graphical problem and it almost never happens. Still a terminal which wants to be a faithful clone of the VT340's behaviours would of course care about this nuance. The cost of attempting to replicate it are high, causing other design trade-offs and adding complexity not just for the terminal developers but also for application programmers. Terminals which aim to be useful in modern times would be well advised to skip the quirk.

I believe it was not a design goal but a compromise. The VT340's "top pixel" heuristic is a quick approximation for a calculation that was too expensive at the time: "bottom pixel". Fortunately -- or perhaps by design -- the VT340's character cell height of 20 pixels makes that heuristic work just like "bottom pixel" in nearly every case.

I had thought this glitch was a bug in the VT340 but, after looking into it deeply, I am actually extremely impressed with the engineers from DEC. They came up with a clever solution that nobody noticed at the time was any different from the correct calculation. DEC's lack of documentation on this point would be surprising given how thorough the manuals are until one realizes it was probably omitted on purpose. If people knew the trick the VT340 was using, they might start relying on the quirky behaviour and future terminals would be obligated to support it.

Modern terminals have no need to approximate the calculation of the bottom-most opaque pixel as processors are not as limited as they were in the 1980s.

Even if they wanted to, there is no benefit to trying to extrapolate what the VT340 heuristic would be in modern times. Whatever it is, it is certainly not just picking the top pixel as that doesn't work for other character cell heights. Trying to salvage it by presuming all character cells are 10x20 regardless of the font size causes a cascade of other problems, the worst being that high res images require making the font size imperceptibly small.

And, even if some terminal did implement a heuristic that worked at any font size, it would be useless to application programmers. Calculating when to send two newlines is unnecessarily complicated and sometimes not even possible.

Consider the case where a program wants to display files that contain sixel screen dumps, perhaps captured by the VT340's MediaCopy. Since each file can contain an arbitrarily sized region, the program doesn't know ahead of time how high the image is in pixels. The only sane thing to do would be to send a single newline and presume one is enough to get the text cursor to a free line. This works on a VT340 so close to always that it isn't worth it to try to work around the occasional glitch.

In summary: We're talking about a very minor and rare graphical glitch that can occur on the VT340. While interesting from a historical perspective, only a precise VT340 emulator needs to care about such quirks. There is no benefit to copying this behaviour of the VT340 to modern terminals and much harm.


@PerBothner: Although not appropriate for sixel graphics, I could see your fresh-line proposal being useful for other situations, such as to make sure the prompt is located correctly after a program dies abruptly.

@hpjansson: To not return to the first column after displaying sixels, use IND, '\eD', instead of newline. If a terminal has newline working correctly, then IND should work, too.

@hpjansson
Copy link
Owner

hpjansson commented Mar 8, 2024

@hpjansson: To not return to the first column after displaying sixels, use IND, '\eD', instead of newline. If you have newline working correctly, then IND should work, too.

Right - but assuming a DEC-faithful TE, I would have to emit IND once or twice, depending - or rely on some extension such as @PerBothner's suggestion. The central question is "can we conserve DEC sixels but do something else to obviate the need to know where the last sixel band fell in relation to text cells?"

@hackerb9
Copy link

hackerb9 commented Mar 8, 2024

Right - but assuming a DEC-faithful TE, I would have to emit IND once or twice, depending - or rely on some extension such as @PerBothner's suggestion. The central question is "can we conserve DEC sixels but do something else to obviate the need to know where the last sixel band fell in relation to text cells?"

Just emit IND once, same as a newline. This conserves DEC's sixel design.

@hpjansson
Copy link
Owner

Just emit IND once, same as a newline. This conserves DEC's sixel design.

Okay - I'll do that (and unless I've misunderstood something, accept that a few pixels may get cut off). I'll get out of your hair now so you can discuss the other aspects (e.g. should raster attributes define a clipping rectangle? :-) Enjoying the conversation.

hpjansson added a commit that referenced this issue Mar 8, 2024
The final GNL could cause extra space to be emitted in some
circumstances.

Also fix an issue causing more bands to be padded than necessary when
multithreaded.

See #192 (GitHub).
hpjansson added a commit that referenced this issue Mar 8, 2024
This positions the cursor correctly ~everywhere.

See #192 (GitHub).
@dankamongmen
Copy link

Right - but assuming a DEC-faithful TE, I would have to emit IND once or twice, depending - or rely on some extension such as @PerBothner's suggestion. The central question is "can we conserve DEC sixels but do something else to obviate the need to know where the last sixel band fell in relation to text cells?"

maybe i'm misunderstanding the need, but in notcurses i handle what i believe to be your problem by getting the terminal size in pixels, dividing that out by the number of rows and cols, and using those as the cell pixel dimensions. doesn't this provide you enough?

@hackerb9
Copy link

hackerb9 commented Mar 9, 2024

should raster attributes define a clipping rectangle?

Good question. I've already said I think it's a reasonable, if not ideal, optimization even though it clearly violates both DEC's documentation and actual hardware behaviour.

I should ask, though, does anyone have a good hypothesis for why DEC repeatedly stated that sixel images can extend beyond the rectangle defined by RA? What is lost by taking this optimization?

My working theory had been that DEC probably wanted RA to define a clipping box but their hardware wasn't up to the task. However, that kinda falls apart when I look into it as their "GPU" (DRAGON) actually featured multiple viewports that might have done the job in no time. And, if jerch's results apply, using clipping could have actually made the VT340 run quite a bit faster, not slower. But do they apply? Would the VT340 have seen a significant speed benefit?

@jerch, when you say you get a 40% speed boost, what exactly was the bottleneck? Memory pressure from dynamic allocation of large rectangles?

@dnkl
Copy link
Contributor Author

dnkl commented Mar 9, 2024

@hackerb9 how does trailing GNLs interact with last transparent rows being clipped? One way of looking at final, trailing GNL, is that it is a completely transparent sixel row (and thus that it should be removed). But perhaps it's more correct to say that a GNL should be treated as a fully opaque row, until you start printing sixels; then you start tracking the bottom-most opaque pixel.

when you say you get a 40% speed boost, what exactly was the bottleneck? Memory pressure from dynamic allocation of large rectangles?

I can obviously not speak for @jerch , and I, too, am very curious. However, for me, there's no 40% speed boost just from allowing the raster attributes to act like a clipping region. Foot allocates the entire backing memory when the raster attributes is set. We still have to check for "overflows" (either increase image size if the sixel cursor goes beyond the raster attributes, or ignore the sixel). Thus, it makes very little difference while processing sixel characters.

There would be a small performance gain, in that we wouldn't have to reallocate the backing image when we encounter "sloppy" encoders that emit a trailing GNL, that triggers a vertical resize.

Treating it as a clipping region does simplify things though. And, almost removes the need to scan for last-opaque sixel row ;)

But, I'm fine with either way.

@j4james
Copy link

j4james commented Mar 9, 2024

does anyone have a good hypothesis for why DEC repeatedly stated that sixel images can extend beyond the rectangle defined by RA? What is lost by taking this optimization?

Infinite scroll would be the most obvious example (I'm sure we discussed this somewhere before but I can't find it in your repo right now). You'd also lose some bandwidth saving tricks that could be beneficial when working with non-rectangular output. You can see the sort of thing I mean in the raster dimension tests.

@hackerb9
Copy link

hackerb9 commented Mar 10, 2024

@hackerb9 how does trailing GNLs interact with last transparent rows being clipped?

Before I get into the weeds about a trailing graphic newline, I do want to say that I think GNL is not as important as getting the text newline behaviour consistent across modern terminals.

One way of looking at final, trailing GNL, is that it is a completely transparent sixel row (and thus that it should be removed). But perhaps it's more correct to say that a GNL should be treated as a fully opaque row, until you start printing sixels; then you start tracking the bottom-most opaque pixel.

Click to see hackerb9's pondering of GNL

@j4james is most knowledgeable of precise VT340 behaviour and may even know the exact algorithm for 20 pixel tall fonts off-hand.

For modern terminals, I think perhaps a better question would be why did DEC choose the algorithm they did for the VT340? We've already seen that sometimes they developed fast but inexact algorithms to overcome hardware limitations, so what benefits did the algorithm they chose for the VT340's GNL provide to programmers and users at that time?

With the caveat that I haven't thought this out as deeply as I have text newlines, here's my current take on GNL:

EFFECT OF A TRAILING GRAPHIC NEW LINE ON TEXT CURSOR POSITION

Previous image height Behaviour
Exact multiple of text height Cursor is moved to the blank line under the image
Anything else Has no effect (usually)

It seems that a trailing GNL is practically useless to current programmers as the following text will almost always overlap. The one case it is sure to give a fresh line is not terribly useful since a text newline works the same and is more general.

I don't know the design parameters DEC was constrained by, but it looks an awful lot like an attempt at backwards compatibility. Historically, sixels were designed for printers and teletypewriters in which GNL represented advancing the paper by a fraction of the usual line height.

Excerpt from LJ250 Printer Programmer's Reference Manual

6.3.2.4 Graphic New Line (-) The graphic new line (GNL) control code (2/13) sets the active column to the [graphic] left margin and advances the paper by the current sixel height.

Since the fractions can add up, it makes sense that some programmers may have relied on printing images at a multiple of the line height and sending a final GNL to move the printhead to the next (whole) text line instead of using an explicit LF. Perhaps this was a common programming idiom and DEC wanted to make sure it still worked on video terminals.

A possible critique and response

One problem with this theory is that printer-terminals, unlike video-terminals, might have been able to print a fraction of a line down so sizing images to a multiple of the line-height might not matter. Response: It's also possible that being aligned to whole lines was important if not 100% necessary. For example, the manual for the DEC LA100 printer has a caution about using Partial Line Down:

The PLD sequence does not modify the active line. To avoid losing the top of form reference send an equal number of PLU sequences to the terminal.

Another possible reason aligning to whole lines may have been important back in those days was that green bar paper was common, but that seems weak to me.


Even if my above theory is correct, one thing I don't get is why not always advance the text cursor? What, if any, benefit is there to have a trailing GNL stay on the same line?

My first thought was that perhaps the fractional page motion was saved and would be used to align any following sixel images, but no, they overwrite the previous image just like text does. Speed of calculation is likely part of it, but what exactly were they trying to calculate? I suspect this is a historical mystery which won't be solved until someone documents the actual behaviour of something even older than a VT340, perhaps a DECwriter IV printer-terminal.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 11, 2024

Alright, I now have three open PRs for foot, addressing the following:

  • Place cursor on the last character row touched by the sixel: this is the one I started out this ticket with.

  • limit image size to the one specified in the raster attributes: changes foot from allowing images to grow beyond the dimensions in the raster attributes. This is mostly to sync with @jerch. In the end, it didn't really offer any major benefits, and I would be just as happy to continue supporting dynamically growing the image beyond the raster attributes' dimensions. If I were to decide all on my own, I wouldn't merge this PR, and instead continue supporting dynamic resizes. Note that even with this PR, dynamically sized images are still supported, as long as they omit the raster attributes.

  • trim trailing, fully transparent sixel rows: does what it says. We haven't really discussed the nitty gritty details on this one, so, I chose to do this for all sixels, regardless of the background color mode (i.e. the P2 parameter), and regardless of whether there are any raster attributes present or not.

Is this something you all (though I guess it's pretty clear where @j4james stands on this) would consider implementing in your TEs?

Just to make it clear. I don't intend to merge any of the above (1640 being the exception) unless we can reach at least some level of consensus here.

@hackerb9 thanks again for your detailed explanation. What I ended up doing (in 1640), is to let trailing GNLs move the text cursor as if you had at least one fully opaque 6-pixel sixel on that row, but as soon as you start printing sixels, I switch to tracking whatever the actual bottom pixel is. In other words, a trailing GNL will not be trimmed out when we remove trailing, transparent sixel rows.

@hpjansson
Copy link
Owner

@dankamongmen

maybe i'm misunderstanding the need, but in notcurses i handle what i believe to be your problem by getting the terminal size in pixels, dividing that out by the number of rows and cols, and using those as the cell pixel dimensions. doesn't this provide you enough?

It's sufficient, but not ideal (click to expand summary):
  1. The final sixel band can fall entirely within the final cell row, or it can fall in multiple cell rows (most likely the final two - but technically if your cells are <= 4px tall, it could be more). When the terminal leaves the cursor at the row containing the topmost pixel of the final sixel band, it means you'll have to move the cursor down by one or two (or perhaps more) rows to get clear of the image. This seems more complex than necessary.
  2. You need an interactive terminal session that can report its pixel size (ioctl or control sequences). A tool like convert can't produce a sixel image occupying a consistent cell extent.
  3. There's probably a race condition where the application gets the terminal dimensions, and while it prepares the image, the terminal's cell size changes. "Zoom" accelerators like the ones VTE implement (C-S-+ and C-S--) allow the user to trigger this easily during animations.

To be clear, I'm not asking for anything in particular to be done about this, just that it's taken into account if terminal maintainers are making changes anyway. IMO, a broad consensus is more important than any of these concerns. Also, as @hackerb9 suggested, we should probably leave points 2 and 3 for a separate issue :-)

@AnonymouX47
Copy link

AnonymouX47 commented Mar 11, 2024

@dnkl

Considering images actually having trailing transparent rows, I have a couple concerns/questions as regards trimming trailing transparent rows:

  1. Won't it affect vertical cursor placement?
  2. Won't it affect drawing an image with P2=0 over another?

@dankamongmen
Copy link

@hpjansson thanks for the explanation. i work around these three issues, but they're all valid concerns.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 14, 2024

@AnonymouX47

Won't it affect vertical cursor placement?

Yes, and that's kind of the whole idea. If we don't trim, all images will be forced to have a height that is a multiple of 6.

If we choose to truncate images with raster attributes, we could also choose to not trim trailing transparent rows. But if we don't truncate the image, I think trimming should be done regardless of whether the image has raster attributes or not. Otherwise, an image with raster attributes would still be forced to have a height that is a multiple of 6.

Won't it affect drawing an image with P2=0 over another?

That's a valid question. Not sure if @hackerb9 has any insights on what the real VT340 does? I would kind of make sense to only trim when P2=1.

@AnonymouX47
Copy link

If we choose to truncate images with raster attributes, we could also choose to not trim trailing transparent rows.

Honestly, I think this approach results in the most reliable/consistent/predictable behaviour and is technically the most straightforward and efficient to implement... both for TE and app developers.

@hackerb9
Copy link

hackerb9 commented Mar 15, 2024

Won't it affect drawing an image with P2=0 over another?

That's a valid question. Not sure if @hackerb9 has any insights on what the real VT340 does? I would kind of make sense to only trim when P2=1.

Definitely a good question, though straying a bit from the issue nominally at hand (newlines: graphical and otherwise).

I just ran a test of p2 effects on overlaying graphics and the results surprised me.

The rules for overlaying graphics seem to be:

  1. If transparency is on (P2 = 1), everything is composited as expected regardless of the setting of RA.
  2. If transparency is off (P2 = 0) and a size is specified in RA, a rectangle of that size is cleared and the cursor is moved back to the starting corner (top left of image) before drawing sixels. This does not affect the final text cursor.
  3. If transparency is off (P2 = 0) and a size is not specified by RA, then a rectangle of the background color is cleared from the cursor position to the bottom right corner of the screen. This also does not affect the text cursor.

№ 3 was the most surprising to me, but I guess it makes sense for a sixel parser: if you don't have any guess what size the graphics actually are, but you know there's an opaque background that must be cleared first, set the RA size to maximum.

This behaviour also fits with how the documentation talks about the RA size parameter not being the actual geometry of the sixel image but rather an easy way to clear a rectangle. (You can see that in my test because I made the 20x20 image have a 60x60 RA size, which matters when transparency is off, P2=0).

It was also interesting to me that the Raster Attribute size had no effect on the final cursor position. I'm not sure what the benefit is, but I think perhaps it makes sense since multiple Raster Attributes are allowed in a single sixel DCS string.

If you want to test your terminal emulator of choice, you can get my script from here: https://raw.githubusercontent.com/hackerb9/vt340test/main/sixeltests/p2effect.sh . I'm curious to know the results.


Footnote

Footnote: I think of the VT340 as lacking the rectangle operations that existed in later terminals like the VT4x0. I'm not sure if I ever quite grasped before that that there is actually an easy way to clear rectangles on the VT340. (And the rectangle doesn't even have to align to the character cell! --- not sure if that's a bug or a feature.)

@AnonymouX47
Copy link

Wow! Ain't that something... Now, i kinda regret asking.

@dnkl
Copy link
Contributor Author

dnkl commented Mar 15, 2024

@hackerb9 thanks! That's some interesting results. I'll be doing a couple of changes in foot to better match the VT340. I'm also inclined to not make RA truncate images, but instead continue allowing images to extend beyond their RA. But combine that with trimming trailing transparent sixel rows.

@wez
Copy link

wez commented May 5, 2024

I ran the p2effect.sh script on wezterm and xterm.

xterm:
image

wezterm:
image

Looks a bit wonky in wezterm(!)

@j4james
Copy link

j4james commented May 6, 2024

@wez I think it's your system that is a bit wonky! Maybe an incompatibility with the shell? Because even the Xterm image has a whole bunch of mistakes that I'm not seeing when I run the script myself. For example, where are all those $ characters coming from? Why are the red blocks offset one column to the right? And why is the cursor text not lined up with the character? Xterm does have a few issues, but it's not nearly as bad as it appears in your screenshot.

@hackerb9
Copy link

hackerb9 commented May 9, 2024

Hey @wez, I think @j4james is right about your shell. Did you get it figured out? If not, please let me know the output of bash --version as my script should definitely not be doing that. If I recall correctly, MacOS comes with an ancient version of bash.

@wez
Copy link

wez commented May 10, 2024

Ah, I think I was lazy and didn't chmod the script and just ran it with sh. Running with bash explicitly gives:

xterm:

image

wezterm -n:

image

@j4james
Copy link

j4james commented May 10, 2024

@wez Your Xterm screenshot still looks wrong to me. Are you sure you're using the latest version? This is what I get:

XTerm screenshot

image

It doesn't get the cursor position right when raster attributes are set, and it doesn't set the opaque background correctly when raster attributes are not set, but otherwise it seems OK to me.

And note that you need to use a 10x20 font if you want to fully emulate the VT340, otherwise the cursor position tests are going to be misleading.

hpjansson added a commit that referenced this issue Jun 11, 2024
The final GNL could cause extra space to be emitted in some
circumstances.

Also fix an issue causing more bands to be padded than necessary when
multithreaded.

See #192 (GitHub).
hpjansson added a commit that referenced this issue Jun 11, 2024
This positions the cursor correctly ~everywhere.

See #192 (GitHub).
lbrayner pushed a commit to lbrayner/foot that referenced this issue Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility (e.g. terminal quirks) research Research & discussion
Projects
None yet
Development

No branches or pull requests

9 participants