Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erlang 26 breaks command line on Windows #7621

Closed
Ilyushya opened this issue Sep 3, 2023 · 36 comments
Closed

Erlang 26 breaks command line on Windows #7621

Ilyushya opened this issue Sep 3, 2023 · 36 comments
Assignees
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM

Comments

@Ilyushya
Copy link

Ilyushya commented Sep 3, 2023

Describe the bug

  • After further actions dialog appears (that one that appears after ^c), some buttons start (like arrows and Alt) producing incoherent mess (similar to special symbol combinations usually used for making coloured text in Windows CMD and Powershell).

  • In June(June 26, particularly), when I experienced the issue the first time, even Clipdiary, a clipboard manager, used to break.

  • The backspace key starts deleting whole words.

backspace.by.word.mp4
broken.clipdiary.mp4
normal.clipdiary.example.mp4
bad
The Dialog is used and the console is now broken
ok
The Dialog is skipped with another Ctrl-C, fine
point of crash
The problem is definitely caused by that menu

To Reproduce
Run any BEAM app and pick a further action

Expected behavior
The console doesn't break

Affected versions
26

Notes
I suspect the disappearance of the GUI app and this flaw are linked.

@Ilyushya Ilyushya added the bug Issue is reported as a bug label Sep 3, 2023
@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Sep 4, 2023
@frazze-jobb
Copy link
Contributor

Related #7548

garazdawi added a commit to garazdawi/otp that referenced this issue Sep 6, 2023
This is needed for cmd.exe to have the correct settings after
it has been terminated by Ctrl-C. Powershell does not need this,
but it does not hurt either.

Closes erlang#7548
Closes erlang#7621
@user20230119
Copy link

Was this fixed in 26.1.2?

@garazdawi
Copy link
Contributor

The fix is to be released in 26.2.

Sine GitHub does not notify when issues are updated I never noticed that this issue had been updated with more bug reports, so not all bugs listed above have been fixed yet. I’m therefore re-opening this issue.

@garazdawi garazdawi reopened this Oct 26, 2023
@thojanssens
Copy link

Thank you for working on this. This has been a pain.

May I ask, does this happen because... Windows sucks? Would this kind of problem ever happen on Linux? Or MacOS?

@garazdawi
Copy link
Contributor

The same issue has been present on most other platforms. Here is an example here we fixed the same issue on MacOS: #3796. So in this partucular case, Windows is not worse than most others 😄

One of the things were Windows is not as good as the other platforms is the fact that it is very difficult/impossible to write automated tests for the console. So it is very easy for us to change something and not knowing that it breaks on Windows until someone reports it. We do some manual testing, but it is hard to test everything that our automated tests on other platforms test.

@user20230119
Copy link

@thojanssens I switched to PowerShell as it doesn't have this issue.

@frazze-jobb frazze-jobb assigned garazdawi and unassigned frazze-jobb Dec 11, 2023
@Ilyushya
Copy link
Author

Ilyushya commented Jan 4, 2024

@thojanssens I switched to PowerShell as it doesn't have this issue.

I have also done that, but the issue is still there on cmd

@joeapearson
Copy link

I'm finding that OTP 26.2.1 (latest at the time of writing) + Elixir 1.16.0 will hang on Windows 11. The behaviour is intermittent but reproduces more often than not. I assume that the issue I'm encountering is related to those described on this issue hence posting here.

The application continues running but nothing is echo'd back to me on the console when I attempt to interact with it. Pressing Ctrl+C a couple of times exits as expected, (perhaps) interestingly printing the BREAK message several times (more times than I would otherwise expect).

It does seem like I can cause the command prompt to stay alive and continue accepting input as expected by (and I know this sounds silly) pressing return a bunch of times as soon after startup as I can.

I've reproduced in these combinations:

  • PowerShells v1 and v7 via Windows terminal
  • cmd.exe via Windows terminal
  • PowerShells v1 and v7 via the Windows console host
  • cmd.exe via the Windows console host

I can revert to OTP 25 of course but the promise of being able to work without werl.exe is a good one.

It seems to be very difficult to get any useful information out of iex when it is stuck. Can anyone suggest a good way of extracting useful information that I could add to this issue and/or a means of resolving it?

@garazdawi
Copy link
Contributor

If you do "Ctrl+Break" then "A" and then "Enter" you will create a crash dump of you system called erl_crash.dump. You can then use crashdump viewer to view the dump and dig around so see what is going on. If nothing if private in the dump, feel free to post it here and I can have a look.

@joeapearson
Copy link

@garazdawi thanks for the offer.

Here's an erl_crash.dump as requested.

erl_crash.zip

@joeapearson
Copy link

@garazdawi In the most polite recognising-that-you-must-have-plenty-of-other-stuff-to-do way, is there any chance that you might have found anything useful in that crash log that I attached on the previous reply to this message? Thanks for any help you're able to provide.

@garazdawi
Copy link
Contributor

Thanks for the ping, I had forgotten that you added a dump. From what I can tell Elixir has not started the reading end of the tty (that is shell:start_interactive/0 has not been called). Maybe @josevalim can shed some light on when that may happen?

@josevalim
Copy link
Contributor

@joeapearson which command (elixir, mix, iex, etc) do you run when you see the hanging?

@garazdawi
Copy link
Contributor

garazdawi commented Feb 8, 2024

Seems to be iex -S mix.

If you open the dump using crash dump viewer you will see the command line options in the stacktrace of the init process.

@josevalim
Copy link
Contributor

Thank you @garazdawi! Looking at the user_drv process in the crash dump, I can see IEx.Shell as part of a group, which makes me think we did call shell:start_interactive/1, no?

@joeapearson if you are running iex -S mix, it will first attempt to compile the current project. Has compilation already finished by the time you notice it is stuck?

In order to support both Erlang/OTP 26 and earlier, our boot process goes like this.

  1. First create a file called custom_user.erl with the following:
-module(custom_user).
-compile(export_all).
start() -> user_drv:start(#{initial_shell => noshell}).
  1. Compile it: erlc custom_user.erl

  2. Now run: erl -noshell -user custom_user -eval "spawn(fun() -> shell:start_interactive() end)"

@joeapearson can you try reproducing the issue with the steps above? You can try executing the third step several times.

@garazdawi
Copy link
Contributor

Thank you @garazdawi! Looking at the user_drv process in the crash dump, I can see IEx.Shell as part of a group, which makes me think we did call shell:start_interactive/1, no?

Yes, you are right. I thought I checked there but apparently not. Seems like this is an issue in our code. Sorry for the noise.

For some reason the reader process has crashed... I wonder why...

@josevalim
Copy link
Contributor

No worries, I am also glad to help trim down any issue that involves Elixir so you have to debug only Erlang (and not Elixir).

@garazdawi
Copy link
Contributor

@joeapearson when github actions has finished with #8103 it will produce a windows installer with an Erlang with some extra logs for when the tty reader crashes. If you could have a go and see if you can reproduce the behaviour with that version of Erlang that would be great. If you can reproduce it, then please attach the crash dump again so that I can view it.

@garazdawi
Copy link
Contributor

ping @joeapearson, do you have some time to reproduce the error or should I close this issue?

@simonmcconnell
Copy link

simonmcconnell commented May 3, 2024

I thought I'd give OTP 26 another go but I'm getting an intermitent error running the latest (OTP 26.2.5, Elixir 1.16.2-otp-26) when using Windows Powershell, Command Prompt or Powershell Core (aka Powershell).

R:\>erl -noshell -user -custom_user -eval "spawn(fun() -> shell:start_interactive() end)"
Erlang/OTP 26 [erts-14.2.5] [source] [64-bit] [smp:48:32] [ds:48:32:10] [async-threads:1] [jit:ns]

Eshell V14.2.5 (press Ctrl+G to abort, type help(). for help)
=CRASH REPORT==== 3-May-2024::13:36:47.186000 ===
  crasher:
    initial call: prim_tty:reader/1
    pid: <0.68.0>
    registered_name: user_drv_reader
    exception error: no case clause matching {error,
                                              {'GetOverlappedResult',
                                               'The I/O operation has been aborted because of either a thread exit or an application request.\r\n'}}
      in function  prim_tty:reader_loop/6 (prim_tty.erl, line 476)
    ancestors: [user_drv,<0.64.0>,kernel_sup,<0.47.0>]
    message_queue_len: 0
    messages: []
    links: [<0.65.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 376
    stack_size: 28
    reductions: 156
  neighbours:

in iex.bat -S mix

2024-05-03 13:32:45.297 [error] Process :user_drv_reader (#PID<0.68.0>) terminating
** (CaseClauseError) no case clause matching: {:error, {:GetOverlappedResult, :"The I/O operation has been aborted because of either a thread exit or an application request.\r\n"}}
    (kernel 9.2.4) prim_tty.erl:476: :prim_tty.reader_loop/6
    (stdlib 5.2.3) proc_lib.erl:241: :proc_lib.init_p_do_apply/3
Initial Call: :prim_tty.reader/1
Ancestors: [:user_drv, #PID<0.64.0>, :kernel_sup, #PID<0.47.0>]
Message Queue Length: 0
Messages: []
Links: [#PID<0.66.0>]
Dictionary: []
Trapping Exits: false
Status: :running
Heap Size: 376
Stack Size: 28
Reductions: 180 ~

@garazdawi
Copy link
Contributor

strange, what you are doing "works on my machine". How often do you get that error?

@simonmcconnell
Copy link

I think it was almost every time with Windows Powershell and Command Prompt and less often with Powershell Core. Originally I thought Powershell Core worked because it ran fine a few times.

@nixxquality
Copy link
Contributor

This issue remains in OTP 27 and seems to happen more frequently when running rebar3 shell, at least on my machine.

When running Simon's command from above (#7621 (comment)) I had one crash after 13 runs. Maybe run a check in the source code for unlucky numbers? ;)

@josevalim
Copy link
Contributor

A user on ElixirForum mentioned that running chcp 65001 in the terminal before hand addresses this issue. Can you folks confirm that's the case?

@nixxquality
Copy link
Contributor

I ran chcp 65001 and then repeatedly ran rebar3 shell. It still crashes about a fifth of the time.
Are there some debug flags I can enable to help pin this issue down?

@simonmcconnell
Copy link

I have chcp 65001 set in my powershell profile so it isn't that. Would a unicode issue cause intermittent crashes? There would need to be something like random string generation going on.

I tried it in the developer command prompt, with and without chcp 65001 and the error persists.

@simonmcconnell
Copy link

iex.bat seems to work fine in 26.2.5.2 and 27.0.1 for me on Elixir 1.17.2.

Still getting the overlapped IO crash intermittently when running erl -noshell -user -custom_user -eval "spawn(fun() -> shell:start_interactive() end)"

@nixxquality @joeapearson @Ilyushya want to check on the latest versions?

@Ilyushya
Copy link
Author

Ilyushya commented Jul 22, 2024

@simonmcconnell I've updated Erlang & Elixir. Luckily, I am not having that particular problem, but I do have a problem of faulty character printing when a recursive function in a Mix app grabs input from console and prints it back if it's not abusive, and prints "!@#$" in case it is.

Microsoft Windows [Version 10.0.19045.4651]
(c) Корпорация Майкрософт (Microsoft Corporation). Все права защищены.

C:\Users\Ilyushya\Documents\chat_prof_filter>mix

13:59:25.326 [info] Plug now running on localhost:4000
>hi
hi

>привет
�ਢ��

>bad word
bad word

>плохое слово
���宥 ᫮��

>

The backspace key removes whole words. That's a problem I've experienced some time before

@nixxquality
Copy link
Contributor

rebar3 shell still crashes intermittently on 27.0.1 for me.

@garazdawi
Copy link
Contributor

It is still on my todo list. Hope to have it done by 27.1. A PR would help speed things along.

@nixxquality
Copy link
Contributor

If I had any clue what was causing the issue I would attempt to contribute a PR :)

I ran a simple autohotkey script to repeatedly open and close an erl shell and after more than 10 minutes it hadn't crashed.
For some reason, it's much more likely to happen with rebar3 shell, and I don't know if the extra logging information is present there.

> rebar3 shell
===> Verifying dependencies...
Erlang/OTP 27 [erts-15.0.1] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Failed to write log message to stdout, trying stderr
=CRASH REPORT==== 13-Aug-2024::10:31:51.140000 ===
  crasher:
    initial call: prim_tty:reader/1
    pid: <0.68.0>
    registered_name: user_drv_reader
    exception error: no case clause matching {error,
                                              {'GetOverlappedResult',
                                               'The I/O operation has been aborted because of either a thread exit or an application request.\r\n'}}
      in function  prim_tty:reader_loop/6 (prim_tty.erl, line 522)
    ancestors: [user_drv,<0.64.0>,kernel_sup,<0.47.0>]
    message_queue_len: 0
    messages: []
    links: [<0.65.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 233
    stack_size: 29
    reductions: 52
  neighbours:

Failed to write log message to stdout, trying stderr
=ERROR REPORT==== 13-Aug-2024::10:31:51.147000 ===
Reader crashed ({{case_clause,
                     {error,
                         {'GetOverlappedResult',
                             'The I/O operation has been aborted because of either a thread exit or an application request.\r\n'}}},
                 [{prim_tty,reader_loop,6,[{file,"prim_tty.erl"},{line,522}]},
                  {proc_lib,init_p_do_apply,3,
                      [{file,"proc_lib.erl"},{line,329}]}]})
Failed to write log message to stdout, trying stderr
=ERROR REPORT==== 13-Aug-2024::10:31:51.153000 ===
Error in process <0.153.0> with exit value:
{terminated,[{io,fwrite,
                 ["Warning! The slogan \"~p\" could not be printed.\n",
                  [[69,115,104,101,108,108,32,86,"15.0.1"]]],
                 [{file,"io.erl"},
                  {line,198},
                  {error_info,#{cause => {io,terminated},
                                module => erl_stdlib_errors}}]},
             {shell,server,1,[{file,"shell.erl"},{line,289}]}]}

@garazdawi
Copy link
Contributor

PR #8774 is an attempt to address the GetOverlappedResult error, when github actions have finished, please try to windows installer and see if it solves the issue.

@nixxquality
Copy link
Contributor

It seems very promising! I didn't see a single crash despite a lot of attempts.
This was using rebar3 (which I had to compile from source) in exactly the same way which used to crash before.

@garazdawi
Copy link
Contributor

Great! Thanks for testing. The fix will be included in 27.1.

From what I can tell this is a race that is triggered if the user types anything or resizes the console window as rebar3 is starting the shell. It may also trigger on the mouse moving, but I don't think it should.

@nixxquality
Copy link
Contributor

I don't think I've been doing any of that, but I'm willing to blame PowerShell if you are.

@garazdawi
Copy link
Contributor

I think that 27.1 will fix all issues mentioned in this issue. If you continue to have problems with the new shell on Windows please open new issues describing what is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

10 participants