Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic if unable to write to event handler files #2314

Open
emoscardini opened this issue Aug 9, 2024 · 7 comments
Open

Panic if unable to write to event handler files #2314

emoscardini opened this issue Aug 9, 2024 · 7 comments

Comments

@emoscardini
Copy link
Contributor

If the controller doesn't have permissions to write to a file configured in the event handler, the follow panic happens:

panic: cannot write to log file path: /var/log/ziti/utilization-events.log
goroutine 1 [running]:
github.com/openziti/ziti/controller.(*Controller).Run(0xc000c5dc20)
github.com/openziti/ziti/controller/controller.go:381 +0xcce
{0xc000c4a130, 0x1, 0x3ad5116?})
github.com/openziti/ziti/ziti/controller/run.go:95 +0xabf
{0xc000c4a100, 0x1, 0x1})
github.com/spf13/cobra@v1.8.1/command.go:989 +0xab1
github.com/spf13/cobra.(*Command).ExecuteC(0x5b33880)
github.com/spf13/cobra@v1.8.1/command.go:1117 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.8.1/command.go:1041
github.com/openziti/ziti/ziti/cmd.Execute()
github.com/openziti/ziti/ziti/cmd/cmd.go:81 +0x1a
main.main()
github.com/openziti/ziti/ziti/main.go:51 +0xf

The process should log that it's unable to access the file & continue.

@dovholuknf
Copy link
Member

Interesting. As a user, I would rather it panic and not start so that I would be aware of the situation before it ran for "however long" and then i discovered a problem.

I'm curious why you'd rather have it operating in a mode where it's degraded. Should there be events emitted indicating a problem?

@michaelquigley
Copy link
Contributor

I also tend to think I'd rather it panic, rather than not do what I configured it to do.

@emoscardini
Copy link
Contributor Author

Not sure how you're going to emit an event if the event handler can't output...

I'm suggesting we fix the panic & log the process doesn't have access to write to the events handler file specified. It can log an ERROR for every time it tries to write.

IMHO, the process shouldn't ever panic. Even switching to exiting the process gracefully after it logs that it can't write would be better.

@r-caamano
Copy link
Member

r-caamano commented Aug 9, 2024

Seems to me that it would be more important to have a network stay up than to panic that it cant write to an event log. As Edward mentioned we log that we can't access the file in journald.

@michaelquigley
Copy link
Contributor

michaelquigley commented Aug 9, 2024

It only panics at startup, not during normal execution. It's panic-ing because you've specified an invalid configuration... you're asking for the controller to consider some parts of the configuration as "optional" or "best effort"... but for a lot of configurations, not having functional events for any period of time (zrok) is a broken state, even if the rest of the network is up.

@michaelquigley
Copy link
Contributor

Maybe a compromise might be to mark that event handler as optional: true or abort-on-error: startup, abort-on-error: anytime... or some other way to mark that event handler as non-critical.

@r-caamano
Copy link
Member

if the controller crashes it could possibly not close the file properly which could result in the file becoming "stale locked" so when it attempts to open the file at next start it can't. Since this is an automated restart no one is going to be there to immediately see that the file is locked an it will just continue to cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants