Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabled localityLbSetting with outlierDetection results in outgoing connection timeouts #52750

Closed
2 tasks done
apsega opened this issue Aug 19, 2024 · 1 comment
Closed
2 tasks done
Labels
area/networking feature/Multi-cluster issues related with multi-cluster support

Comments

@apsega
Copy link

apsega commented Aug 19, 2024

Is this the right place to submit this?

  • This is not a security vulnerability or a crashing bug
  • This is not a question about how to use Istio

Bug Description

Running Istio in multi-primary setup.

When using localityLbSetting enabled with outlierDetection traffic is being routed to local cluster only, but some requests are getting timeout errors.

When either a) outlierDetection is removed and localityLbSetting is enabled or b) outlierDetection is present and localityLbSetting is disabled, traffic is being balanced between two clusters

Screenshots of application connection timeout error logs and proxy sidecars metrics shows the correlation when outlierDetection is removed - errors disappear, but traffic is being balanced between two clusters.

image image

Interestingly enough, localityLbSetting enablement adds to EnvoyProxy clusters config:

  commonLbConfig:
    localityWeightedLbConfig: {}

While adding outlierDetection, under the hood healthyPanicThreshold is being added to EnvoyProxy config:

  commonLbConfig:
    healthyPanicThreshold: {}
    localityWeightedLbConfig: {}

Additionally all endpoints have priority assigned to them.

I'm interested what magic does outlierDetection and localityLbSetting combo adds that outgoing requests starting to time out?

Version

$ istioctl version
client version: 1.19.0
control plane version: 1.22.2
data plane version: 1.22.2 (2306 proxies)

$ kubectl version --short
Client Version: v1.25.10
Kustomize Version: v4.5.7
Server Version: v1.26.7

Additional Information

No response

@istio-policy-bot istio-policy-bot added area/networking feature/Multi-cluster issues related with multi-cluster support labels Aug 19, 2024
@apsega
Copy link
Author

apsega commented Aug 22, 2024

It seems that the issue is application specific. When application itself has timeouts and handles retries, EnvoyProxy on top does its own magic. Turning off application timeouts and retries and setting those with VirtualService solved our issue.

@apsega apsega closed this as completed Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking feature/Multi-cluster issues related with multi-cluster support
Projects
None yet
Development

No branches or pull requests

2 participants