-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rook-ceph-osd pod fails after reboot: Block already exists #14469
Comments
|
|
It seems like the first reboot is ok, but then if you wait 15 mins and reboot the problems happen. Deleting one folder and rebooting again fixes it. |
Are you setting
Where are you seeing the failure exactly? Are the pods in CLBO? What does |
mgr describe pod
|
osd describe pod: |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions. |
Is this a bug report or feature request?
Deviation from expected behavior:
Expected behavior:
OSD pod is running.
How to reproduce it (minimal and precise):
I'm testing on a cluster with 1 node. Reboot the node. This seems to happen 100% on RHEL versions and sometimes on Ubuntu.
File(s) to submit:
cluster.yaml
, if necessaryLogs to submit:
Cluster Status to submit:
unable to read
Environment:
OS This happen much more frequently on RHEL. I used RHEL 9.2, but saw it on other versions of RHEL
NAME="Red Hat Enterprise Linux"
VERSION="9.2 (Plow)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="9.2"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Red Hat Enterprise Linux 9.2 (Plow)"
Kernel : Linux 5.14.0-284.48.1.el9_2.x86_64 Monitor bootstrapping with libcephd #1 SMP PREEMPT_DYNAMIC Thu Jan 4 03:49:47 EST 2024 x86_64 x86_64 x86_64 GNU/Linux
Cloud provider or hardware configuration: I test on AWS EC2, but customers that aren't on the cloud have the problem too.
Rook version (use
rook version
inside of a Rook Pod): 14.8Storage backend version: quay.io/ceph/ceph:v18.2.2
Kubernetes version (use
kubectl version
): 1.29.5Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): kubernetes installed on a VM with Kurl.
Storage backend status (e.g. for Ceph use
ceph health
: auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directoryThe text was updated successfully, but these errors were encountered: