View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000462 | AlmaLinux-8 | kernel | public | 2024-04-12 20:57 | 2024-07-16 02:45 |
Reporter | zpalmerlw | Assigned To | |||
Priority | urgent | Severity | crash | Reproducibility | always |
Status | new | Resolution | open | ||
Platform | Alma Linux 8 | OS Version | 8.9 | ||
Summary | 0000462: aacraid driver causes SCSI Hang, followed by I/O Spike, and usually a server reboot is triggered | ||||
Description | The aacraid driver is faulty. The server will run fine, then it will hang, the load and I/O spike, and it either crashes to a reboot, or freezes solid for several minutes before recovering. There is output in /var/log/messages regarding "SCSI Hang" every time this happens: "host kernel: aacraid: Host adapter reset request. SCSI hang ?" So far the only fix is to either update to the latest mainline kernel, which breaks some backup software (Acronis) due the mainline kernel being too new to be supported, or reverting to the following kernel: 4.18.0-477.27.1.el8_8.x86_64 | ||||
Steps To Reproduce | Run an affected kernel with an affected Adaptec card, then it's a matter of waiting for the bug to happen, usually within 15-20 minutes of a system boot. | ||||
Additional Information | There was an update to the aacraid driver in kernel 6.4.0 that has been backported and is causing this breakage. From what I have found, these are the affected kernels: 4.18.0-513.11.1.el8_9.x86_64 4.18.0-513.9.1.el8_9.x86_64 (Possibly more, unknown to me if so) I found that this issue was also reported on Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059624 And a kernel Bugzilla for this issue: https://bugzilla.kernel.org/show_bug.cgi?id=217599 Here are a couple affected cards: Controller Model : Adaptec ASR8805E Controller Model : Adaptec ASR8405E | ||||
Tags | No tags attached. | ||||
abrt_hash | |||||
URL | |||||
|
> So far the only fix is to either update to the latest mainline kernel, which breaks some backup software (Acronis) due the mainline kernel being too new to be supported, or ... If the mainline kernel is too new, you might want to give elrepo's kernel-lt (1) a try. It is currently at 5.4.273.el8. (1) https://elrepo.org/wiki/doku.php?id=kernel-lt |
|
I have built the kmod-aacraid package using the patch referenced in https://bugzilla.kernel.org/show_bug.cgi?id=217599 (comment c63) and released it to the elrepo testing repository. If you have elrepo enabled, you can install it by running: sudo dnf --enablerepo=elrepo-testing install kmod-aacraid Or you can download the kmod rpm: https://elrepo.org/linux/testing/el8/x86_64/RPMS/kmod-aacraid-1.2.1-11.1.el8_10.elrepo.x86_64.rpm |
|
@zpalmerlw Once I get a positive response, I will move the kmod package to the main repository. |
|
The kmod-aacraid package has now been moved to the elrepo main repository. |