Dmesg full of I/O errors, smart ok, four disks affectedmdadm raid1 fails to resyncLinux external USB drive failure - corrupt filesystemmdadm raid1, [1/2] disks failed, safe to reboot?end_request: I/O error, dev sda, sector xxxxxxxxxMigrate SAS disks from E200i to B320i Smart ArrayDisk IO Errors when writing / Linux + Windows / HDD is OKZFS slow read speed on 8 drive 4 vdev striped mirrorsHow to find the cause of a single event file content corruption?dmesg errors not showing in ddrescueG-sense errors in new disks

Bash echo $-1 prints hb1. Why?

How can I convince my reader that I will not use a certain trope?

Why isn’t the tax system continuous rather than bracketed?

How can I check type T is among parameter pack Ts... in C++?

Signing using digital signatures?

Is there a short way to check uniqueness of values without using 'if' and multiple 'and's?

Why is a blank required between "[[" and "-e xxx" in ksh?

How to modify the uneven space between separate loop cuts, while they are already cut?

Averting Real Women Don’t Wear Dresses

What are good ways to spray paint a QR code on a footpath?

Why won't the ground take my seed?

How well known and how commonly used was Huffman coding in 1979?

How would a order of Monks that renounce their names communicate effectively?

Anagram Within an Anagram!

How can I create ribbons like these in Microsoft word 2010?

What's the point of DHS warning passengers about Manila airport?

SPI Waveform on Raspberry Pi Not clean and I'm wondering why

What does 2>&1 | tee mean?

Math PhD in US vs Master + PhD in Europe

“Transitive verb” + interrupter+ “object”?

Why is Madam Hooch not a professor?

Should I report a leak of confidential HR information?

A player is constantly pestering me about rules, what do I do as a DM?

One folder two different locations on ubuntu 18.04

Dmesg full of I/O errors, smart ok, four disks affected

mdadm raid1 fails to resyncLinux external USB drive failure - corrupt filesystemmdadm raid1, [1/2] disks failed, safe to reboot?end_request: I/O error, dev sda, sector xxxxxxxxxMigrate SAS disks from E200i to B320i Smart ArrayDisk IO Errors when writing / Linux + Windows / HDD is OKZFS slow read speed on 8 drive 4 vdev striped mirrorsHow to find the cause of a single event file content corruption?dmesg errors not showing in ddrescueG-sense errors in new disks

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I'm working on a remote server (Dell Poweredge) that was a new install. It has four drives (2TB) and 2 SSD's (250 GB). One SSD contains the OS (RHEL7) and the four mechanical disks are eventually going to contain an oracle database.

Trying to create a software RAID array led to disks constantly being marked as faulty. Checking dmesg outputs a slew of the following errors,

[127491.711407] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719699] sd 0:0:4:0: [sde] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127491.719717] sd 0:0:4:0: [sde] Sense Key : Aborted Command [current]
[127491.719726] sd 0:0:4:0: [sde] Add. Sense: Logical block guard check failed
[127491.719734] sd 0:0:4:0: [sde] CDB: Read(32)
[127491.719742] sd 0:0:4:0: [sde] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127491.719750] sd 0:0:4:0: [sde] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127491.719757] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719764] Buffer I/O error on dev sde, logical block 488378260, async page read
[127497.440222] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.440240] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.440249] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.440258] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.440266] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.440273] sd 0:0:5:0: [sdf] CDB[10]: 00 01 a0 00 00 01 a0 00 00 00 00 00 00 00 00 08
[127497.440280] blk_update_request: I/O error, dev sdf, sector 106496
[127497.901432] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.901449] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.901458] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.901467] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.901475] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.901482] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.901489] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911003] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.911019] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.911029] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.911037] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.911045] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.911052] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.911059] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911067] Buffer I/O error on dev sdf, logical block 488378260, async page read

These errors occur for all of the four mechanical disks, (sdc/sdd/sde/sdf) SMARTctl passed all four disks, long and short tests. I'm currently running badblocks (write mode test ~35 hrs in, probably another 35 to go).

The following are the errors I've suspected/considered upon research

Failed HDD - Seems unlikely that 4 "refurbished" disks would be DOA doesn't it?

Storage Controller Issue (bad cable?) - Seems like it would affect the SSD's too?
- Kernel issue, The only change to the stock kernel was the addition of kmod-oracleasm. I really don't see how it would cause these faults, ASM isn't set up at all.

Another noteworthy event was when trying to zero the disks (part of early troubleshooting), using the command $ dd if=/dev/zero of=/dev/sdX yielded these errors,

dd: writing to ‘/dev/sdc’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70583 s, 32.0 MB/s
dd: writing to ‘/dev/sdd’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70417 s, 32.0 MB/s
dd: writing to ‘/dev/sde’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71813 s, 31.7 MB/s
dd: writing to ‘/dev/sdf’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71157 s, 31.9 MB/s

If anyone here could share some insight as to what might be causing this, I'd be grateful. I'm inclined to follow occam's razor here and go straight for the HDD's, the only doubt stems from the unlikelihood of four failed HDD's out of box.

I will be driving to the site tomorrow for a physical inspection & to report my assessment of this machine to the higher ups. If there's something I should physically inspect (beyond cables/connections/power supply) please let me know.

Thanks.

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

When you say SMART "ok", do you just mean the overall health? Are any individual raw counters for reallocated or pending sectors non-zero? Drives don't immediately declare themselves failed on the first bad sector, even though it is unreadable. Use smartctl -x /dev/sda or something. But it's highly suspicious that it's the same LBA on all disks.

– Peter Cordes
Jun 18 at 6:42

add a comment |

Trying to create a software RAID array led to disks constantly being marked as faulty. Checking dmesg outputs a slew of the following errors,

[127491.711407] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719699] sd 0:0:4:0: [sde] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127491.719717] sd 0:0:4:0: [sde] Sense Key : Aborted Command [current]
[127491.719726] sd 0:0:4:0: [sde] Add. Sense: Logical block guard check failed
[127491.719734] sd 0:0:4:0: [sde] CDB: Read(32)
[127491.719742] sd 0:0:4:0: [sde] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127491.719750] sd 0:0:4:0: [sde] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127491.719757] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719764] Buffer I/O error on dev sde, logical block 488378260, async page read
[127497.440222] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.440240] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.440249] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.440258] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.440266] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.440273] sd 0:0:5:0: [sdf] CDB[10]: 00 01 a0 00 00 01 a0 00 00 00 00 00 00 00 00 08
[127497.440280] blk_update_request: I/O error, dev sdf, sector 106496
[127497.901432] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.901449] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.901458] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.901467] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.901475] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.901482] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.901489] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911003] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.911019] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.911029] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.911037] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.911045] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.911052] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.911059] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911067] Buffer I/O error on dev sdf, logical block 488378260, async page read

The following are the errors I've suspected/considered upon research

Failed HDD - Seems unlikely that 4 "refurbished" disks would be DOA doesn't it?

Storage Controller Issue (bad cable?) - Seems like it would affect the SSD's too?
- Kernel issue, The only change to the stock kernel was the addition of kmod-oracleasm. I really don't see how it would cause these faults, ASM isn't set up at all.

Another noteworthy event was when trying to zero the disks (part of early troubleshooting), using the command $ dd if=/dev/zero of=/dev/sdX yielded these errors,

dd: writing to ‘/dev/sdc’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70583 s, 32.0 MB/s
dd: writing to ‘/dev/sdd’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70417 s, 32.0 MB/s
dd: writing to ‘/dev/sde’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71813 s, 31.7 MB/s
dd: writing to ‘/dev/sdf’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71157 s, 31.9 MB/s

Thanks.

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

When you say SMART "ok", do you just mean the overall health? Are any individual raw counters for reallocated or pending sectors non-zero? Drives don't immediately declare themselves failed on the first bad sector, even though it is unreadable. Use smartctl -x /dev/sda or something. But it's highly suspicious that it's the same LBA on all disks.

– Peter Cordes
Jun 18 at 6:42

add a comment |

Trying to create a software RAID array led to disks constantly being marked as faulty. Checking dmesg outputs a slew of the following errors,

[127491.711407] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719699] sd 0:0:4:0: [sde] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127491.719717] sd 0:0:4:0: [sde] Sense Key : Aborted Command [current]
[127491.719726] sd 0:0:4:0: [sde] Add. Sense: Logical block guard check failed
[127491.719734] sd 0:0:4:0: [sde] CDB: Read(32)
[127491.719742] sd 0:0:4:0: [sde] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127491.719750] sd 0:0:4:0: [sde] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127491.719757] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719764] Buffer I/O error on dev sde, logical block 488378260, async page read
[127497.440222] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.440240] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.440249] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.440258] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.440266] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.440273] sd 0:0:5:0: [sdf] CDB[10]: 00 01 a0 00 00 01 a0 00 00 00 00 00 00 00 00 08
[127497.440280] blk_update_request: I/O error, dev sdf, sector 106496
[127497.901432] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.901449] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.901458] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.901467] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.901475] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.901482] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.901489] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911003] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.911019] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.911029] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.911037] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.911045] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.911052] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.911059] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911067] Buffer I/O error on dev sdf, logical block 488378260, async page read

The following are the errors I've suspected/considered upon research

Failed HDD - Seems unlikely that 4 "refurbished" disks would be DOA doesn't it?

Storage Controller Issue (bad cable?) - Seems like it would affect the SSD's too?
- Kernel issue, The only change to the stock kernel was the addition of kmod-oracleasm. I really don't see how it would cause these faults, ASM isn't set up at all.

Another noteworthy event was when trying to zero the disks (part of early troubleshooting), using the command $ dd if=/dev/zero of=/dev/sdX yielded these errors,

dd: writing to ‘/dev/sdc’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70583 s, 32.0 MB/s
dd: writing to ‘/dev/sdd’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70417 s, 32.0 MB/s
dd: writing to ‘/dev/sde’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71813 s, 31.7 MB/s
dd: writing to ‘/dev/sdf’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71157 s, 31.9 MB/s

Thanks.

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

Trying to create a software RAID array led to disks constantly being marked as faulty. Checking dmesg outputs a slew of the following errors,

[127491.711407] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719699] sd 0:0:4:0: [sde] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127491.719717] sd 0:0:4:0: [sde] Sense Key : Aborted Command [current]
[127491.719726] sd 0:0:4:0: [sde] Add. Sense: Logical block guard check failed
[127491.719734] sd 0:0:4:0: [sde] CDB: Read(32)
[127491.719742] sd 0:0:4:0: [sde] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127491.719750] sd 0:0:4:0: [sde] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127491.719757] blk_update_request: I/O error, dev sde, sector 3907026080
[127491.719764] Buffer I/O error on dev sde, logical block 488378260, async page read
[127497.440222] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.440240] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.440249] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.440258] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.440266] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.440273] sd 0:0:5:0: [sdf] CDB[10]: 00 01 a0 00 00 01 a0 00 00 00 00 00 00 00 00 08
[127497.440280] blk_update_request: I/O error, dev sdf, sector 106496
[127497.901432] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.901449] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.901458] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.901467] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.901475] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.901482] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.901489] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911003] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[127497.911019] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[127497.911029] sd 0:0:5:0: [sdf] Add. Sense: Logical block guard check failed
[127497.911037] sd 0:0:5:0: [sdf] CDB: Read(32)
[127497.911045] sd 0:0:5:0: [sdf] CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[127497.911052] sd 0:0:5:0: [sdf] CDB[10]: e8 e0 7c a0 e8 e0 7c a0 00 00 00 00 00 00 00 08
[127497.911059] blk_update_request: I/O error, dev sdf, sector 3907026080
[127497.911067] Buffer I/O error on dev sdf, logical block 488378260, async page read

The following are the errors I've suspected/considered upon research

Failed HDD - Seems unlikely that 4 "refurbished" disks would be DOA doesn't it?

Storage Controller Issue (bad cable?) - Seems like it would affect the SSD's too?
- Kernel issue, The only change to the stock kernel was the addition of kmod-oracleasm. I really don't see how it would cause these faults, ASM isn't set up at all.

Another noteworthy event was when trying to zero the disks (part of early troubleshooting), using the command $ dd if=/dev/zero of=/dev/sdX yielded these errors,

dd: writing to ‘/dev/sdc’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70583 s, 32.0 MB/s
dd: writing to ‘/dev/sdd’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.70417 s, 32.0 MB/s
dd: writing to ‘/dev/sde’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71813 s, 31.7 MB/s
dd: writing to ‘/dev/sdf’: Input/output error
106497+0 records in
106496+0 records out
54525952 bytes (55 MB) copied, 1.71157 s, 31.9 MB/s

Thanks.

redhat hard-drive io

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

asked Jun 17 at 11:52

Scu11y

434 bronze badges

asked Jun 17 at 11:52

Scu11y

434 bronze badges

New contributor

When you say SMART "ok", do you just mean the overall health? Are any individual raw counters for reallocated or pending sectors non-zero? Drives don't immediately declare themselves failed on the first bad sector, even though it is unreadable. Use smartctl -x /dev/sda or something. But it's highly suspicious that it's the same LBA on all disks.

– Peter Cordes
Jun 18 at 6:42

add a comment |

When you say SMART "ok", do you just mean the overall health? Are any individual raw counters for reallocated or pending sectors non-zero? Drives don't immediately declare themselves failed on the first bad sector, even though it is unreadable. Use smartctl -x /dev/sda or something. But it's highly suspicious that it's the same LBA on all disks.

– Peter Cordes
Jun 18 at 6:42

When you say SMART "ok", do you just mean the overall health? Are any individual raw counters for reallocated or pending sectors non-zero? Drives don't immediately declare themselves failed on the first bad sector, even though it is unreadable. Use smartctl -x /dev/sda or something. But it's highly suspicious that it's the same LBA on all disks.

– Peter Cordes
Jun 18 at 6:42

add a comment |

1 Answer
1

active

oldest

votes

Your dd tests show the four disks all failing at the same LBA address. As it is extremely improbable that four disks all fail at the exact same location, I strongly suspect it is due to controller or cabling issues.

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

1

It's difficult to tell without further testing. Anyway, the first think I would control/replace is the cables attaching the controller to the backplane.

– shodanshok
Jun 17 at 12:33

4

High data-rate cables, as 6/12 Gbs SATA/SAS ones, are not only about electrical continuity, but mainly about signal clearness and low noise. Try to physically clear the connectors and reseat the cables. If the error persists, try changing them and, finally, try a different controller.

– shodanshok
Jun 17 at 13:21

2

Same-LBA seems unlikely to be a cabling issue. Unless the data in that sector just happens to be some worst-case bit-sequence for some scrambling (to prevent extended runs of all-zero defeating self-clocking) or ECC over the SATA/SAS link. I'm not sure what encoding that link uses. Controller is plausible though; same LBA on each of multiple disks needs some kind of common factor explanation.

– Peter Cordes
Jun 18 at 6:40

3

@djsmiley2k It is difficult that all four ddended cached on the same, failing RAM address. Moreover, PERC's DRAM is ECC protected and, while ECC RAM also fails, it is relatively uncommon. That said, the controller can be the source of the issues so, if changing cables does not help, the OP should try swapping the controller.

– shodanshok
Jun 18 at 12:55

2

Well my friends, you were right. Cables + controllers swapped and now 600GB into a dd zeroing process and no errors thus far. Looks like everything's working correctly now. Thanks again for all the knowledge you've shared. I'm always grateful to this community for your expertise and willingness to share it. :)

– Scu11y
Jun 19 at 21:27

|
show 7 more comments

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Scu11y is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f971722%2fdmesg-full-of-i-o-errors-smart-ok-four-disks-affected%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

1

It's difficult to tell without further testing. Anyway, the first think I would control/replace is the cables attaching the controller to the backplane.

– shodanshok
Jun 17 at 12:33

4

High data-rate cables, as 6/12 Gbs SATA/SAS ones, are not only about electrical continuity, but mainly about signal clearness and low noise. Try to physically clear the connectors and reseat the cables. If the error persists, try changing them and, finally, try a different controller.

– shodanshok
Jun 17 at 13:21

2

Same-LBA seems unlikely to be a cabling issue. Unless the data in that sector just happens to be some worst-case bit-sequence for some scrambling (to prevent extended runs of all-zero defeating self-clocking) or ECC over the SATA/SAS link. I'm not sure what encoding that link uses. Controller is plausible though; same LBA on each of multiple disks needs some kind of common factor explanation.

– Peter Cordes
Jun 18 at 6:40

3

@djsmiley2k It is difficult that all four ddended cached on the same, failing RAM address. Moreover, PERC's DRAM is ECC protected and, while ECC RAM also fails, it is relatively uncommon. That said, the controller can be the source of the issues so, if changing cables does not help, the OP should try swapping the controller.

– shodanshok
Jun 18 at 12:55

2

Well my friends, you were right. Cables + controllers swapped and now 600GB into a dd zeroing process and no errors thus far. Looks like everything's working correctly now. Thanks again for all the knowledge you've shared. I'm always grateful to this community for your expertise and willingness to share it. :)

– Scu11y
Jun 19 at 21:27

|
show 7 more comments

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

1

It's difficult to tell without further testing. Anyway, the first think I would control/replace is the cables attaching the controller to the backplane.

– shodanshok
Jun 17 at 12:33

4

High data-rate cables, as 6/12 Gbs SATA/SAS ones, are not only about electrical continuity, but mainly about signal clearness and low noise. Try to physically clear the connectors and reseat the cables. If the error persists, try changing them and, finally, try a different controller.

– shodanshok
Jun 17 at 13:21

2

Same-LBA seems unlikely to be a cabling issue. Unless the data in that sector just happens to be some worst-case bit-sequence for some scrambling (to prevent extended runs of all-zero defeating self-clocking) or ECC over the SATA/SAS link. I'm not sure what encoding that link uses. Controller is plausible though; same LBA on each of multiple disks needs some kind of common factor explanation.

– Peter Cordes
Jun 18 at 6:40

3

@djsmiley2k It is difficult that all four ddended cached on the same, failing RAM address. Moreover, PERC's DRAM is ECC protected and, while ECC RAM also fails, it is relatively uncommon. That said, the controller can be the source of the issues so, if changing cables does not help, the OP should try swapping the controller.

– shodanshok
Jun 18 at 12:55

2

Well my friends, you were right. Cables + controllers swapped and now 600GB into a dd zeroing process and no errors thus far. Looks like everything's working correctly now. Thanks again for all the knowledge you've shared. I'm always grateful to this community for your expertise and willingness to share it. :)

– Scu11y
Jun 19 at 21:27

|
show 7 more comments

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

answered Jun 17 at 12:18

shodanshok

28.2k3 gold badges52 silver badges96 bronze badges

1

It's difficult to tell without further testing. Anyway, the first think I would control/replace is the cables attaching the controller to the backplane.

– shodanshok
Jun 17 at 12:33

4

High data-rate cables, as 6/12 Gbs SATA/SAS ones, are not only about electrical continuity, but mainly about signal clearness and low noise. Try to physically clear the connectors and reseat the cables. If the error persists, try changing them and, finally, try a different controller.

– shodanshok
Jun 17 at 13:21

2

Same-LBA seems unlikely to be a cabling issue. Unless the data in that sector just happens to be some worst-case bit-sequence for some scrambling (to prevent extended runs of all-zero defeating self-clocking) or ECC over the SATA/SAS link. I'm not sure what encoding that link uses. Controller is plausible though; same LBA on each of multiple disks needs some kind of common factor explanation.

– Peter Cordes
Jun 18 at 6:40

3

@djsmiley2k It is difficult that all four ddended cached on the same, failing RAM address. Moreover, PERC's DRAM is ECC protected and, while ECC RAM also fails, it is relatively uncommon. That said, the controller can be the source of the issues so, if changing cables does not help, the OP should try swapping the controller.

– shodanshok
Jun 18 at 12:55

2

Well my friends, you were right. Cables + controllers swapped and now 600GB into a dd zeroing process and no errors thus far. Looks like everything's working correctly now. Thanks again for all the knowledge you've shared. I'm always grateful to this community for your expertise and willingness to share it. :)

– Scu11y
Jun 19 at 21:27

|
show 7 more comments

1

It's difficult to tell without further testing. Anyway, the first think I would control/replace is the cables attaching the controller to the backplane.

– shodanshok
Jun 17 at 12:33

4

High data-rate cables, as 6/12 Gbs SATA/SAS ones, are not only about electrical continuity, but mainly about signal clearness and low noise. Try to physically clear the connectors and reseat the cables. If the error persists, try changing them and, finally, try a different controller.

– shodanshok
Jun 17 at 13:21

2

Same-LBA seems unlikely to be a cabling issue. Unless the data in that sector just happens to be some worst-case bit-sequence for some scrambling (to prevent extended runs of all-zero defeating self-clocking) or ECC over the SATA/SAS link. I'm not sure what encoding that link uses. Controller is plausible though; same LBA on each of multiple disks needs some kind of common factor explanation.

– Peter Cordes
Jun 18 at 6:40

3

@djsmiley2k It is difficult that all four ddended cached on the same, failing RAM address. Moreover, PERC's DRAM is ECC protected and, while ECC RAM also fails, it is relatively uncommon. That said, the controller can be the source of the issues so, if changing cables does not help, the OP should try swapping the controller.

– shodanshok
Jun 18 at 12:55

2

Well my friends, you were right. Cables + controllers swapped and now 600GB into a dd zeroing process and no errors thus far. Looks like everything's working correctly now. Thanks again for all the knowledge you've shared. I'm always grateful to this community for your expertise and willingness to share it. :)

– Scu11y
Jun 19 at 21:27

It's difficult to tell without further testing. Anyway, the first think I would control/replace is the cables attaching the controller to the backplane.

– shodanshok
Jun 17 at 12:33

High data-rate cables, as 6/12 Gbs SATA/SAS ones, are not only about electrical continuity, but mainly about signal clearness and low noise. Try to physically clear the connectors and reseat the cables. If the error persists, try changing them and, finally, try a different controller.

– shodanshok
Jun 17 at 13:21

Same-LBA seems unlikely to be a cabling issue. Unless the data in that sector just happens to be some worst-case bit-sequence for some scrambling (to prevent extended runs of all-zero defeating self-clocking) or ECC over the SATA/SAS link. I'm not sure what encoding that link uses. Controller is plausible though; same LBA on each of multiple disks needs some kind of common factor explanation.

– Peter Cordes
Jun 18 at 6:40

@djsmiley2k It is difficult that all four ddended cached on the same, failing RAM address. Moreover, PERC's DRAM is ECC protected and, while ECC RAM also fails, it is relatively uncommon. That said, the controller can be the source of the issues so, if changing cables does not help, the OP should try swapping the controller.

– shodanshok
Jun 18 at 12:55

Well my friends, you were right. Cables + controllers swapped and now 600GB into a dd zeroing process and no errors thus far. Looks like everything's working correctly now. Thanks again for all the knowledge you've shared. I'm always grateful to this community for your expertise and willingness to share it. :)

– Scu11y
Jun 19 at 21:27

|
show 7 more comments

Scu11y is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Scu11y is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

uH5TVG2Zq787xexTyfgK,eSE2lz ALO8Xa Uwvi,IAQIAoTBbAtEs7At2BzcGeG

搜尋此網誌

Ttdfjt

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

1 Answer
1

1 Answer
1

1 Answer
1