Click to See Complete Forum and Search --> : Faulty RAID Disks


newpylong
07-07-2004, 12:37 PM
Hello everyone,

Everytime I get my Software RAID up and running, the disks become faulty overnight and are marked faulty and removed from the array. This is a bogus message because nothing wrong with the disks. Prior to this the RAID-1 array was synced and running just fine. Here are my log errors: What gives??


Jul 6 16:53:40 ns2 kernel: SCSI disk error : host 0 channel 0 id 6 lun 0 return code = 8000002
Jul 6 16:53:40 ns2 kernel: Current sd08:12: sense key Recovered Error
Jul 6 16:53:40 ns2 kernel: Additional sense indicates Failure prediction threshold exceeded
Jul 6 16:53:40 ns2 kernel: I/O error: dev 08:12, sector 14632
Jul 6 16:53:40 ns2 kernel: ^IOperation continuing on 1 devices
Jul 6 16:53:40 ns2 kernel: md: recovery thread got woken up ...
Jul 6 16:53:40 ns2 kernel: md: updating md1 RAID superblock on device
Jul 6 16:53:40 ns2 kernel: md: scsi/host0/bus0/target5/lun0/part2 [events: 00000012]<6>(write) scsi/host0/bus0/$
Jul 6 16:53:40 ns2 kernel: md: (skipping faulty scsi/host0/bus0/target6/lun0/part2 )
Jul 6 16:53:40 ns2 kernel: md: recovery thread finished ...
Jul 6 17:13:49 ns2 -- MARK --

Jul 7 05:13:49 ns2 -- MARK --
Jul 7 05:33:49 ns2 -- MARK --
Jul 7 05:53:49 ns2 -- MARK --
Jul 7 06:25:05 ns2 kernel: SCSI disk error : host 0 channel 0 id 6 lun 0 return code = 8000002
Jul 7 06:25:05 ns2 kernel: Current sd08:11: sense key Recovered Error
Jul 7 06:25:05 ns2 kernel: Additional sense indicates Failure prediction threshold exceeded
Jul 7 06:25:05 ns2 kernel: I/O error: dev 08:11, sector 546
Jul 7 06:25:05 ns2 kernel: ^IOperation continuing on 1 devices
Jul 7 06:25:05 ns2 kernel: md: recovery thread got woken up ...
Jul 7 06:25:05 ns2 kernel: md: updating md0 RAID superblock on device
Jul 7 06:25:05 ns2 kernel: md: scsi/host0/bus0/target5/lun0/part1 [events: 00000012]<6>(write) scsi/host0/bus0/$
Jul 7 06:25:05 ns2 kernel: md: (skipping faulty scsi/host0/bus0/target6/lun0/part1 )
Jul 7 06:25:05 ns2 kernel: md: recovery thread finished ..

bradfordgd
07-07-2004, 12:52 PM
What software based RAID are you using? Look up the error code 8000002 that was reported. There does appear to be an I/O error according to the log you posted.

newpylong
07-07-2004, 01:39 PM
Thanks for the reply.

I am using Raidtools 2 to create and build the RAID, and mdadm to show diagnostics afterwords.

OS is Debian 3.0 Woody
Kernel 2.4.18
Initrd with EXT3, Adaptec, RAID modules, etc.


I have been reading and some people mention upgrading the kernel. We will see?

Its just weird an I/O error comes up since the drive reformats, install, and runs fine. Then just about an hour after I leave the office the RAID goes faulty. Seems like a software problem to me.

Thansk for your help