December 9, 20196 yr System hard crashed overnight while trying to pre clear disks.. no idea why but syslog was cleared out when I started up. Dockers didn't start, so I restarted.. and dockers started and I mentally moved on. Then I starting seeing errors in sabnzbd about SQL commands failing and upon researching that I tried to clear out a history1.db file and got mv: cannot move 'history1.db' to 'history1.db.old': Read-only file system Even though the file system is not RO.. then I tried to play something in plex. and that failed.. seems anything that needs to write/read cache is having issues. DIag attached. vortex-diagnostics-20191209-1611.zip
December 9, 20196 yr Community Expert There's a hardware problem with one of the cache devices (cache1): Dec 9 08:20:06 vortex kernel: BTRFS info (device sdd1): bdev /dev/sdp1 errs: wr 640215651, rd 458159560, flush 0, corrupt 0, gen 0 With SSDs this is usually a cable problem, see here for more info. Try running a scrub first, but I'm seeing some transid errors, and these are usually fatal requiring a re-format.
December 9, 20196 yr Author Can i safely just remove cache1 from the pool then try and get a working system before dealing with the other drive?
December 9, 20196 yr Community Expert Difficult to say because of the filesystem corruption but unlikely, make sure to try and backup anything important before trying anything.
December 9, 20196 yr Author OK i'll back it all up and then try. I should probably start backing up cache anyway (stuff trhat is never moved)
December 9, 20196 yr Author OK i made a copy.. mor or less.. lots of errors just doing that.. and when i stopped the array the disk that was known to be question able disppeared.. then the remaining one.. if i start the array with just it. Unmountable: No file system not sure what I can do with this at this point
December 9, 20196 yr Community Expert Reformat pool and restore data, and don't forget link above for better pool monitoring.
December 9, 20196 yr Author Is there no way to use the pool drive that wasnt bad? seems to defeat the whole purpose.. if one drive goes bad.. i lose everything
December 9, 20196 yr Community Expert Depends on the damage, like I mentioned: 3 hours ago, johnnie.black said: I'm seeing some transid errors, and these are usually fatal
December 9, 20196 yr Community Expert There are some recovery options here that might help: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490
December 9, 20196 yr Author To update everyone. Formated the "working" cache drive.. assigned it back to cache.. copied backup back into place.. restarted.. all is well. Waiting for a parity build to finish then I'll reseat that original cache drive and install a bunch of new disks.
December 10, 20196 yr Author I lied.. now the other disk is bad? ErrorWarningSystemArrayLogin Dec 9 19:31:59 vortex kernel: print_req_error: I/O error, dev sdp, sector 77578192 Dec 9 19:31:59 vortex kernel: BTRFS error (device sdp1): bdev /dev/sdp1 errs: wr 205, rd 961, flush 0, corrupt 0, gen 0 Dec 9 19:32:00 vortex kernel: sd 12:0:0:0: [sdp] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Dec 9 19:32:00 vortex kernel: sd 12:0:0:0: [sdp] tag#30 CDB: opcode=0x28 28 00 0d 74 2d 68 00 00 08 00 Dec 9 19:32:00 vortex kernel: print_req_error: I/O error, dev sdp, sector 225717608 fresh diag attached vortex-diagnostics-20191210-0032.zip
December 10, 20196 yr Community Expert SSD dropped offline again: Dec 9 18:54:40 vortex kernel: ata9: limiting SATA link speed to 3.0 Gbps Dec 9 18:54:41 vortex kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 320) Dec 9 18:55:11 vortex kernel: ata9.00: qc timeout (cmd 0xec) Dec 9 18:55:11 vortex kernel: ata9.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 9 18:55:11 vortex kernel: ata9.00: revalidation failed (errno=-5) Dec 9 18:55:11 vortex kernel: ata9.00: disabled Did you do what I suggested in the link above, i.e., replacing the cables on that SSD? You should also try connecting the SSD to a different controller, Marvell controllers are known to drop disks without a reason and for that and other reasons not recommended for Unraid use.
December 10, 20196 yr Author I had a bunch of new drives come in and was waiting for some screws lol... everything arriving today (I guess I just figured one drive was lose or something) It's a norco 24 drive system so it'll take me a bit to check all the cables but I'll do that and then try and move the ssd's on to one of the cards instead of onboard controller.
Archived
This topic is now archived and is closed to further replies.