Cannot write to cache brtfs pool issues?

leon · December 9, 2019

System hard crashed overnight while trying to pre clear disks.. no idea why but syslog was cleared out when I started up.

Dockers didn't start, so I restarted.. and dockers started and I mentally moved on.

Then I starting seeing errors in sabnzbd about SQL commands failing and upon researching that I tried to clear out a history1.db file and got

mv: cannot move 'history1.db' to 'history1.db.old': Read-only file system

Even though the file system is not RO.. then I tried to play something in plex. and that failed.. seems anything that needs to write/read cache is having issues.

DIag attached.

vortex-diagnostics-20191209-1611.zip

JorgeB · December 9, 2019

There's a hardware problem with one of the cache devices (cache1):

Dec  9 08:20:06 vortex kernel: BTRFS info (device sdd1): bdev /dev/sdp1 errs: wr 640215651, rd 458159560, flush 0, corrupt 0, gen 0

With SSDs this is usually a cable problem, see here for more info.

Try running a scrub first, but I'm seeing some transid errors, and these are usually fatal requiring a re-format.

leon · December 9, 2019

Can i safely just remove cache1 from the pool then try and get a working system before dealing with the other drive?

JorgeB · December 9, 2019

Difficult to say because of the filesystem corruption but unlikely, make sure to try and backup anything important before trying anything.

leon · December 9, 2019

OK i'll back it all up and then try. I should probably start backing up cache anyway (stuff trhat is never moved)

leon · December 9, 2019

OK i made a copy.. mor or less.. lots of errors just doing that.. and when i stopped the array the disk that was known to be question able disppeared.. then the remaining one.. if i start the array with just it.

Unmountable: No file system

not sure what I can do with this at this point

JorgeB · December 9, 2019

Reformat pool and restore data, and don't forget link above for better pool monitoring.

leon · December 9, 2019

Is there no way to use the pool drive that wasnt bad? seems to defeat the whole purpose.. if one drive goes bad.. i lose everything

JorgeB · December 9, 2019

Depends on the damage, like I mentioned:

3 hours ago, johnnie.black said:

I'm seeing some transid errors, and these are usually fatal

JorgeB · December 9, 2019

There are some recovery options here that might help:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490

leon · December 9, 2019

To update everyone. Formated the "working" cache drive.. assigned it back to cache.. copied backup back into place.. restarted.. all is well. Waiting for a parity build to finish then I'll reseat that original cache drive and install a bunch of new disks.

leon · December 10, 2019

I lied.. now the other disk is bad?

ErrorWarningSystemArrayLogin

Dec 9 19:31:59 vortex kernel: print_req_error: I/O error, dev sdp, sector 77578192
Dec 9 19:31:59 vortex kernel: BTRFS error (device sdp1): bdev /dev/sdp1 errs: wr 205, rd 961, flush 0, corrupt 0, gen 0
Dec 9 19:32:00 vortex kernel: sd 12:0:0:0: [sdp] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Dec 9 19:32:00 vortex kernel: sd 12:0:0:0: [sdp] tag#30 CDB: opcode=0x28 28 00 0d 74 2d 68 00 00 08 00
Dec 9 19:32:00 vortex kernel: print_req_error: I/O error, dev sdp, sector 225717608

fresh diag attached

vortex-diagnostics-20191210-0032.zip

JorgeB · December 10, 2019

SSD dropped offline again:

Dec  9 18:54:40 vortex kernel: ata9: limiting SATA link speed to 3.0 Gbps
Dec  9 18:54:41 vortex kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
Dec  9 18:55:11 vortex kernel: ata9.00: qc timeout (cmd 0xec)
Dec  9 18:55:11 vortex kernel: ata9.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Dec  9 18:55:11 vortex kernel: ata9.00: revalidation failed (errno=-5)
Dec  9 18:55:11 vortex kernel: ata9.00: disabled

Did you do what I suggested in the link above, i.e., replacing the cables on that SSD?

You should also try connecting the SSD to a different controller, Marvell controllers are known to drop disks without a reason and for that and other reasons not recommended for Unraid use.

leon · December 10, 2019

I had a bunch of new drives come in and was waiting for some screws lol... everything arriving today (I guess I just figured one drive was lose or something) It's a norco 24 drive system so it'll take me a bit to check all the cables but I'll do that and then try and move the ssd's on to one of the cards instead of onboard controller.

Cannot write to cache brtfs pool issues?

Recommended Posts

leon

Link to comment

JorgeB

Link to comment

leon

Link to comment

JorgeB

Link to comment

leon

Link to comment

leon

Link to comment

JorgeB

Link to comment

leon

Link to comment

JorgeB

Link to comment

JorgeB

Link to comment

leon

Link to comment

leon

Link to comment

JorgeB

Link to comment

leon

Link to comment

Join the conversation