Jump to content

JorgeB

Moderators
  • Posts

    67,504
  • Joined

  • Last visited

  • Days Won

    706

Everything posted by JorgeB

  1. Diskspeed had a bug that was multiplying the LBAs written by 4k instead of 512b, IIRC this has been fixed in the latest update.
  2. HBA problems: Aug 4 12:28:12 NUCLEAR-WINTER kernel: mpt2sas_cm0: SAS host is non-operational !!!! Unraid ending up losing connection with all disks, when this happens you'll get as many disabled disks as there are parity devices, which disk(s) get disabled is a crap-shoot. Issue started here: Aug 4 12:28:12 NUCLEAR-WINTER kernel: pcieport 0000:00:05.0: AER: Uncorrected (Fatal) error received: 0000:00:05.0 Aug 4 12:28:12 NUCLEAR-WINTER kernel: pcieport 0000:00:05.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID) Aug 4 12:28:12 NUCLEAR-WINTER kernel: pcieport 0000:00:05.0: device [8086:340c] error status/mask=00000020/00318000 Aug 4 12:28:12 NUCLEAR-WINTER kernel: pcieport 0000:00:05.0: [ 5] SDES (First) This is a fatal error, but can't tell if it's a problem with the board or the HBA, try it in a different slot if available and make sure it's sufficiently cooled. Also note that you're having multiple hardware errors logged, e.g: Aug 4 02:32:24 NUCLEAR-WINTER kernel: mce: [Hardware Error]: Machine check events logged
  3. There could be a way but it's not implemented since if there is corruption on multiple devices you would end up corrupting more disks, more info here: https://forums.unraid.net/topic/46170-unraid-server-release-620-beta20-available/page/10/?tab=comments#comment-456693
  4. See here on how to reset current error count and monitor for new ones (you'll need to adjust the path), then post new diags if/when there are more errors (before rebooting).
  5. Looks like a BIOS issue, look for an update.
  6. No, they would still link @ 6Gb/s, also more than enough bandwidth for spinners.
  7. Most likely, though strange the superblock getting damaged out of the blue, what part was replaced on the server?
  8. Now try coping to a disk share directly, enable then fist on Settings -> Global Share Settings then copy for example to \\tower\cache
  9. There are constant error like these on multiple devices: Jul 28 00:08:15 Diskstation kernel: sd 7:0:6:0: Power-on or device reset occurred Jul 28 00:08:21 Diskstation kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303) ### [PREVIOUS LINE REPEATED 1 TIMES] ### Jul 28 00:08:22 Diskstation kernel: sd 7:0:0:0: Power-on or device reset occurred Jul 28 00:08:25 Diskstation kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303) ### [PREVIOUS LINE REPEATED 7 TIMES] ### Jul 28 00:08:26 Diskstation kernel: sd 7:0:6:0: Power-on or device reset occurred Jul 28 00:08:45 Diskstation kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303) ### [PREVIOUS LINE REPEATED 1 TIMES] ### Jul 28 00:08:46 Diskstation kernel: sd 7:0:1:0: Power-on or device reset occurred Jul 28 00:08:54 Diskstation kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303) ### [PREVIOUS LINE REPEATED 4 TIMES] ### Most like connections issue, could be SATA or power problem, so check/replace all cables, it could also be a failing PSU.
  10. Disk is not being seen by Unraid, most likely the 3.3v issue.
  11. There's also an update but probably best to leave them alone for now.
  12. I'm sorry, so what is the problem exactly? If you want to attempt to recover data form am unmountable btrfs disk I need diags after trying to mount it. What is crashing, the server? I though this was about an unmountable disk, again I'm sorry I didn't read the entire thread, so please do a quick summary if there are other issues. Also this is not a good sign: Aug 3 13:26:08 unRAID kernel: BTRFS info (device md2): bdev /dev/md2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 ... Aug 3 13:26:21 unRAID kernel: BTRFS info (device md3): bdev /dev/md3 errs: wr 0, rd 0, flush 0, corrupt 435370, gen 0 Both disks 2 and 3 are showing corruption errors, this is usually data corruption caused by a hardware problem, like bad RAM, it would be a good idea to run memtest.
  13. Nothing as long as the data fits on disk1, if it doesn't you'll need to use another disk with more space. Also good idea to cancel/pause the parity check for nw.
  14. To see in what state the filesystem is I need diags after attempting to mount that disk.
  15. The bug is that only data is redundant, metadata isn't, metadata takes very little space but if a device fails or is missing whole pool is lost/won't mount.
  16. If a btrfs balance is running you'll get an inhibited Stop button with a reason why: It is, but you're pool wasn't redundant and that is way it was unmountable the first time, possibly the result on being created on v6.7.x due to a bug: Aug 3 17:09:29 CubeZero kernel: BTRFS warning (device sdd1): devid 1 uuid 82740355-53fc-4d7a-8aaf-0ec4de6f38ce is missing ### [PREVIOUS LINE REPEATED 1 TIMES] ### Aug 3 17:09:29 CubeZero kernel: BTRFS warning (device sdd1): chunk 402246860800 missing 1 devices, max tolerance is 0 for writeable mount Aug 3 17:09:29 CubeZero kernel: BTRFS warning (device sdd1): writeable mount is not allowed due to too many missing devices You then formatted the pool, and yes, by doing that it wiped both devices, now it still might be possible to recover the pool using a backup superblock, see here, but can't really help with this since I've never used it, you can ask for help on IRC #btrfs linke mentioned in that thread.
  17. Difficult to say what exactly happened without the diags, if the pool was raid1 and the array started with the single device it would auto balance to single mode, as long as you let it finish first by then adding the other device it should have rebalanced to raid1.
  18. Enable turbo write and copy directly to the array, is speed same, better, worse?
  19. Like mentioned best bet is the btrfs restore option (2nd one).
  20. Most times result of a bad shutdown or bad hardware.
  21. Start by running a single stream iperf test to rule out any LAN issue.
  22. No need to backup the docker image, I believe that CA appdata backup already has the option to backup libvirt.
  23. Still most likely a hardware issue, if you can try other RAM, or another board/CPU/RAM combo.
  24. Yes, unless you had duplicate appdata on more than one device, it would still all be under /mnt/user
×
×
  • Create New...