Jump to content

JorgeB

Moderators
  • Posts

    67,652
  • Joined

  • Last visited

  • Days Won

    707

Everything posted by JorgeB

  1. If it's completely full it's possibly one of the reasons it's crashing, a COW filesystem should never be completely full, but see here for some more recovery options, then re-format.
  2. You can copy super.dat but that just contains the array assignments, cache still needs to be re-assigned.
  3. shfs segfaulted, rebooting will bring the shares back.
  4. You just need to re-assign all the previous pool members, order is not important, also there can't be a "all data on this device will be deleted at array start" or similar warning next to any of the pool devices.
  5. If there's no data there you just need to re-format.
  6. You can see below for a list of recommended controllers:
  7. You'll need to recreate the pool, but if there's still important data there you can try to recover with this: First create a temp dir: mkdir /x then try to mount with skip balance: mount -o degraded,skip_balance /dev/sdf1 /x If that doesn't work try read-only: mount -o degraded,ro /dev/sdf1 /x If either works you can browse /x and copy any important data to the array.
  8. Please post the complete syslog to see the error/crash.
  9. It's possible, but the VM will be in a crash consistent state, i.e., same thing as pulling the plug, so it can be done but there's a always a risk, I for example make daily snapshots of my VMs online (with btrfs but the principle is the same) but also do snapshots with the VMs shutdown at least once a week so I have more options.
  10. See if the pool starts with only the existing device, if yes post new diags after array start.
  11. If split level is 0 it will move to the next disk once it hits the minimum share space set, if not it's like trurl posted, since rsync creates all the folders at the beginning of the transfer.
  12. Total devices means the pool has 2 devices, and 2 devices are assigned (num devices), but the most important part is this one: May 24 20:25:31 Tower emhttpd: /mnt/cache NumFound: 1 May 24 20:25:31 Tower emhttpd: /mnt/cache NumMissing: 1 May 24 20:25:31 Tower emhttpd: /mnt/cache NumMisplaced: 0 May 24 20:25:31 Tower emhttpd: /mnt/cache NumExtra: 1 This means only 1 pool device was found, there's 1 missing and 1 extra (new device). Yes, it can't convert to raid 1 because of the missing device.
  13. That is a serious error, but unlikely to be the result of trim, unless the device has a buggy firmware.
  14. It's perfectly fine to run trim on btrfs filesystems. This is just a warning, unrelated to trim, and not a reason to make a filesystem read-only, something else must have happened.
  15. It's not normal for sure, it still fails the SMART test after the clearing?
  16. mdX would be to restore from an array disk, to restore from a cache disk you always use sdX.
  17. You're using a SAS2LP controller, it dropped 2 disks at the same time, these controllers are not recommended for Unraid v6, you should use an LSI instead if possible: May 24 16:53:45 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1415:mvs_I_T_nexus_reset for device[0]:rc= 0 May 24 16:53:45 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1415:mvs_I_T_nexus_reset for device[1]:rc= 0 May 24 16:53:45 Tower kernel: ata15.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) May 24 16:53:45 Tower kernel: ata15.00: revalidation failed (errno=-5) May 24 16:53:46 Tower kernel: sas: sas_form_port: phy0 belongs to port3 already(1)! May 24 16:53:46 Tower kernel: sas: sas_form_port: phy1 belongs to port4 already(1)! May 24 16:53:50 Tower kernel: ata14.00: qc timeout (cmd 0xec) May 24 16:53:50 Tower kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) May 24 16:53:50 Tower kernel: ata14.00: revalidation failed (errno=-5) May 24 16:53:51 Tower kernel: sas: sas_form_port: phy0 belongs to port3 already(1)! May 24 16:53:53 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1415:mvs_I_T_nexus_reset for device[0]:rc= 0 May 24 16:53:53 Tower kernel: ata14.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) May 24 16:53:53 Tower kernel: ata14.00: revalidation failed (errno=-5) May 24 16:53:56 Tower kernel: ata15.00: qc timeout (cmd 0xec) May 24 16:53:56 Tower kernel: ata15.00: failed to IDENTIFY (I/O error, err_mask=0x4) May 24 16:53:56 Tower kernel: ata15.00: revalidation failed (errno=-5) May 24 16:53:56 Tower kernel: sas: sas_form_port: phy1 belongs to port4 already(1)! May 24 16:53:58 Tower kernel: ata14.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) May 24 16:53:58 Tower kernel: ata14.00: revalidation failed (errno=-5) May 24 16:53:58 Tower kernel: ata14.00: disabled May 24 16:53:58 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1415:mvs_I_T_nexus_reset for device[1]:rc= 0 May 24 16:54:09 Tower kernel: ata15.00: qc timeout (cmd 0xec) May 24 16:54:09 Tower kernel: ata15.00: failed to IDENTIFY (I/O error, err_mask=0x4) May 24 16:54:09 Tower kernel: ata15.00: revalidation failed (errno=-5) May 24 16:54:09 Tower kernel: ata15.00: disabled May 24 16:54:09 Tower kernel: sas: sas_form_port: phy1 belongs to port4 already(1)! May 24 16:54:11 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1415:mvs_I_T_nexus_reset for device[1]:rc= 0 May 24 16:54:11 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 2 tries: 1 P.S. some other ATA errors on different disks that suggest cable/connection issues.
  18. Type "diagnostics" on the console and attach the zip here.
  19. According to the log there's a new cache device but there's also a device missing: May 24 20:25:31 Tower emhttpd: /mnt/cache TotDevices: 2 May 24 20:25:31 Tower emhttpd: /mnt/cache NumDevices: 2 May 24 20:25:31 Tower emhttpd: /mnt/cache NumFound: 1 May 24 20:25:31 Tower emhttpd: /mnt/cache NumMissing: 1 May 24 20:25:31 Tower emhttpd: /mnt/cache NumMisplaced: 0 May 24 20:25:31 Tower emhttpd: /mnt/cache NumExtra: 1 May 24 20:25:31 Tower emhttpd: /mnt/cache LuksState: 0 May 24 20:25:31 Tower emhttpd: shcmd (332): mount -t btrfs -o noatime,space_cache=v2,discard=async,degraded -U 8af1fbf7-4e95-4aa6-aa41-91cf4fcabeec /mnt/cache May 24 20:25:31 Tower kernel: BTRFS info (device sdf1): turning on async discard May 24 20:25:31 Tower kernel: BTRFS info (device sdf1): allowing degraded mounts May 24 20:25:31 Tower kernel: BTRFS info (device sdf1): using free space tree May 24 20:25:31 Tower kernel: BTRFS info (device sdf1): has skinny extents May 24 20:25:31 Tower kernel: BTRFS warning (device sdf1): devid 2 uuid 34efdbf3-fad8-42bb-acdc-bba1ed3bcaf4 is missing May 24 20:25:31 Tower kernel: BTRFS info (device sdf1): enabling ssd optimizations May 24 20:25:31 Tower kernel: BTRFS error (device sdf1): balance: invalid convert data profile raid1 May 24 20:25:31 Tower kernel: BTRFS warning (device sdf1): Skipping commit of aborted transaction. Do you still have the missing device?
  20. No they are not, looks like I was looking at different diags, sorry about that. Ignore all I posted before, disk dropped offline and looks to be the result of a bad SATA cable, though you still dint' post the SMART report.
  21. I can point you to the user scripts plugin thread, there are some examples there, but you'll need to adapt the scripts to your needs, google is another good place to start.
  22. Parity is for replacing one or more failed disks, it can't help with filesystem corruption, hence why parity (and Unraid or any other RAID array) isn't a backup, it just adds redundancy.
  23. That is a connection problem, usually a bad SATA cable, and consistent with how the disk dropped, so you should replace that before re-syncing parity.
  24. Yes, assuming you were fixing the filesystem on disk6 (md6). No, like mentioned what you see on the emulated disk is what you'll see after a rebuild.
×
×
  • Create New...