Jump to content

JorgeB

Moderators
  • Posts

    67,505
  • Joined

  • Last visited

  • Days Won

    707

Everything posted by JorgeB

  1. You cache isn't mounting, was it ever formatted? Shfs is then crashing, possibly because of that, though it shouldn't, possibly a bug.
  2. Errors are logged as an actual disk error but on LBA 0 which is weird, SMART looks OK, run an extended SMART test, if successful swap cables with another disk to rule that out and keep monitoring.
  3. There are errors on multiple disks, this suggests a controller/connection/power problem, rebooting should bring the unmountable disks online, at least most of them, some might need a filesystem check, but don't try to do it before rebooting, also try to at least get the syslog before rebooting, even if you need to do it manually: cp /var/log/syslog /boot/syslog.txt
  4. Nothing I can see on the Unraid side, it's as if the disk is not connected, since the LSI has a BIOS installed does it show up in the BIOS when this happens? P.S. you have a lot of nedtools apps installed, make sure you only install the ones you actually need.
  5. Only if you format the device manually.
  6. Like most probably know, all disks being valid, if there are errors on multiple disks, e.g. when there's a controller or power problem, Unraid only disables as many disks are there are parity devices, this is a very good thing, but if for example a user with single parity is rebuilding a disabled disk and a write fails to another one it disables that disk, leaving the user in a precarious situation, maybe there's a reason for this, maybe it's an oversight. If it's the latter I would suggest the behavior should be similar to the described above, so that the user is not left with more invalid disks than parity can handle, also would suggest pausing the rebuild if that happens, so the user can decide if he wants to abort to check the problem and retry rebuilding or proceed knowing that there will be corruption on the rebuilt disk.
  7. Yes, if reads are the main concern, it should help. I don't know for sure but i believe it's related to the GUI stats, i.e., when they are updated, I've noticed that at least with btrfs, which is the fs I use, it might be different with others, it takes a few seconds to update the stats after the write is over, and if for example I start another write right away it will go for the same disk until the stats are updated, so basically it would depend mostly on the file size you're writing, if it's very small or small files you will likely see multiple files going to the same disk before it switches.
  8. Not sure what happened but the partitions are gone, nothing you described explains that, but you're not the first to have something like this happen, though not to all disks as far as I can remember. You can try testdisk to see if it can find and recover the partition on each disk, if yes try again to mount the disks with UD, If that doesn't work and it's just the partition missing recreating just the partition as it should be might bring the data back, other than that a file recovery utility is probably you best bet, something like UFS explorer.
  9. You can do it now, you'll need to restart the rebuild but at least you'll know if it's worth it or not, since whatever is on the emulated disk will be on the actual disk after it's done rebuilding. https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui
  10. If you do a new config without preserving assignments all disks will be unassigned, they will appear in the UD section, see if they all have an option to mount, i.e. there's still a valid partition (they need to show a plus sign next to them).
  11. According to previous diags you assigned old disk1 to parity slot, and that's a shame since you had the diags posted here, so just needed to check that if you weren't sure which disk was which, and since the parity sync run for a few minutes that disk will now also be unmountable. You can force rebuilding that disk using the invalid slot command but note that since the array was started read/write parity won't be 100% valid, there may be at least some fs corruption, follow the instructions bellow carefully and ask if there's any doubt. -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) if needed, assign old parity as parity, and current parity as disk2 (or 1 if you want, data disks order doesn't matter with single parity) -Important - After checking the assignments leave the browser on that page, the "Main" page. -Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters): mdcmd set invalidslot 2 29 (assuming you assigned that disk as disk2, if it's now disk1 replace 2 with 1) -Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk2 will start rebuilding, disk should mount immediately (possibly not in this case) but if it's unmountable don't format, wait for the rebuild to finish (or cancel it) and then run a filesystem check.
  12. What you say happened doesn't explain the invalid partition errors, did you leave anything out? Also, do a new config to unassign all disks, to they have the option to mount on UD? Rr do they appear without a partition? Note, if they have the option to mount and you want to do it do it only in read only mode for now.
  13. If you don't have users created you can can click on flash on the main page and change the export security to "public", then change it back for better security.
  14. Do you known for sure which disk was parity? If not make sure you don't assign any disk to the parity slot, also please post diags.
  15. Please post current diags: Tools -> Diagnostics
  16. It's possible, there's a cache device missing and Unraid is failing to remove it because there's filesystem corruption, and since there are filesystem issues they can then also cause other problems, in any case this needs to be fixed, best bet it to backup then reformat cache.
  17. Yes, but but let me repeat that's it's very difficult to diagnose most hardware issues remotely, especially when they lockup the server, if there's something on the syslog server, that might help, but most times the way to find out the problem is for the user to start swapping some of the hardware around.
  18. Yes, the problem appears to have been caused by the controller, not the disk.
  19. Just because the error mentions BUG it doesn't mean it's an Unraid bug, in fact most likely it's not, and when it's a hardware problem it's very difficult to diagnose remotely, especially if there's nothing logged on the syslog, if this happens again make sure to use the syslog feature linked above and then post that syslog, it might catch the beginning of the error, the screenshots you posted only show the end.
  20. This has happened before and IIRC it was a flash drive problem, copy all the bz* files from the Unraid install overwriting existing ones and reboot.
  21. Sorry, I sometimes forget not everyone is aware of everything related to Unraid, UD is the Unassigned Devices plugin, you can install it from the apps tab, check the first couple of posts on the support thread, that and the included help should give you an idea of how to use it, if still doubts ask. Note that you need to unassign the disable disk and start the array to make Unraid "forget it" before it can be used with UD. Yes, new config, preserve everything, re-assign original 10tb if needed, array should look the same as it was before the upgrade, then start it to re-sync parity. No, new config is to rebuild parity based on the old disk6, assuming disk13 is also looking correct, to bring the array back to the same state it was before the upgrade, after that is done and parity is valid you repeat the disk upgrade.
  22. That's expected, once a disk is disabled it stays disable until rebuilt. Disk13 looks fine, still good to check that it mounts correctly with UD in read-only mode, if yes do a new config with old disk6 and resync parity, then repeat the replacement. You should also replace the SATA cable for disk14 since there are a few CRC errors on the previous diags. Also good idea to replace that SASLP ASAP.
  23. Disk13 dropped offline, looks more like a controller problem, SASLP are not recommended for a long time, but since there's no SMART reboot and post new diags, but note that the rebuilt disk will be mostly corrupt due to those errors.
  24. Run an extended SMART test, if it also passes that it's OK for now, though more likely to have more issues in the near future.
×
×
  • Create New...