jazzysmooth

Members
  • Posts

    135
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

jazzysmooth's Achievements

Apprentice

Apprentice (3/14)

1

Reputation

  1. File system check did find and apparently fix issues. Will see what happens in 6 days... Thanks!
  2. Been running this system for years, with a few upgrades here and there. Been very stable until recently, where every 6+ days I'd notice the Docker service was no longer running. Restarting would fix the issue until another 6+ days would pass. Finally bothered to look at the diagnostics, and the crashing seems to be related to: WARNING: CPU: 0 PID: 16956 at fs/btrfs/extent-tree.c:3061 __btrfs_free_extent+0x466/0xc02 ... Workqueue: events_unbound btrfs_preempt_reclaim_metadata_space ... BTRFS error (device sdh1): unable to find ref byte nr 2845564928 parent 0 root 5 owner 40359587 offset 0 Jul 13 07:04:47 Storage kernel: ------------[ cut here ]------------ Jul 13 07:04:47 Storage kernel: BTRFS: Transaction aborted (error -2) etc. I see this in 6.12.2 and 6.12.3 logs (attached) I'm going to try a BTRFS file system check next; the 2 SSDs that make up the cache drive are definitely old, but never had an issue until 6.12.2 Thoughts? storage-diagnostics-20230715-1605.zip storage-diagnostics-20230810-1055.zip
  3. I was having random reboots as well, with nothing ever in the logs (like yours). The frequency of them increased over time, but running Common Problems in troubleshooting mode appeared to extend the time between reboots. Ultimately it seems to have been my power supply (7+ yr old Corsair 550 watt) Replaced that with a new EVGA 500 watt and so far at least, the reboots have stopped.
  4. Ultimately the problem appears to have been related to the power supply. After swapping out several components with no success, changing the power supply seems to have fixed the random reboots.
  5. I'm curious about this - I have a SuperMicro X8SIL-F motherboard with 2x 8G ECC REG quad rank DIMMs. There are 4 memory slots, 2 per channel. If I put the DIMMs in slot 1 Channel A and Slot 2 Channel A (blue slots), the BIOS screen shows 8 GB and UnRAID shows basically the same as what tmoran000 posted. If however I put the dimms in Slot 2 Channel A and Slot 2 Channel B, the BIOS shows 16 GB and UNRAID shows 16 GB in the "allocated" box. Of course the memory is no longer interleaved in this config, and the speed drops from 1066 to 800. But it appears I get access to all the ram. I'm wondering if the BIOS automatically enables memory sparing when 2 dimms of the same size / type are on the same channel? If so, I see no means to disable it like I do on Dell servers. As an aside, the system won't boot with only 1 DIMM, or if I populate Slot 1 Channel A and Slot 1 Channel B.
  6. Been running unraid for years on older hardware with no issues. About a year ago I got a newer (old) SuperMicro board with 16GB so I could run dockers. Periodically the system will reboot, which in turn causes a parity check. The interesting thing is the frequency of the reboots is considerably reduced if I have the Common Problems plugin running its diagnostics. Finally however, it rebooted while diagnostics were running - but I don't see anything in the logs which points me to the issue. Hopefully someone else will? FCPsyslog_tail.txt storage-diagnostics-20180425-0439.zip
  7. Back to the original topic, what file system were the drives formatted with, reiserfs or xfs? You may be able to rebuild the directory tree and recover data. https://lime-technology.com/wiki/Check_Disk_Filesystems . (see XFS or ReiserFS section depending)
  8. Did you have the ntfs formatted drives assigned to slots in Unraid? The first thing it has to do is format them, which makes unraid unavailable (that's why most of us use Joe L's preclear script to prep and stress test the drives prior to adding them). So if you didn't have them assigned to slots, you should be fine. If you did, then I'd suggest taking them out and mounting them in a Windows machine and see if you can access the data, or run repair tools on them to try and recover.
  9. OK I have tried all the network tests you suggested. 1. Yes no problem but was only running at less than 9Mb/s which seems slow (copied a 8Gb file between disks) 2. Yes they seem to copy including ones that were 50-60Mb 3. As soon as I tried to copy large video files it seems to crash the system Streaming from unRAID is faultless and has never crashed when watching movies etc, its ONLY when writing files to the array. Do people think it's still a network problem please? Is there a way to capture the log files onto the USB stick so that after the reboot I can see what errors are occuring as they get wiped when I reboot the box? Thanks! You have a Realtek nic (same as I did) and you have the lockup when writing larger files (same as I did). I also never had an issue reading files, only writes. I'd try a dedicated nic.
  10. I had a similar issue some time ago, turned out it was due to the onboard nic sharing an IRQ with the secondary SATA controller. Few questions - have you made any hardware changes recently? Are you using the onboard nic? If so, what chipset is it? (mine was Realtek) Are you using the onboard SATA connectors? What size are the files you're attempting to copy? Some tests: 1) Can you successfully copy 1+ GB of data from disk to disk using Midnight Commander (taking the network out of the equation)? 2) Can you successfully copy small (2-5 MB files) across the network? 3) Can you successfully copy 1+ GB file across the network? If test 1 works and you are using the onboard nic, try adding a dedicated nic (most recommend Intel) and see if that makes any differenct.
  11. performing the initconfig as mentioned in that thread won't remove your shares, but it will remove your parity protection (until it rebuilds it). As long as your comfortable the rest of your drives are in good shape, you should be fine. Side note - you don't have to replace that drive if you're going to do the initconfig. You can just remove the erroring drive, run the initconfig and it will rebuild parity based on the drives that remain. You could then add the replacement drive later. If you don't need the additional space at this moment, that might be a better way to go.
  12. Best to take the problems 1 at a time... You stated the web interface would take a long time to come up and moving from page to page is slow. Have you tried it again with the cache drive removed? Concerning the preclears, while it can take a long time (over 24 hours) to check a 2 TB drive, its best to find out at the beginning if they are suspect, rather than after you have data on them. How long did it take to run your preclears? Do you have the jumper on or off on those EARS drives? As far as problems running Unraid, only one I had was when I added drives 5 & 6 (onboard). That was because the controller for those ports shared an IRQ with the onboard nic and the server would lock up during large file transfers. As the other poster stated, adding a dedicated Intel nic solved that issue.
  13. Your 80 GB drive is only connected at 1.5 GB (SATA 1) is that all the drive supports or is there a jumper on it limiting the speed? --------------------------------------------- Sep 13 19:58:51 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Sep 13 19:58:51 Tower kernel: ata7.00: ATA-7: WDC WD800ADFS-75SLR2, 21.07Q21 --------------------------------------------- Does also appear to be a problem with that drive as was mentioned previously: --------------------------------------------- Sep 13 19:58:51 Tower kernel: sd 8:0:0:0: [sdf] 156250000 512-byte logical blocks: (80.0 GB/74.5 GiB) Sep 13 19:58:51 Tower kernel: ata8: link is slow to respond, please be patient (ready=-19) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: link is slow to respond, please be patient (ready=-19) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: link is slow to respond, please be patient (ready=-19) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: reset failed, giving up Sep 13 19:58:51 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 13 19:58:51 Tower kernel: ata7.00: BMDMA2 stat 0x686c0009 Sep 13 19:58:51 Tower kernel: ata7.00: failed command: READ DMA Sep 13 19:58:51 Tower kernel: ata7.00: cmd c8/00:20:c8:2e:50/00:00:00:00:00/ e9 tag 0 dma 16384 in Sep 13 19:58:51 Tower kernel: res 51/04:20:c8:2e:50/00:00:00:00:00/ e9 Emask 0x1 (device error) Sep 13 19:58:51 Tower kernel: ata7.00: status: { DRDY ERR } Sep 13 19:58:51 Tower kernel: ata7.00: error: { ABRT } Sep 13 19:58:51 Tower kernel: ata7.00: configured for UDMA/100 Sep 13 20:07:01 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 13 20:07:01 Tower kernel: ata7.00: BMDMA2 stat 0x686d0009 Sep 13 20:07:01 Tower kernel: ata7.00: failed command: READ DMA Sep 13 20:07:01 Tower kernel: ata7.00: cmd c8/00:00:60:44:04/00:00:00:00:00/ e0 tag 0 dma 131072 in Sep 13 20:07:01 Tower kernel: res 51/04:00:60:44:04/00:00:00:00:00/ e0 Emask 0x1 (device error) Sep 13 20:07:01 Tower kernel: ata7.00: status: { DRDY ERR } Sep 13 20:07:01 Tower kernel: ata7.00: error: { ABRT } Sep 13 20:07:01 Tower kernel: ata7.00: configured for UDMA/100 Sep 13 20:07:01 Tower kernel: ata7: EH complete Sep 13 20:07:01 Tower kernel: ata7: drained 32768 bytes to clear DRQ. Sep 13 20:07:01 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Sep 13 20:07:01 Tower kernel: ata7.00: failed command: READ DMA Sep 13 20:07:01 Tower kernel: ata7.00: cmd c8/00:00:60:46:04/00:00:00:00:00/ e0 tag 0 dma 131072 in Sep 13 20:07:01 Tower kernel: res ff/ff:ff:ff:ff:ff/00:00:00:00:00/ ff Emask 0x2 (HSM violation) Sep 13 20:07:01 Tower kernel: ata7.00: status: { Busy } Sep 13 20:07:01 Tower kernel: ata7.00: error: { ICRC UNC IDNF ABRT } Sep 13 20:07:01 Tower kernel: ata7: hard resetting link Sep 13 20:07:01 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Sep 13 20:07:01 Tower kernel: ata7.00: configured for UDMA/100 --------------------------------------------- I'd definitely disconnect that 80 gb drive and try it again
  14. I don't see any issues, UNRaid supports a pretty wide variety of hardware. Only thing that potentially raises a flag for me is the Marvel controller on the NIC - but that's only because I don't know if there have been issues with them or not, may want to do a search on Marvel. However you can easily (and cheaply) add a Intel based nic and disable the onboard if it does come up as a potential problem area.
  15. Both questions would require you to send an email to support (Tom). He should have a record of the GUIDs you registered, or that you purchased a 2 pak and only registered 1. He can then provide you with the license key. As far as the 2nd question, while he isn't under any obligation to provide a replacement key for a failed flash drive, he has typically done so in the past - again, just send an email to support explaining the situation.