jazzysmooth

Members
  • Posts

    135
  • Joined

  • Last visited

Everything posted by jazzysmooth

  1. File system check did find and apparently fix issues. Will see what happens in 6 days... Thanks!
  2. Been running this system for years, with a few upgrades here and there. Been very stable until recently, where every 6+ days I'd notice the Docker service was no longer running. Restarting would fix the issue until another 6+ days would pass. Finally bothered to look at the diagnostics, and the crashing seems to be related to: WARNING: CPU: 0 PID: 16956 at fs/btrfs/extent-tree.c:3061 __btrfs_free_extent+0x466/0xc02 ... Workqueue: events_unbound btrfs_preempt_reclaim_metadata_space ... BTRFS error (device sdh1): unable to find ref byte nr 2845564928 parent 0 root 5 owner 40359587 offset 0 Jul 13 07:04:47 Storage kernel: ------------[ cut here ]------------ Jul 13 07:04:47 Storage kernel: BTRFS: Transaction aborted (error -2) etc. I see this in 6.12.2 and 6.12.3 logs (attached) I'm going to try a BTRFS file system check next; the 2 SSDs that make up the cache drive are definitely old, but never had an issue until 6.12.2 Thoughts? storage-diagnostics-20230715-1605.zip storage-diagnostics-20230810-1055.zip
  3. I was having random reboots as well, with nothing ever in the logs (like yours). The frequency of them increased over time, but running Common Problems in troubleshooting mode appeared to extend the time between reboots. Ultimately it seems to have been my power supply (7+ yr old Corsair 550 watt) Replaced that with a new EVGA 500 watt and so far at least, the reboots have stopped.
  4. Ultimately the problem appears to have been related to the power supply. After swapping out several components with no success, changing the power supply seems to have fixed the random reboots.
  5. I'm curious about this - I have a SuperMicro X8SIL-F motherboard with 2x 8G ECC REG quad rank DIMMs. There are 4 memory slots, 2 per channel. If I put the DIMMs in slot 1 Channel A and Slot 2 Channel A (blue slots), the BIOS screen shows 8 GB and UnRAID shows basically the same as what tmoran000 posted. If however I put the dimms in Slot 2 Channel A and Slot 2 Channel B, the BIOS shows 16 GB and UNRAID shows 16 GB in the "allocated" box. Of course the memory is no longer interleaved in this config, and the speed drops from 1066 to 800. But it appears I get access to all the ram. I'm wondering if the BIOS automatically enables memory sparing when 2 dimms of the same size / type are on the same channel? If so, I see no means to disable it like I do on Dell servers. As an aside, the system won't boot with only 1 DIMM, or if I populate Slot 1 Channel A and Slot 1 Channel B.
  6. Been running unraid for years on older hardware with no issues. About a year ago I got a newer (old) SuperMicro board with 16GB so I could run dockers. Periodically the system will reboot, which in turn causes a parity check. The interesting thing is the frequency of the reboots is considerably reduced if I have the Common Problems plugin running its diagnostics. Finally however, it rebooted while diagnostics were running - but I don't see anything in the logs which points me to the issue. Hopefully someone else will? FCPsyslog_tail.txt storage-diagnostics-20180425-0439.zip
  7. Back to the original topic, what file system were the drives formatted with, reiserfs or xfs? You may be able to rebuild the directory tree and recover data. https://lime-technology.com/wiki/Check_Disk_Filesystems . (see XFS or ReiserFS section depending)
  8. Did you have the ntfs formatted drives assigned to slots in Unraid? The first thing it has to do is format them, which makes unraid unavailable (that's why most of us use Joe L's preclear script to prep and stress test the drives prior to adding them). So if you didn't have them assigned to slots, you should be fine. If you did, then I'd suggest taking them out and mounting them in a Windows machine and see if you can access the data, or run repair tools on them to try and recover.
  9. OK I have tried all the network tests you suggested. 1. Yes no problem but was only running at less than 9Mb/s which seems slow (copied a 8Gb file between disks) 2. Yes they seem to copy including ones that were 50-60Mb 3. As soon as I tried to copy large video files it seems to crash the system Streaming from unRAID is faultless and has never crashed when watching movies etc, its ONLY when writing files to the array. Do people think it's still a network problem please? Is there a way to capture the log files onto the USB stick so that after the reboot I can see what errors are occuring as they get wiped when I reboot the box? Thanks! You have a Realtek nic (same as I did) and you have the lockup when writing larger files (same as I did). I also never had an issue reading files, only writes. I'd try a dedicated nic.
  10. I had a similar issue some time ago, turned out it was due to the onboard nic sharing an IRQ with the secondary SATA controller. Few questions - have you made any hardware changes recently? Are you using the onboard nic? If so, what chipset is it? (mine was Realtek) Are you using the onboard SATA connectors? What size are the files you're attempting to copy? Some tests: 1) Can you successfully copy 1+ GB of data from disk to disk using Midnight Commander (taking the network out of the equation)? 2) Can you successfully copy small (2-5 MB files) across the network? 3) Can you successfully copy 1+ GB file across the network? If test 1 works and you are using the onboard nic, try adding a dedicated nic (most recommend Intel) and see if that makes any differenct.
  11. performing the initconfig as mentioned in that thread won't remove your shares, but it will remove your parity protection (until it rebuilds it). As long as your comfortable the rest of your drives are in good shape, you should be fine. Side note - you don't have to replace that drive if you're going to do the initconfig. You can just remove the erroring drive, run the initconfig and it will rebuild parity based on the drives that remain. You could then add the replacement drive later. If you don't need the additional space at this moment, that might be a better way to go.
  12. Best to take the problems 1 at a time... You stated the web interface would take a long time to come up and moving from page to page is slow. Have you tried it again with the cache drive removed? Concerning the preclears, while it can take a long time (over 24 hours) to check a 2 TB drive, its best to find out at the beginning if they are suspect, rather than after you have data on them. How long did it take to run your preclears? Do you have the jumper on or off on those EARS drives? As far as problems running Unraid, only one I had was when I added drives 5 & 6 (onboard). That was because the controller for those ports shared an IRQ with the onboard nic and the server would lock up during large file transfers. As the other poster stated, adding a dedicated Intel nic solved that issue.
  13. Your 80 GB drive is only connected at 1.5 GB (SATA 1) is that all the drive supports or is there a jumper on it limiting the speed? --------------------------------------------- Sep 13 19:58:51 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Sep 13 19:58:51 Tower kernel: ata7.00: ATA-7: WDC WD800ADFS-75SLR2, 21.07Q21 --------------------------------------------- Does also appear to be a problem with that drive as was mentioned previously: --------------------------------------------- Sep 13 19:58:51 Tower kernel: sd 8:0:0:0: [sdf] 156250000 512-byte logical blocks: (80.0 GB/74.5 GiB) Sep 13 19:58:51 Tower kernel: ata8: link is slow to respond, please be patient (ready=-19) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: link is slow to respond, please be patient (ready=-19) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: link is slow to respond, please be patient (ready=-19) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: COMRESET failed (errno=-16) Sep 13 19:58:51 Tower kernel: ata8: reset failed, giving up Sep 13 19:58:51 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 13 19:58:51 Tower kernel: ata7.00: BMDMA2 stat 0x686c0009 Sep 13 19:58:51 Tower kernel: ata7.00: failed command: READ DMA Sep 13 19:58:51 Tower kernel: ata7.00: cmd c8/00:20:c8:2e:50/00:00:00:00:00/ e9 tag 0 dma 16384 in Sep 13 19:58:51 Tower kernel: res 51/04:20:c8:2e:50/00:00:00:00:00/ e9 Emask 0x1 (device error) Sep 13 19:58:51 Tower kernel: ata7.00: status: { DRDY ERR } Sep 13 19:58:51 Tower kernel: ata7.00: error: { ABRT } Sep 13 19:58:51 Tower kernel: ata7.00: configured for UDMA/100 Sep 13 20:07:01 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 13 20:07:01 Tower kernel: ata7.00: BMDMA2 stat 0x686d0009 Sep 13 20:07:01 Tower kernel: ata7.00: failed command: READ DMA Sep 13 20:07:01 Tower kernel: ata7.00: cmd c8/00:00:60:44:04/00:00:00:00:00/ e0 tag 0 dma 131072 in Sep 13 20:07:01 Tower kernel: res 51/04:00:60:44:04/00:00:00:00:00/ e0 Emask 0x1 (device error) Sep 13 20:07:01 Tower kernel: ata7.00: status: { DRDY ERR } Sep 13 20:07:01 Tower kernel: ata7.00: error: { ABRT } Sep 13 20:07:01 Tower kernel: ata7.00: configured for UDMA/100 Sep 13 20:07:01 Tower kernel: ata7: EH complete Sep 13 20:07:01 Tower kernel: ata7: drained 32768 bytes to clear DRQ. Sep 13 20:07:01 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Sep 13 20:07:01 Tower kernel: ata7.00: failed command: READ DMA Sep 13 20:07:01 Tower kernel: ata7.00: cmd c8/00:00:60:46:04/00:00:00:00:00/ e0 tag 0 dma 131072 in Sep 13 20:07:01 Tower kernel: res ff/ff:ff:ff:ff:ff/00:00:00:00:00/ ff Emask 0x2 (HSM violation) Sep 13 20:07:01 Tower kernel: ata7.00: status: { Busy } Sep 13 20:07:01 Tower kernel: ata7.00: error: { ICRC UNC IDNF ABRT } Sep 13 20:07:01 Tower kernel: ata7: hard resetting link Sep 13 20:07:01 Tower kernel: ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Sep 13 20:07:01 Tower kernel: ata7.00: configured for UDMA/100 --------------------------------------------- I'd definitely disconnect that 80 gb drive and try it again
  14. I don't see any issues, UNRaid supports a pretty wide variety of hardware. Only thing that potentially raises a flag for me is the Marvel controller on the NIC - but that's only because I don't know if there have been issues with them or not, may want to do a search on Marvel. However you can easily (and cheaply) add a Intel based nic and disable the onboard if it does come up as a potential problem area.
  15. Both questions would require you to send an email to support (Tom). He should have a record of the GUIDs you registered, or that you purchased a 2 pak and only registered 1. He can then provide you with the license key. As far as the 2nd question, while he isn't under any obligation to provide a replacement key for a failed flash drive, he has typically done so in the past - again, just send an email to support explaining the situation.
  16. What are the drive sizes and how much have you written to the 1st drive? High water is calculated at 1/2 the size of the largest disk, so it will write to the first drive until its 50% utilized. Then it will move to drive 2, drive 3, etc, each time writing until 50% full (assuming they are all the same size). Then it does 1/2 again, so it would write to the 1st drive until 75% full, then drive 2, etc. So until you get the first drive 50% full, you won't see anything on the 2nd drive.
  17. Think of your shared folder as level 0 and the split level you allocate is the 1st level you DON'T want to split. So some examples: If you have "TV Shows" as the root folder (/TV Shows) /TV Shows (level 0) -Weeds (level 1) -Season 1 (level 2) -Season 2 (level 2) If you assign Split level 1 (and don't exclude any drives) then the /TV Shows folder could appear on all the drives, while Weeds (and any subfolders) would only be written to 1 drive. If you assigned split level 2, /TV Shows/Weeds could appear on multiple drives, but the contents of each season would be on 1 drive (NOTE, Season 1 and Season 2 could be on different drives in this scenario) If you use a more complex folder structure then just change the split level accordingly: /Media (Level 0) -TV (Level 1) Choosing split level 1 could cause TV and all subfolders to be written to the same drive -Action (Level 2) -Comedies (Level 2) Level 2 would cause /Media/TV to be on multiple drives, with Comedies and all subfolders to be on the same drive -Weeds (Level 3) Level 3 would cause /Media/TV/Comedies to potentially be on multiple drives, with Weeds and all subs only on 1 drive - Season 1 (Level 4) Level 4 would cause /Media/TV/Comedies/Weeds to potentially be on multiple drives, with Season 1 to be only on 1 drive - Season 2 (Level 4) Again, remember that the split level you choose is the first level that you don't want to split. Also note that changing a split level after data is in the folder won't move existing data - it only applies to new data. You'd have to manually adjust existing data, or move it out and copy it back after changing the split level.
  18. Am a little confused by your stream of consciousness typing but if your ultimate question is "Is there a way to use my current array set with the new motherboard/cpu combo", the answer is yes. UnRaid doesn't really care about the hardware - as long as you can see all the drives with the mb bios or add in cards you should be fine. As far as your particular situation, what is the issue, that the network card died and so you can't reach the webpage to manage the array? Or that you want to get a copy of that jpg? If the former, I guess you'll need to add a dedicated nic (I recommend Intel). If the latter, you should be able to copy the file to the flash drive via Midnight Commander (MC) while logged into the console. This is assuming that the array is available.
  19. It sees it as a new disk: Jun 26 21:38:37 Tower kernel: md: import disk4: [8,80] (sdf) SAMSUNG_HD155UI_S2HEJ1DB303167 size: 1465138552 Jun 26 21:38:37 Tower kernel: md: disk4 new disk And is clearing it: Jun 26 21:38:38 Tower emhttp: clearing disk4... This process can take hours, during which time UNRaid shares aren't available. If you want to avoid the extended downtime when adding a new disk, then use Joe's preclear script to clear the drive outside of the array. Then when you add it in, all UNRaid has to do is format it, which should take less than a minute. As to why you didn't see this behavior when you added the previous drives, I can't say. But best as I can tell from your syslog, everything is working normally.
  20. Your previous key is tied to the GUID of the old flash drive so even if you copied that key file over, it doesn't match the new drive and so Unraid defaults to the free (3 drive) version. Since your old drive didn't die, you'll most likely have to purchase a new key. Tom has been known to generate a new key for a replacement drive for free, however he's under no obligation to do so - and in this case, there's the risk that could use both drives, thereby getting a license for free. For the immediate term, use the old drive. You can also send an email to Tom explaining what you want to do, perhaps he'll oblige you with a replacement key.
  21. Any reason you can't use Joe's preclear script? http://lime-technology.com/forum/index.php?topic=13054.0 That will test the entire drive (multiple times if desired) and write the proper signature to it so Unraid can bring it online almost immediately.
  22. I'll withhold my comments on randomly changing cables... In any case, when you moved the disk inside the machine, did you connect it to the same cable / sata port as it was connected to when outside the box? If not, its quite possible the disk order changed when you moved it inside. Are you CERTAIN the parity disk is assigned correctly? If it isn't, and started Unraid, you just wiped whatever data was on that drive. If you are comfortable that parity has been correctly assigned, and the data on the other drives in intact, you can type initconfig at the server prompt, which will remove the current configuration, allow you to reassign the drives, start the array, and it will recalculate parity. NOTE: I'd strongly suggest posting a syslog before going this route if at all possible.
  23. I added a Intel Pro 100/1000 (sorry, don't have the exact model # offhand) PCIx card and didn't have to do anything other than disable the onboard in the bios. Do you see the card being initialized in the syslog? Here's the network portion of mine: Apr 23 14:42:17 Storage kernel: e1000 0000:02:16.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 (Network) Apr 23 14:42:17 Storage kernel: e1000: 0000:02:16.0: e1000_probe: (PCI:33MHz:32-bit) 00:02:b3:b1:d0:ac (Network) Apr 23 14:42:17 Storage kernel: e1000: eth0: e1000_probe: Intel® PRO/1000 Network Connection (Network) Apr 23 14:42:17 Storage logger: /etc/rc.d/rc.inet1: /sbin/ifconfig lo 127.0.0.1 (Network) Apr 23 14:42:17 Storage logger: /etc/rc.d/rc.inet1: /sbin/route add -net 127.0.0.0 netmask 255.0.0.0 lo (Network) Apr 23 14:42:17 Storage ifplugd(eth0)[1249]: ifplugd 0.28 initializing. (Network) Apr 23 14:42:17 Storage ifplugd(eth0)[1249]: Using detection mode: SIOCETHTOOL (Network) Apr 23 14:42:17 Storage ifplugd(eth0)[1249]: Initialization complete, link beat not detected. (Network) Apr 23 14:42:17 Storage kernel: e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX (Network) Apr 23 14:42:18 Storage ifplugd(eth0)[1249]: Link beat detected. (Network) Apr 23 14:42:19 Storage ifplugd(eth0)[1249]: Executing '/etc/ifplugd/ifplugd.action eth0 up'. (Network) Apr 23 14:42:19 Storage logger: /etc/rc.d/rc.inet1: /sbin/ifconfig eth0 hw ether 00:50:8D:91:70:39 (Network) Apr 23 14:42:19 Storage logger: /etc/rc.d/rc.inet1: /sbin/ifconfig eth0 192.168.10.143 broadcast 192.168.10.255 netmask 255.255.255.0 (Network) Apr 23 14:42:19 Storage logger: /etc/rc.d/rc.inet1: /sbin/route add default gw 192.168.10.1 metric 1 (Network) Apr 23 14:42:19 Storage ifplugd(eth0)[1249]: client: Starting NTP daemon: /usr/sbin/ntpd -g (Network) Apr 23 14:42:19 Storage ntpd[1983]: Listening on interface #0 wildcard, 0.0.0.0#123 Disabled (Network) Apr 23 14:42:19 Storage ntpd[1983]: Listening on interface #1 lo, 127.0.0.1#123 Enabled (Network) Apr 23 14:42:19 Storage ntpd[1983]: Listening on interface #2 eth0, 192.168.10.143#123 Enabled (Network) Apr 23 14:42:19 Storage ifplugd(eth0)[1249]: Program executed successfully. (Network)
  24. I may have missed it while quickly reading through the thread, but does it hard reboot if you are copying data within the server itself? I.e. using Midnight Commander to move data from one drive to another? If I read your initial issue correctly, the reboot occurs when writing to user shares, as in over the network correct? If so the issue may be being caused by the onboard nic sharing resources with the SATA controller - that is what was going on with my server (similar problem), and it was resolved by adding a dedicated nic. edit - I notice you have realtek onboard nic, I did as well. If the sata controller is being managed by that same chipset, its possible they are using the same IRQ.
  25. Starting at the beginning, when you say 2 drives failed because the files were no longer accessible from the network - what was their status in the Unraid? Based on what you've stated so far, its possible the drives / data are salvageable, but you need to back up a few steps... replacing the drives should be down only after all attempts to fix them are exhausted. Put the system back the way it was originally, with the "failed" drives. Start it up, see what Unraid says, post the syslog. If it sees the drives, then you'll get guidance on running SMART tests, as well as checking the file system on them.