Jump to content

Dmtalon

Members
  • Content Count

    148
  • Joined

  • Last visited

Everything posted by Dmtalon

  1. look back like 3 posts on how to roll back. Something in the latest release(s) removed unpack portion of the settings. Also for those that want to roll back DO NOT copy/paste that tag into Repository, type it in. Copy/Paste (at least for me) broke this docker. Clearly something was getting captured in the copy.
  2. I actually do for the HDD's but I don't an enclosure for my SSD to slide it into one of them so it's inside the case
  3. OK, this is why I hate opening my unRAID case It appears that this was just cabling. I have again swapped a cable and double/triple checked everything was fully seated. Re-seated the HDD's, and booted up. Been up for about 40 minutes w/o any SATA errors/issues. Lets hope this continues. Thanks for the insight/help.
  4. Oh wait... I guess I might be confused here what is what (looking at my own diagnostic I see my SSD is ATA10)
  5. I moved the SATA to the empty port on the Asmedia controller Attached latest diagnostic nas1-diagnostics-20181204-1052.zip
  6. Ack, I didn't even notice I had ATA9 in there. I was copying/pasting and managed to overlook that. The current connectors have fat heads and are pushing the release spring of the bottom cable. SO I swapped in two of the original ones for the bottom and replaced the SSD with a third brand cable and moved it to the ASMedia port that was open. I think I got a clean boot!! I also took this time to look up and see there was a newer BIOS for my MB, and updated that while I was at it. Why not The highest messages I got on this boot were yellow/orange warnings, no errors and not related to SATA. Dec 4 05:12:32 NAS1 kernel: ACPI: Early table checksum verification disabled Dec 4 05:12:32 NAS1 kernel: ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has valid Length but zero Address: 0x0000000000000000/0x1 (20170728/tbfadt-658) Dec 4 05:12:32 NAS1 kernel: acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM Dec 4 05:12:32 NAS1 kernel: floppy0: no floppy controllers found Dec 4 05:12:32 NAS1 kernel: random: 7 urandom warning(s) missed due to ratelimiting Dec 4 05:12:33 NAS1 rpc.statd[1659]: Failed to read /var/lib/nfs/state: Success Dec 4 05:12:37 NAS1 avahi-daemon[2809]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns! AND... There it is again. While typing up this message ATA1 (SSD) just barked again. <sigh> Dec 4 10:18:53 NAS1 kernel: ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen Dec 4 10:18:53 NAS1 kernel: ata1.00: irq_stat 0x08000000, interface fatal error Dec 4 10:18:53 NAS1 kernel: ata1: SError: { Handshk } Dec 4 10:18:53 NAS1 kernel: ata1.00: failed command: WRITE DMA EXT Dec 4 10:18:53 NAS1 kernel: ata1.00: cmd 35/00:40:d0:a6:37/00:01:04:00:00/e0 tag 16 dma 163840 out Dec 4 10:18:53 NAS1 kernel: res 50/00:00:cf:a6:37/00:00:04:00:00/e0 Emask 0x10 (ATA bus error) Dec 4 10:18:53 NAS1 kernel: ata1.00: status: { DRDY } Dec 4 10:18:53 NAS1 kernel: ata1: hard resetting link Dec 4 10:18:53 NAS1 kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 4 10:18:53 NAS1 kernel: ata1.00: configured for UDMA/133 Dec 4 10:18:53 NAS1 kernel: ata1: EH complete
  7. Attached! Thanks nas1-diagnostics-20181203-1848.zip
  8. Last week I took advantage of the low price of the a Samsung 860 1TB SSD so I could replace my quite old WD Black 1TB cache drive and an older 128GB Samsung SSD used as an apps drive (from SNAP days) The cutover to the new SSD when very smooth and I was able to move my VM/Dockers off my app drive without any issue. My end result was removing two drives and replacing them with the new SSD. The problem is I keep getting errors at boot and in the log every so often. I *thought* it had something to do with NCQ (which is forced off) but it's still error hours later. Below, notice the SATA link up at 1.5 Gbps. Everything seems to work, I've used plex docker quite a bit and have a Windows VM running on this drive too. No user/noticeable issues Also, I'm on SATA Cable #3. The last I opened the case last night I replaced all 7 of them with brand new Monoprice 18" cables. Any help would be greatly appreciated. Let me know if I should attach diagnostics. Dec 3 15:19:36 NAS1 kernel: ata5.00: exception Emask 0x10 SAct 0x7fffefff SErr 0x0 action 0x6 frozen Dec 3 15:19:36 NAS1 kernel: ata5.00: irq_stat 0x08000000, interface fatal error ~ <lots of entries> Dec 3 15:19:36 NAS1 kernel: ata5.00: status: { DRDY } Dec 3 15:19:36 NAS1 kernel: ata5.00: failed command: WRITE FPDMA QUEUED Dec 3 15:19:36 NAS1 kernel: ata5.00: cmd 61/10:e8:70:4d:1c/00:00:1d:00:00/40 tag 29 ncq dma 8192 out Dec 3 15:19:36 NAS1 kernel: res 40/00:68:70:46:1c/00:00:1d:00:00/40 Emask 0x10 (ATA bus error) Dec 3 15:19:36 NAS1 kernel: ata5.00: status: { DRDY } Dec 3 15:19:36 NAS1 kernel: ata5.00: failed command: WRITE FPDMA QUEUED Dec 3 15:19:36 NAS1 kernel: ata5.00: cmd 61/a8:f0:c0:4d:1c/00:00:1d:00:00/40 tag 30 ncq dma 86016 out Dec 3 15:19:36 NAS1 kernel: res 40/00:68:70:46:1c/00:00:1d:00:00/40 Emask 0x10 (ATA bus error) Dec 3 15:19:36 NAS1 kernel: ata5.00: status: { DRDY } Dec 3 15:19:36 NAS1 kernel: ata5: hard resetting link Dec 3 15:19:36 NAS1 kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 3 15:19:36 NAS1 kernel: ata5.00: supports DRM functions and may not be fully accessible Dec 3 15:19:36 NAS1 kernel: ata5.00: supports DRM functions and may not be fully accessible Dec 3 15:19:36 NAS1 kernel: ata5.00: configured for UDMA/133 Dec 3 15:19:36 NAS1 kernel: ata5: EH complete Dec 3 15:19:36 NAS1 kernel: ata5.00: Enabling discard_zeroes_data Here's from earlier today. (notice the 6.0 Gbps) Dec 2 20:45:33 NAS1 kernel: ata9.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen Dec 2 20:45:33 NAS1 kernel: ata9.00: irq_stat 0x08000000, interface fatal error Dec 2 20:45:33 NAS1 kernel: ata9: SError: { Handshk } Dec 2 20:45:33 NAS1 kernel: ata9.00: failed command: WRITE DMA EXT Dec 2 20:45:33 NAS1 kernel: ata9.00: cmd 35/00:40:b8:2e:89/00:05:14:00:00/e0 tag 10 dma 688128 out Dec 2 20:45:33 NAS1 kernel: res 50/00:00:b7:2e:89/00:00:14:00:00/e0 Emask 0x10 (ATA bus error) Dec 2 20:45:33 NAS1 kernel: ata9.00: status: { DRDY } Dec 2 20:45:33 NAS1 kernel: ata9: hard resetting link Dec 2 20:45:34 NAS1 kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 2 20:45:34 NAS1 kernel: ata9.00: configured for UDMA/133 Dec 2 20:45:34 NAS1 kernel: ata9: EH complete No errors shown here, passes smart test During boot up things SEEM to come up ok. then ~23 seconds later it errors out, comes back up, errors out, comes up, gets limited to 3.0, errors out. At one point over night it made it like 4 hours w/o erroring (no activity I guess) Dec 2 20:10:22 NAS1 kernel: EDAC MC0: Giving out device to module amd64_edac controller F15h: DEV 0000:00:18.3 (INTERRUPT) Dec 2 20:10:22 NAS1 kernel: EDAC PCI0: Giving out device to module amd64_edac controller EDAC PCI controller: DEV 0000:00:18.2 (POLLED) Dec 2 20:10:22 NAS1 kernel: AMD64 EDAC driver v3.5.0 Dec 2 20:10:22 NAS1 kernel: ata7: SATA link down (SStatus 0 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 2 20:10:22 NAS1 kernel: ata5.00: supports DRM functions and may not be fully accessible Dec 2 20:10:22 NAS1 kernel: ata9.00: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4E0ELT95A, 82.00A82, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata9.00: 7814037168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata1.00: ATA-9: WDC WD20EZRX-00DC0B0, WD-WCC1T0586104, 80.00A80, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata1.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata9.00: configured for UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata5.00: ATA-11: Samsung SSD 860 EVO 1TB, S3Z8NB0KB64216A, RVT02B6Q, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata5.00: 1953525168 sectors, multi 1: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata1.00: configured for UDMA/133 Dec 2 20:10:22 NAS1 kernel: scsi 1:0:0:0: Direct-Access ATA WDC WD20EZRX-00D 0A80 PQ: 0 ANSI: 5 Dec 2 20:10:22 NAS1 kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0 Dec 2 20:10:22 NAS1 kernel: sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Dec 2 20:10:22 NAS1 kernel: sd 1:0:0:0: [sdb] 4096-byte physical blocks Dec 2 20:10:22 NAS1 kernel: sd 1:0:0:0: [sdb] Write Protect is off Dec 2 20:10:22 NAS1 kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 Dec 2 20:10:22 NAS1 kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 2 20:10:22 NAS1 kernel: ata5.00: supports DRM functions and may not be fully accessible Dec 2 20:10:22 NAS1 kernel: ata4.00: ATA-8: WDC WD20EARS-00MVWB0, WD-WMAZA3795145, 51.0AB51, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata4.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata3.00: ATA-8: WDC WD20EARS-00MVWB0, WD-WMAZA3812777, 51.0AB51, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata3.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata2.00: ATA-8: WDC WD20EARS-00MVWB0, WD-WMAZA3745610, 51.0AB51, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata2.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata5.00: configured for UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata6.00: ATA-8: WDC WD20EARX-00PASB0, WD-WCAZAC344236, 51.0AB51, max UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata6.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Dec 2 20:10:22 NAS1 kernel: ata4.00: configured for UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata3.00: configured for UDMA/133 Dec 2 20:10:22 NAS1 kernel: ata2.00: configured for UDMA/133 Dec 2 20:10:22 NAS1 kernel: scsi 2:0:0:0: Direct-Access ATA WDC WD20EARS-00M AB51 PQ: 0 ANSI: 5 Dec 2 20:10:22 NAS1 kernel: sd 2:0:0:0: Attached scsi generic sg2 type 0 Dec 2 20:10:22 NAS1 kernel: sd 2:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Dec 2 20:10:22 NAS1 kernel: sd 2:0:0:0: [sdc] Write Protect is off Dec 2 20:10:22 NAS1 kernel: sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 Dec 2 20:10:22 NAS1 kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 2 20:10:22 NAS1 kernel: scsi 3:0:0:0: Direct-Access ATA WDC WD20EARS-00M AB51 PQ: 0 ANSI: 5 Dec 2 20:10:22 NAS1 kernel: sd 3:0:0:0: Attached scsi generic sg3 type 0 Dec 2 20:10:22 NAS1 kernel: sd 3:0:0:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Dec 2 20:10:22 NAS1 kernel: sd 3:0:0:0: [sdd] Write Protect is off Dec 2 20:10:22 NAS1 kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 Dec 2 20:10:22 NAS1 kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 2 20:10:22 NAS1 kernel: scsi 4:0:0:0: Direct-Access ATA WDC WD20EARS-00M AB51 PQ: 0 ANSI: 5 Dec 2 20:10:22 NAS1 kernel: sd 4:0:0:0: [sde] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Dec 2 20:10:22 NAS1 kernel: sd 4:0:0:0: Attached scsi generic sg4 type 0 Dec 2 20:10:22 NAS1 kernel: sd 4:0:0:0: [sde] Write Protect is off Dec 2 20:10:22 NAS1 kernel: sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 Dec 2 20:10:22 NAS1 kernel: sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 2 20:10:22 NAS1 kernel: scsi 5:0:0:0: Direct-Access ATA Samsung SSD 860 2B6Q PQ: 0 ANSI: 5 Dec 2 20:10:22 NAS1 kernel: ata5.00: Enabling discard_zeroes_data Dec 2 20:10:22 NAS1 kernel: sd 5:0:0:0: [sdf] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) Dec 2 20:10:22 NAS1 kernel: sd 5:0:0:0: [sdf] Write Protect is off Dec 2 20:10:22 NAS1 kernel: sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00 Dec 2 20:10:22 NAS1 kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 2 20:10:22 NAS1 kernel: sd 5:0:0:0: Attached scsi generic sg5 type 0 Dec 2 20:10:22 NAS1 kernel: ata5.00: Enabling discard_zeroes_data
  9. Thanks, that is my opinion as well. After reseating the connectors, The Parity Sync did complete w/o errors. Fingers crossed it was a one time issue.
  10. At about 9am this morning I got an email saying that my parity drive was down. Event: unRAID Parity disk error Subject: Alert [NAS1] - Parity disk in error state (disk dsbl) Description: WDC_WD40EFRX-68WT0N0_WD-WCC4E0ELT95A (sdh) Importance: alert I'm home today, so I logged in to see a red x over parity. I downloaded diagnostics, and tried looking at smart/running smartctl from the command, but no luck. root@NAS1:/boot# smartctl -a -A -T permissive /dev/sdh > /boot/paritySMART.txt root@NAS1:/boot# cat paritySMART.txt smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.9.30-unRAID] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org Short INQUIRY response, skip product id === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 0 C Drive Trip Temperature: 0 C Read defect list: asked for grown list but didn't get it Error Counter logging not supported Device does not support Self Test logging not getting anything from smart I then shut the array down, walked down and opened/reseated the connectors to the Parity drive (and all drives) and looked around not seeing anything obviously wrong, so I started it back up. The drive powered up and appears to be error free from a smart perspective. smartctl runs and I can see attributes in the UI Because of this, my opinion was to just unassign, reassign the drive and let it rebuild, but that doesn't seem to be working. And that's what I'm doing now. But I'm curious if anyone has any opinion on what might have happened? Parity rebuild is just over 30% now and no errors detected. unRaid had been up 31 days, and 0 erros found on 8/1 parity check. One thing I forgot to check was if the drive was spinning/powered up. I am wondering if somehow it was not spinning which is why smartctl wasn't working, and even trying to through the UI (spin up drive) was failing. nas1-diagnostics-20180814-0906_preboot.zip nas1-diagnostics-20180814-0952_rebooted.zip
  11. should RSS be able to auto-download? I thought I had this setup working but now it's doesn't seem to. Torrents come in via RSS as a status of RSS and I have to manually click on them to load them.
  12. So, I upgraded to 6.5.0 a little while ago, with no immediate issues (had to mess with some config to get VMs to start but otherwise no big deal). I upgraded from 6.3.5. First, I have Plex docker, and two VM's (windows 7 and windows 10). Fast forward to issues; I started noticing that my VM which has uTorrent on it (and rss feeds) was having issues, or at least that was the symptom. It seemed that downloads were not finishing, but it thought they were, and was running winrar to extract, which was either failing, or producing a choppy/corrupt file. I just noticed the missing files or corruptions. If I did a *check* on the torrent it would find it was only 97% complete and finish it, and THEN the extraction would work. That was my first symptom, just issues with extracting torrents. To make a long story hopefully shorter, I started digging around and looking in logs, and saw some corruption in the brfs file system. I actually posted over here: Over there I attached some screen shots, and log files etc, (diagnostic report) showing lots of corruption on the cache drive and I'm guessing my app drive too (cache is a 1TB WD black, app is a samsung SSD 128gb) So, push come to shove, I decided to try going back to 6.3.5. I copied the the entire backup I make before upgrading (full zip of the flash drive), and put my cache, and app drive back into their original slots (part of my troubleshooting in the other thread was pulling the app/cache drive and replacing with a 512GB SSD to do both cache/app). And booted back up. From there I took the data off the app drive, copied it to the array, then formatted it, and then copied over my one windows VM, and my plex docker. Got everything back up and running last weekend. It's now been about a week and I'm not seeing any issues. During the process of troubleshooting the issues I was having on 6.5.0 I formatted that cache/app drive a couple times. I started fresh with dockers and even a fresh VM install of windows 7. The install happened "ok" w/o errors, but then when installing SP1 corruption started happening in the logs ended up causing vm manager to stop. I did run memtest86 for 4 passes (19 hours) no errors, I had reseated all the cables, and ram. So my only conclusion is there is an issue with maybe brfs/images and 6.5.0 (at least with my hardware). Hopefully there is enough information in the other thread (attached files) to help diagnose this and/or maybe prove I'm wrong. I'd really like to get back on the latest unraid, but not if it doesn't work for me. Thanks.
  13. So, as an update. I let memtest run 4 tests (about 19 hours) and it found no errors. I've decided to try going back to my previous build of unRAID (6.3.5) (did a full restore of the zip I took before upgrading). Put my app drive and cache spinner back in, and rebooted. I copied off the date from the app drive, cleaned it up by reformatting it, then moved my VM/Dockers back on. Things seem back up, and no errors in the log(yet). I'm about 20% through a parity check now. Once that completes I'll try some tests load tests within the VM to see if the problem returns. I'm starting to wonder if it's a 6.5.0 issues, but time will tell I guess.
  14. So, I cleaned up the cache drive, formatted it clean and deleted the libvirt image and started fresh. Started installing windows again fresh. While trying to install a service pack I see this pop up in the logs. No indication there were any issues until just now. (no errors in log) I shut everything down, and reseated the ram, and am running memtest86 on it. We'll see what happens I guess. MB/Build is from 9/2014 ASUS SABERTOOTH 990FX R2.0 AMD BOX AMD FX-8320 BLACK ED CRUCIAL 8GB D3 1333 ECC x3 XFX HD5450 1GB D3 DVH PCIE Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): Metadata CRC error detected at xfs_buf_ioend+0x49/0x9c [xfs], xfs_bmbt block 0xdc5be0 Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): Unmount and run xfs_repair Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): First 64 bytes of corrupted metadata buffer: Apr 1 14:48:10 NAS1 kernel: ffff8805eaa44000: 42 4d 41 33 00 00 00 fb 00 00 00 00 02 04 96 d8 BMA3............ Apr 1 14:48:10 NAS1 kernel: ffff8805eaa44010: 00 00 00 00 02 04 85 2e 00 00 00 00 00 dc 5b e0 ..............[. Apr 1 14:48:10 NAS1 kernel: ffff8805eaa44020: 00 00 00 01 00 03 40 4c 87 94 a5 6c 31 ae 4b 35 ......@L...l1.K5 Apr 1 14:48:10 NAS1 kernel: ffff8805eaa44030: 9c 4c 12 72 8e 1b 67 1c 00 00 00 00 00 00 00 65 .L.r..g........e Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): metadata I/O error: block 0xdc5be0 ("xfs_trans_read_buf_map") error 74 numblks 8 Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): xfs_do_force_shutdown(0x1) called from line 315 of file fs/xfs/xfs_trans_buf.c. Return address = 0xffffffffa0254bea Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): I/O Error Detected. Shutting down filesystem Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): Please umount the filesystem and rectify the problem(s) Apr 1 14:48:10 NAS1 kernel: XFS (sdh1): writeback error on sector 246627368
  15. Build is a few years old now, has been rock solid, until recently. I have had 2 VM's (windows 7 and windows10) and a Plex docker. The windows 7 VM runs my "house audio" via a pass-through audio PCI card, and also runs uTorrent. First sign of issues, is extracted RAR files either being empty (extraction failed), or corruption (or what I thought was corruption). I had 5 data drives, 1 parity, cache and an app drive. Cache drive is a fairly old 1TB WD Black, app drive is a samsung 128GB SSD. VMs/Dockers live on the app drive, and downloads would download to the VM, then copy to a directory (through cache). I assumed my cache drive was dying even though no red ball, and decided to move into 2018 and dump the separate cache/app drive and put in a 500GB SSD as my cache/app drive. That all went fairly smooth, however one of my VM's didn't want to come up. I messed with it for a while, but eventually plugged the old app drive into another pc (booted upbuntu usb live) and tried copying over the vm again. I did this and boom, VM came up. Fast Forward, uTorrent (or some process) was still having issues. I decided since I had a nice new big cache/app drive to install ruTorrent docker, Got that up and running and added some existing recent torrents to it pointing to there already downloaded location. It was here I discovered that uTorrent didnt' seem to be completing the downloads. ruTorrent was finding them at like 96-97% complete and then finishing them. So, I assumed that this was just some kind of uTorrent issue, and moved on with life using ruTorrent docker vs uTorrent on my VM. It was this same time I discovered pi-hole and pi-hole docker so I installed and got that working/setup too. SO I'm feeling all good and things are working etc.. Yesterday my VMs/Dockers crashed, and i had all kinds of I/O errors in my logs. So I SSH in and try to look in /mnt/cache but no go, i/o errors. OK, lets shut down the array go into maintenance mode and do some checks. BUT, I can't get /mnt/cache unmounted. I let it sit, I tried lsof/fuser etc.. but the kernel was the only process that seemed to have it locked, so I pulled the plug and rebooted. Everything came up but Parity wanted to run. I stopped it, stopped the array, and put it in maintenance mode. I wanted to do some checks on the cache drive. It seemed to find some errors it couldn't fix (and google didn't help me a whole lot honestly) "Metadata CRC error detected at xfs_bmbt block" Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata CRC error detected at xfs_bmbt block 0xec53a00/0x1000 Metadata CRC error detected at xfs_bmbt block 0xec53a00/0x1000 btree block 1/451648 is suspect, error -74 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 2 - agno = 0 - agno = 3 - agno = 1 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done After trying to run xfs_repair w/o -n and ultimately with -L those errors stay, so I restarted the array and let it run a parity check through the night. It found 2.4 million errors. and when I just took the array offline (stopped, restarted in maintenance mode to get the errors above) it told me I had an unclean shutdown and needed another parity check. I'm attaching some screen shots, and a diagnostics report from just now after the parity. Before I keep digging in this hole I'm in, I'm hoping someone can maybe help me climb out. Thanks, sorry for the rambling. nas1-diagnostics-20180401-1056.zip
  16. Thanks @trurl... Sorry for the trouble. My initial 'issue' matched the existing post which put me in the other thread and I was just trying to find out if my docker was the issue
  17. I might have a couple things going on... <sigh> I have what looks like a dorked up cache drive (xfs) root@NAS1:~# xfs_repair -v /dev/sdi1 Phase 1 - find and verify superblock... - block cache size set to 2277800 entries Phase 2 - using internal log - zero log... zero_log: head block 182199 tail block 181661 ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. And since this was a cache drive I didn't care enough to troubleshoot a LOT. and tried to just clear the logs. root@NAS1:~# xfs_repair -L /dev/sdi1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. Invalid block length (0x0) for buffer Log inconsistent (didn't find previous header) empty log check failed fatal error -- failed to clear log I guess next I'm just going to attempt to reformat it.
  18. Sorry, I'm not the OP, so I kind of hijacked his post probably a party foul
  19. how do you know loop2 is docker image? (is this just default?) I'm getting these errors for loop3 Mar 24 16:27:16 NAS1 kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 3, rd 0, flush 0, corrupt 0, gen 0 Mar 24 16:27:16 NAS1 kernel: loop: Write error at byte offset 91684864, length 4096. Mar 24 16:27:16 NAS1 kernel: print_req_error: I/O error, dev loop3, sector 179072 Mar 24 16:27:16 NAS1 kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 4, rd 0, flush 0, corrupt 0, gen 0 Mar 24 16:27:16 NAS1 kernel: BTRFS: error (device loop3) in btrfs_commit_transaction:2257: errno=-5 IO failure (Error while writing out transaction) Mar 24 16:27:16 NAS1 kernel: BTRFS info (device loop3): forced readonly Mar 24 16:27:16 NAS1 kernel: BTRFS warning (device loop3): Skipping commit of aborted transaction. Mar 24 16:27:16 NAS1 kernel: BTRFS: error (device loop3) in cleanup_transaction:1877: errno=-5 IO failure Mar 24 16:27:16 NAS1 kernel: BTRFS info (device loop3): delayed_refs has NO entry
  20. Finally upgrading from 6.1.9 to 6.3.3 so I went through the procedures for my VM's however neither fs0 or fs1 work. After posting this, after reading/searching more I learned I could type exit here, but this takes me to the BIOS but doesn't allow me to boot into windows. This was a working windows 7 install. Here is the VM page.
  21. If you switch, your old image will still be there, it'll just start a new one. I don't remember having to redo any settings (are they saved on plex cloud?) I am a plex pass user though so not sure if that is why. I no longer use needo's, I use this one: https://registry.hub.docker.com/u/linuxserver/plex/
  22. Just FYI, now a days, I use this plex docker: https://registry.hub.docker.com/u/linuxserver/plex/ And you log in once its up/running vs in the configuration. Notice the highlighted part to pull the plexpass version. I also force my transcode directory onto my ssd app drive.
  23. 07:00.0 PCI bridge: Creative Labs [sB X-Fi Xtreme Audio] CA0110-IBG PCIe to PCI Bridge 08:00.0 Audio device: Creative Labs [sB X-Fi Xtreme Audio] CA0110-IBG 08 is what I'm passing through to my windows 7 VM. This is (I believe) a custom version of the SB X-Fi Xtreme card. It is a PCI card, and is for my whole house audio (Casatunes). Its a 6 channel audio card with an aux in and IR out. /usr/local/sbin/vfio-bind 0000:08:00.0
  24. Not sure how common this will be but I had to copy my .xml's back to /etc/libvirt/qemu after the upgrade. I was a couple versions behind on 14b but the upgrade didn't move my .xml's I had them backed up (which I'd suggest you do first).