fysmd

Members
  • Posts

    95
  • Joined

  • Last visited

Everything posted by fysmd

  1. All working again - smashing! Thank you for the help & confirmation!
  2. Thanks for such a prompt reply, maybe I'm just beging too cautious, wanted to check before I mucked something up. I'll give it a whirl now..
  3. Hi all, Happy New Year!! A little help please: I had an HDD fail so replaced with an alreay prepared replacement HDD (I once had multiple fail and no spares to hand!). After powering down to replace that drive, my cache SSD died totally, not even spotted in BIOS! I dont have a spare so running for a while without a cahche drive. I have recreated appdata share and restored from ca.backup (took an eterninty!) and rebooted but no docker containers have reappeared after a reboot. My container config is still there: root@Server:/boot/config/plugins/dockerMan/templates-user# ls my-Cacti.xml my-EmbyServer.xml my-HDDTemp.xml my-QDirStat.xml my-matrix.xml my-tasmobackup.xml my-CloudBerryBackup.xml my-FileBot.xml my-Influxdb.xml my-duplicati.xml my-nextcloud.xml my-tautulli.xml my-DiskSpeed.xml my-FileZilla.xml my-Minio.xml my-ferdi-server.xml my-plex.xml my-telegraf.xml my-Duplicacy.xml my-Grafana.xml my-Netdata.xml my-glances.xml my-plex22.xml Unraid prop 6.8.3, diag attached. I could just recreate my containers manually but shouldn't they have recovered after the steps I've taken? server-diagnostics-20210101-1652.zip
  4. getting same here on 6.8.3, network traffic looks like a (very) brief burst of traffic every 30s or so
  5. So I've been stable since changing to force vers=1.0 Is this a bug - as there's nothing logged how do we report it (usefully!)?
  6. I have see the same on ubuntu 18 host AND a 2nd unrid server using unassigned devices to mount. Spent ages thinking it was a problem with unassigned devices plugin... I notice that the web UI on the server machine runs VERY slowly once this issue begins. Noting obvs in any logs anywhere Switching to vers=1.0..
  7. OK, so Sunday update: Music share has gone offline again this morning. vh1-diagnostics-20200503-1035.zip I dont see anything obvs in syslog - what do you think? Server machine: eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether bc:5f:f4:d2:db:d4 txqueuelen 1000 (Ethernet) RX packets 2477677922 bytes 3071357019049 (2.7 TiB) RX errors 5 dropped 533 overruns 0 frame 5 TX packets 3155609402 bytes 4322561158105 (3.9 TiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 20 memory 0xf0700000-f0720000 eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether bc:5f:f4:d2:db:d4 txqueuelen 1000 (Ethernet) RX packets 130740224 bytes 17696208835 (16.4 GiB) RX errors 41312 dropped 100132 overruns 0 frame 41312 TX packets 333590388 bytes 442915979439 (412.4 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Client machine: docker@VH1:~$ ifconfig eth0 eth0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST> mtu 1500 ether d8:cb:8a:14:10:d9 txqueuelen 1000 (Ethernet) RX packets 315521459 bytes 408413770038 (380.3 GiB) RX errors 0 dropped 2992 overruns 0 frame 0 TX packets 119869768 bytes 23267646022 (21.6 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 19 docker@VH1:~$ ethtool -S eth0 NIC statistics: rx_packets: 315619207 rx_bcast_packets: 396272 rx_mcast_packets: 588699 rx_pause_packets: 0 rx_ctrl_packets: 2995 rx_fcs_errors: 0 rx_length_errors: 0 rx_bytes: 408481320459 rx_runt_packets: 0 rx_fragments: 0 rx_64B_or_less_packets: 1757734 rx_65B_to_127B_packets: 1000264 rx_128B_to_255B_packets: 39357056 rx_256B_to_511B_packets: 6591231 rx_512B_to_1023B_packets: 4971266 rx_1024B_to_1518B_packets: 261944652 rx_1519B_to_mtu_packets: 0 rx_oversize_packets: 0 rx_rxf_ov_drop_packets: 0 rx_rrd_ov_drop_packets: 0 rx_align_errors: 0 rx_bcast_bytes: 45617284 rx_mcast_bytes: 184470451 rx_address_errors: 0 tx_packets: 119944390 tx_bcast_packets: 1902 tx_mcast_packets: 9153 tx_pause_packets: 0 tx_exc_defer_packets: 0 tx_ctrl_packets: 0 tx_defer_packets: 0 tx_bytes: 23281052097 tx_64B_or_less_packets: 67518 tx_65B_to_127B_packets: 60869104 tx_128B_to_255B_packets: 49266597 tx_256B_to_511B_packets: 3280260 tx_512B_to_1023B_packets: 154673 tx_1024B_to_1518B_packets: 6306238 tx_1519B_to_mtu_packets: 0 tx_single_collision: 0 tx_multiple_collisions: 0 tx_late_collision: 0 tx_abort_collision: 0 tx_underrun: 0 tx_trd_eop: 0 tx_length_errors: 0 tx_trunc_packets: 0 tx_bcast_bytes: 161174 tx_mcast_bytes: 1320031 tx_update: 0 docker@VH1:~$ Drops look to be control packets on client machine but many more (numerically) on eth1 on server. I'm going to disable eth1 port on the server machine.. [guess] Also, as I've rebooted server upgraded it to 6.8.3 (same as client machine)
  8. they autoupdate but I did it manually earlier. For info, how would containers impact (host) system mounts? The containers do map to these CIFS mounts (with slave option). Syslog: May 2 12:51:41 VH1 kernel: CIFS VFS: Close unmatched open May 2 12:53:11 VH1 kernel: CIFS VFS: Close unmatched open May 2 12:58:14 VH1 kernel: CIFS VFS: Close unmatched open May 2 12:59:09 VH1 kernel: CIFS VFS: Autodisabling the use of server inode numbers on \\SERVER\downloads. This server doesn't seem to support them properly. Hardlinks will not be recognized on this mount. Consider mounting with the "noserverino" option to silence this message. May 2 12:59:10 VH1 kernel: CIFS VFS: Autodisabling the use of server inode numbers on \\SERVER\movies. This server doesn't seem to support them properly. Hardlinks will not be recognized on this mount. Consider mounting with the "noserverino" option to silence this message. May 2 12:59:14 VH1 kernel: CIFS VFS: Close unmatched open May 2 12:59:45 VH1 kernel: CIFS VFS: Close unmatched open May 2 13:01:46 VH1 kernel: CIFS VFS: Close unmatched open May 2 13:02:46 VH1 kernel: CIFS VFS: Close unmatched open
  9. all put back to 1500 last night as reported. One CIFS share down again this morning in "Main" on web UI, UD shows "music" as mounted but with zero size,used and free. Linux 4.19.107-Unraid. docker@VH1:~$ df -h Filesystem Size Used Avail Use% Mounted on <snip> //SERVER/TV 66T 45T 21T 69% /mnt/disks/SERVER_TV //SERVER/music 66T 45T 21T 69% /mnt/disks/SERVER_music //SERVER/movies 66T 45T 21T 69% /mnt/disks/SERVER_movies docker@VH1:~$ ls /mnt/disks/SERVER_music /bin/ls: cannot access '/mnt/disks/SERVER_music': Stale file handle docker@VH1:~$ and still lots of "unmatched open" in syslog. vh1-diagnostics-20200502-0915.zip
  10. yes, the server machine 9k. I've not changed the client machine.
  11. so machine ran overnight and this morning I have a different CIFS share in the broken state. Could it be that they fail due to inactivity? I'm going to scipt ls of a large directory every five mins to see if that prevents. For info, I'm doing this because my cherished NUC has passed. I was running ubuntu14 LTS on there and had zero CIFS mount issues they were rock solid. vh1-diagnostics-20200430-1059.zip
  12. LOL! I thought exactly what you say and that's same as in help. In reality though it seems to fill the cache and then complain / fail Wondering if I need to combine wiht with a different min free value or something. .. or maybe the machin was in a funk at the time and just needed restarting. Yes, I'm a very long time user, the "server" is my (now) very stable box but I'm testing out another one to host some containers and virtual machines. On an eval license and using old disks which wile working, were removed from my live arrar due to the smart warnings.
  13. So a reboot fixed Time discrepancies (the web ui and cli: date both reported correct local time but not syslog) and the cache disk is no longer full (prefer cache on share doesnt seem to do what I expected!). and so far the CIFS mount is stable (and so my cache disk no longer full) Syslog full of: Apr 30 15:11:30 VH1 kernel: CIFS VFS: Close unmatched open Apr 30 15:13:00 VH1 kernel: CIFS VFS: Close unmatched open Apr 30 15:17:01 VH1 kernel: CIFS VFS: Close unmatched open Apr 30 15:17:01 VH1 kernel: CIFS VFS: Close unmatched open Apr 30 15:19:01 VH1 kernel: CIFS VFS: Close unmatched open Apr 30 15:20:02 VH1 kernel: CIFS VFS: Close unmatched open Apr 30 15:20:33 VH1 kernel: CIFS VFS: Close unmatched open Are they relevant?
  14. Sorry in advance if answered in this (LONG) thread but I've searched and can't find same issue: Two unraid servers running 6.8.2 [server] and 6.8.3 [client] both on physical boxes. diag attached for client machine. I run UD on both but I want to mount my media shares using CIFS with UD on the client server. Many work well and are stable but my movies share is not. sometimes it mounts and works for a period but once it fails it shows mounted on the GUI and in CLI but navigating into share un GUI takes a long time but shows it empty. In cli: root@VH1:~# ls /mnt/disks/SERVER_movies /bin/ls: cannot access '/mnt/disks/SERVER_movies': Stale file handle syslog reports sucessful mount despite it not bing and syslog full of: root@VH1:~# ls /mnt/disks/SERVER_movies /bin/ls: cannot access '/mnt/disks/SERVER_movies': Stale file handle I notice that the time shown in syslog must be in a different timezone, machine reports time correctly but logs are -7h ( I know my cache disk full at the mo BTW! and a couple of HHDs are reporting smart errors, I'm in eval at the mo while I try to get this puppie working how I want ) vh1-diagnostics-20200430-1059.zip
  15. Unraid v 6.7.0 diag attached and drive ogs below. This drive is not very old, only 6.3 weeks of runtime. Nothing logged (telegraf) into Grafana from that drive seems to indicate it going wrong. Could this be a simple HDD failure and nothing to worry about? May 18 12:15:22 Server kernel: sd 7:0:4:0: [sdi] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) May 18 12:15:22 Server kernel: sd 7:0:4:0: [sdi] 4096-byte physical blocks May 18 12:15:22 Server kernel: sd 7:0:4:0: [sdi] Write Protect is off May 18 12:15:22 Server kernel: sd 7:0:4:0: [sdi] Mode Sense: 9b 00 10 08 May 18 12:15:22 Server kernel: sd 7:0:4:0: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA May 18 12:15:22 Server kernel: sdi: sdi1 May 18 12:15:22 Server kernel: sd 7:0:4:0: [sdi] Attached SCSI disk May 18 12:15:47 Server emhttpd: ST4000DM004-2CV104_ZFN1RHKA (sdi) 512 7814037168 May 18 12:15:47 Server kernel: mdcmd (2): import 1 sdi 64 3907018532 0 ST4000DM004-2CV104_ZFN1RHKA May 18 12:15:47 Server kernel: md: import disk1: (sdi) ST4000DM004-2CV104_ZFN1RHKA size: 3907018532 May 18 12:15:50 Server emhttpd: shcmd (27): /usr/local/sbin/set_ncq sdi 1 May 18 12:15:50 Server root: set_ncq: setting sdi queue_depth to 1 May 18 12:15:50 Server emhttpd: shcmd (28): echo 128 > /sys/block/sdi/queue/nr_requests May 21 03:14:25 Server kernel: sd 7:0:4:0: [sdi] Synchronizing SCSI cache May 21 03:14:25 Server kernel: sd 7:0:4:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 May 21 03:14:25 Server kernel: sd 7:0:4:0: [sdi] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 6a 24 de 70 00 00 01 40 00 00 May 21 03:14:25 Server kernel: print_req_error: I/O error, dev sdi, sector 1780801136 May 21 03:14:25 Server kernel: sd 7:0:4:0: [sdi] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 May 21 03:14:25 Server kernel: sd 7:0:4:0: [sdi] tag#1 CDB: opcode=0x88 88 00 00 00 00 00 6a 24 df b0 00 00 00 10 00 00 May 21 03:14:25 Server kernel: print_req_error: I/O error, dev sdi, sector 1780801456 May 21 03:14:25 Server kernel: sd 7:0:4:0: [sdi] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 server-diagnostics-20190527-0933.zip
  16. Hi all, Well, a week later, a full rebuild and much frustration later I think I can confirm that the SATA power cable with capacitor was my issue. No issues reported since! Thanks for the help and advise all!
  17. So, after more investigation I found the only component common to the misbehaving drives was one of these puppies: https://www.scan.co.uk/products/silverstone-cp06-1-to-4-sata-power-adaptor-cable-with-capacitor Removed and now over three hours into rebuild with no faults so far (everything's crossed).
  18. During rebuild two otherwise happy drives now report a >>LOT<< of read errors. I'm replacing the PSU which supplies all so far affcted drives, is there anything lurking in the logs I dont see? server-diagnostics-20190224-0822.zip
  19. the drives are split over two 550W PSUs but will tot up how many / which on each.
  20. machine reboot rather than stop and restart array? - damn, in the middle of preclears
  21. Just had another drive go, disk2 now, diag attached server-diagnostics-20190223-1439.zip
  22. Sorry, that's not what I meant - I now have four drives which i removed from the array, likely cable related. is there any reason to suspect them faulty? One back at full strength I may run preclear a few times on them to test them but is there anything else?
  23. thanks, I noticed disk18 and have a replacement now - preclearing. While starting a pre-clear, disk2 just went offline!! It does seem that all these drives were on the same cable. Bi of rearrangement and that cable's out of play now. REALLY glad I added two parity now! Other than drive18 with errors, would I be right thinking that they're likely OK?
  24. So I've just had my third drive in a fortnight go bad. Last week I took the precaution of adding a 2nd parity but had another go unmountable since then. Now the drives failing have been old but I'm wondering if there is something going on here which I'm not noticing. Posted diag from today and one from a v poory state last week. Any advise / comments much appreciated.. [off to buy another couple of drives!] server-diagnostics-20190223-0956.zip server-diagnostics-20190216-1045.zip