bmrowe

November 12, 2023

An update - I changed several things:

-I updated my DHCP server to only share out IPs in the .200-250 range thinking that perhaps there was an IP address conflict despite having IPs reserved for most clients including unraid.

-I have had issues with dockers talking to eachother and had added a script to check if the unraid shim was in place, and if not, add it. I disabled this script.

-I disabled netdata, cadvisor, and the prometheus exporter for adguard. I did this primarily because there was so much network chatter it was hard to troubleshoot w/ wireshark

None of these changes solved the problem. However, I noticed CPU use was pretty high and the top two processes were firefox-bin being launched from the /tmp/ directory. I had no clue what this process was, so I decided to restart.

Since restarting, this issue has not reoccured once. I'm not sure which change, if any, fixed this problem - but I wanted to share the latest.

November 10, 2023

2 hours ago, dlandon said:

Start by setting up a DNS server for your Unraid server. This is normally your router. You don't have a DNS server defined.

Thanks for chiming in. I might be missing a setting, but I have this configured in network:

I'll do a basic rundown of my network too:

192.168.1.1 is my router
192.168.1.3 is an adguard home docker running as DNS
192.168.1.24 is a pi hole docker running as secondary DNS
192.168.1.13 is an unbound docker running as upstream DNS for both adguard and the pi hole.

Here is the config in the router:

Finally, I will also add that when unraid is unreachable on one pc, it may still be reachable on another.

November 9, 2023

I have a super weird issue where my unraid server will frequently become inaccessible from any computer on the network. The web gui will not be accessible and pings will fail. All dockers are still accessible, only unraid itself it impacted. This will only be resolved by either removing the entry in the arp table for the ip address of the unraid server, or turning wifi off and back on for a client. I have also verified that the router cannot ping unraid during these outages.

I've made a few changes lately - adding nginx as an internal proxy to avoid memorizing ip/port combos, and sonarr/radarr/prowlarr/overseerr. I don't see how either would be impacting this.

Thoughts?

Attaching diagnostics.

unraid-diagnostics-20231107-1714.zip

November 8, 2023

I am having a weird issue with, what has been, a working Adguard Home install on unraid. The symptom I see is that on computers with adguard home as the DNS provider, they will occasionally stop being able to get to any app hosted on docker within unraid, including unraid itself, until I turn their wifi off and back on. While a device has no access, I can ping adguard home, but get no ping response when I ping unraid. I also see some weird connectivity issues within unraid:

-I cannot access the console of adguard home docker from the web ui. I get this error:

Nov  8 11:45:29 unraid nginx: 2023/11/08 11:45:29 [error] 7285#7285: *696028 connect() to unix:/var/tmp/AdGuard-Home.sock failed (111: Connection refused) while connecting to upstream, client: 192.168.1.184, server: , request: "GET /logterminal/AdGuard-Home/ws HTTP/1.1", upstream: "http://unix:/var/tmp/AdGuard-Home.sock:/ws", host: "192.168.1.198"

Adguard-exporter will work for hours and then randomly:

2023/11/08 10:05:10 An error has occurred during login to AdguardGet "http://192.168.1.3:80/control/status": dial tcp 192.168.1.3:80: connect: no route to host

I have made a couple changes to the system recently. Primarily adding nginx as an internal reverse proxy (basically to give friendlier names to things that you had to memorize the port on). I also added radarr/sonarr/prowlarr/overseerr. Those all are working fine.

Edit: Some updates on testing - if I remove the entry for the unraid server ip from the arp table on the client, things go back to normal. I'm way beyond my depth with networking at this point - what could be causing this to be required?

Attaching diags as well. Appreciate the help in advance.

unraid-diagnostics-20231108-0952.zip

October 30, 2023

Reporting back. Nothing changed here. I was fixing up some VMs and thought to retry connecting. The media share showed up. Super weird. The only thing I can think of is that there were file not found errors being thrown because I lost a couple .iso files that were stored on the array. I replaced those files as part of fixing up a couple VMs. I don't see how that could have fixed it, but sharing for potential future searches.

October 30, 2023

7 hours ago, itimpi said:

I could not spot an obvious reason in the diagnostics. The definitely show that the media share has files on both cache pool and on disk1.

I would suggest running a check filesystem on disk1 as sometimes file system corruption has stopped shares showing up.

I can do that. That would be a weird occurrence for a brand new drive, right?

October 30, 2023

3 minutes ago, itimpi said:

You are likely to get better informed feedback if you attach your system’s diagnostics zip file to your next post in this thread.

Sure.

unraid-diagnostics-20231029-2147.zip

October 29, 2023

I replaced a hard drive and now only two of my three shared SMB folders (all three are user shares) are showing up. 'media' is the missing share. I see appdata and isos. I have done the following so far:

Double checked global share settings:

Double checked user share setting to make sure export is on for the media share:

Despite this, I only see appdata and isos but not media:

I thought maybe permissions related, but I'm a bit out of my depth there. Things seem (?) fine:
image.jpeg.17ad0f6a2a49ac9f34597091e0ea8424.jpeg

October 25, 2023

Following up on this. Replaced motherboard - same issue. However, I tried the new power supply and it worked! I'm super confused how a power supply could fail at the exact time I was replacing other parts. Has to somehow be connected, right? I thought maybe cable related and a cable got pinched while doing the install, but I used the old cables with the new power supply.

October 24, 2023

I've got a replacement motherboard coming tomorrow and I ordered an extra power supply to test. I can't imagine its the CPU, but that would be the last possible thing.

I don't have a way to boot in legacy since this motherboard needs a graphics card to enable legacy boot and I just use the igpu.

October 24, 2023

I updated a working system to an ASUS TUF b760M D4 and an intel 13500 (from an 11500 and b560m motherboard). Things look fine in bios, and I can leave the machine in bios forever.

However once I boot to unraid, the system will restart after exactly 15 seconds and do this over and over. I think I've eliminated most potential hardware issues - I've simplified the system down to just the motherboard and CPU. I've tried GUI mode and safe mode, and all restart in 15s. The memtest sends it back to the bios splash screen.

I tried a random distro of linux on a different USB. Same issue occurred once I booted into it. I also tried a fresh download of unraid, same problem.

Any ideas on things to try or what might be happening?

January 28, 2022

I don't think so. This is the manual: https://images-na.ssl-images-amazon.com/images/I/A1kRDk8X1iL.pdf It mentions: "USB port of your PC must support power-off function so that the device would go to sleeping mode. Setting up motherboard’s (power management ) in S3 is strongly recommended. For more details, please refer to user guide of motherboard BIOS setting."

I'm not sure if that is referencing the functionality it has called 'sync' where it sleeps when the pc sleeps. I have that turned off.

January 28, 2022

Or would an e-sata to usb cable get around this issue?

January 28, 2022

Yikes. My HP ProDesk 400 G6 doesn't have a SATA port. Can I achieve the same with a serial port? Options look like:

January 28, 2022

2 minutes ago, trurl said:

Looks like all of your disks disconnected. How are these attached?

Its a small form factor pc connected to a mediasonic 4 bay enclosure via usb-c.

January 27, 2022

I ordered a new 10TB WD drive (a shucked WD Elements external) and replaced my existing 8TB parity drive. I also moved that existing parity drive to become another data drive. However, building parity on the new hard drive has failed twice and not had a successful run yet. The first time, it made it to around 20% and the second time close to 95%. Both times, I've gotten warnings about read errors on the other drives and then the parity drive becomes disabled and I can't bring it back. The array is connected via USB-C and is a Mediasonic hf2-su31c.

The log for the drive right before failure shows:

Jan 27 15:42:13 Tower kernel: usb 2-4.1: Failed to set U1 timeout to 0x0,error code -19
Jan 27 15:42:13 Tower kernel: usb 2-4.1: Set SEL for device-initiated U1 failed.
Jan 27 15:42:13 Tower kernel: usb 2-4.1: Set SEL for device-initiated U2 failed.
Jan 27 15:42:13 Tower kernel: usb 2-4.1: usb_reset_and_verify_device Failed to disable LPM
Jan 27 15:42:13 Tower kernel: sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00 cmd_age=0s
Jan 27 15:42:13 Tower kernel: sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x8a 8a 00 00 00 00 04 50 f6 f3 48 00 00 08 00 00 00
Jan 27 15:42:13 Tower kernel: blk_update_request: I/O error, dev sdb, sector 18538230600 op 0x1:(WRITE) flags 0x4000 phys_seg 256 prio class 0
Jan 27 15:42:13 Tower kernel: md: disk0 write error, sector=18538230536

After the first time this happened, I reseated the drive thinking that perhaps that was the issue. And while it did get further, it did not complete.

Diagnostics attached. Ideas on what to try next?

tower-diagnostics-20220127-1751.zip

January 17, 2022

5 hours ago, JorgeB said:

It might help, look also for a BIOS/firmware update.

Thanks. Did the BIOS update. Any tricks on firmware updates for an nvme m.2 drive? The tool provided by ADATA is a windows app. You don't really have to pull the drive and install it in a windows machine to update the firmware, do you?

January 16, 2022

I just added:

nvme_core.default_ps_max_latency_us=0

as is mentioned in a few posts and here: https://johnespiritu.dev/blog/random-nvme-crash-manjaro/. Not sure if thats the root cause but wanted to update the thread.

January 16, 2022

Hey all,

I upgraded my nvme drive on a HP Prodesk 400 G6 mini pc to a 1TB XPG S70. Since doing so, the drive will be fine for a bit, but then start cluttering the logs with errors and then becoming unavailable. Upon stopping the array, the drive vanishes. A reboot "fixes" things until the errors start again. I've tried reseating the drive once, thinking maybe that was the problem. I've also done a BIOS update, hoping that would help. Thoughts?

Jan 16 13:24:42 Tower kernel: nvme nvme0: I/O 115 QID 2 timeout, aborting
Jan 16 13:24:50 Tower kernel: nvme nvme0: I/O 480 QID 1 timeout, aborting
Jan 16 13:24:50 Tower kernel: nvme nvme0: I/O 481 QID 1 timeout, aborting
Jan 16 13:24:50 Tower kernel: nvme nvme0: I/O 482 QID 1 timeout, aborting
Jan 16 13:25:12 Tower kernel: nvme nvme0: I/O 115 QID 2 timeout, reset controller
Jan 16 13:25:42 Tower kernel: nvme nvme0: I/O 2 QID 0 timeout, reset controller
Jan 16 13:27:05 Tower kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Jan 16 13:27:05 Tower kernel: nvme nvme0: Abort status: 0x371
### [PREVIOUS LINE REPEATED 3 TIMES] ###
Jan 16 13:27:55 Tower kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Jan 16 13:27:55 Tower kernel: nvme nvme0: Removing after probe failure status: -19
Jan 16 13:28:46 Tower kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 1001944288 op 0x1:(WRITE) flags 0x1000 phys_seg 4 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 1002150864 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 1001530688 op 0x1:(WRITE) flags 0x1000 phys_seg 4 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 1001196896 op 0x1:(WRITE) flags 0x1000 phys_seg 4 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 25776 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 1000205688 op 0x1:(WRITE) flags 0x1000 phys_seg 2 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 1000205666 op 0x1:(WRITE) flags 0x1000 phys_seg 1 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 500657648 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 500818448 op 0x1:(WRITE) flags 0x1000 phys_seg 4 prio class 0
Jan 16 13:28:46 Tower kernel: blk_update_request: I/O error, dev nvme0n1, sector 25768 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 1075631154, offset 1212416, sector 1002150872
Jan 16 13:28:46 Tower kernel: XFS (nvme0n1p1): metadata I/O error in "xfs_buf_ioend+0x12d/0x284 [xfs]" at daddr 0x3bb86ce0 len 32 error 5
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 537586352, offset 0, sector 500788960
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 537403597, offset 0, sector 500657656
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 537586346, offset 0, sector 500832176
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 537586348, offset 0, sector 500788960
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 1075631154, offset 1212416, sector 1002150864
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 1611570215, offset 0, sector 1501284352
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 1611570215, offset 16384, sector 1501284392
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 537586350, offset 0, sector 500786856
Jan 16 13:28:46 Tower kernel: nvme0n1p1: writeback error on inode 232, offset 761856, sector 25760
Jan 16 13:28:46 Tower kernel: XFS (nvme0n1p1): log I/O error -5
Jan 16 13:28:46 Tower kernel: XFS (nvme0n1p1): xfs_do_force_shutdown(0x2) called from line 1196 of file fs/xfs/xfs_log.c. Return address = 00000000b5b54af3
Jan 16 13:28:46 Tower kernel: XFS (nvme0n1p1): Log I/O Error Detected. Shutting down filesystem
Jan 16 13:28:46 Tower kernel: XFS (nvme0n1p1): Please unmount the filesystem and rectify the problem(s)
Jan 16 13:28:46 Tower kernel: XFS (nvme0n1p1): log I/O error -5
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Jan 16 13:28:46 Tower kernel: nvme nvme0: failed to set APST feature (-19)

tower-diagnostics-20220116-1600.zip

December 25, 2021

So I did a new config and plugged the drive back in. I also checked that parity was already valid. Things appear to be working, but unsure whether I will run into this again. So happy to try any other recommendations to keep this thing in a healthy state.

December 25, 2021

3 hours ago, itimpi said:

It looks as if the serial number information being reported for that disk has changed How is it connected?

Its plugged into a mediasonic array. That hasn't been touched or changed. Should I just try clearing and restoring from parity? Its a brand new shucked drive that has been in the array for maybe 3 weeks at this point.

December 25, 2021

Hey all,

New to the unraid ecosystem and loving it. I was cleaning up a series of empty directories and after doing so, I couldn't start any of the docker images (403 error). At the same time, I also had my Plex storage moving to cache accidentally (had set Cache to prefer), so I set Cache to 'yes' and ran the mover. That seemed to work just fine, but wanted to mention it as well.

I thought a reboot might solve it, but after reboot, one of the discs is showing up unassigned. When I try to assign it, it warns that the disc is 'wrong' and will erase all content if I assign it. Curious if that is my only path forward, or if the diagnostics tell you guys any easier steps.

bmrowe

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by bmrowe

Can't ping or access unraid unless I remove entry from arp table

Can't ping or access unraid unless I remove entry from arp table

Can't ping or access unraid unless I remove entry from arp table

[support] Siwat's Docker Repository

Replaced drive, now not all SMB shares showing up

Replaced drive, now not all SMB shares showing up

Replaced drive, now not all SMB shares showing up

Replaced drive, now not all SMB shares showing up

Updated CPU and motherboard, now constant restarts

Updated CPU and motherboard, now constant restarts

Updated CPU and motherboard, now constant restarts

Upgraded parity drive now can't finish a parity rebuild

Upgraded parity drive now can't finish a parity rebuild

Upgraded parity drive now can't finish a parity rebuild

Upgraded parity drive now can't finish a parity rebuild

Upgraded parity drive now can't finish a parity rebuild

New nvme drive having issues. Diagnostics included

New nvme drive having issues. Diagnostics included

New nvme drive having issues. Diagnostics included

Disk unassigned after reboot

Disk unassigned after reboot

Disk unassigned after reboot