PeteBa
Members-
Posts
8 -
Joined
Recent Profile Visitors
The recent visitors block is disabled and is not being shown to other users.
PeteBa's Achievements
Noob (1/14)
1
Reputation
-
Hi, My ubiquity unifi router is complaining that it sees two MAC addresses referencing the same IP address. The mac addresses belong to my UnRAID server's eth0 and vhost0. Note that I am using macvlan rather than ipvlan for my docker containers so that I can better see/manage them within the unifi interface. Is there any way to have different IP addresses for eth0 and vhost0 ? At the moment, I am getting a daily "nag" email from unifi that would be good to quiet. Thanks. root@Server:~# ifconfig ... eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.10.10.10 netmask 255.255.255.0 broadcast 10.10.10.255 ether 70:85:c2:49:49:72 txqueuelen 1000 (Ethernet) RX packets 1312049842 bytes 1715808516599 (1.5 TiB) RX errors 0 dropped 43501 overruns 4 frame 0 TX packets 592446388 bytes 64135910321 (59.7 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device memory 0xdf200000-df21ffff ... vhost0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.10.10.10 netmask 255.255.255.0 broadcast 0.0.0.0 ether 02:2f:25:90:90:5d txqueuelen 500 (Ethernet) RX packets 42872774 bytes 19317802710 (17.9 GiB) RX errors 0 dropped 15410 overruns 0 frame 0 TX packets 42247302 bytes 33452512600 (31.1 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
-
@JorgeB, I haven't rebooted the server so hoping the attached diagnostics are useful? Many thanks. server-diagnostics-20231120-0951.zip
-
@itimpi Thanks for the response. I recall the CRC errors occurred many years ago due to a faulty sata cable that was replaced at that time. However, the drive has just started showing a "yellow thumbs down" on the Dashboard page and I'm getting daily e-mails quoting read error messages as below. I took a closer look at the SMART log attached above and it suggests the only recent error was a UNC error. Not sure what that is and whether it is terminal or not. I have run a few Extended SMART Test and they have all said PASSED. So this is a bit confusing 8(. Do you think I can just click acknowledge on the "thumbs down icon" and monitor ? Thanks again. Most recent error from the SMART log: Error 90 [1] occurred at disk power-on lifetime: 21475 hours (894 days + 19 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 04 f8 00 00 30 05 f9 28 00 00 Error: UNC at LBA = 0x3005f928 = 805697832 Recent notification emails from unRAID: Event: Unraid array errors Subject: Warning [SERVER] - array has errors Description: Array has 1 disk with read errors Importance: warning Disk 2 - HGST_HTS721010A9E630_JG40006PG3RD0C (sdd) (errors 159) Event: Unraid Parity-Check Subject: Notice [SERVER] - Parity-Check finished (1 errors) Description: Duration: 3 hours, 10 minutes, 47 seconds. Average speed: 87.4 MB/s Importance: warning Event: Unraid Status Subject: Notice [SERVER] - array health report [FAIL] Description: Array has 6 disks (including parity & pools) Importance: warning Parity - HGST_HTS721010A9E630_JR1020BN0J90DE (sdg) - active 21 C [OK] Disk 1 - HGST_HTS721010A9E630_JR1000D31PL5WE (sdf) - active 26 C [OK] Disk 2 - HGST_HTS721010A9E630_JG40006PG3RD0C (sdd) - active 26 C (disk has read errors) [NOK] Disk 3 - HGST_HTS721010A9E630_JG40006PG6PP7C (sde) - active 27 C [OK] Disk 4 - HGST_HTS721010A9E630_JS1020620JGW6W (sdc) - active 26 C [OK] Cache - Samsung_SSD_960_EVO_250GB_S3ESNX1JB30291E (nvme0n1) - active 35 C [OK] Parity is valid Last checked on Sun 12 Nov 2023 03:10:48 AM GMT (yesterday), finding 1 error. Duration: 3 hours, 10 minutes, 47 seconds. Average speed: 87.4 MB/s
-
Hi, I have a small unRaid server made up of 5 fairly old 2.5" HGST 7K1000 1TB drives (4 x data and 1 x parity). I also have a 500GB nvme cache drive. This has worked great for over 5 years now as mainly a media server and backup storage. But one of the drives is reporting errors (SMART test attached) and so I have been looking for a replacement. Unfortunately, my searching has struggled to find a modern equivalent drive for a reasonable price. I think it has to do with CMR vs SMR technology and that CMR is no longer very supported for what is now a small capacity drive. I think the WD Red Plus drive is roughly equivalent but seems very expensive (£100) compared to the various SMR drives (£30-40) that come up in an amazon search. So I guess I'm after some advice on a good 7K1000 replacement at a reasonable price. My thoughts at the moment, include: 1) I dont need any additional storage capacity so going for larger drives doesnt seem necessary; 2) having said that, if 1TB drives are going obsolete do I bite the bullet and upgrade the whole system to two 4TB drives but that seems excessive for one failing drive; 3) ideally, I would replace the drive with an 1TB SSD that is about the same price point but I see very conflicting TRIM messages on the forums, and; 4) maybe, I'm unfairly concerned about SMR drives and given I have a reasonably sized cache nvme drive then it wont make any material difference. Appreciate any thoughts/recommendations. server-smart-20231118-1554.zip
-
@JorgeB, many thanks for the suggestion. I ran memtest a few times and got clean passes so it doesn't seem to be a persistent memory problem at least. I restarted Unraid and the cache filesystem looks fine and I can access the files that were previously erroring out. Which is great. The experience does leave me a bit wary and wonder what could have caused the problem. Is there any other testing tool that I can use or any way to monitor this going forward ? Thanks again for the help.
-
Yesterday evening, I updated to 6.12.0. Everything came back online fine and I went to bed. Woke up this morning to find that all my dockers had stopped as well as the docker engine. Looking through syslog I can see a ton of "BTRF error" reports on my nvme cache drive that holds /appdata and /system. The main array looks OK. I note that just prior to these errors was a scheduled trim task but not sure if that is related. Sample of the syslog: Jun 19 04:56:22 Server kernel: BTRFS critical (device nvme0n1p1): corrupt leaf: root=2 block=156609576960 slot=44, invalid key objectid, have 18446612688409594768 expect to be aligned to 4096 Jun 19 04:56:22 Server kernel: BTRFS info (device nvme0n1p1): leaf 156609576960 gen 15881817 total ptrs 105 free space 8093 owner 2 Jun 19 04:56:22 Server kernel: item 0 key (1956950016 168 45056) itemoff 16230 itemsize 53 Jun 19 04:56:22 Server kernel: extent refs 1 gen 13815472 flags 1 Jun 19 04:56:22 Server kernel: ref#0: extent data backref root 5 objectid 265178326 offset 24576 count 1 ... Jun 19 04:56:26 Server kernel: I/O error, dev loop2, sector 31924224 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2 Jun 19 04:56:26 Server kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 36, rd 0, flush 0, corrupt 0, gen 0 Jun 19 04:56:26 Server kernel: loop: Write error at byte offset 20109533184, length 4096. ... Jun 19 07:10:50 Server kernel: BTRFS error (device nvme0n1p1: state EA): parent transid verify failed on logical 156753166336 mirror 1 wanted 15881817 found 15881803 ### [PREVIOUS LINE REPEATED 7 TIMES] ### Jun 19 07:11:05 Server kernel: verify_parent_transid: 3 callbacks suppressed I noticed that the timing of the syslog errors corresponded to my nightly rsync job that copies "/mnt/user" to a standby server. The rsync log from that job reports that most files copied across to the standby server but around ~800 failed due to read errors. So I now have a mixed set of files on the standby as well. I do have a full archive from a week ago so I should be able to reconstruct a reasonable set of /appdata and /system files. Thankfully only around 50 of the 800 impacted files are what I would call important (e.g. influxdb data, home-assistant storage etc.) and should be recoverable from what I have. The rest are things like plex metadata files and the like. I haven't yet rebooted the server or tried to repair the nvme drive. I have however taken a copy of the remaining files from the cache drive onto a spare unassigned drive so that is positive at least. I think my options are either: 1) try a Scrub/Repair of the cache drive from the UnRaid GUI, or, 2) reboot the machine and see what happens. If both of those fail then I assume I need to reformat the drive and start the reconstruction process. Any advice on the best way forward ? I have attached my diagnostic files as I would really like to know what caused this issue in the first place. Is it coincidental with the OS upgrade? Appreciate any insight. server-diagnostics-20230619-1635.zip
-
I am trying to use one of those dashboard/home page apps (Organizr in this case) that uses iframes to provide "one-stop-shop" access to all of your self-hosted app front-ends. Unfortunately, the browser is throwing an exception when trying to view the UnRaid GUI from within an iframe. Specifically, I'm getting the following error showing in chrome development tools: "Refused to frame 'http://10.10.10.10:80/' because an ancestor violates the following Content Security Policy directive: "frame-ancestors 'self' https://connect.myunraid.net/"." What is strange is that the error still occurs even after removing the connect plugin from UnRaid. Not sure why it is still trying to access connect.myunraid.net ? I'm on UnRaid 6.12.0 if it matters. Any help in resolving this would be much appreciated. [Let me know if I should move this to another area of the forum.]
-
Hi, I bought a replacement drive on ebay that was advertised as new. Got it today and the initial SMART data suggested a new drive with 3 power cycles an an hour of powered time. I then did an extended SMART test and it came back OK. But I do notice a very odd Reported_Uncorrect value of 9796820402176 ! Is this cause for concern on what should be a new device or indicative of something else ? BTW, is it possible to reset SMART data if someone wanted to give the appearance of a new drive ? Attached smart test output: server-smart-20211123-1749.zip