YourNightmar3 Posted December 7, 2022 Share Posted December 7, 2022 (edited) Today i noticed the LOG on the dashboard of my unraid server shows 100%: I used this comment to find out which log file is taking up all this space: du -sm /var/log/* And this is the result: 0 /var/log/btmp 1 /var/log/btmp.1 0 /var/log/cron 0 /var/log/debug 1 /var/log/dmesg 1 /var/log/docker.log 0 /var/log/faillog 1 /var/log/lastlog 1 /var/log/libvirt 0 /var/log/maillog 0 /var/log/mcelog 0 /var/log/messages 0 /var/log/nfsd 0 /var/log/nginx 0 /var/log/packages 1 /var/log/pkgtools 0 /var/log/plugins 0 /var/log/pwfail 0 /var/log/removed_packages 0 /var/log/removed_scripts 0 /var/log/removed_uninstall_scripts 0 /var/log/sa 3 /var/log/samba 1 /var/log/scan 0 /var/log/scripts 0 /var/log/secure 0 /var/log/setup 0 /var/log/spooler 0 /var/log/swtpm 4 /var/log/syslog 14 /var/log/syslog.1 108 /var/log/syslog.2 0 /var/log/vfio-pci 1 /var/log/wtmp When i do cat syslog.2 i see a bunch of BTRFS errors. I have no idea what's going on. I don't know what's causing this and what i should do. Any support with this is much appreciated. I have attached my diagnostics which i checked also contains the logs. Thanks in advance. tower-diagnostics-20221207-1923.zip Edited December 7, 2022 by YourNightmar3 Quote Link to comment
Solution JorgeB Posted December 7, 2022 Solution Share Posted December 7, 2022 Problems with the cache2 device: Dec 5 04:40:02 Tower kernel: BTRFS warning (device sdb1): lost page write due to IO error on /dev/sdc1 (-5) Dec 5 04:40:02 Tower kernel: BTRFS error (device sdb1): error writing primary super block to device 2 Dec 5 04:40:02 Tower kernel: BTRFS warning (device sdb1): lost page write due to IO error on /dev/sdc1 (-5) It dropped offline, check/replace cables then run a scrub on the pool. Quote Link to comment
YourNightmar3 Posted December 7, 2022 Author Share Posted December 7, 2022 (edited) 12 hours ago, JorgeB said: Problems with the cache2 device: Dec 5 04:40:02 Tower kernel: BTRFS warning (device sdb1): lost page write due to IO error on /dev/sdc1 (-5) Dec 5 04:40:02 Tower kernel: BTRFS error (device sdb1): error writing primary super block to device 2 Dec 5 04:40:02 Tower kernel: BTRFS warning (device sdb1): lost page write due to IO error on /dev/sdc1 (-5) It dropped offline, check/replace cables then run a scrub on the pool. Thanks a lot! Can this cause any serious data issues or is cache 1 my saviour here? I'll check the cable asap and how do i perform a scrub on the pool? Edited December 8, 2022 by YourNightmar3 Quote Link to comment
ChatNoir Posted December 8, 2022 Share Posted December 8, 2022 7 hours ago, YourNightmar3 said: I'll check the cable asap and how do i perform a scrub on the pool? Click the on 'cache', then scroll down to the scrub section. Quote Link to comment
YourNightmar3 Posted December 11, 2022 Author Share Posted December 11, 2022 (edited) Ok so i first unplugged/replugged the cables on the particular SSD but when i booted up unRAID again it said the entire SSD was missing this time. I then swapped the power and SATA cables with the power and SATA cables from the other SSD that is still working to rule out a cable problem and the same SSD still said missing. I then plugged it into my windows computer, formatted it to exFAT because otherwise i couldn't do anything with it. I did a quick read/write test on it and got a normal 400 MB/s, then did a chkdsk cmd command on it with this result: The type of the file system is exFAT. Volume Serial Number is D4AA-EF58 Windows is verifying files and folders... Volume label is New Volume. Windows is verifying file allocations... File and folder verification is complete. Windows is verifying free space... 3815278 free clusters processed. Free space verification is complete. Bad sectors were found and tested while examining free space on the volume. Windows has made corrections to the file system. No further action is required. 976712704 KB total disk space. 256 KB in 1 files. 512 KB in 2 indexes. 1536 KB in bad sectors. 768 KB in use by the system. 976711168 KB available on disk. 262144 bytes in each allocation unit. 3815284 total allocation units on disk. 3815272 allocation units available on disk. It seems to be working fine on my windows PC but it does say bad sectors. I then plugged it back into my unRAID server to see if it would show up but it still cannot find the disk at all. Is that normal? How come my Windows PC can see/use it seemingly fine (even though it may be dying) but unRAID just doesn't see it at all? And does this mean the SSD is just dead? Edited December 11, 2022 by YourNightmar3 Quote Link to comment
JorgeB Posted December 12, 2022 Share Posted December 12, 2022 Try a different port/controller, it would be strange that it works on Windows and not in Unraid. Quote Link to comment
YourNightmar3 Posted December 12, 2022 Author Share Posted December 12, 2022 (edited) 13 hours ago, JorgeB said: Try a different port/controller, it would be strange that it works on Windows and not in Unraid. Something i noticed when i type the "lsblk" command in the terminal is that (if im reading this right) i get three 1TB drives as result: root@Tower:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS loop0 7:0 0 116.8M 1 loop /lib/firmware loop1 7:1 0 19.4M 1 loop /lib/modules loop2 7:2 0 80G 0 loop /var/lib/docker/btrfs /var/lib/docker loop3 7:3 0 1G 0 loop /etc/libvirt sda 8:0 1 7.5G 0 disk └─sda1 8:1 1 7.5G 0 part /boot sdb 8:16 0 931.5G 0 disk └─sdb1 8:17 0 931.5G 0 part sdc 8:32 0 931.5G 0 disk └─sdc1 8:33 0 931.5G 0 part /mnt/cache sdd 8:48 0 931.5G 0 disk └─sdd1 8:49 0 931.5G 0 part sde 8:64 0 465.8G 0 disk └─sde1 8:65 0 465.8G 0 part sdf 8:80 0 3.6T 0 disk └─sdf1 8:81 0 3.6T 0 part sdg 8:96 0 3.6T 0 disk └─sdg1 8:97 0 3.6T 0 part sdh 8:112 0 7.3T 0 disk └─sdh1 8:113 0 7.3T 0 part /mnt/disks/VDK2E7KK sdi 8:128 0 7.3T 0 disk └─sdi1 8:129 0 7.3T 0 part sdj 8:144 0 7.3T 0 disk └─sdj1 8:145 0 7.3T 0 part md1 9:1 0 7.3T 0 md /mnt/disk1 md2 9:2 0 3.6T 0 md /mnt/disk2 md3 9:3 0 3.6T 0 md /mnt/disk3 md4 9:4 0 931.5G 0 md /mnt/disk4 md5 9:5 0 465.8G 0 md /mnt/disk5 sr0 11:0 1 1024M 0 rom nvme0n1 259:0 0 931.5G 0 disk nvme1n1 259:1 0 931.5G 0 disk Is that correct? I see sbd, sdc, and sdd as 1TB drives. sdd is a 1TB array drive. sdc is my working cache SSD that's left, and sdb doesn't show up anywhere in the unRAID main page. Could that be the other 1TB cache SSD? If so, why is it not showing up in the GUI? If not, then what is it? I have attached my server's diagnostics (once again) in case this might help answer the question/solve the problem. tower-diagnostics-20221213-0059.zip Edited December 13, 2022 by YourNightmar3 Quote Link to comment
JorgeB Posted December 13, 2022 Share Posted December 13, 2022 sdb was a 1TB Sandisk device that dropped offline: Dec 12 20:59:19 Tower kernel: ata4.00: disable device Dec 12 20:59:19 Tower kernel: sd 5:0:0:0: [sdb] tag#24 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=90s Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.