Jump to content

Max

Members
  • Posts

    271
  • Joined

  • Last visited

Everything posted by Max

  1. @JorgeB update... So far after switching to stick 2, its running stable now and no more BTRFS or input output errors, so after 5-6 days of it running stable i added last RAM Stick and its still running stable, no BTRFS or I/O Error except i got segfault errors when i tried to play a episode from a TV series over plex and so far it has only happened with that one single particular file so i just replaced that file. Jul 28 22:41:35 Unraid kernel: PMS GTP[10402]: segfault at 77 ip 00001500349a0e6c sp 000015002b925720 error 4 in Plex Media Server[1500347fd000+d08000] likely on CPU 7 (core 3, socket 0) Jul 28 22:41:57 Unraid kernel: PMS GTP[14390]: segfault at 77 ip 0000148ea43a0e6c sp 0000148e9d38c720 error 4 in Plex Media Server[148ea41fd000+d08000] likely on CPU 4 (core 0, socket 0) Jul 28 22:44:38 Unraid kernel: PMS GTP[15524]: segfault at 77 ip 000015190f7a0e6c sp 0000151908fd4720 error 4 in Plex Media Server[15190f5fd000+d08000] likely on CPU 2 (core 2, socket 0) Jul 28 22:45:00 Unraid kernel: PMS GTP[22766]: segfault at 77 ip 000014b912da0e6c sp 000014b908ce6720 error 4 in Plex Media Server[14b912bfd000+d08000] likely on CPU 5 (core 1, socket 0) So my thinking is so far it looks 1st of the three sticks was causing BTRFS Corruption and somewhere some time that file may have gone corrupt which then now caused these segfault errors when i tried to read them but you are the expert here i'm not..😅
  2. @JorgeB So with just one stick, i deleted the docker image and slowly started reinstalling dockers through community app plugin and it was going fine up until 18th docker. 19th Docker failed with following error Error: failed to register layer: read /var/lib/docker/tmp/GetImageBlob3662119572: input/output error And syslog is showing these same btrfs errors again. Jul 22 19:35:18 Unraid kernel: BTRFS warning (device loop2): csum failed root 5 ino 2846 off 250462208 csum 0x59363698 expected csum 0x508941a2 mirror 1 Jul 22 19:35:18 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Jul 22 19:35:19 Unraid kernel: BTRFS warning (device loop2): csum failed root 5 ino 2846 off 250462208 csum 0x59363698 expected csum 0x508941a2 mirror 1 Jul 22 19:35:19 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 Jul 22 19:35:19 Unraid kernel: BTRFS warning (device loop2): csum failed root 5 ino 2846 off 250462208 csum 0x59363698 expected csum 0x508941a2 mirror 1 Jul 22 19:35:19 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 Jul 22 19:35:19 Unraid kernel: BTRFS warning (device loop2): csum failed root 5 ino 2846 off 250462208 csum 0x59363698 expected csum 0x508941a2 mirror 1 Jul 22 19:35:19 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 So i guess its to replace with another stick and start the whole process again😭
  3. . unraid-diagnostics-20240722-1528.zip
  4. @JorgeBhey i booted my system with just 1 stick to see how it goes but now syslog is showing these errors. Jul 22 15:08:02 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 15:08:02 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 16, gen 0 Jul 22 15:08:19 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 15:08:19 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 17, gen 0 Jul 22 15:08:44 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 15:08:44 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 18, gen 0 Jul 22 15:09:15 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 15:09:15 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 19, gen 0 any ideas which drive is device loop2 ? is it cache
  5. 😭 thats what I pretty much did last time around bought a new 8 gig stick used it for week and it ran stable so afterwards replaced my older 2x8 gig sticks that i was using. So i guess a year later i'm back to square one then😭 But thanks, as always you have been much help i just hope it last before i can upgrade. ( I had been thinking of upgrading my CPU ever since ryzen 5000 series came out and ryzen 9000 is almost out 😅.)
  6. i have tested my ram using Live mem tester plugin multiple times and it always comes out without any errors and actually last time when i had these kind of errors, you suggested that ram is generally the first thing to test in these kind of scenarios so i had replaced all my ram sticks since these Kingston hyperx sticks come with lifetime warranty so...😅 Shortly after which i got myself apc ups for my server and LSI 9207-8i and also replaced my SMPS and after which it had been running quite stably so far, so i had thought maybe it was ram even though last time too memtest didn't find any errors but now they are back again. Surely i can't be that unlucky when it comes to RAM😅 And this morning again while running appdata backup it again failed on Tar verification of plex appdata and i noticed that again syslog is showing similar errors Jul 22 03:01:26 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 03:01:26 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 12, gen 0 Jul 22 03:12:59 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 03:12:59 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 13, gen 0 Jul 22 03:43:30 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 03:43:30 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 14, gen 0 Jul 22 04:14:00 Unraid kernel: BTRFS warning (device loop2): csum failed root 26983 ino 3290 off 5459968 csum 0x851cd069 expected csum 0xa5ae22ba mirror 1 Jul 22 04:14:00 Unraid kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 15, gen 0 Jul 22 11:19:27 Unraid nginx: 2024/07/22 11:19:27 [error] 19942#19942: *1010208 open() "/usr/local/emhttp/plugins/dynamix.file.manager/javascript/ace/mode-log.js" failed (2: No such file or directory) while sending to client, client: 10.0.0.23, server: , request: "GET /plugins/dynamix.file.manager/javascript/ace/mode-log.js HTTP/1.1", host: "10.0.0.3", referrer: "http://10.0.0.3/Shares/Browse?dir=%2Fmnt%2Fuser%2FBackup%2FAppdata%2Fab_20240722_021002-failed" Could this also happen due same hardware issue here.
  7. deleted that file and scrub ran without any error this time UUID: 8206daa0-8850-4ac1-8149-65d3d6d92f27 Scrub started: Sun Jul 21 16:52:22 2024 Status: finished Duration: 0:11:20 Total to scrub: 216.22GiB Rate: 325.61MiB/s Error summary: no errors found But why is this corruption happening again and again, i forgot to mention this earlier but originally back a week ago when earlier i noticed these errors i was actually installing some docker container using community application which ended up failing due input output error thats how i noticed these errors. Any thoughts on why its this errors could occuring ?
  8. UUID: 8206daa0-8850-4ac1-8149-65d3d6d92f27 Scrub started: Sun Jul 21 15:41:00 2024 Status: finished Duration: 0:11:29 Total to scrub: 239.14GiB Rate: 355.41MiB/s Error summary: csum=1 Corrected: 0 Uncorrectable: 1 Unverified: 0 ☝️ Scrub results Jul 21 15:41:00 Unraid kernel: BTRFS info (device sdd1): scrub: started on devid 1 Jul 21 15:49:17 Unraid kernel: BTRFS warning (device sdd1): checksum error at logical 4283911495680 on dev /dev/sdd1, physical 248789721088, root 5, inode 46457306, offset 11439251456, length 4096, links 1 (path: Media/All Movies/Movies/X2 (2003)/X2 (2003) Bluray-2160p.mkv) Jul 21 15:49:17 Unraid kernel: BTRFS error (device sdd1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 33, gen 0 Jul 21 15:49:17 Unraid kernel: BTRFS error (device sdd1): unable to fixup (regular) error at logical 4283911495680 on dev /dev/sdd1 Jul 21 15:52:29 Unraid kernel: BTRFS info (device sdd1): scrub: finished on devid 1 with status: 0 Same file that tdarr was using earlier.
  9. Hey guys i noticed some corruption errors, BTRFS csum Errors and input output errors on server's syslog today, all were on cache drive and i think both time i was running tdarr. Please give suggestion on how can fix these errors, could input output errors cause corruption or vice versa. And almost a week ago similar input output and errors on cache drive occured so when i ran scrub on it its was showing 2-3 plex meta data files as corrupt so i just deleted those files as at that time my APC ups was also having some battery related issues so my server did face one or two unclean shutdowns, so i didnt think on it much but they are back again. Could it be my cache drive going bad ? unraid-diagnostics-20240721-1326.zip
  10. hey everyone i just noticed that my parity drive has come up with a SMART error and its UDMA CRC error and so far its only one error, should i be worried and replace it asap ?? The drive in question here 8TB Seagate Ironwolf and its only 9 months old. unraid-diagnostics-20231204-2155.zip
  11. thanks unraid's main page shows zero errors. Should I go ahead with the parity check that failed on Dec 2nd. ( that was scheduled parity check)
  12. @JorgeB okay something weird is happening again first off replacing psu somehow cleared the bios which resulted it to be back on optimized defaults, meaning no more proper iommu group sepration but i thought i will figure it later. so important stuff now, so last yesterday night when i checked it was going fine i think it was almost 80 percent done without any errors but this morning when logged on webgui it showed a notification saying data rebuild finished with 310912 errors !!! though the disk has returned to normal operations and data is there and accessible and logs also doesn't show any errors or warning. and somehow all this changed my unraid servers name and screwed time, so time is off on the logs. tower-diagnostics-20231204-0829.zip
  13. so far looks like it was bad sata power splitter as i did try again after reseating the cables (i tried reseating cable again cause i realized that last time i mistakenly reseated cables of completely different drive😅) and now while rebuilding data it started throwing errors on disk 2, disk 2 and 3 are the ones that are connected through sata power splitter, so finally decided to pull the plug on sata power splitters and bought myself Gigabyte P750GM which comes with 8 sata connectors. i dont know its just my bad luck or what, my history with SATA cables and power splitters has been quite troublesome. I found myself frequently reseating or replacing these cables every 2-3 months. About 6 or 7 months ago, I invested in an LSI 9207-8i, and since then, I haven't encountered a single issue related to SATA link speed or any other connectivity problems. Here's hoping that this new power supply will have a similar positive impact on my SATA power issues.
  14. yeah actually thats what i meant that it mainly talks about ungraceful shutdown due to too short of a timeout but honestly as this wasn't my first unclean shutdown i have manually tested it many times like deliberately pulled out the cord to see whether it stays on on ups or not or whether it shuts down properly according to set rules. i have tried different plug points on the back of ups, changed the power cords as well, still haven't figured out whats the issue. @JorgeB ahh looks we are back to square one, disk3 is again throwing read errors and its disabled and currently being emulated unraid.local-diagnostics-20231202-1923.zip
  15. just gave it a read but thing is, it mainly talks about ungraceful shutdown after a power failure but there was no power failure.
  16. phewww!!! finally rebuild is finished and fortunately without any errors or weird notifications this time and disk3 is back up and running in normal operation dont know what went wrong the first time. BTW any guesses on what could have caused unclean shutdown in the beginning.
  17. okay i have uninstalled parity check tuning plugin started rebuilding again, so far looks good, started from zero percent this percent time. gonna take much longer this time 😅
  18. here's the latest log unraid.local-diagnostics-20231201-0025.zip
  19. okay once again i just removed the drive from pool and started the array normally although now its says disk3 missing but once again just like earlier disk3 contents are getting emulated and i can access any content that supposed to be there. can i try rebuilding again cause im not too sure about earlier rebuild process i mean how could it do 71-72 percent rebuilding in a couple of minutes ??
  20. i used maintenance mode because unraid docs recommends using it for better speeds. And earlier when it was in normal mode, the emulated disk was mountable and i was able to access any data that was supposed to be there in disk 3.
  21. after rebuilding the data, i stopped the array as it was in maintenance mode and started the array normally and now my disk 3 is saying unmountable: Unsupported or no file system. unraid.local-diagnostics-20231130-1056.zip
  22. okay so after reseating the cables i have started rebuilding data but im not sure wheteher everything is going fine here, cause firstly the moment i clicked sync webgui became unresponsive for almost a minute then when it came back it got paused on its own. Secondly its all ready done 73issh percent. im not sure whether this is how it works or i m doing something wrong here. unraid.local-diagnostics-20231129-2208.zip
  23. Hey guys so for some unknown reason my server detected unclean shutdown this morning although it is connected to apc ups so this should not have happened but it did and i don't know why it did and as usual after an unclean shutdown server started automatic parity check it was all running fine until just now for some reason disk 3 started throwing read and write errors and now my disk 3 is disabled and its contents are currently being emulated. please suggest how should i proceed ? unraid.local-diagnostics-20231129-2100.zip
  24. been using since 6.12. this was mentioned in the changelog of 6.12.
  25. unraid 6.12.3 breaks it again for me, was working on 6.12.2.
×
×
  • Create New...