RobJ Posted August 13, 2015 Share Posted August 13, 2015 Two kernel panics within 24 hours of installing v.6.1-rc2.... Is there a way to revert to my previous 6.0.x stable??? Cannot save logs as the server completely locks up... errors were all similar to something like btrfs_run_delayed VM file stored in cache drive which is btrfs format.... my array drives are Reiserfs. I suspect you have BTRFS corruption, so I doubt if the unRAID version matters. Try repairing it (Maintenance mode, click on the disk, go to Check filesystems). Quote Link to comment
hernandito Posted August 13, 2015 Share Posted August 13, 2015 Two kernel panics within 24 hours of installing v.6.1-rc2.... Is there a way to revert to my previous 6.0.x stable??? Cannot save logs as the server completely locks up... errors were all similar to something like btrfs_run_delayed VM file stored in cache drive which is btrfs format.... my array drives are Reiserfs. I suspect you have BTRFS corruption, so I doubt if the unRAID version matters. Try repairing it (Maintenance mode, click on the disk, go to Check filesystems). Thank you Rob... I stopped the array... click on maintenance mode button in Main, and started array. I clicked on the Cache drive from Main, but I cannot figure out where to go to Check Filesystems... I have all the Smart Test buttons... but none are "check filesystem". Balance and Scrub are greyed-out in Maintenance mode. Hard Drives in Cache pool are less than a month old. Regular spinner drives. Thanks again, H. Quote Link to comment
RobJ Posted August 13, 2015 Share Posted August 13, 2015 Two kernel panics within 24 hours of installing v.6.1-rc2.... Is there a way to revert to my previous 6.0.x stable??? Cannot save logs as the server completely locks up... errors were all similar to something like btrfs_run_delayed VM file stored in cache drive which is btrfs format.... my array drives are Reiserfs. I suspect you have BTRFS corruption, so I doubt if the unRAID version matters. Try repairing it (Maintenance mode, click on the disk, go to Check filesystems). Thank you Rob... I stopped the array... click on maintenance mode button in Main, and started array. I clicked on the Cache drive from Main, but I cannot figure out where to go to Check Filesystems... I have all the Smart Test buttons... but none are "check filesystem". Balance and Scrub are greyed-out in Maintenance mode. Hard Drives in Cache pool are less than a month old. Regular spinner drives. Hmmm... have I misled you? Normally, 'Check filesystems' ONLY appears in Maintenance mode. But I have no experience fixing BTRFS. I *think* you want that Scrub option, what ever way you can get to it. Quote Link to comment
limetech Posted August 13, 2015 Author Share Posted August 13, 2015 Two kernel panics within 24 hours of installing v.6.1-rc2.... Is there a way to revert to my previous 6.0.x stable??? Cannot save logs as the server completely locks up... errors were all similar to something like btrfs_run_delayed VM file stored in cache drive which is btrfs format.... my array drives are Reiserfs. I suspect you have BTRFS corruption, so I doubt if the unRAID version matters. Try repairing it (Maintenance mode, click on the disk, go to Check filesystems). Thank you Rob... I stopped the array... click on maintenance mode button in Main, and started array. I clicked on the Cache drive from Main, but I cannot figure out where to go to Check Filesystems... I have all the Smart Test buttons... but none are "check filesystem". Balance and Scrub are greyed-out in Maintenance mode. Hard Drives in Cache pool are less than a month old. Regular spinner drives. Hmmm... have I misled you? Normally, 'Check filesystems' ONLY appears in Maintenance mode. But I have no experience fixing BTRFS. I *think* you want that Scrub option, what ever way you can get to it. For 'btrfs' the file system check is called 'Scrub' and is available when the array is started in 'Normal' mode, meaning Scrub only runs when the target device is mounted. This is unlike other file systems such as reiserfs and xfs, which require the target device to not be mounted. This is accomplished by starting array in 'Maintenance' mode. tldr: for btrfs, file system check "scrub" is available after array is Started for xfs, reiserfs, file system check is available after array started in Maintenance mode. Quote Link to comment
hernandito Posted August 13, 2015 Share Posted August 13, 2015 Thank you guys.... I ran a scrub and it found 2 UNCORRECTABLE errors.... Googling how to fix gives very complicated and un-clear answers. Running a second scrub, but any ideas how to fix? Thanks again. Quote Link to comment
BRiT Posted August 13, 2015 Share Posted August 13, 2015 Thank you guys.... I ran a scrub and it found 2 UNCORRECTABLE errors.... Googling how to fix gives very complicated and un-clear answers. Running a second scrub, but any ideas how to fix? Thanks again. Nuke it and recreate seems to be the only noncomplicated way of fixing BTRFS issues. But if you do attempt the complicated and lengthy process of trying to fix btrfs, please let us know your experience and results. Quote Link to comment
hernandito Posted August 13, 2015 Share Posted August 13, 2015 On second scrub, it stops in the middle... here is the log: Aug 13 06:49:32 Tower kernel: __readpage_endio_check: 68 callbacks suppressed Aug 13 06:49:32 Tower kernel: BTRFS warning (device sdh1): csum failed ino 321081 off 2055118848 csum 3689655997 expected csum 2830925743 Aug 13 06:49:32 Tower kernel: BTRFS warning (device sdh1): csum failed ino 321081 off 2055122944 csum 1711389019 expected csum 3761795661 Aug 13 06:49:32 Tower kernel: BTRFS warning (device sdh1): csum failed ino 321081 off 2055118848 csum 3689655997 expected csum 2830925743 Aug 13 06:49:32 Tower kernel: BTRFS warning (device sdh1): csum failed ino 321081 off 2055122944 csum 1711389019 expected csum 3761795661 Aug 13 06:49:34 Tower kernel: BTRFS: read error corrected: ino 321081 off 2055118848 (dev /dev/sdh1 sector 315101136) Aug 13 06:49:34 Tower kernel: BTRFS: read error corrected: ino 321081 off 2055122944 (dev /dev/sdh1 sector 315101144) Aug 13 06:50:01 Tower sSMTP[12695]: Connection lost in middle of processing Aug 13 06:50:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:50:55 Tower kernel: BTRFS (device sdh1): parent transid verify failed on 539501920256 wanted 33363 found 9349 Aug 13 06:50:55 Tower kernel: BTRFS (device sdh1): parent transid verify failed on 539501920256 wanted 33363 found 9349 Aug 13 06:50:55 Tower kernel: BTRFS: checksum error at logical 626360905728 on dev /dev/sdj1, sector 308963960, root 5, inode 321081, offset 28978675712, length 4096, links 1 (path: vms/nZEDb/nZEDb/nzedb.img) Aug 13 06:50:55 Tower kernel: BTRFS: bdev /dev/sdj1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 Aug 13 06:50:55 Tower kernel: BTRFS: unable to fixup (regular) error at logical 626360905728 on dev /dev/sdj1 Aug 13 06:51:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:51:16 Tower sSMTP[14307]: Connection lost in middle of processing Aug 13 06:52:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:52:14 Tower sSMTP[15538]: Connection lost in middle of processing Aug 13 06:53:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:53:10 Tower in.telnetd[30480]: connect from 192.168.0.190 (192.168.0.190) Aug 13 06:53:10 Tower login[30481]: ROOT LOGIN on '/dev/pts/1' from '192.168.0.190' Aug 13 06:53:15 Tower sSMTP[16865]: Connection lost in middle of processing Aug 13 06:53:16 Tower kernel: BTRFS (device sdh1): parent transid verify failed on 539501920256 wanted 33363 found 9349 Aug 13 06:53:16 Tower kernel: BTRFS (device sdh1): parent transid verify failed on 539501920256 wanted 33363 found 9349 Aug 13 06:53:16 Tower kernel: BTRFS: checksum error at logical 626360905728 on dev /dev/sdh1, sector 309002872, root 5, inode 321081, offset 28978675712, length 4096, links 1 (path: vms/nZEDb/nZEDb/nzedb.img) Aug 13 06:53:16 Tower kernel: BTRFS: bdev /dev/sdh1 errs: wr 91885372, rd 60386823, flush 793487, corrupt 2605287, gen 24522 Aug 13 06:53:16 Tower kernel: BTRFS: unable to fixup (regular) error at logical 626360905728 on dev /dev/sdh1 Aug 13 06:54:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:54:15 Tower sSMTP[18173]: Connection lost in middle of processing Aug 13 06:55:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:55:16 Tower sSMTP[19525]: Connection lost in middle of processing Aug 13 06:56:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:56:14 Tower emhttp: /usr/local/emhttp/webGui/scripts/tail_log syslog 2>&1 Aug 13 06:56:16 Tower sSMTP[20836]: Connection lost in middle of processing Aug 13 06:56:59 Tower kernel: usb 4-1.2: new low-speed USB device number 8 using uhci_hcd Aug 13 06:56:59 Tower kernel: input: Microsoft Microsoft Basic Optical Mouse v2.0 as /devices/pci0000:00/0000:00:1a.1/usb4/4-1/4-1.2/4-1.2:1.0/0003:045E:00CB.0007/input/input10 Aug 13 06:56:59 Tower kernel: hid-generic 0003:045E:00CB.0007: input,hidraw3: USB HID v1.11 Mouse [Microsoft Microsoft Basic Optical Mouse v2.0 ] on usb-0000:00:1a.1-1.2/input0 Aug 13 06:57:01 Tower crond[1560]: exit status 127 from user root /usr/local/sbin/monitor &> /dev/null Aug 13 06:57:02 Tower kernel: usb 4-1.2: USB disconnect, device number 8 Aug 13 06:57:15 Tower sSMTP[22154]: Connection lost in middle of processing The scrub screen reads: scrub status for f2580c11-36f2-4582-8d65-340a7c1a13d9 scrub started at Thu Aug 13 06:37:17 2015 and was aborted after 00:15:59 total bytes scrubbed: 268.25GiB with 2 errors error details: csum=2 corrected errors: 0, uncorrectable errors: 2, unverified errors: 0 Quote Link to comment
hernandito Posted August 13, 2015 Share Posted August 13, 2015 Thank you guys.... I ran a scrub and it found 2 UNCORRECTABLE errors.... Googling how to fix gives very complicated and un-clear answers. Running a second scrub, but any ideas how to fix? Thanks again. Nuke it and recreate seems to be the only noncomplicated way of fixing BTRFS issues. But if you do attempt the complicated and lengthy process of trying to fix btrfs, please let us know your experience and results. How do I nuke it? I will attempt a backup first. update: I nuked by stopping the array, and un-assigning the second cache drive from the cache pool. It then prompted me to re-format. I guess it had to do with the raid stripeing. Then I re-assigned the second drive and now I am clean. Copying my backup back to cache.... long process. Thanks everyone! Quote Link to comment
Bungy Posted August 13, 2015 Share Posted August 13, 2015 Upgraded to rc2.. went to the docker page and it did not load received the error on my console redirect.. "unregister_netdevice waiting for lo to become free" Any Ideas? I'm having this same issue although I haven't noticed any negative effects from it. However, I may not be looking closely enough. I'm upgrading to RC3 and hoping it solves the problem. Quote Link to comment
RobJ Posted August 13, 2015 Share Posted August 13, 2015 Thank you guys.... I ran a scrub and it found 2 UNCORRECTABLE errors.... Googling how to fix gives very complicated and un-clear answers. Running a second scrub, but any ideas how to fix? Nuke it and recreate seems to be the only noncomplicated way of fixing BTRFS issues. But if you do attempt the complicated and lengthy process of trying to fix btrfs, please let us know your experience and results. update: I nuked by stopping the array, and un-assigning the second cache drive from the cache pool. It then prompted me to re-format. I guess it had to do with the raid striping. Then I re-assigned the second drive and now I am clean. Copying my backup back to cache.... long process. Thanks for reporting your experience, it's helpful. I've updated the Check Disk File systems wiki page, based on your experience and Tom's comments above. Thank you both for that. In addition, I've finally added a Rebuilding the ReiserFS superblock section. I'm quite sure others could have done better! Quote Link to comment
jonp Posted August 13, 2015 Share Posted August 13, 2015 Upgraded to rc2.. went to the docker page and it did not load received the error on my console redirect.. "unregister_netdevice waiting for lo to become free" Any Ideas? I'm having this same issue although I haven't noticed any negative effects from it. However, I may not be looking closely enough. I'm upgrading to RC3 and hoping it solves the problem. I had this and reported it up to Tom as well, but never saw any negative effects either. There are plenty of semi-random things that may appear in a log event, but if you don't notice anything to indicate there is a problem, it's not something I would even recommend submitting a defect report against. We need something to "break" and break consistently for us to recreate the issue so we can fix it ourselves. Nothing breaking? Nothing getting fixed. Quote Link to comment
Squid Posted August 13, 2015 Share Posted August 13, 2015 Upgraded to rc2.. went to the docker page and it did not load received the error on my console redirect.. "unregister_netdevice waiting for lo to become free" Any Ideas? I'm having this same issue although I haven't noticed any negative effects from it. However, I may not be looking closely enough. I'm upgrading to RC3 and hoping it solves the problem. I had this and reported it up to Tom as well, but never saw any negative effects either. There are plenty of semi-random things that may appear in a log event, but if you don't notice anything to indicate there is a problem, it's not something I would even recommend submitting a defect report against. We need something to "break" and break consistently for us to recreate the issue so we can fix it ourselves. Nothing breaking? Nothing getting fixed. The problem with this one is that message shows up on any and all open sessions including putty. Very annoying when you in the middle of nano. That being said, it was very intermittent, and since RC-3, I haven't noticed it at all Quote Link to comment
hernandito Posted August 14, 2015 Share Posted August 14, 2015 Upgraded to rc2.. went to the docker page and it did not load received the error on my console redirect.. "unregister_netdevice waiting for lo to become free" Any Ideas? I'm having this same issue although I haven't noticed any negative effects from it. However, I may not be looking closely enough. I'm upgrading to RC3 and hoping it solves the problem. I had this and reported it up to Tom as well, but never saw any negative effects either. There are plenty of semi-random things that may appear in a log event, but if you don't notice anything to indicate there is a problem, it's not something I would even recommend submitting a defect report against. We need something to "break" and break consistently for us to recreate the issue so we can fix it ourselves. Nothing breaking? Nothing getting fixed. The problem with this one is that message shows up on any and all open sessions including putty. Very annoying when you in the middle of nano. That being said, it was very intermittent, and since RC-3, I haven't noticed it at all I have gotten it w/ RC3... As I said, intermittent, and there was an instance while I was in putty, that it kept popping up about 4 times in a row, and freezing my putty session for about 15 seconds... Quote Link to comment
limetech Posted August 14, 2015 Author Share Posted August 14, 2015 Upgraded to rc2.. went to the docker page and it did not load received the error on my console redirect.. "unregister_netdevice waiting for lo to become free" Any Ideas? I'm having this same issue although I haven't noticed any negative effects from it. However, I may not be looking closely enough. I'm upgrading to RC3 and hoping it solves the problem. I had this and reported it up to Tom as well, but never saw any negative effects either. There are plenty of semi-random things that may appear in a log event, but if you don't notice anything to indicate there is a problem, it's not something I would even recommend submitting a defect report against. We need something to "break" and break consistently for us to recreate the issue so we can fix it ourselves. Nothing breaking? Nothing getting fixed. The problem with this one is that message shows up on any and all open sessions including putty. Very annoying when you in the middle of nano. That being said, it was very intermittent, and since RC-3, I haven't noticed it at all I have gotten it w/ RC3... As I said, intermittent, and there was an instance while I was in putty, that it kept popping up about 4 times in a row, and freezing my putty session for about 15 seconds... Those seeing this: are you accessing any shares via NFS? Quote Link to comment
Squid Posted August 14, 2015 Share Posted August 14, 2015 Those seeing this: are you accessing any shares via NFS? Never have, never will. Quote Link to comment
Bungy Posted August 14, 2015 Share Posted August 14, 2015 Me either. Samba all the way Quote Link to comment
PeterB Posted August 14, 2015 Share Posted August 14, 2015 Those seeing this: are you accessing any shares via NFS? I haven't been aware of this issue, and I use nfs (almost) exclusively. Quote Link to comment
hernandito Posted August 14, 2015 Share Posted August 14, 2015 I have NFS enabled... but I am not sure if I have anything actually accessing via NFS. I may have Openelec using Zeroconf... or Samba. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.