KRiSX

Members
  • Posts

    21
  • Joined

  • Last visited

About KRiSX

  • Birthday June 21

Converted

  • Gender
    Male
  • Location
    Sydney, Australia

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

KRiSX's Achievements

Noob

Noob (1/14)

1

Reputation

2

Community Answers

  1. Updated my backup server this morning from RC8, all good! Bit more nervous about my main server, but I'll give it a go later in the week. It's on 6.9.2 (i keep it on stable) and running an Intel 11th gen w/ Plex so want to be sure there are no issues there before i go for it thanks for the release!
  2. drives are recognised based off serial number, so as long as the drives are reported the same way it should be fine
  3. I have since replaced my Adaptec controller with an LSI 9207-8i, so far so good as I had to move all my data and format all my disks again and I've recieved absolutely no errors so far including on the 10tb that gave me issues. I really feel this was a case of my Adaptec couldn't properly handle 10tb disks, but they are on the support list so who knows... either way I'm in a so far so good situation and will continue to monitor. The onlly weird thing I'm seeing now is the Fix Common Problems plugin is saying write cache is disabled on all my disks connected to this controller, but speeds seem fine to me (150-250MB/s), I had 6 drives running flat out doing moves this past week and was hitting 550-650MB/s across them so I think its a false report or it simply doesn't matter.
  4. So I went ahead and stopped the array and restarted it to see if that would change anything - it didn't. After finding some other threads referencing some of erros I'm seeing, I tried sgdisk -v which led me to running sgdisk -e. Neither of which seemed to do a whole lot. I fired up storman (adaptec docker) to check the status from there and noticed the drive was "Ready" instead of JBOD like it should have been. Instead of adjusting this as I was pretty confident I would destroy the data doing so, I connected the drive to my motherboard directly instead. Tested if the drive was mountable and it is. Tried including it in my array but it was unmountable and wanted to format, so removed it again and now I'm moving the data off it via Unassigned Devices. Have managed to move off a couple hundred gigs without issue at this point, so I really still don't know for sure what the issue is but at least data loss appears to be minimal (if any). After I get the data off I'll remove the drive and test it elsewhere.
  5. Hi all, I woke up this morning to 46 read errors on a brand new 10tb Ironwolf drive that successfully passed a full preclear. I'm not running parity, but I used preclear on the drive to give it a good test/work out before putting data on it. Due to some shuffling of data and drives I'm doing using unBALANCE, this particular drive is 98% full currently and I now seem to be getting errors with it. My first suspicion is a controller issue as it wouldn't be the first time I've had a bad time with my Adaptec with unRAID and I do have an LSI controller on the way, but right now I want to try and work out if I've got a faulty drive or if its my controller as I suspect. Diagnostics attached. Highlights are... kernel: blk_update_request: I/O error, dev sdt, sector 6497176416 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 kernel: md: disk16 read error, sector=6497176352 kernel: XFS (md16): metadata I/O error in "xfs_da_read_buf+0x9e/0xfe [xfs]" at daddr 0x183430b20 len 8 error 5 kernel: sd 1:1:27:0: [sdt] tag#520 access beyond end of device kernel: blk_update_request: I/O error, dev sdt, sector 10754744120 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 kernel: md: disk16 read error, sector=10754744056 kernel: XFS (md16): metadata I/O error in "xfs_da_read_buf+0x9e/0xfe [xfs]" at daddr 0x281085ef8 len 8 error 5 kernel: sd 1:1:27:0: [sdt] tag#568 access beyond end of device kernel: blk_update_request: I/O error, dev sdt, sector 19327352984 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0 kernel: md: disk16 read error, sector=19327352920 kernel: md: disk16 read error, sector=19327352928 kernel: md: disk16 read error, sector=19327352936 kernel: md: disk16 read error, sector=19327352944 kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x480000058 len 32 error 5 I pulled down the diagnostics initially after I'd triggered an extended smart test, i then noticed the smart test was stopped by the host (host reset it says) and more errors appeared so I re-downloaded the logs... new items are... rc.diskinfo[11028]: SIGHUP received, forcing refresh of disks info. kernel: sd 1:1:27:0: [sdt] tag#678 access beyond end of device kernel: print_req_error: 4 callbacks suppressed kernel: blk_update_request: I/O error, dev sdt, sector 11491091352 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0 kernel: md: disk16 read error, sector=11491091288 kernel: md: disk16 read error, sector=11491091296 kernel: md: disk16 read error, sector=11491091304 kernel: md: disk16 read error, sector=11491091312 kernel: XFS: metadata IO error: 21 callbacks suppressed kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x2acec2358 len 32 error 5 kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x2acec2358 len 32 error 5 kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x2acec2358 len 32 error 5 kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x2acec2358 len 32 error 5 kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x2acec2358 len 32 error 5 kernel: XFS (md16): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x2acec2358 len 32 error 5 kernel: sd 1:1:27:0: [sdt] 19532873728 512-byte logical blocks: (10.0 TB/9.10 TiB) kernel: sdt: detected capacity change from 0 to 10000831348736 kernel: GPT:Primary header thinks Alt. header is not at the end of the disk. kernel: GPT:19524464639 != 19532873727 kernel: GPT:Alternate GPT header not at the end of the disk. kernel: GPT:19524464639 != 19532873727 kernel: GPT: Use GNU Parted to correct GPT errors. kernel: sdt: sdt1 I'm going to try triggering an extended test again now. Hopefully I'm right and I'll have these issues solved by replacing the controller, but I need to work it out for sure if possible. Thanks UPDATE: second extended test failed, same message as above "kernel: GPT:Primary header thinks Alt. header is not at the end of the disk." - I still strongly suspect its a controller issue, but would love some feedback newbehemoth-diagnostics-20220307-0811.zip
  6. Found this thread as I'm trying to decide on buying an LSI HBA to replace my adaptec 6805, but I've got 12 x ST8000VN004's (and another 5 x 6tb ironwolf drives)... So I'm starting to think I'll just stay put. Everything has been pretty solid for me except the other day where my controller had a fit and killed my parity (it happened while I was extracting a bunch of zip files and copying large amounts of data to the array). I'm currently running without parity until I decide what I'm doing (ran without for years on drivepool so I'm not bothered ultimately). That said, this seems to only really be a problem for people that spin down drives? I've generally got all my drives spinning 24/7, so perhaps it won't be an issue?
  7. Yeah, I believe I can get a HP H220 but I'll have to flash it myself. Shouldn't be too hard
  8. after my adventure yesterday with my adaptec having a heart attack, I think I'm going to swap it out for an LSI and I'm aiming for a 9207-8i... however googling now has be paranoid about fakes - any opinions on these listings? https://www.ebay.com.au/itm/175133386946?epid=1385797553&hash=item28c6c368c2:g:pEwAAOSwPP1h9NgJ&frcectupt=true https://www.ebay.com.au/itm/185266212380?epid=25025684119&hash=item2b22ba0e1c:g:Y6gAAOSwLHFh6-tx&frcectupt=true https://www.ebay.com.au/itm/143372493429?hash=item2161aaa275:g:1TAAAOSwAUZhb8B2&frcectupt=true either that or I play it safe and go for something like the SilverStone ECS04 which I believe I'd then have to flash, but at least I'd know its legit... didn't realise this was a problem in the world, but seems to be plenty of talk about it!
  9. Yeah I've seen that seems to be the go from what I've seen on here, have had this Adaptec for years and it's served me well.... I guess if it keeps giving me trouble I can look into an LSI Thanks for the replies
  10. Anything I can do, would it be worth adjusting those time outs or not relevant these days?
  11. Hey all, bit freaked out right now, I have been fast approaching finishing my build and transferring everything across from my old setup when I've just hit a whole boat load of errors on both parity drives and 2 of the drives that had data copying to them. I've stopped doing any transfers and don't want to touch anything until I know what to do next. Logs attached. Essentially I'm seeing a whole heap of "md: diskX write error" messages. I am now also seeing a lot of "Failing async write on buffer block" messages. For the record, all of these drives that now have errors showing have been in service for years without issue. I didn't pre-clear them, I just let unraid do its clear and then formatted them as it seems that is acceptable with known good drives. Hopefully this isn't the end of the world and I can simply resolve it with a parity rebuild or something along those lines. UPDATE #1: array appears to not be allowing any writes at this point either, going to stop all my docker containers and have kicked off extended tests on the 2 x 6tb's that have kicked up errors - have done a short test on the 2 x 8tb parity drives and they are showing as ok, maybe I need to do an extended test though? UPDATE #2: I've stopped the array as it was non-stop generating the same lines in the logs "Failing async write on buffer block" for 10 different blocks. When stopping the array also noticed this "XFS (md13): I/O Error Detected. Shutting down filesystem" and "XFS (md13): Please unmount the filesystem and rectify the problem(s)" - so perhaps disk 13 really isn't good like I thought? UPDATE #3: Restarted the array to see what would happen, array started, appears to be writable now, no errors being produced in the logs - parity is offline. Going to keep everything else (docker) shut down until the SMART tests are complete on the 2 x 6tb's unless someone advises me otherwise. UPDATE #4: Looking at the logs a bit harder, it seems my controller (Adaptec 6805) had a bit of a meltdown which is why I think the errors occured - I've since restarted the server which has cleared all the errors, but parity is still disabled. I'm going to continue running without parity until after the extended SMART test finishes on the 2 x 6tb's and at this point may just keep it disabled until I've finished moving data across anyway. I also ran xfs checks on each disk to be sure they were all healthy. Not sure there is much else to do apart from wait for scans to finish and then rebuild parity. Would still appreciate any feedback anyone may have I also found this article.... seems old... but I confirmed the timeout is set to 60... would changing it per drive as instructed cause any issue? https://ask.adaptec.com/app/answers/detail/a_id/15357/~/error%3A-aacraid%3A-host-adapter-abort-request newbehemoth-diagnostics-20220218-1114.zip
  12. fair points, too many variables to assume its just safe to go ahead... oh well, i'm 8 hours in with another 8 hours to go on the correcting check - will logs show what/if any files are affected or is it a case of if the files are there then you're all good? i'm assuming the latter and right now anything on here could be re-obtained without any hassle, i just want my system healthy overall
  13. ok thanks, i'll trigger that off now before i move more data over then, am I correct in saying all I need to do is hit the Check button on Main with the write corrections box ticked or is there more to it? is there a way to make it do this by default in the event this happens again? so i don't have to tie up my disks for 16+ hours twice
  14. Hey all, I had an unclean shutdown yesterday due to a power cut, UPS options either didn't work or weren't configured correctly - I'm not sure yet and will be testing and fixing it soon, but upon powering the server back on I of course had an unclean shutdown and a parity check which resulted in 116 errors, I just want to check if I need to do anything else in this situation? Looking at the syslog I see "Parity Check Tuning: automatic Non-Correcting Parity Check finished (116 errors)" Should I run parity check again with "Write corrections to parity" enabled or will this be fine? I didn't select or do anything to influence the parity check and everything seems to be working. I'm just in the middle of my transfer from DrivePool so want to be sure I'm good to keep going with something like this occuring.
  15. Just installed for the first time on 6.9.2 (new user) and it was going crazy after signing in. Have removed it for now based on this thread, oh well! Will keep an eye out for updates.