wishie

Members
  • Posts

    108
  • Joined

  • Last visited

Everything posted by wishie

  1. Ok, I've attached the new replacement disk, but won't assign it or start the array. Give me a few minutes.
  2. I didn't have a disk to replace it with at the time, and I was hoping that rebooting the system would bring that failing disk back online (its happened before with other disks, ages ago).. but on reboot, ALL disks being unassigned wasn't a good sign. So you want me to boot and get diagnostics now?
  3. Correct me if I am wrong, but if I HADNT ticked that, it would have wiped out my parity and I would have lost all that data, no?
  4. Nothing writes directly to the array (its all via cache) and the cache disk has no pending data to write.
  5. Hey all, Last night I had a sudden drive failure (Seagate ST2000DM001) so I shut down the machine to put a new drive in.. on booting the system, ALL drives assignments where GONE. Every slot said 'unassigned'. Thankfully I had taken a screenshot of my drive assignments a few weeks ago, so I set all the other drives back as usual, ticked 'Parity is valid' box and started the array (still with a missing disk). I now have a replacement disk for it, and want to plug it in an assign it to the slot.. my question is, will a parity rebuild automatically start, or is there something special I have to do? I know that NORMALLY when replacing a disk it will start a sync/rebuild, but in this case, its sort of a 'new config' because all drive assignments were lost for some reason. Any advice appreciated. At this moment, the machine is powered off, but all drives are assigned (except the failed/replaced disk, which is still on my desk).
  6. Ok, new power supply is in, parity check started, and so far so good.. its been running a few minutes and hasn't disabled the disk yet. I'll let it do this pass without corrections (I unticked 'write corrections to disk') and if it succeeds, ill run a check again, with it ticked.
  7. My new power supply hasnt arrived yet, but the disk just got disabled overnight (2:21am)... ive just woken up (8:30am) and grabbed Diagnostics.. can someone see if they can spot anything please? wishie-diagnostics-20180115-0846.zip
  8. I did not reboot since those notifications. I'm thinking now, maybe those backups we're custom scripts I made years ago.. I'll check when I get home. Either way, they used to dismiss just fine.
  9. I should mention, if I refresh the page, or navigate away to another page, the notifications are gone.
  10. Attached. wishie-diagnostics-20180114-1512.zip
  11. When trying to dismiss notifications, they just keep re-appearing.. this seems to be a new bug, since the upgrade to 6.4.0 Video attached to show what im talking about. notifications-issue-6.4.0.mp4
  12. I've ordered a 700W Thermaltake PSU, so I hope that will fix the problems.
  13. From a little reading, I've determined that the Key, ASC, and ASCQ point to the error code "Not Ready - Cause not reportable.".. sadly, this doesn't shed any more light on the situation.
  14. ..on start of the parity check [451898.927735] mdcmd (45): check correct [451898.927760] md: recovery thread: check P ... [451898.931876] md: using 1536k window, over a total of 2930266532 blocks. [451899.163482] md: recovery thread: P corrected, sector=0 [451903.311373] mpt2sas_cm0: log_info(0x31110d01): originator(PL), code(0x11), sub_code(0x0d01) [451903.311382] mpt2sas_cm0: log_info(0x31110d01): originator(PL), code(0x11), sub_code(0x0d01) [451903.319258] sd 7:0:6:0: [sdi] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 [451903.319271] sd 7:0:6:0: [sdi] tag#0 Sense Key : 0x2 [current] [451903.319274] sd 7:0:6:0: [sdi] tag#0 ASC=0x4 ASCQ=0x0 [451903.319277] sd 7:0:6:0: [sdi] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 00 01 0c 40 00 00 04 00 00 00 [451903.319280] print_req_error: I/O error, dev sdi, sector 68672
  15. Ok, so, within 1-2 seconds after pressing "Check" to start a parity sync, it aborted with a 'drive failure'. Diagnostics are attached. I suspect power issues. wishie-diagnostics-20180111-0101.zip
  16. I'll set it to not write corrections to parity, and try a sync shortly.. if it fails almost instantly, im happy to assume its the PSU and replace it.
  17. The case is very clean, and there are dust filters on almost every entrance point.. its all housed in a Fractal Designs Define R5. Im now wondering if its the power supply ageing, or perhaps those cheap molex -> 2 x SATA power cable splitter things..
  18. Quick specs: Gigabyte F2A68HM-DS2 motherboard AMD A4-7300 CPU 8Gb DDR3 RAM LSI 9211-8i (LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]) SAS controller Power Supply is 500W (Thermaltake TT-500NL2NK-A) - I've had this for a few years now I believe. Hard disks are as follows: Cache: 128Gb Kingston SSD (SV300) Parity: 3Tb Seagate NAS (ST3000VN000) Disk 1: 2Tb Seagate (ST2000DM001) Disk 2: 2Tb Western Digital Purple (WDC_WD20PURX) Disk 3: 2Tb Western Digital Purple (WDC_WD20PURZ) Disk 4: 3Tb Seagate NAS (ST3000VN000) Disk 5: 2Tb Seagate (ST2000DM001) Disk 6: 2Tb Western Digital Purple (WDC_WD20PURX)
  19. It's been the same machine for many years, always been stable. The only change is that lately I swapped out 2 x 2Tb drives (1 parity, 1 data) for 2 x 3Tb drives.. so perhaps power usage is the issue? The power supply struggling under load?
  20. Ok, so pretty much, every drive passes SMART extended tests, without issue. I can write files to the array on each disk, without issue.. but twice in a row now, as soon as I start a parity check, a disk gets upset, spits errors and gets disabled. I can't explain what is causing it. I'll try to trigger it again soon, and get Diagnostics straight away... unless anyone thinks there is a better course of action?
  21. I will do that, if it happens again.. what I've done for the time being, is swap which power cable goes to which drive, to see if a different drive fails.. will keep you posted.
  22. Here is the dmesg output of the error.. errors.rtf
  23. So, a week or so ago, I had a failed/disabled disk. I ran extended SMART tests, which came back with no issues. I removed the drive and replaced it with a brand new 3Tb drive, zeroed, and ran extended SMART tests.. everything seemed fine.. a few days later, it was time for a parity check, and the same 'slot' (disk 4) showed errors and was disabled. At this point, I started to suspect a bad SATA sable.. Since I use an LSI SAS controller, I swapped out the SFF-8087 to SATA cable and re-enabled the drive.. everything seemed fine again, until today. Once again, the same 'slot' has a failed drive in it. Should I be suspecting the power supply, or the LSI card, or something else?
  24. Worked perfectly. Thanks again, have a good Christmas.