[SOLVED] Double Data Drive Failure During Parity Check (Dual Parity)


Recommended Posts

Dear UNRAID Community,

 

I finally have an issue with UNRAID where I feel it is best to open my first topic. I've been using UNRAID without any major issues since 2014. Sure, I've had drives fail in the past, but nothing like my current situation. I've seen a few older topics that generally discuss these issues, but nothing that I could find where I am comfortable proceeding before I get some expert opinions. So here goes....

 

Over this past weekend I decided to kick off a Parity Check while I was on vacation (it had been 88 days... yeah, I know 😂). To my horror, I received an email from UNRAID that one of my 8TB disks failed with 2,000+ errors. No worries, "I have dual parity", I thought, and immediately drop shipped a new 14TB drive to my door. The parity check continued on... A few hours later, I got another horrific email that a second 8TB drive threw 3 errors and is now also disabled. So now I'm worried. I drop shipped a second 14TB drive to my door, and could not wait to get home and sort this out. Needless to say, the parity check ended in error due to the double failure.

 

Welp, now I am at home and am currently in the process of pre-clearing both drives.

 

Note: ALL data is backed up both onsite and offsite. Even so, I'm still unsettled.

 

 

QUESTION: Though it sounds like it is possible to REBUILD 2 DRIVES SIMULTANEOUSLY via DUAL PARITY, should I? Or would it be safer to get 1 drive up and running to provide fault protection, and then get the second up after that. Right now, if a third drive dies, I'm actually going to lose data.

 

My goal here is to get back up and running with the LEAST amount of reads/writes to the existing disks so I can once again have fault tolerance. My assumption is that if I rebuild both drives at the same time, it would be half as much overall I/O as rebuilding both drives separately, and the same I/O per disk array-wide (I'm assuming generally speaking that 1 rebuild has similar per disk I/O on other disks in array as 2 rebuilds). Rebuilding both would also presumably save some time and I would be back to dual fault protection in one shot. However, it is probably the exact same trade off benefit-wise with having fault tolerance sooner by getting at least 1 drive up.

 

So to sum up, I'm trying to tread carefully and reduce the risk of a third drive failing. I'll probably open a separate topic (or we can discuss here later) about these 2 disks (Shucked Seagate Backup Plus) and why they may have possibly failed at the same time. I'm just hoping these drives aren't cursed.

 

Array_Scrubbed.thumb.png.3350d2d4516893b7f7a0627bdce8cf6e.png

 

 

Just wanted to get some thoughts on this and pick your collective brains. Since there was not a ton of material on the subject, I'm hoping this discussion will help not only me, but others in the future. Finally, I wrote some custom scripts to pull the entire drive structure with and without folders, just so I have a complete log of what specifically is on each of these drives, in the event I actually have to restore from backups. I usually run these scripts prior to parity checks and they can be disk specific or the entire array.

 

Thanks so much for your help!

 

 

Edited by falconexe
For Clarity Sake
Link to comment

Johnnie, thanks for responding. Nothing personal, but I'd rather not post my diagnostics to the open forum. I'm a business user and even the sanitized version of my diags have some info in there that is sensitive. I DID save them off prior to rebooting just now. So I have them if anyone needs something specific. SMART reports on these drives were perfect...but both had read/write errors to numerous sectors all of a sudden during the parity check.

 

I'm not looking to solve a specific problem/disk issue per se, just looking for some advice on best practices. I consider both of these drives DEAD and will be upgrading them to 14TB as soon as they pre-clear. I'll keep the old 04/22 disks on hand to pull data if necessary.

 

That being said, do you or anyone else have GENERAL thoughts as to my question above regarding rebuilding both data drives at the same time?

Edited by falconexe
Link to comment

Very unlikely that two disks failed at the same time, but as to your question and since you already have two disable disks, the risk of rebuilding one or two is the same, since if another disk fails during the rebuild (of either 1 or 2 disks) you might lose data, so IMHO if you're going to rebuild do both at the same time.

  • Like 1
Link to comment
15 minutes ago, johnnie.black said:

Very unlikely that two disks failed at the same time, but as to your question and since you already have two disable disks, the risk of rebuilding one or two is the same, since if another disk fails during the rebuild (of either 1 or 2 disks) you might lose data, so IMHO if you're going to rebuild do both at the same time.

That is what I figured. So do you think there is any chance of actually recovering from the faults (especially the disk with only 3 errors)? Both drives are disabled, but are being emulated. Is there something that can be done? At this point I just figured they were trash disks and was going to move on. It does bug me though that they only have 11,000 hours (about 450 days or so) of life. Both were purchased and installed at the same time.

 

I'm down for whatever we can try, but it is not critical. If you think they are salvageable in any way, I may rebuild them as 14TB with the new disks and reuse the old 8TB in different slots (after another round of pre-clears of course). Finally, I am positive there was no hardware issue when this happened and I am on a PSU with battery backup. The power did not go out and there were no brown-outs. No issues with controllers or HBA cards, no correlation of disk location, and no loose wires. Aside from ambient air, nothing touched this server over the weekend. I'm on brand new hardware. See the posts below:

 

 

This all being said, clearly there was some type of issue with reads/writes. I would like to get to the bottom of it if you think it is worth it, but my priority is getting stability at this point.

Edited by falconexe
Typos
Link to comment

Quick Update: Both of my 14TB replacement disks successfully passed the preclear process. Thank goodness.

 

If any one is wondering how long it takes UNRAID to preclear a 14TB disk, it is 59 HOURS (Pre-Read/Zeroing/Post-Read). I averaged 197 MB/S and ran both PreClears simultaneously. Interestingly, each step in the 3 step process ended within 1 min of each other. So reading/writing across the entire disk was pretty much the same speed.

 

Now I am running automated scripts to audit my entire array disk by disk, and share by share. That way I have a complete record of every single file and their locations prior to starting my rebuilds.

 

Next Steps: SIMULTANEOUS 14TB DOUBLE DATA DISK REBUILDS. 🙏😬

Link to comment

My scripts finished and I have a full accounting of all files. I just loaded the new 14TB drives into the failed 04/22 slots. I'm about to assign the drives and start the rebuild process.

 

While I have the old 8TB drives out, I just ran an Error Scan (quick) with HD Tune Pro in Windows and every sector came back GREEN. I am now running the FULL test, and in 20 hours or so, I will have a report for both drives. The Smart Reports are below for both drives and they look normal to me. Please Note: The high temps of 52C & 53C on both drives was due to my old NORCO case with craptastic airflow. These were peak temps, not sustained, and were not long term. Since changing over to the 45 Drives Storinator, I've been rock solid in the mid 20s C. Could it be a factor, sure, but I've had no issues with any other drives since switching hardware. Other than those high temps the disks have been solid for me.

 

DISK 04:

Smart04.PNG.1293737a45a5e7b17748a90a80c95d79.PNG

 

DISK 22:

Smart22.PNG.cdbf0e6fcdf92f4ba94bf93039727b24.PNG

 

IF these both come back clean, then I would tend to agree with Johnnie that something else happened and that these drives did not actually fail. Via the UNRAID diagnostics, I can see that both read and write errors occurred on both drives. Very odd.  I'll be scratching my head here if these drives actually come back clean.

 

@johnnie.black (Or Anyone Else) If I can send these diags to you directly, would you still be cool with taking a look at this and letting me know what your thoughts are? In the end, I'll probably throw these 2 drives back into my server if they pass the HD Tune Pro Full Error Scan and UNRAID preclears, but until then, I have both the drives intact and the diagnostics of when this happened. Thanks for your help in advance.

Edited by falconexe
Typo
Link to comment

Here is the sanitized SYSLOG that shows the issues. I would love to get someone's feedback on the order of events. Is there a smoking gun that indicates what happened here? Aside from the READ/WRITE Errors on both disks 04/22, I see some odd shutdown entries that are concerning (Marked in RED). I also see some odd apcupsd entries that I assume are my UPS (Marked in PUPRLE).

 

*** BEGIN SANITIZED SYSLOG ***

 

Jan  3 12:48:18 MassEffect emhttpd: req (17): clearStatistics=true&startState=STARTED&csrf_token=****************
Jan  3 12:48:18 MassEffect kernel: mdcmd (453): clear 
Jan  3 12:48:24 MassEffect emhttpd: req (18): startState=STARTED&file=&cmdCheck=Check&optionCorrect=correct&csrf_token=****************
Jan  3 12:48:24 MassEffect kernel: mdcmd (454): check 
Jan  3 12:48:24 MassEffect kernel: md: recovery thread: check P Q ...
Jan  3 12:48:32 MassEffect kernel: mdcmd (455): set md_write_method 1
Jan  3 12:48:32 MassEffect kernel: 
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: PQ corrected, sector=1698969552
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: PQ corrected, sector=1698969560
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: PQ corrected, sector=1698969568
PQ CORRECTIONS CONTINUE FOR MANY MORE SECTORS...
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: PQ corrected, sector=1698970328
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: PQ corrected, sector=1698970336
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: PQ corrected, sector=1698970344
Jan  3 15:23:09 MassEffect kernel: md: recovery thread: stopped logging
Jan  4 03:00:09 MassEffect Recycle Bin: Scheduled: Files older than 30 days have been removed
Jan  4 12:01:59 MassEffect root: /etc/libvirt: 923.5 MiB (968314880 bytes) trimmed on /dev/loop3
Jan  4 12:01:59 MassEffect root: /var/lib/docker: 14.4 GiB (15480115200 bytes) trimmed on /dev/loop2
Jan  4 12:01:59 MassEffect root: /mnt/cache: 906.4 GiB (973236932608 bytes) trimmed on /dev/sdb1
Jan  4 12:48:53 MassEffect kernel: sd 12:0:10:0: attempting task abort! scmd(000000001f69d96a)
Jan  4 12:48:53 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1373 CDB: opcode=0x88 88 00 00 00 00 03 96 dd d8 d8 00 00 04 00 00 00
Jan  4 12:48:53 MassEffect kernel: scsi target12:0:10: handle(0x0023), sas_address(0x300062b203fe85d2), phy(18)
Jan  4 12:48:53 MassEffect kernel: scsi target12:0:10: enclosure logical id(0x500062b203fe85c0), slot(8) 
Jan  4 12:48:53 MassEffect kernel: scsi target12:0:10: enclosure level(0x0000), connector name(     )
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: task abort: SUCCESS scmd(000000001f69d96a)
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd d8 d8 00 00 04 00 00 00
Jan  4 12:48:57 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416023256
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416023192
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416023208
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416023216
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd dc d8 00 00 04 00 00 00
Jan  4 12:48:57 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416024280
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416024216
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416024224
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416024232
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd e0 d8 00 00 04 00 00 00
Jan  4 12:48:57 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416025304
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416025240
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416025248
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416025256
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416026256
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd e4 d8 00 00 04 00 00 00
Jan  4 12:48:57 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416026328
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416026264
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416026272
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416026280
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416027272
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416027280
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd e8 d8 00 00 04 00 00 00
Jan  4 12:48:57 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416027352
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416027288
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416027296
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416028296
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416028304
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:57 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd ec d8 00 00 04 00 00 00
Jan  4 12:48:57 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416028376
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416028312
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416028320
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416028328
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416029320
Jan  4 12:48:57 MassEffect kernel: md: disk22 read error, sector=15416029328
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd f0 d8 00 00 04 00 00 00
Jan  4 12:48:58 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416029400
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416029336
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416029344
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416029352
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416030344
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416030352
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x88 88 00 00 00 00 03 96 dd f4 d8 00 00 04 00 00 00
Jan  4 12:48:58 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416030424
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416030360
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416030368
READ ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416031360
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416031368
Jan  4 12:48:58 MassEffect kernel: md: disk22 read error, sector=15416031376
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 Sense Key : 0x2 [current] 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1879 CDB: opcode=0x8a 8a 00 00 00 00 03 96 dd d8 d8 00 00 04 00 00 00
Jan  4 12:48:58 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416023256
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416023192
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416023200
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416023208
WRITE ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416024200
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416024208
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1374 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1374 Sense Key : 0x2 [current] 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1374 ASC=0x4 ASCQ=0x0 
Jan  4 12:48:58 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1374 CDB: opcode=0x8a 8a 00 00 00 00 03 96 dd dc d8 00 00 04 00 00 00
Jan  4 12:48:58 MassEffect kernel: print_req_error: I/O error, dev sdab, sector 15416024280
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416024216
Jan  4 12:48:58 MassEffect kernel: md: disk22 write error, sector=15416024224
WRITE ERRORS CONTINUE FOR MANY MORE SECTORS...
Jan  4 12:48:59 MassEffect kernel: md: disk22 write error, sector=15416031360
Jan  4 12:48:59 MassEffect kernel: md: disk22 write error, sector=15416031368
Jan  4 12:48:59 MassEffect kernel: md: disk22 write error, sector=15416031376
Jan  4 12:49:22 MassEffect kernel: scsi_io_completion_action: 6 callbacks suppressed
Jan  4 12:49:22 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1432 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=0x00
Jan  4 12:49:22 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1432 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00
Jan  4 12:49:22 MassEffect kernel: mpt3sas_cm1: log_info(0x31110e03): originator(PL), code(0x11), sub_code(0x0e03)
Jan  4 12:49:26 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1432 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=0x00
Jan  4 12:49:26 MassEffect kernel: sd 12:0:10:0: [sdab] tag#1432 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00
Jan  4 12:49:26 MassEffect kernel: mpt3sas_cm1: log_info(0x31110e03): originator(PL), code(0x11), sub_code(0x0e03)
Jan  4 12:49:26 MassEffect kernel: mdcmd (456): set md_write_method 0
Jan  4 12:49:26 MassEffect kernel: 
Jan  4 12:49:33 MassEffect kernel: sd 12:0:10:0: device_block, handle(0x0023)
Jan  4 12:49:44 MassEffect kernel: sd 12:0:10:0: device_unblock and setting to running, handle(0x0023)
Jan  4 12:49:44 MassEffect kernel: sd 12:0:10:0: [sdab] Synchronizing SCSI cache
Jan  4 12:49:44 MassEffect kernel: sd 12:0:10:0: [sdab] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Jan  4 12:49:44 MassEffect kernel: mpt3sas_cm1: removing handle(0x0023), sas_addr(0x300062b203fe85d2)
Jan  4 12:49:44 MassEffect kernel: mpt3sas_cm1: enclosure logical id(0x500062b203fe85c0), slot(8) 
Jan  4 12:49:44 MassEffect kernel: mpt3sas_cm1: enclosure level(0x0000), connector name(     )
Jan  4 12:49:44 MassEffect rc.diskinfo[16863]: SIGHUP received, forcing refresh of disks info.
Jan  4 12:49:44 MassEffect rc.diskinfo[16863]: SIGHUP ignored - already refreshing disk info.
Jan  4 12:49:45 MassEffect kernel: scsi 12:0:12:0: Direct-Access     ATA      ST8000DM004-2CX1 0001 PQ: 0 ANSI: 6
Jan  4 12:49:45 MassEffect kernel: scsi 12:0:12:0: SATA: handle(0x0023), sas_addr(0x300062b203fe85d2), phy(18), device_name(0x0000000000000000)
Jan  4 12:49:45 MassEffect kernel: scsi 12:0:12:0: enclosure logical id (0x500062b203fe85c0), slot(8) 
Jan  4 12:49:45 MassEffect kernel: scsi 12:0:12:0: enclosure level(0x0000), connector name(     )
Jan  4 12:49:45 MassEffect kernel: scsi 12:0:12:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: Power-on or device reset occurred
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: Attached scsi generic sg27 type 0
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: [sdad] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB)
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: [sdad] 4096-byte physical blocks
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: [sdad] Write Protect is off
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: [sdad] Mode Sense: 9b 00 10 08
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: [sdad] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan  4 12:49:45 MassEffect kernel: sdad: sdad1
Jan  4 12:49:45 MassEffect kernel: sd 12:0:12:0: [sdad] Attached SCSI disk
Jan  4 12:49:46 MassEffect unassigned.devices: Disk with serial 'ST8000DM004-2CX188_***DISK22', mountpoint 'ST8000DM004-2CX188_***DISK22' is not set to auto mount and will not be mounted.
Jan  4 12:49:46 MassEffect rc.diskinfo[16863]: SIGHUP received, forcing refresh of disks info.
Jan  5 00:00:01 MassEffect crond[3203]: exit status 126 from user root /boot/config/plugins/dynamix.file.integrity/integrity-check.sh &> /dev/null
Jan  5 00:01:45 MassEffect apcupsd[6330]: apcupsd exiting, signal 15
Jan  5 00:01:45 MassEffect apcupsd[6330]: apcupsd shutdown succeeded
Jan  5 00:01:48 MassEffect apcupsd[17915]: apcupsd 3.14.14 (31 May 2016) slackware startup succeeded
Jan  5 00:01:48 MassEffect apcupsd[17915]: NIS server startup succeeded

Jan  5 03:00:10 MassEffect Recycle Bin: Scheduled: Files older than 30 days have been removed
Jan  5 03:00:21 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4264 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  5 03:00:21 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4264 Sense Key : 0x5 [current] 
Jan  5 03:00:21 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4264 ASC=0x24 ASCQ=0x0 
Jan  5 03:00:21 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4264 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
Jan  5 03:00:21 MassEffect kernel: print_req_error: 6 callbacks suppressed
Jan  5 03:00:21 MassEffect kernel: print_req_error: critical target error, dev sdj, sector 0
Jan  5 03:00:21 MassEffect kernel: mpt3sas_cm0: log_info(0x31110630): originator(PL), code(0x11), sub_code(0x0630)
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4955 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4955 Sense Key : 0x4 [current] 
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4955 ASC=0x44 ASCQ=0x0 
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4955 CDB: opcode=0x8a 8a 08 00 00 00 00 ae d2 e0 f8 00 00 00 08 00 00
Jan  5 03:00:22 MassEffect kernel: print_req_error: critical target error, dev sdj, sector 2933055736
Jan  5 03:00:22 MassEffect kernel: md: disk4 write error, sector=2933055672
Jan  5 03:00:22 MassEffect kernel: md: recovery thread: exit status: -4
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 Sense Key : 0x4 [current] 
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 ASC=0x44 ASCQ=0x0 
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 CDB: opcode=0x8a 8a 08 00 00 00 00 ae d2 e0 f0 00 00 00 08 00 00
Jan  5 03:00:22 MassEffect kernel: print_req_error: critical target error, dev sdj, sector 2933055728
Jan  5 03:00:22 MassEffect kernel: md: disk4 write error, sector=2933055664
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 Sense Key : 0x4 [current] 
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 ASC=0x44 ASCQ=0x0 
Jan  5 03:00:22 MassEffect kernel: sd 5:0:7:0: [sdj] tag#4956 CDB: opcode=0x8a 8a 08 00 00 00 00 ae d2 e1 00 00 00 00 08 00 00
Jan  5 03:00:22 MassEffect kernel: print_req_error: critical target error, dev sdj, sector 2933055744
Jan  5 03:00:22 MassEffect kernel: md: disk4 write error, sector=2933055680
Jan  5 06:36:40 MassEffect apcupsd[17915]: UPS Self Test switch to battery.
Jan  5 06:36:48 MassEffect apcupsd[17915]: UPS Self Test completed: Battery OK
Jan  6 08:00:02 MassEffect root: Fix Common Problems Version 2019.12.29
Jan  6 08:00:03 MassEffect root: Fix Common Problems: Error: disk4 (ST8000DM004-2CX188_***DISK22) is disabled
Jan  6 08:00:03 MassEffect root: Fix Common Problems: Error: disk22 (ST8000DM004-2CX188_***DISK04) is disabled
Jan  6 08:00:03 MassEffect root: Fix Common Problems: Error: disk4 (ST8000DM004-2CX188_***DISK22) has read errors
Jan  6 08:00:03 MassEffect root: Fix Common Problems: Error: disk22 (ST8000DM004-2CX188_***DISK04) has read errors

 

*** END SANITIZED SYSLOG ***

 

Edited by falconexe
Link to comment

Based on the log snippet it looks like a connection/power issue, disk22 dropped offline and then reconnected as a different disk and disk4 completely stopped responding, but I'm still missing a lot of info like controller used, firmware on the controller, full SMART reports for the disks as provided in the diags, etc, and no please don't send diags by PM, like mentioned here any help should be done in the forums, not by PMs, diagnostics have an anonymize option, if you believe some sensitive info is still present it would be best to make a feature request to also anonymize that, so you can post them in the future if you need help again.

Link to comment

Thanks johnnie. Very odd that the disks would power down like that randomly. Thanks for taking a look. And thanks for the info about logs and PM/Forum rules. I was not aware of that. I did use the anonymize option, but I still found A LOT of sensitive info to our organization within. That is why I just sent a snippit.

 

The other things that got me a tad worried in the log is that I noticed the RECYCLE BIN app emptied due to a scheduled chron job. From everything I know/have read over the years, read/writes during a parity check would only slow it down and not actually cause issues as the math still is accounted for on those changes.

 

Let me ask this last question. If it was a simply power issue, then I assume my data was intact and the parity check freaked out. That being said, is there a way to simply force UNRAID to just use the disks as is and get the array back up and running (and then perform another parity check)? Or is that a very bad idea? Not that I am going to try this. My rebuilds are already running on the new disks. Just curious...

 

I will most likely be throwing these drives back into my array after pre-clears pass based on your thoughts. I'll be watching the new 04/22 disks closely for any more power issues. Thanks!

Edited by falconexe
Typos
Link to comment
11 minutes ago, falconexe said:

Very odd that the disks would power down like that randomly

Power or connection problem, they show up similarly on the logs, so could be either. 

 

15 minutes ago, falconexe said:

That being said, is there a way to simply force UNRAID to just use the disks as is and get the array back up and running (and then perform another parity check)?

You can do that by doing a new config and checking "parity is already valid" before array start, btw were all those sync errors expected?

  • Thanks 1
Link to comment
Just now, johnnie.black said:

Power or connection problem, they show up similarly on the logs, so could be either. 

 

You can do that by doing a new config and checking "parity is already valid" before array start, btw were all those sync errors expected?

No, it had been 88 days since my last parity check. I don't schedule them due to the amount of dockers and services running. I manually stop everything first, then perform the parity check, then start everything back up. We've been so busy lately that it was hard to find a good time. At some point you just have to do it. I'll be going back to monthly for sure after this event.

 

Again though, everything is backed up on and offsite, but finding corrupted files (if any) is going to be a pain in the butt. I may not notice them for years. Luckily we have version history on our BACKBLAZE account and I run yearly snapshots to cold storage backups.

Link to comment
17 minutes ago, johnnie.black said:

You need to see what caused that, the only number of acceptable sync errors is 0, a few sync errors are normal after an unclean shutdown, but that's about the only situation were a few errors are normal.

Agreed. And I'm seriously stumped. Brand new hardware. Many brand new disks. This is only the 4th time since 2016 on this FLASH between 2 hardware configs that I have had any parity check errors > 0. And never anything over 3. But this last one had 2,654 before it crapped out and the disks failed/shutdown.

 

Here is the full history:

 

Date Duration Speed Status Errors

2020-01-05, 03:00:22 1 day, 14 hr, 11 min, 58 sec Unavailable Canceled 2654

2019-10-13, 10:12:101 day, 6 hr, 41 min, 40 sec126.7 MB/sOK0

2019-09-23, 09:30:281 day, 11 hr, 18 min, 15 sec110.2 MB/sOK0

2019-09-21, 10:21:1020 hr, 24 min, 12 sec190.6 MB/sOK0

2019-08-31, 17:43:251 day, 2 hr, 23 min, 18 sec147.4 MB/sOK0

2019-08-30, 08:09:201 day, 8 hr, 26 min, 8 sec119.9 MB/sOK0

2019-08-28, 22:15:261 day, 6 hr, 42 min, 2 sec126.7 MB/sOK0

2019-08-23, 13:26:1121 hr, 29 min, 15 sec103.4 MB/sOK0

2019-08-22, 08:39:2019 hr, 34 min, 19 sec113.6 MB/sOK0

2019-07-31, 06:45:2019 hr, 13 min, 37 sec115.6 MB/sOK0

2019-07-30, 07:01:2519 hr, 27 min, 21 sec114.2 MB/sOK0

2019-07-27, 21:31:5219 hr, 10 min, 29 sec115.9 MB/sOK0

2019-05-10, 22:52:2219 hr, 22 min, 23 sec114.7 MB/sOK0

2019-05-09, 21:53:04 19 hr, 26 min, 55 sec 114.3 MB/s OK 3

2019-02-28, 22:10:1919 hr, 22 min, 15 sec114.7 MB/sOK0

2019-01-08, 21:01:3420 hr, 54 min, 16 sec106.3 MB/sOK0

2018-11-22, 17:09:55 20 hr, 43 min, 11 sec 107.3 MB/s OK 1

2018-09-15, 19:04:451 day, 7 hr, 21 min, 2 sec70.9 MB/sOK0

2018-09-09, 14:13:0419 hr, 51 min, 6 sec112.0 MB/sOK0

2018-09-08, 16:00:0721 hr, 59 min, 57 sec101.0 MB/sOK0

2018-09-07, 16:02:5822 hr, 6 min, 22 sec100.5 MB/sOK0

2018-09-06, 17:33:5122 hr, 10 min, 51 sec100.2 MB/sOK0

2018-09-05, 16:13:5122 hr, 44 min, 32 sec97.7 MB/sOK0

2018-09-04, 15:53:0722 hr, 14 min, 11 sec100.0 MB/sOK0

2018-09-03, 12:46:18 1 day, 1 hr, 46 min, 33 sec 86.2 MB/s OK 3

2018-08-14, 11:47:0623 hr, 25 min, 50 sec94.9 MB/sOK0

2018-08-10, 18:06:5123 hr, 23 min, 44 sec95.0 MB/sOK0

2018-08-06, 04:42:3723 hr, 31 min, 38 sec94.5 MB/sOK0

2018-08-05, 02:38:441 day, 1 hr, 42 min, 10 sec86.5 MB/sOK0

2018-08-03, 20:48:5122 hr, 47 min, 1 sec97.6 MB/sOK0

2018-08-02, 17:49:3823 hr, 12 min, 43 sec95.8 MB/sOK0

2018-08-01, 17:50:5121 hr, 48 min, 51 sec101.9 MB/sOK0

2018-06-19, 02:40:3722 hr, 57 min, 3 sec96.8 MB/sOK0

2018-03-13, 19:24:151 day, 2 hr, 22 min, 50 sec84.3 MB/sOK0

2018-03-10, 08:36:098 hr, 16 min, 9 sec268.8 MB/sOK0

2017-12-02, 04:15:462 day, 4 hr, 15 min, 45 sec42.5 MB/sOK0

2017-11-07, 10:22:3323 hr, 12 min, 47 sec95.8 MB/sOK0

2017-11-05, 23:27:331 day, 10 hr, 13 min, 23 sec64.9 MB/sOK0

2017-11-01, 22:45:201 day, 22 hr, 45 min, 19 sec47.5 MB/sOK0

2017-10-01, 17:32:511 day, 17 hr, 32 min, 50 sec53.5 MB/sOK0

2017-08-31, 13:30:3923 hr, 42 min, 56 sec93.7 MB/sOK0

2017-08-14, 14:19:401 day, 57 min, 41 sec89.0 MB/sOK0

2017-07-29, 07:43:431 day, 4 hr, 14 min, 59 sec78.7 MB/sOK0

2017-07-27, 19:00:2019 hr, 18 min, 28 sec115.1 MB/sOK0

2017-07-21, 03:20:471 day, 5 hr, 37 min, 57 sec75.0 MB/sOK0

2017-07-17, 15:41:0818 hr, 58 min, 40 sec117.1 MB/sOK0

2017-07-16, 10:30:2922 hr, 58 min, 2 sec96.8 MB/sOK0

2017-07-15, 10:47:1122 hr, 31 min, 3 sec98.7 MB/sOK0

2017-07-14, 06:18:2215 hr, 30 min, 30 sec107.5 MB/sOK0

2017-06-30, 18:32:5018 hr, 32 min, 49 sec89.9 MB/sOK0

2017-05-31, 17:28:5217 hr, 28 min, 51 sec95.4 MB/sOK0

2017-05-26, 16:22:1317 hr, 39 min, 16 sec94.4 MB/sOK0

2017-03-27, 17:02:2117 hr, 2 min, 20 sec97.8 MB/sOK0

2017-03-06, 04:05:001 day, 1 hr, 46 min, 29 sec64.7 MB/sOK0

2017-02-16, 19:21:3618 hr, 15 min, 9 sec91.3 MB/sOK0

2017-02-13, 16:27:3618 hr, 13 min, 50 sec91.4 MB/sOK0

2017-01-31, 00:16:081 day, 16 min, 7 sec68.7 MB/sOK

2017-01-16, 09:55:2321 hr, 26 min, 41 sec77.7 MB/sOK

2016-12-22, 13:58:5220 hr, 17 min, 30 sec82.2 MB/sOK

2016-11-28, 19:04:3719 hr, 4 min, 36 sec87.4 MB/sOK

2016-11-11, 17:21:3619 hr, 48 min, 16 sec84.2 MB/sOK

2016-11-07, 13:08:2518 hr, 42 min, 42 sec89.1 MB/sOK

 

 

Edited by falconexe
Link to comment

Also my HD Tune Pro Full Error Scans successfully completed on both of the "failed" data drives. 100% Healthy. Once the rebuilds finish on the new drives, I'm going to drop these back into the array and run preclears. If they pass, I get back 16TB. Or if I look at it the other way, I gained 28TB unnecessarily ha ha.

 

1293746105_HDTuneProStatus-20200111.thumb.PNG.de80e3d3819e709232d27bb024004bdc.PNG

Link to comment

Double Data Rebuild Completed SUCCESSFULLY! Took just over 43 Hours for 14TB Dual Parity Rebuilding Dual Data Disks.

 

405855132_ParityStatus-20200111.thumb.PNG.ae6fb5d514f1ebe881538a0c3b73f838.PNG

 

I'm super happy this all worked out as discussed. Now I am going to drop these old 8TB data disks back into the array in slots 26/27 and run some preclears. I'll post another update soon.

Edited by falconexe
Link to comment

OK, so there is something really odd going on. I have never seen this before and have no idea what it means, or what is causing it.

 

So as you know, I've been preclearing the previously "failed" data disks 04/22.

 

Both passed HD Tune Pro Full Error scans. So I threw them back into the server and have been running Preclears on them.

 

Both disks passed the PreRead Process.

Both disks passed the Zeroing Process.

 

However, immediately after the Zeroing finished on the old Disk 22 with ID "sdae" (the one that threw 2,000+ errors during the parity check), the preclear process failed and the disk literally disappeared from my array. It is nowhere to be found in the GUI. The error I received "Invalid unRAID's MBR signature" is below in RED (SNs have been Sanitized). The old Disk 4 with ID "sdad" (the one that thew only 3 errors during the parity check) is still going with the post-read process.

 

657457654_PreclearStatus-20200113.PNG.2a545a6b2d3d989b0b4f7807e4c16f53.PNG

 

Furthermore, something odd happened with the "sdad" disk where is now showing up as missing under Historical Devices? However, it is not missing and is shown in the preclear area above. (Perhaps it was always there and I forgot?)

 

1015220416_PreclearStatus-20200113-02.thumb.PNG.50f1774cdb2b00e8412310bb816fe975.PNG

 

I am now thinking that the old disk 22 "sdae" is REALLY Messed Up For Some Reason. No issues with SMART, but something is going on.

 

 

Can anyone tell me what that error means and what you think is going on with this Disk that disappeared? Also this disk is in a totally different slot than before (was 22, now is 27). So I am really thinking it is NOT my UNRAID Hardware and actually the disk itself.

 

 

It appears to have powered down again after it failed the post-read. The log is below:

 

Jan 13 03:43:16 MassEffect preclear_disk_OLDDISK22[11017]: Zeroing: progress - 100% zeroed
Jan 13 03:43:16 MassEffect preclear_disk_OLDDISK22[11017]: Zeroing: dd - wrote 8001563222016 of 8001563222016.
Jan 13 03:43:17 MassEffect preclear_disk_OLDDISK22[11017]: Zeroing: dd exit code - 0
Jan 13 03:43:20 MassEffect preclear_disk_OLDDISK22[11017]: Writing signature:    0   0   2   0   0 255 255 255   1   0   0   0 255 255 255 255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: Failed test 1: MBR signature is not valid, byte 3 [00000] != [00170]
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: array 'sectors'
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 0 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 1 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 2 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 3 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 4 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 5 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 6 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 7 -> 00002
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 8 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 9 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 10 -> 00255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 11 -> 00255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 12 -> 00255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 13 -> 00001
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 14 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 15 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 16 -> 00000
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 17 -> 00255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 18 -> 00255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 19 -> 00255
Jan 13 03:43:21 MassEffect preclear_disk_OLDDISK22[11017]: 20 -> 00255
Jan 13 03:43:24 MassEffect preclear_disk_OLDDISK22[11017]: error encountered, exiting...
Jan 13 03:43:24 MassEffect preclear_disk_OLDDISK22[11017]: cat: 15853: No such file or directory
Jan 13 03:50:17 MassEffect kernel: mpt3sas_cm1: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000)
Jan 13 03:50:19 MassEffect kernel: scsi 12:0:14:0: Direct-Access     ATA      ST8000DM004-2CX1 0001 PQ: 0 ANSI: 6
Jan 13 03:50:19 MassEffect kernel: scsi 12:0:14:0: SATA: handle(0x0026), sas_addr(0x300062b203fe85d7), phy(23), device_name(0x0000000000000000)
Jan 13 03:50:19 MassEffect kernel: scsi 12:0:14:0: enclosure logical id (0x500062b203fe85c0), slot(13) 
Jan 13 03:50:19 MassEffect kernel: scsi 12:0:14:0: enclosure level(0x0000), connector name(     )
Jan 13 03:50:19 MassEffect kernel: scsi 12:0:14:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: Power-on or device reset occurred
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: Attached scsi generic sg30 type 0
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: [sdae] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB)
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: [sdae] 4096-byte physical blocks
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: [sdae] Write Protect is off
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: [sdae] Mode Sense: 9b 00 10 08
Jan 13 03:50:19 MassEffect kernel: sd 12:0:14:0: [sdae] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan 13 03:50:20 MassEffect rc.diskinfo[15953]: SIGHUP received, forcing refresh of disks info.
Jan 13 03:50:20 MassEffect kernel: sd 12:0:14:0: [sdae] Attached SCSI disk
Jan 13 04:26:48 MassEffect rc.diskinfo[15953]: SIGHUP received, forcing refresh of disks info.
Jan 13 04:26:48 MassEffect rc.diskinfo[15953]: SIGHUP ignored - already refreshing disk info.

Edited by falconexe
Typos
Link to comment

After searching the forums, I noticed some others having issues with the November 2019 version of the PreClear plugin. I just updated to the latest version and will rerun both drives to see if I receive similar results.

 

Even so, I still am at a loss as to why the disk unmounted and power cycled.

 

More to come...

Link to comment
55 minutes ago, trurl said:

If you have some suggestions for further anonymizing diagnostics let us know

@trurl Will do. I might be on the more security conscience side. So my diagnostic anonymizing might be overkill for home users, but it would be really cool to see an extra check box that can incorporate some further sanitation for business purposes or "extremely paranoid" 🥺 people. Ha ha. 😂

 

One of my main concerns is suppressing network info, filepaths like found in "ps.txt", and drive serial numbers in all diagnostic files. I've been scripting changing drive SNs to something more generic like "DRIVE22".  Maybe we are just paranoid, but with business, you can never be too careful.

 

Where should I post such requests? And thanks for even entertaining this thought. I'm happy to sanitize to my needs manually. I just know for us, we will never post a full diag on the forum for the above and other various security reasons.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.