mdumont1

Members
  • Posts

    15
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

mdumont1's Achievements

Noob

Noob (1/14)

0

Reputation

  1. Yes, I'm just letting the parity2 build complete since there's no point stopping it until I know it will be bad. There's about 8 hours left on the parity2 build/sync, with only the data past 8TB left to construct. If there's no reason not to trust the parity2 sync currently running then I'll rebuild disk3 on the old parity drive and then test the disabled disk3 to see if it was just a connection issue. Thanks.
  2. The new parity2 drive is shucked from a WD easystore. The old parity2 was place in the easystore enclosure and connected to the system via USB3. My plan was to preclear that drive before adding it back to the array since it previously had errors. The SMART report for the old parity2 is available in the previously attached diagnostics zip under the WD_easystore name. I have attached it directly for convenience. WD_easystore_264D_5A38343051325A34-0-0-20210327-1824 (sdo).txt
  3. Attaching diagnostics zip. The syslog attached to the original post is from this zip file. The system has not been rebooted yet since the drive failure. tower-diagnostics-20210327-1824.zip
  4. The other day I replaced my parity2 drive with a new 14TB drive. During the parity-sync process one of my data drives had write errors and was disabled. The disabled drive is 8TB and the parity1 drive is 8TB. The parity-sync process was less than 8TB in when the drive failed. From what I can tell in the syslog the failure occurred during the nightly mover process. I'm looking for some guidance an the best way to rebuild the failed drive and not lose any data. Some questions: 1. The parity-sync continued after the drive failed and is ongoing. Since the parity1 drive was valid just before the swap of parity2 I'm assuming the disk that is disabled is being emulated to construct parity2. Is this assumption correct? Can I trust this parity-sync? 2. If the parity-sync result can be trusted, Is a reasonable process to get back to a good state to take the old parity2 drive and rebuild the disabled disk on it? syslog.txt
  5. What's the best way to find out which files are unreadable before doing the copy, so that I can skip them?
  6. Does the new disk1 need to be exactly the same as the old disk1? Will keeping any newly added files on the disk affect the rebuild of disk9? I can probably use the old 4TB parity drive for this.
  7. Does the order matter? Can I copy the data from disk9 to disk1 (now 8TB) and then copy the data from old disk1 to new disk1? I'm assuming the the parity drive won't help me at all at this point?
  8. The rebuild did complete and I suspect that ~1TB of the rebuild completed before the drive failed. Unfortunately, I don't have a backup. At this point I'm trying to find the optimal way to maximize the data that can be recovered. If I were to copy the files from old disk1 onto rebuilt disk1 and then rebuild disk9, is there any chance that would work? I could then try and copy files off of the bad disk9 but at this point I don't really trust it. Here is the smart report from the old disk1 gathered with CrystalDiskInfo on my Win7 machine connected with a sata-to-usb adapter: ---------------------------------------------------------------------------- (4) WDC WD20EARX-00PASB0 ---------------------------------------------------------------------------- Enclosure : Seagate USB USB Device (V=0BC2, P=A0A4, sa1) - wd Model : WDC WD20EARX-00PASB0 Firmware : 51.0AB51 Serial Number : WD-WMAZA8438694 Disk Size : 2000.3 GB (8.4/137.4/2000.3/2000.3) Buffer Size : Unknown Queue Depth : 32 # of Sectors : 3907029168 Rotation Rate : Unknown Interface : USB (Serial ATA) Major Version : ATA8-ACS Minor Version : ---- Transfer Mode : SATA/600 Power On Hours : 40369 hours Power On Count : 218 count Temparature : 32 C (89 F) Health Status : Good Features : S.M.A.R.T., 48bit LBA, NCQ APM Level : ---- AAM Level : ---- -- S.M.A.R.T. -------------------------------------------------------------- ID Cur Wor Thr RawValues(6) Attribute Name 01 200 200 _51 000000000001 Read Error Rate 03 170 166 _21 000000001942 Spin-Up Time 04 _92 _92 __0 00000000203A Start/Stop Count 05 200 200 140 000000000000 Reallocated Sectors Count 07 100 253 __0 000000000000 Seek Error Rate 09 _45 _45 __0 000000009DB1 Power-On Hours 0A 100 100 __0 000000000000 Spin Retry Count 0B 100 100 __0 000000000000 Recalibration Retries 0C 100 100 __0 0000000000DA Power Cycle Count C0 200 200 __0 000000000045 Power-off Retract Count C1 _10 _10 __0 00000008B9F6 Load/Unload Cycle Count C2 118 103 __0 000000000020 Temperature C4 200 200 __0 000000000000 Reallocation Event Count C5 200 200 __0 000000000000 Current Pending Sector Count C6 200 200 __0 000000000000 Uncorrectable Sector Count C7 200 200 __0 000000000002 UltraDMA CRC Error Count C8 200 200 __0 000000000000 Write Error Rate
  9. Sorry about that, you're right it's actually disk1 that I replaced and not disk2. I got confused and was thinking that disk1 was the parity disk. So, in my previous posts, when I say disk2 I mean disk1. Attached is the latest diagnostics. tower-diagnostics-20170113-2148.zip
  10. The checksum idea sounds like a good one, I'll use md5deep for that. Is it possible to replace any bad files with the good ones from old disk2 and then rebuild disk9 as-is? When parity was built there were no smart errors from disk9 so is it possible that the existing parity will work? The mover emptied the cache drive to the new disk2 during rebuild so this sounds out of the question. I thought I did have notifications set up, I started receiving emails of the SMART failures about 6 hours into the rebuild. Are there other notifications that I'm unaware of that I should be enabling? I have emails enabled for all notification levels: status, notices, warnings, and alerts. I have attached the complete diagnostics zip (minus some mover entries from the syslog), can you take a look to see if I have other drive issues? Thanks. tower-diagnostics-20170112-1918.zip
  11. Hello, Here's my situation. I have 2 new 8TB drives that have been precleared and I'm using them to upgrade my system. I have done the following: 1. Did a parity check before changing any drives. 2. Replaced existing 4TB parity drive with new 8TB drive and did a rebuild. 3. Did another parity check which resulted in no errors. 4. Replaced disk2 (2TB) with the other 8TB drive and started the rebuild. After about 6 hours of rebuilding the rebuild rate goes down to 100KB/s and I started getting the following errors in the syslog: Jan 12 02:01:31 Tower kernel: sd 4:0:4:0: [sdl] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Jan 12 02:01:31 Tower kernel: sd 4:0:4:0: [sdl] tag#0 Sense Key : 0xb [current] Jan 12 02:01:31 Tower kernel: sd 4:0:4:0: [sdl] tag#0 ASC=0x47 ASCQ=0x3 Jan 12 02:01:31 Tower kernel: sd 4:0:4:0: [sdl] tag#0 CDB: opcode=0x28 28 00 5e 64 4e a0 00 00 20 00 Jan 12 02:01:31 Tower kernel: blk_update_request: I/O error, dev sdl, sector 1583632032 Jan 12 02:01:31 Tower kernel: md: disk9 read error, sector=1583631968 Jan 12 02:01:31 Tower kernel: Buffer I/O error on dev md9, logical block 197953996, lost async page write Jan 12 02:01:31 Tower kernel: md: disk9 read error, sector=1583631976 Jan 12 02:01:31 Tower kernel: Buffer I/O error on dev md9, logical block 197953997, lost async page write Jan 12 02:01:31 Tower kernel: md: disk9 read error, sector=1583631984 Jan 12 02:01:31 Tower kernel: Buffer I/O error on dev md9, logical block 197953998, lost async page write Jan 12 02:01:31 Tower kernel: md: disk9 read error, sector=1583631992 Jan 12 02:01:31 Tower kernel: Buffer I/O error on dev md9, logical block 197953999, lost async page write I have the existing 4TB and 2TB drives intact and there have been writes to the array since the parity drive upgrade. The drive that is being rebuilt was full (less than 50MB free) and the drive with the failure is 2TB and has ~36GB free. Also, the error count showing up in the WebUI for the failing drive is less than 50 and the errors shown in the syslog seem to be the same blocks over and over again. The rebuild has made it past the size of the failing drive and is continuing at a normal rate. I'm only using single parity with Unraid 6.2.1. Attached is the syslog and smart report for the failing drive. I understand that this failing drive is likely toast but what are the various options I have to recover my data? TIA unraid_failure_2017-01-12.zip
  12. Check out this thread: http://lime-technology.com/forum/index.php?topic=45249.0 It seems multiple dockers have the logging problem. Another user had a similar issue with the BTSync docker and was able to slow down the log growth with an extra BTSync option.
  13. This solution works great, Thanks! I was able to clear up 11GB of space by reinstalling the sonarr docker. I wonder if these log files can be removed directly without reinstalling the docker? If so, a cron job to remove the files would keep things fairly clean.
  14. I have this 1TB Seagate Drive. I backed up my data and some of it was corrupt already so I lost some of it and then ran the preclear script to see if the drive is failing. I ran it the first time and there seems to be some strange numbers. I'm a little concerned about the "a change of -509 in the number of sectors pending re-allocation". I have attached the results from the first run After getting all of this not so happy result I decided to run the script again but this time with 3 cycles. I have attached the results from the second run. I'm not quite sure how to understand these results. It seems like the drive is continuously having a different amounts of sectors pending reallocation yet it always ends with During the preclear there were read errors that were displayed in syslog. Also, this drive was previously being used in a cheap external enclosure before being moved to my unRaid box. Is this drive failling? Should I replace it? If so how can I tell based on the preclear results? Thanks Seagate-preclear-2ndRun.txt Seagate-preclear-1stRun.txt