Jump to content

dvaldez

Members
  • Content Count

    17
  • Joined

  • Last visited

Community Reputation

0 Neutral

About dvaldez

  • Rank
    Member
  1. Is there any way to re-enable or rebuild on the same disk or do I need to insert another disk? I tried changing ports/cables on this disk and it still shows disabled
  2. The sequence of events from my perspective (what I saw, I may have missed something): 1. I get an email alert from my Unraid: 2. I log into unraid and see that the disk has been disabled 3. I shut down the array/server because I am going on a long trip and don't want anything to happen while away 4. I power the server back up and take this diagnostic I have gotten 'read errors' before and the disk was automatically disabled the previous time as well
  3. Hey, I have 'read errors' on one of my disks again, but it passes SMART check. I've read that sometimes this is just a 'cable issue' or something. I plan on moving into new server hardware ASAP (I have it all ready) Should I try and resolve this read error and get all of the disks green again or should I move the disks into the new hardware and do a 'new config' or a rebuild? If I should resolve the issue first, what is the recommended course of action? Last time I had this issue another one of my disks became unreadable, then unrecognized so maybe I did not proceed correctly. Thanks Diagnostic file is attached mainstore-diagnostics-20190831-2011.zip
  4. Finally a positive update: I tried doing the hdparm -N command on the 2 unmountable disks as well as the parity disk which was "not the largest disk in the array" while plugged into a DIFFERENT storage controller, they were successful this time. I did a new config with the 4 data disks in the proper positions (as 2 of the disks were showing 'wrong') and was able to see all of the data I then assigned the parity disk and it seems to have been recognized, as it DIDNT show the message "all data on this disk will be erased when array started" parity was showing invalid (as expected due to 18hrs downtime of disk 3) and it is now running a parity sync/data rebuild
  5. How do I use the UD plugin to copy the data out of the 'unmountable' disks? I have it installed, but if I unassign the 2 "unmountable" disks, I cannot start the array, even in maintenance mode
  6. Here are the outputs of hdparm -I for both the unmountable disks: root@mainstore:~# hdparm -I /dev/sde /dev/sde: ATA device, with non-removable media Model Number: WDC WD40EFRX-68N32N0 Serial Number: WD-WCC7K5JCVK5E Firmware Revision: 82.00A82 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 Standards: Used: unknown (minor revision code 0x006d) Supported: 10 9 8 7 6 5 Likely used: 10 Configuration: Logical max current cylinders 16383 0 heads 16 0 sectors/track 63 0 -- LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 7814035055 Logical Sector size: 512 bytes Physical Sector size: 4096 bytes Logical Sector-0 offset: 0 bytes device size with M = 1024*1024: 3815446 MBytes device size with M = 1000*1000: 4000785 MBytes (4000 GB) cache/buffer size = unknown Form Factor: 3.5 inch Nominal Media Rotation Rate: 5400 Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * 64-bit World wide name * IDLE_IMMEDIATE with UNLOAD * WRITE_UNCORRECTABLE_EXT command * {READ,WRITE}_DMA_EXT_GPL commands * Segmented DOWNLOAD_MICROCODE * Gen1 signaling speed (1.5Gb/s) * Gen2 signaling speed (3.0Gb/s) * Gen3 signaling speed (6.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters * Idle-Unload when NCQ is active * NCQ priority information * READ_LOG_DMA_EXT equivalent to READ_LOG_EXT DMA Setup Auto-Activate optimization Device-initiated interface power management * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Write Same (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] (vendor specific) unknown 206[13] (vendor specific) * DOWNLOAD MICROCODE DMA command * WRITE BUFFER DMA command * READ BUFFER DMA command Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count supported: enhanced erase 480min for SECURITY ERASE UNIT. 480min for ENHANCED SECURITY ERASE UNIT. Logical Unit WWN Device Identifier: 50014ee21119cfe8 NAA : 5 IEEE OUI : 0014ee Unique ID : 21119cfe8 Checksum: correct root@mainstore:~# hdparm -I /dev/sdd /dev/sdd: ATA device, with non-removable media Model Number: WDC WD40EFRX-68N32N0 Serial Number: WD-WCC7K5XR9FEE Firmware Revision: 82.00A82 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 Standards: Used: unknown (minor revision code 0x006d) Supported: 10 9 8 7 6 5 Likely used: 10 Configuration: Logical max current cylinders 16383 0 heads 16 0 sectors/track 63 0 -- LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 7814035055 Logical Sector size: 512 bytes Physical Sector size: 4096 bytes Logical Sector-0 offset: 0 bytes device size with M = 1024*1024: 3815446 MBytes device size with M = 1000*1000: 4000785 MBytes (4000 GB) cache/buffer size = unknown Form Factor: 3.5 inch Nominal Media Rotation Rate: 5400 Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE Power-Up In Standby feature set * SET_FEATURES required to spinup after power up SET_MAX security extension * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * 64-bit World wide name * IDLE_IMMEDIATE with UNLOAD * WRITE_UNCORRECTABLE_EXT command * {READ,WRITE}_DMA_EXT_GPL commands * Segmented DOWNLOAD_MICROCODE * Gen1 signaling speed (1.5Gb/s) * Gen2 signaling speed (3.0Gb/s) * Gen3 signaling speed (6.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters * Idle-Unload when NCQ is active * NCQ priority information * READ_LOG_DMA_EXT equivalent to READ_LOG_EXT DMA Setup Auto-Activate optimization Device-initiated interface power management * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Write Same (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] (vendor specific) unknown 206[13] (vendor specific) * DOWNLOAD MICROCODE DMA command * WRITE BUFFER DMA command * READ BUFFER DMA command Security: Master password revision code = 65534 supported not enabled not locked not frozen not expired: security count supported: enhanced erase 490min for SECURITY ERASE UNIT. 490min for ENHANCED SECURITY ERASE UNIT. Logical Unit WWN Device Identifier: 50014ee2bb3783a5 NAA : 5 IEEE OUI : 0014ee Unique ID : 2bb3783a5 Checksum: correct root@mainstore:~#
  7. So I went for an xfs_repair on disk 3, the disk I think (if I remember correctly) has the least amount of data root@mainstore:~# xfs_repair -v /dev/md3 Phase 1 - find and verify superblock... error reading superblock 4 -- seek to offset 4000785907712 failed couldn't verify primary superblock - attempted to perform I/O beyond EOF !!! attempting to find secondary superblock... .found candidate secondary superblock... error reading superblock 4 -- seek to offset 4000785907712 failed unable to verify superblock, continuing... .found candidate secondary superblock... error reading superblock 4 -- seek to offset 4000785907712 failed unable to verify superblock, continuing... .found candidate secondary superblock... error reading superblock 4 -- seek to offset 4000785907712 failed unable to verify superblock, continuing... ..found candidate secondary superblock... error reading superblock 4 -- seek to offset 4000785907712 failed unable to verify superblock, continuing... .found candidate secondary superblock... error reading superblock 4 -- seek to offset 4000785907712 failed unable to verify superblock, continuing... .found candidate secondary superblock... error reading superblock 4 -- seek to offset 4000785907712 failed unable to verify superblock, continuing... ...................................................................................................................................... the .....'s have continued ever since for at least 40 mins and is still continuing...should I cancel at this point? there hasn't been anything else
  8. should I try doing the xfs_repair method on the 2 disks?
  9. I have a 6TB drive coming in the mail to use as a replacement parity disk, but I tried to start the array without the parity, just the remaining 4 disks. At this time the last 2 disks are unmountable: I didn't make any changes on disk 3 but I tried to remove the HPA on disk 4 previously, it failed however....How can I fix this?? I have attached an updated diagnostic if it matters mainstore-diagnostics-20190726-0225.zip
  10. well luckily I was never able to properly remove HPA on the 'read errors' data disk, so it should be in the same state it was in when the read errors occurred, unless HPA was changed automatically afterwards...
  11. OK I tried to use the HDAT2 method, when I used the set max command - it gave some error, I think my motherboard may not support these commands... I then tried the "Auto set max" command, and it said it was successful... I then booted back into unraid, and set all the disks in the proper slots, and it still says the parity disk is not the biggest disk... I know I did the correct one in HDAT2 because I disconnected all of the other disks to be sure. At this point, if possible, I'll just buy a larger disk 6/8TB and put it in the parity slot, will this be ok? If the data on the "read errors" disk was never corrupted or damaged, and it was in fact just a cable issue, will it retain all of the data that was on there? What should I do?
  12. I put the BIOS back to the original config, and I did New Config Now I have all of the original disks assigned in all of the original slots, but I cannot start the array as "The parity drive is not the biggest" I tried to remove the HPA using hdparm but I am getting some error: root@mainstore:~# hdparm -N p7814037168 /dev/sdb /dev/sdb: setting max visible sectors to 7814037168 (permanent) SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 10 51 40 01 21 00 00 00 a0 af 00 00 00 00 00 00 00 00 00 00 00 00 00 00 max sectors = 7814035055/7814037168, HPA is enabled I guess the next step is to try and use the 2nd HPA removal method. If I can say that all of the data disks are good and have the data that I want, is it possible to just buy and add a larger (6TB or something) disk and put that in the parity slot instead? Would I retain all of the data? How could I check if the original "read errors" disk is good? I definitely was not doing any rebuilding when it failed but there may have been a parity check running. I have not written any new data to the array to my knowledge, but I do have a few things that automatically mount the array via NFS so I'm not completely sure.
  13. Disk 3 should be fine, but it was removed from the array for at least 18 hours when the 'read errors' started, I wrote backups to the array within that time, is it still ok to do a New Config on? Also yes, originally HPA was enabled, I didn't know. On my motherboard I have 2 options in BIOS: HPA or BIOS Backup, I cannot disable it seems. I changed from HPA to BIOS Backup