apandey

Members
  • Posts

    461
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

apandey's Achievements

Enthusiast

Enthusiast (6/14)

54

Reputation

41

Community Answers

  1. One other thing you should know is that unraid recreates these networks on reboot, so the ID will change and break the compose managed containers, unless you have the “recreate containers on startup” setting on. If not, you will need to compose down/up after each reboot
  2. the pool was created pre 6.12, using the zfs plugin. I don't remember exact steps on how I imported when 6.12 came out Anyway, replaced drive, resilvered, and then reimported as per your instructions, this time in same order as zdb. All good now
  3. there seems to be some odd behaviour on the slots on unraid UI vs the disk order shown by zdb or zpool status. zdb shows the failed disk as 8th drive, but thats a different order than unraid assigned slots. Then unraid seems to be doing operations based on drive index rather than drive IDs (which is why it was trying to target sdo) I'll see how it looks after resilver and reimport, but if the drives don't line up, this may be an issue in future too
  4. i created partitions on sdv using sgdisk and did replace as above. its resilvering now
  5. same result after reboot, it is still somehow trying to run wipefs on /dev/sdo I had 2 slots empty on my server (hot swap drive bays). What I did was I plugged in the new drive, it showed up under unassigned drives I stopped the array. changed the drive next to zdata 5 with the new drive rebooted and started the array. drive in slot 8 showed red mark, but i found it showing online in zpool status command How should I do manual replacement? would it be zpool replace zdata /dev/sdq /dev/sdv
  6. the new drive is showing as sdv, so I am not sure why it tied to run wipefs on sdo sdo is the exiting drive from the pool in slot 8 (which is showing a red cross). I did not touch it, yet it is somehow in play slot 5 is the new drive, it showing as sdv. the drive before replacement in slot 5 was showing sdq I did not touch slot 8, that was and is sdo, and became red after I started array after replace zpool status is showing this # zpool status -xv pool: zdata state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J scan: scrub repaired 0B in 02:36:08 with 0 errors on Tue Oct 3 09:36:09 2023 config: NAME STATE READ WRITE CKSUM zdata DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 sdk ONLINE 0 0 0 sds ONLINE 0 0 0 sdl ONLINE 0 0 0 sdo ONLINE 0 0 0 sdg ONLINE 0 0 0 sdn ONLINE 0 0 0 sdh ONLINE 0 0 0 10871034331009088735 UNAVAIL 0 0 0 was /dev/sdq1 sdr ONLINE 0 0 0 notice the ordering difference (missing drive is same position as the red drive in UI, but sdo is showing elsewhere in 4th slot. sdv is nowhere) I will run wipefs on sdv and see what happens
  7. One of my drive in zfs rzidz2 pool started throwing increasing number of pending sectors, so I decided to replace it I followed the instructions here: basically stopped array, changed the bad drive in dropdown with new one and started array Unlike the instructions, I don't see replace happening automatically. status shows the following status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. I also noticed another issue. The failing drive was zdata5, and that slot now shows the new drive ID However, zdata8, a different drive which seems healthy is showing a red cross next to it with message "device is disabled, contents emulated" In the zfs pool status, there is this in the 8th row 10871034331009088735 UNAVAIL 0 0 0 was /dev/sdq1 but, sdq was the previous identifier for the failed drive (it shows unassigned now) and it was in slot 5 before. The drive in slot 8 is sdo which shows online in zfs status. Is this just a UI glitch? Anyway, how should I proceed? do I need to run the replace command, or something else has gone wrong? Diags attached godaam-diagnostics-20231011-2146.zip
  8. How do you know the hba is overheating? I have the same one, and I don't think it has a temperature sensor. If it's just crc errors, it may just be cables. SATA cables and connectors unfortunately are one of the worst when it comes to pc connectivity, and even a bit of unsettling with vibrations or other activity can dislodge or wear them out
  9. scrub finished and corrected all verify + csum errors. thanks for pointers
  10. I have switched the affected drive to the other motherboard SATA controller, so far so good started a scrub on the tsdb pool, and its reporting some errors. I did not check the fix errors Scrub started: Sat Jun 24 11:12:36 2023 Status: running Duration: 0:13:38 Time left: 3:06:22 ETA: Sat Jun 24 14:32:36 2023 Total to scrub: 2.48TiB Bytes scrubbed: 173.21GiB (6.82%) Rate: 216.83MiB/s Error summary: verify=2 csum=120266 Corrected: 0 Uncorrectable: 0 Unverified: 0 how do I go about fixing the errors? should I run with fix errors checkbox? need to wait for initial scrube to finish, or just cancel and do the fix steps? EDIT: started a correcting scrub
  11. arrrrgggh. sorry, and thanks for spotting this. I mistakenly moved the other 2TB drive to the LSI controller. no wonder the crc errors didn't stop. I have a spare port on the motherboard SATA, so will rewire to that and report back
  12. OK, I will swap things out once the current parity check finishes. I have also increased the shutdown timeout in disk settings for now to avoid this for next shutdown I did move the drive from motherboard SATA to my LSI controller when the crc errors first showed up. The drive bays use different cables too. I will also examine the drive side connectors this time when I swap it. I have one more spare slot where I can try to put the disk If this continues, is there a way I can downgrade tsdb to a single drive pool temporarily? would like to take the disk out and test it outside the system if swapping cables etc doesnt work
  13. nice. I didn't know a diag is automatically saved in such a case. Attached the one from shutdown. seems the troubled btrfs pool drive had problems unmounting diags attached godaam-diagnostics-20230623-1255.zip I am still not clear why this should mark the array dirty. a bit scary if issues with a cache pool would affect array operations so what should be my next steps? I am not very familiar with btrfs recovery. I ran a scrub before reboot and it seemed OK, not sure how I see the same issue that the logs see. If the disk needs replacing, how do I replace it?
  14. I was getting crc count error warnings for one of my disks (sdi in attached diagnostics), so I was moving it around to different drive bays (so as to use different SATA ports and cables), but somehow its always the same disk which seems to show CRC errors. Nothing was wrong with data / filesystems during this, except for the latest reboot. After the reboot, a parity check started automatically 1. can I know what caused this? I assume unclean shutdown, timeout on something exceeded, but I can't see what 2. the trouble disk was on a btrfs pool of 2 HDDs. Why did the array get affected? 3. Anything else I am missing? is there another underlying issue I upgraded from 6.11.5 to 6.12.1 yesterday, and it went smooth including importing a zfs pool. I don't believe this is probably the culprit, but still worth a mention the log is full of disk errors, but I cannot pinpoint which disk. I am assuming sdi based on creeping CRC errors diagnostics attached godaam-diagnostics-20230623-1312.zip
  15. The specifications say: Interface: SATA to SATA, SAS to SAS Data Connector: SATA / SAS 6G Data Port x 5 https://www.sg-norco.com/pdfs/SS-500_5_bay_Hot_Swap_Module.pdf It says 6G, but I doubt it matters. see below the cable is simply pass through. The SAS controller decides what protocol to speak to the drives (SATA vs SAS) depending on what is connected. Since a SATA controller cannot speak SAS protocol, SAS drives have an extra plastic piece on the connector that blocks a SATA connector to be plugged in. But besides that, the cables don’t matter as long as they are rated for the throughput Ideally, you would want a SAS to SAS breakout cable. Practically, SAS backplanes would be SAS to SAS without any breakout involved The norco case, seems the ports will take both connectors and simply pass-through, so it really just depends on the controller. In theory, this will allow connecting simple SATA controller to SAS drives physically, but that will not work since the controller will fail to talk to any connected SAS drive. A SAS controller should work OK Note that it is not recommended to mix SATA and SAS on same controller / backplane Coming to practical implications, I think you are worrying about nothing. None of the drives you have can saturate 6G bandwidth, forget about 12G. Those ratings only come into picture when using SAS expanders where multiple drives have to share that 12G bandwidth