gustovier Posted September 27, 2021 Share Posted September 27, 2021 All, Need some big time help. I was trying to repair a drive, replaced a few sata cables and did a few reboots all of a sudden one of my drives (/dev/sdb) showed up as "wrong" although I didn't do anything to it. I then unplugged the data cable to that drive, rebooted. Following another drive (/dev/sdc) showed up as "wrong". I then plugged /dev/sdb sata cable back in and both drives continued to show as "wrong". At this point I figured something was just up with the array config, and I decided to do a "New Configuration" and I preserved all assignments, and restarted the array. Both drives were not seen by the array and I can't even mount them with unassigned drives (it only gives a Format option). I tried doing an XFS_repair, and that went no where root@Tower:/dev# xfs_repair -v /dev/md8 Phase 1 - find and verify superblock... xfs_repair: error - read only 0 of 512 bytes Also tried doing fdisk -L root@Tower:/dev# fdisk -l /dev/sdb Disk /dev/sdb: 3.65 TiB, 4000785948160 bytes, 7814035055 sectors Disk model: WDC WD40EZRZ-00G Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x00000000 Device Boot Start End Sectors Size Id Type /dev/sdb1 1 4294967295 4294967295 2T ee GPT I also tried running gdrisk... see below.. But at this point I've decided to not do anything more until advised by the experts here root@Tower:/dev# gdisk /dev/sdb GPT fdisk (gdisk) version 1.0.4 Warning! Disk size is smaller than the main header indicates! Loading secondary header from the last sector of the disk! You should use 'v' to verify disk integrity, and perhaps options on the experts' menu to repair the disk. Caution: invalid backup GPT header, but valid main header; regenerating backup header from main header. Warning! One or more CRCs don't match. You should repair the disk! Main header: OK Backup header: ERROR Main partition table: OK Backup partition table: ERROR Partition table scan: MBR: protective BSD: not present APM: not present GPT: damaged **************************************************************************** Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk verification and recovery are STRONGLY recommended. **************************************************************************** tower-diagnostics-20210926-2047.zip Quote Link to comment
itimpi Posted September 27, 2021 Share Posted September 27, 2021 Not looked at the diagnostics - but one of the xfs_repair commands you give are incorrect. When done via the command line the it has to /dev/sdb1 (I.e. include the partition number if using ‘sd’ type devices. You do not need the partition when using ‘md’ type devices. Quote Link to comment
gustovier Posted September 27, 2021 Author Share Posted September 27, 2021 8 hours ago, itimpi said: Not looked at the diagnostics - but one of the xfs_repair commands you give are incorrect. When done via the command line the it has to /dev/sdb1 (I.e. include the partition number if using ‘sd’ type devices. You do not need the partition when using ‘md’ type devices. The problem is that there is no partition. It’s like they disappeared. Quote Link to comment
gustovier Posted September 28, 2021 Author Share Posted September 28, 2021 Anyone else have some ideas? Desperate for some help. Quote Link to comment
JorgeB Posted September 28, 2021 Share Posted September 28, 2021 You can try to repairing the GPT partition with gdisk, but I would first clone it with dd then try that on the clone. Quote Link to comment
gustovier Posted September 29, 2021 Author Share Posted September 29, 2021 On 9/28/2021 at 1:52 AM, JorgeB said: You can try to repairing the GPT partition with gdisk, but I would first clone it with dd then try that on the clone. How would I repair with GPT so the partition shows up again? Quote Link to comment
JorgeB Posted September 29, 2021 Share Posted September 29, 2021 On 9/27/2021 at 2:49 AM, gustovier said: You should use 'v' to verify disk integrity, and perhaps options on the experts' menu to repair the disk. Can't help much more as I can't duplicate that or ever needed to use those options, look at the man page for gdisk: https://linux.die.net/man/8/gdisk Quote Link to comment
gustovier Posted September 29, 2021 Author Share Posted September 29, 2021 After some sleuthing ... something is telling me I'm being hit by this bug on my rather old gigabyte motherboard I'm using that is incorrectly reseting the drive size on the disks... http://www.users.on.net/~fzabkar/HDD/HDD_Capacity_FAQ.html Quote Link to comment
gustovier Posted September 29, 2021 Author Share Posted September 29, 2021 To add ran these commands, which is really making me think the bios has caused this problem as listed in the message above... As you can see disk sdb and idc have HPA enabled... and these are the 2 disks with the problems.. /dev/sdb: max sectors = 7814035055/7814037168, HPA is enabled root@Tower:/dev# hdparm -N /dev/sdc /dev/sdc: max sectors = 15628051055/15628053168, HPA is enabled root@Tower:/dev# hdparm -N /dev/sdd /dev/sdd: max sectors = 19532873728/19532873728, HPA is disabled root@Tower:/dev# hdparm -N /dev/sde /dev/sde: max sectors = 19532873728/19532873728, HPA is disabled root@Tower:/dev# hdparm -N /dev/sdf /dev/sdf: max sectors = 937703088/937703088, HPA is disabled root@Tower:/dev# hdparm -N /dev/sdg /dev/sdg: max sectors = 11721045168/11721045168, HPA is disabled root@Tower:/dev# hdparm -N /dev/sdh /dev/sdh: max sectors = 19532873728/19532873728, HPA is disabled root@Tower:/dev# hdparm -N /dev/sdi /dev/sdi: max sectors = 11721045168/11721045168, HPA is disabled root@Tower:/dev# hdparm -N /dev/sdj /dev/sdj: max sectors = 19532873728/19532873728, HPA is disabled root@Tower:/dev# hdparm -N /dev/sdk /dev/sdk: max sectors = 19532873728/19532873728, HPA is disabled root@Tower:/dev# Quote Link to comment
JorgeB Posted September 29, 2021 Share Posted September 29, 2021 You can remove the HPA, but might still need some more fixing: Quote Link to comment
gustovier Posted September 29, 2021 Author Share Posted September 29, 2021 Yeah I saw that post. I have GA-EP35-DS3R motherboard, and from what I can tell I will need to actually downgrade the bios. Of course gigabyte has the bios in some .exe file(assuming self extracting) and of course I'm on a Mac, so gotta figure out a way to extract out the bios. I was able to remove HPA from one of the disks and I can now see all my data. ... the other drive I'm getting this following error... (apparently SD<letter> assignments can change as the impacted drive went from sdb to sdg) root@Tower:~# hdparm -N /dev/sdg /dev/sdg: max sectors = 7814035055/7814037168, HPA is enabled root@Tower:~# hdparm -N p7814037168 /dev/sdg /dev/sdg: setting max visible sectors to 7814037168 (permanent) SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 10 51 40 01 21 00 00 00 a0 af 00 00 00 00 00 00 00 00 00 00 00 00 00 00 max sectors = 7814035055/7814037168, HPA is enabled Quote Link to comment
JorgeB Posted September 30, 2021 Share Posted September 30, 2021 Is that drive still on the onboard SATA? Disabling HPA doesn't work with some controllers, it should with onboard Intel. Quote Link to comment
gustovier Posted September 30, 2021 Author Share Posted September 30, 2021 No i took it off onboard sata and put onto my sata add on card. I still can’t figure out how to fix this other drive still with HPA on it. The hdparm command is failing as shown in the previous post. Does anyone have some guidance on how to resolve it? My research has not turned up much Quote Link to comment
JorgeB Posted September 30, 2021 Share Posted September 30, 2021 1 hour ago, gustovier said: No i took it off onboard sata and put onto my sata add on card. You need to put it back on the onboard SATA. Quote Link to comment
gustovier Posted September 30, 2021 Author Share Posted September 30, 2021 53 minutes ago, JorgeB said: You need to put it back on the onboard SATA. Sorry. I understood. The drive that I was able to remove HPA from is not using one of the onboard sata controllers (this MB has 2 on board). The other drive where I keep on getting the SG_IO error was already still using the onboard sata controller. Quote Link to comment
JorgeB Posted September 30, 2021 Share Posted September 30, 2021 Then try to connect that drive to the same controller were removing the HPA worked. Quote Link to comment
gustovier Posted September 30, 2021 Author Share Posted September 30, 2021 (edited) 21 minutes ago, JorgeB said: Then try to connect that drive to the same controller were removing the HPA worked. Yup, was thinking the same thing, and that worked. Did a power cycle and I can see my data once again. I also went ahead and disconnected all other drives completely but let my cache SSD hooked up to the onboard SATA controller, hoping that the bios would pick it for creating HPA, but upon boot up it did not. So now I'm a little worried the bios will pick another data drive or worse Parity drive to install HPA on... I'm about to try plugging back in all my other drives and if everything is good, then only thing left should be to reset Array config and let Parity rebuild. I'm going to need to get a new motherboard... I've had this box for about 10+ years now and never ran into this problem Edited September 30, 2021 by gustovier Quote Link to comment
JorgeB Posted October 1, 2021 Share Posted October 1, 2021 You should be able to disable that function in the BIOS, if not replace the board or don't use the onboard SATA controller. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.