JorgeB Posted July 18, 2023 Share Posted July 18, 2023 If the emulated disk is you can try again, still have issues with the cache pool, don't forget that extended SMART test. Quote Link to comment
Nanuk_ Posted July 18, 2023 Author Share Posted July 18, 2023 (edited) Score! You da man @itimpi Though before the rebuild should I still run Preclear or only if it prompts me? Edited July 18, 2023 by Nanuk_ Quote Link to comment
Nanuk_ Posted July 18, 2023 Author Share Posted July 18, 2023 (edited) Ah nevermind the moment I hit the last step you gave it started the data-rebuild automatically =D I'm surprised it didn't require a Pre-clear first is that a 6.12 thing? In the previous version it required it. Edited July 18, 2023 by Nanuk_ Quote Link to comment
Nanuk_ Posted July 18, 2023 Author Share Posted July 18, 2023 Darn I only ran the short SMART Test @JorgeB, it passed though! Thanks so much for the help guys! I now know what to do in that situation and that really help! You both rock @JorgeB @itimpi! Quote Link to comment
itimpi Posted July 18, 2023 Share Posted July 18, 2023 1 minute ago, Nanuk_ said: it passed though! Unfortunately the short test is nowhere near a through test of the drive. The long test will live up to its name taking hours per TB. Also progress only updates at 10% intervals. Quote Link to comment
Nanuk_ Posted July 18, 2023 Author Share Posted July 18, 2023 Damn should I stop the data rebuild and run the test? Quote Link to comment
JorgeB Posted July 18, 2023 Share Posted July 18, 2023 You can do both at the same time, since that disk is not in the array. Quote Link to comment
Nanuk_ Posted July 18, 2023 Author Share Posted July 18, 2023 Are you sure @JorgeB? Because it looks like the data-rebuild is writting data to it Quote Link to comment
JorgeB Posted July 18, 2023 Share Posted July 18, 2023 3 hours ago, JorgeB said: Also still errors on cache1, assuming cables were replaced run an extended SMART test and post the results. Not that disk. Quote Link to comment
Nanuk_ Posted July 18, 2023 Author Share Posted July 18, 2023 (edited) Ah! Thank you for clearing that up! Starting the Extended test now =D Though my future goal is to retire this HBA and move the cache to two onboard nvmes since this HBA has bricked 3 SSDs and possibly a 10TB. But for some reason these 4 old 1TB are safe. That what I get for buying from the local amazon ripoff here in the Philippines. I've learned my lesson. Though I plan a replacement in the future from "The Art of the Server" and just ship it here. At least it'll be form a trusted brand. Edited July 19, 2023 by Nanuk_ Quote Link to comment
Nanuk_ Posted July 19, 2023 Author Share Posted July 19, 2023 Result of the long SMART test. Sorry shut down my desktop which the browser that had it on was. Quote Link to comment
JorgeB Posted July 19, 2023 Share Posted July 19, 2023 Did you replace the cables for that disk? It was still showing issues in the last diags Quote Link to comment
Nanuk_ Posted July 19, 2023 Author Share Posted July 19, 2023 (edited) Yes, replaced I the cable. I've actually replaced them 3 times. But still the crc errors persist so I'm gonna assume it's the HBA which I plan to eventual replace. But since the only reputable HBA vendor I know is in the US and I'm in the PH I'll have to be patient and buy 2 NVMes as a replacement cache when I can afford it and retire this HBA. Also the HDDs for the cache are really old 1TBs, only used them because the SDDs got bricked. They're pretty much on their last legs. Edited July 19, 2023 by Nanuk_ Quote Link to comment
Nanuk_ Posted July 22, 2023 Author Share Posted July 22, 2023 New HDD is throwing errors again. I'm starting a SMART extended self-test, then I'll post the diag after. Not sure why it's throwing errors again. I really hope it's not a dud. Quote Link to comment
Nanuk_ Posted July 23, 2023 Author Share Posted July 23, 2023 Here are the logs. I really hope that it's something normal or can be fixed, this was the replacement HDD for the one that had to be RMA'd trojancarabao-smart-20230723-0854.zip trojancarabao-diagnostics-20230723-1245.zip Quote Link to comment
Nanuk_ Posted July 23, 2023 Author Share Posted July 23, 2023 Apparently my Fix Common Problems is now complaining about it. Quote Link to comment
Nanuk_ Posted July 23, 2023 Author Share Posted July 23, 2023 Followed the Fix Common Problems advice and rebooted. Quote Link to comment
JorgeB Posted August 14, 2023 Share Posted August 14, 2023 Are you rebuilding disk2? Quote Link to comment
Nanuk_ Posted August 14, 2023 Author Share Posted August 14, 2023 (edited) Yes, I'm trying to follow the steps your all taught me 1.) first I replaced the SATA cable. (DONE) 2.) Afterwards I disabled spindown, (DONE) 3.) placed it into maintenance mode and (DONE) 4.) ran and extended smart test. (DONE) 5.) Then I ran an xfs repair (DONE) 6.) and now I'm currently rebuilding now. (IN PROGRESS) trojancarabao-smart-20230814-1328.ziptrojancarabao-diagnostics-20230814-1719.zip Edited August 14, 2023 by Nanuk_ Quote Link to comment
JorgeB Posted August 14, 2023 Share Posted August 14, 2023 59 minutes ago, Nanuk_ said: (IN PROGRESS) Then that error (disk invalid) is normal, it will change to OK if/when it finishes rebuilding. Quote Link to comment
Nanuk_ Posted September 20, 2023 Author Share Posted September 20, 2023 Hi so I bought 2 new HDDs I pre-cleared both, but and after a day of use one of them got an error. I have a feeling it might be because that drive is hooked up to my HBA which I think might be faulty. I set the server to maintenance mode and I'm running an extended test. I'll run XFS Repair after and send both the test and diag here when I'm done. Quote Link to comment
Nanuk_ Posted September 20, 2023 Author Share Posted September 20, 2023 ATA Error Count: 44315 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 44315 occurred at disk power-on lifetime: 30 hours (1 days + 6 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 e0 08 a8 bc b9 40 00 1d+06:57:46.041 READ FPDMA QUEUED 2f 00 01 10 00 00 00 00 1d+06:57:46.041 READ LOG EXT 60 00 50 b8 c9 b9 40 00 1d+06:57:45.929 READ FPDMA QUEUED 60 40 70 78 c7 b9 40 00 1d+06:57:45.928 READ FPDMA QUEUED 60 18 68 60 c7 b9 40 00 1d+06:57:45.928 READ FPDMA QUEUED Error 44314 occurred at disk power-on lifetime: 30 hours (1 days + 6 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 80 18 c8 bb b9 40 00 1d+06:57:45.898 READ FPDMA QUEUED 2f 00 01 10 00 00 00 00 1d+06:57:45.898 READ LOG EXT 61 00 a0 50 ae b9 40 00 1d+06:57:45.892 WRITE FPDMA QUEUED 61 c0 00 90 ad b9 40 00 1d+06:57:45.891 WRITE FPDMA QUEUED 60 c0 a0 90 ad b9 40 00 1d+06:57:45.883 READ FPDMA QUEUED Error 44313 occurred at disk power-on lifetime: 30 hours (1 days + 6 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 c0 38 90 ad b9 40 00 1d+06:57:45.882 READ FPDMA QUEUED 2f 00 01 10 00 00 00 00 1d+06:57:45.882 READ LOG EXT 60 c0 b0 d0 bd b9 40 00 1d+06:57:45.865 READ FPDMA QUEUED 60 48 a8 88 bd b9 40 00 1d+06:57:45.864 READ FPDMA QUEUED 60 e0 b8 a8 bc b9 40 00 1d+06:57:45.862 READ FPDMA QUEUED Error 44312 occurred at disk power-on lifetime: 30 hours (1 days + 6 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 28 68 c0 ac b9 40 00 1d+06:57:45.856 READ FPDMA QUEUED 2f 00 01 10 00 00 00 00 1d+06:57:45.856 READ LOG EXT 61 38 78 a0 ab b9 40 00 1d+06:57:45.847 WRITE FPDMA QUEUED 60 08 c8 c0 bb b9 40 00 1d+06:57:45.841 READ FPDMA QUEUED 60 08 c8 98 4c 45 40 00 1d+06:57:45.841 READ FPDMA QUEUED Error 44311 occurred at disk power-on lifetime: 30 hours (1 days + 6 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 38 60 a0 ab b9 40 00 1d+06:57:45.789 READ FPDMA QUEUED 2f 00 01 10 00 00 00 00 1d+06:57:45.789 READ LOG EXT 60 08 10 68 b2 b9 40 00 1d+06:57:45.778 READ FPDMA QUEUED 60 08 10 b0 46 45 40 00 1d+06:57:45.778 READ FPDMA QUEUED 60 08 10 a0 45 45 40 00 1d+06:57:45.778 READ FPDMA QUEUED Quote Link to comment
JorgeB Posted September 20, 2023 Share Posted September 20, 2023 Please post the diagnostics. Quote Link to comment
Nanuk_ Posted September 20, 2023 Author Share Posted September 20, 2023 Sorry, was waiting for the extended smart test to finish. trojancarabao-diagnostics-20230920-1900.ziptrojancarabao-smart-20230920-1900.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.