Jump to content

JorgeB

Moderators
  • Posts

    67,504
  • Joined

  • Last visited

  • Days Won

    706

Everything posted by JorgeB

  1. That suggests more of a hardware problem, I would try running the server as a basic NAS, after booting in safe mode and without any dockers/VMs for a few days, if it still locks up like that it's almost certainly hardware, if not start enabling the other services one at a time.
  2. Not with ECC RAM, unless you can disable ECC in the BIOS.
  3. Yep, but it's not recommended since first Unraid V6 release, it worked OK with older releases.
  4. New config is not used to rebuild disks, see here: https://wiki.unraid.net/Troubleshooting#Re-enable_the_drive
  5. SMART looks fine, since the emulated disk mounted correctly you can rebuild on top, though I would recommend replacing/swapping cables first (if not yet done) to rule them out if it happens again to the same disk.
  6. Then it's not the revision, but trim doesn't work on some some Asmedia 1062 controllers, example.
  7. It's not, did you check the connections like asked? And by checked I mean power the server and psychically remove and re-connect cables. If you already did that replace the cables.
  8. That part is OK, normalized value is remaining life, raw value is spent life, this is from an almost new MX 500: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 100 100 000 - 0 5 Reallocate_NAND_Blk_Cnt -O--CK 100 100 010 - 0 9 Power_On_Hours -O--CK 100 100 000 - 267 12 Power_Cycle_Count -O--CK 100 100 000 - 4 171 Program_Fail_Count -O--CK 100 100 000 - 0 172 Erase_Fail_Count -O--CK 100 100 000 - 0 173 Ave_Block-Erase_Count -O--CK 100 100 000 - 4 174 Unexpect_Power_Loss_Ct -O--CK 100 100 000 - 0 180 Unused_Reserve_NAND_Blk PO--CK 000 000 000 - 28 183 SATA_Interfac_Downshift -O--CK 100 100 000 - 0 184 Error_Correction_Count -O--CK 100 100 000 - 0 187 Reported_Uncorrect -O--CK 100 100 000 - 0 194 Temperature_Celsius -O---K 068 053 000 - 32 (Min/Max 0/47) 196 Reallocated_Event_Count -O--CK 100 100 000 - 0 197 Bogus_Current_Pend_Sect -O--CK 100 100 000 - 0 198 Offline_Uncorrectable ----CK 100 100 000 - 0 199 UDMA_CRC_Error_Count -O--CK 100 100 000 - 0 202 Percent_Lifetime_Remain ----CK 100 100 001 - 0 206 Write_Error_Rate -OSR-- 100 100 000 - 0 210 Success_RAIN_Recov_Cnt -O--CK 100 100 000 - 0 246 Total_LBAs_Written -O--CK 100 100 000 - 2476107088 247 Host_Program_Page_Count -O--CK 100 100 000 - 19671852 248 FTL_Program_Page_Count -O--CK 100 100 000 - 33437715
  9. Disk dropped offline so there's no SMART, check connections and post new diags.
  10. Most likely, we really don't recommend using Marvell controllers with Unraid, especially the 9230.
  11. It should be there, you try installing mcelog, it might show some more info.
  12. Looks like a firmware bug to me, 20TBW is nowhere close the expected life for that SSD, as a comparison where's one from an MX500 500GB, with around 90TB written it's at 50% expected life: === START OF INFORMATION SECTION === Model Family: Crucial/Micron MX500 SSDs Device Model: CT500MX500SSD1 Serial Number: 1849E1DAED7F ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 100 100 000 - 0 5 Reallocate_NAND_Blk_Cnt -O--CK 100 100 010 - 0 9 Power_On_Hours -O--CK 100 100 000 - 11725 12 Power_Cycle_Count -O--CK 100 100 000 - 38 171 Program_Fail_Count -O--CK 100 100 000 - 0 172 Erase_Fail_Count -O--CK 100 100 000 - 0 173 Ave_Block-Erase_Count -O--CK 050 050 000 - 762 174 Unexpect_Power_Loss_Ct -O--CK 100 100 000 - 3 180 Unused_Reserve_NAND_Blk PO--CK 000 000 000 - 38 183 SATA_Interfac_Downshift -O--CK 100 100 000 - 0 184 Error_Correction_Count -O--CK 100 100 000 - 0 187 Reported_Uncorrect -O--CK 100 100 000 - 0 194 Temperature_Celsius -O---K 064 038 000 - 36 (Min/Max 0/62) 196 Reallocated_Event_Count -O--CK 100 100 000 - 0 197 Bogus_Current_Pend_Sect -O--CK 100 100 000 - 0 198 Offline_Uncorrectable ----CK 100 100 000 - 0 199 UDMA_CRC_Error_Count -O--CK 100 100 000 - 0 202 Percent_Lifetime_Remain ----CK 050 050 001 - 50 206 Write_Error_Rate -OSR-- 100 100 000 - 0 210 Success_RAIN_Recov_Cnt -O--CK 100 100 000 - 0 246 Total_LBAs_Written -O--CK 100 100 000 - 174404886673 247 Host_Program_Page_Count -O--CK 100 100 000 - 3003552877 248 FTL_Program_Page_Count -O--CK 100 100 000 - 4939020470 I would just ignore that.
  13. Please post the diagnostics: Tools -> Diagnostics
  14. There are two revisions for the Asmedia 1602 controller, and IIRC trim works on rev 02, it doesn't on rev 01: 03:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 02)
  15. I missed you were using adapters, those could be the problem, try to use a miniSAS to SAS cable.
  16. Still nothing, if you hear them spinning up it could be a cable issue, try a different cable if possible.
  17. 3ware controller is failing to initialize: Aug 6 10:13:47 Tower kernel: 3w-9xxx: scsi1: ERROR: (0x06:0x000E): Controller Queue Error: clearing. Aug 6 10:13:47 Tower kernel: 3w-9xxx: scsi1: ERROR: (0x06:0x0015): No valid response during init connection. Aug 6 10:13:47 Tower kernel: 3w-9xxx: scsi1: ERROR: (0x06:0x0007): Initconnection failed while checking SRL. Aug 6 10:13:47 Tower kernel: 3w-9xxx: scsi1: ERROR: (0x06:0x0021): Compatibility check failed during reset sequence. Aug 6 10:13:47 Tower kernel: 3w-9xxx: probe of 0000:06:00.0 failed with error -12 Don't see any SAS device on the LSI, if there's no SAS devices connected there connect at least one and post new diags.
  18. Yes, if there's nothing on the old cache you want to keep.
  19. No, at least not on the LSI Yes Diags might give a clue.
  20. Probably a hardware problem, diags after rebooting won't help much for this, you can try enabling the syslog server.
  21. This is known to fill up the log: Aug 6 10:42:12 Thunderbrew root: Installing atop-2.2 package... Also, make you only install the apps you need form Nerdapck, it appears you have everything installed.
  22. Start here, also always post the complete diags instead (tools -> diagnostics)
×
×
  • Create New...