buly Posted October 25, 2021 Share Posted October 25, 2021 I just installed a 2TB Crucial P5 NVME SSD and lose access to the disk when UNRAID reads SMART data (when accessing from the web by clicking on the disk). I know that the problem is when reading the SMART information, because I reproduce the same error when I do it from the console with smartctl. In syslog: Oct 25 14:09:52 UNBuly kernel: DMAR: DRHD: handling fault status reg 2 Oct 25 14:09:52 UNBuly kernel: DMAR: [DMA Read] Request device [03:00.0] PASID ffffffff fault addr ffbf0000 [fault reason 06] PTE Read access is not set Oct 25 14:10:31 UNBuly kernel: nvme nvme0: I/O 193 QID 23 timeout, aborting Oct 25 14:10:52 UNBuly kernel: nvme nvme0: I/O 29 QID 0 timeout, reset controller Oct 25 14:11:01 UNBuly kernel: nvme nvme0: I/O 193 QID 23 timeout, reset controller The disk disappears from the system (I can't even see it in /dev /nvme0n1) and I don't get it back until I do a power off / power on. smartctl displays this information before freezing: # smartctl -a /dev/nvme0n1 smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.10.28-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: CT1000P5SSD8 Serial Number: 21xxxx Firmware Version: P4CR311 PCI Vendor/Subsystem ID: 0x1344 IEEE OUI Identifier: 0x00a075 Controller ID: 0 Number of Namespaces: 1 Namespace 1 Size/Capacity: 1,000,204,886,016 [1.00 TB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 00a075 013084ec4c Local Time is: Mon Oct 25 14:09:52 2021 CEST Firmware Updates (0x14): 2 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 78 Celsius Critical Comp. Temp. Threshold: 81 Celsius Namespace 1 Features (0x08): No_ID_Reuse Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 8.25W - - 0 0 0 0 0 0 1 + 3.00W - - 1 1 1 1 0 0 2 + 1.90W - - 2 2 2 2 0 0 3 - 0.0800W - - 3 3 3 3 10000 2500 4 - 0.0050W - - 4 4 4 4 12000 35000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 42 Celsius Available Spare: 100% Available Spare Threshold: 5% Percentage Used: 0% Data Units Read: 2,535,487 [1.29 TB] Data Units Written: 709,366 [363 GB] Host Read Commands: 2,922,852 Host Write Commands: 2,890,773 Controller Busy Time: 76 Power Cycles: 12 Power On Hours: 5 Unsafe Shutdowns: 11 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 42 Celsius Temperature Sensor 2: 47 Celsius Thermal Temp. 1 Transition Count: 1 After displaying the last line, the command hangs and I lose access to the disk. Thanks id advance. un-diagnostics-20211025-1554.zip Quote Link to comment
JorgeB Posted October 25, 2021 Share Posted October 25, 2021 If the same happens with smartctl it's not an Unraid issue, look if it has been reported to smartmontools. Quote Link to comment
Elton Posted November 3, 2021 Share Posted November 3, 2021 (edited) I have the same problem and also modified the Syslinux configuration kernel /bzimage append nvme_core.default_ps_max_latency_us=0 initrd=/bzroot but it still has no effect, unless I completely cut off the power and turn it on. a-diagnostics-20211103-1414.zip Edited November 3, 2021 by Elton Quote Link to comment
buly Posted November 3, 2021 Author Share Posted November 3, 2021 (edited) Solved after replace 1TB Crucial P5 NVME SSD with a Samsung 970 Evo Plus 1TB. However, same Crucial P5 in same computers, works fine with Ubuntu 21.04. I think something is wrong with UnRAID 6.9.2 kernel I found another post about same problem with another NVME: Edited November 3, 2021 by buly Quote Link to comment
Elton Posted November 3, 2021 Share Posted November 3, 2021 (edited) I am the same model "1TB Crucial P5 NVME SSD" replace it with a new product but it didn’t work out so sad.... Edited November 3, 2021 by Elton Quote Link to comment
D_gate Posted November 22, 2021 Share Posted November 22, 2021 I have the same issue with the 1TB P5, has anyone tried this on the 6.10 Kernel? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.