[6.11.1] Nvme Cache Controlller (970 PRO)

Kltc · October 17, 2022

Well ... it did not work. Even with the sugestion, the cache just turns read-only instantly. Temps on it look file as well.

(the nvme bellow is the 970 PRO nvme).

Now i'm completely lost and out of options ... any help is appreciated.

image.png.793b4b9fef90f30e7d56bc47d26192b4.png

new diagnostics post error attached, if it helps.

cronos-diagnostics-20221017-1808.zip

JorgeB · October 17, 2022

Like mentioned by the kernel try adding that to syslinux.cfg, on the main GUI page click on flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"

nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

e.g.:

append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

Reboot and see if it makes a difference.

Kltc · October 17, 2022

Hi. I did.... as stated in the previous comment.

Even after that change, the problem repeats.

When i try to access the cache:

image.png.8b18ab122ae2ea8dc5e64a195d4df3ef.png

I'll try for the 3rd time, and get back here.

JorgeB · October 17, 2022

2 minutes ago, Kltc said:

I did.... as stated in the previous comment.

Sorry, missed that part, if that doesn't help could be a board/BIOS/kernel issue, I use the same device with v6.11.1 without any issues, and didn't need to disable the low power options.

JorgeB · October 17, 2022

Kltc · October 17, 2022

6 minutes ago, JorgeB said:

Sorry, missed that part, if that doesn't help could be a board/BIOS/kernel issue, I use the same device with v6.11.1 without any issues, and didn't need to disable the low power options.

No problem, i appreciate the help. I'm completely lost on what i could do now, as i saw no particular changes from 6.10.3 (but i might be mistaken). I didn't change the BIOS and it's the same board (ASUS Prime X470 PRO), so i'm even considering if i could / should reinstall the flash (my backup is already of 6.11.1).

JorgeB · October 17, 2022

The kernel is a lot newer, look for a BIOS update for the board, it might help.

Kltc · October 17, 2022

Well, thanks but ... No luck so far (still with the pcie_aspm option off as well as the latency one).

I just updated the BIOS from 2018 to the latest driver, tried another restore on the cache, and got the same errors:

Even thou it took more time to throw an error.

I find it even wierder that i can't even see what's on the drive when this happens.

image.png.d38b222ed27d646a72a4042b90f051fa.png

I got no idea if i can do a simple downgrade from 6.11.1 to 6.10.3, but maybe that might be the best here, i'm completely out of ideas.

JorgeB · October 18, 2022

7 hours ago, Kltc said:

I find it even wierder that i can't even see what's on the drive when this happens.

Device is dropping offline so that part is normal, question is why is it dropping with just a kernel change, since we know it's not a device problem I suspect some board compatibility/BIOS issue.

Kltc · October 22, 2022

Hi. So ... here's the status update, and a few more coments.

TLDR: NVMe controller on the motherboard is gone. Not dead, but worse, as it even corrupts the filesystem. The fix ? PCIe NVMe adapter.

Starting from where we left off, first thing i did was revert to 6.10.3. Good idea, but ... same errors. Very, very odd. Did a few searches, and most of them referred to a bogus motherboard error, which looked the same. The difference here is, their motherboards were new, mine isn't (4 years old). Still, I decided to get a PCIe to NVMe adapter board, and ... it works. No errors, all fine. The system still needed some repairs, but I managed to repair it without major issue. And no more corrupted blocks.

Now I'm trying to reach ASUS to find out why I can only see one NVMe and not the two on the adapter (already went to the BIOS to check). Might even get my W11 VM 100% back, but let's see. I do see a light at the end now, just hope it's not a car or a train.

Again, thanks for the help.

bigbangus · October 28, 2022

What's your motherboard and BIOS version?

Kltc · October 28, 2022

Hey,

It's the ASUS PRIME X470-PRO. As for the bios, it's the latest, the 6042.

Since I'm writing this, I'll give another status update. It still isn't over. At least it might help someone that might come across this.

So I got the PCIe adapter, but it didn't work. The 970 Evo worked, but the 860 Evo wasn't recognized. I decided to get another NVMe to check (WD Black SN750). Turns out .. the problem was on the 860 Evo. Ok, problem solved ! (or so I thought).

Later, thinking everything was fixed, I went and tried out a game (Insurgency). "Odd ... the game is slow." Turned on the Riva Statistics ... "The Clock is all over the place... Is the GPU is gone as well ?". This GPU is a GTX 970, old, I know, but it still should get the job done. Still, I tested it at another PC. The GPU is fine. *sigh* Now I'm wondering if my "marvellous" energy grid messed up my PSU. It already fried a microwave and a monitor so ... now I bought a PSU, and I'm waiting on it to arrive.

Guess it was a train after all, let's see how heavy it is 😅

Kltc · November 17, 2022

Hey,

Final post. After a lot of tests and rebuilds, i've nailed down the messed up part. It wasn't the motherboard, it was ... the CPU (Ryzen 2700x).

Apparently (and i got no idea how), the energy grid managed to fry part of the CPU. Meanwhile, with all the parts i've replaced, i've got a new computer. Even the motherboard is working normally now, with another CPU.

Again, thanks for the help, and hope all of this might help someone out.

[6.11.1] Nvme Cache Controlller (970 PRO)

User Feedback

Recommended Comments

Kltc 2

Link to comment

JorgeB 7479

Link to comment

Kltc 2

Link to comment

JorgeB 7479

Link to comment

JorgeB 7479

Link to comment

Kltc 2

Link to comment

JorgeB 7479

Link to comment

Kltc 2

Link to comment

JorgeB 7479

Link to comment

Kltc 2

Link to comment

bigbangus 44

Link to comment

Kltc 2

Link to comment

Kltc 2

Link to comment

Join the conversation