NVMe drives dying


Recommended Posts

Hello,

I bought a completely new server in december for unraid use with some dockers and mainly as a network storage.

I included two Adata XPG 256GB drives as cache drives. Due to a shipment failure I got two different models, one ADATA ASX8200PNP-256GT-C XPG SX8200 and one Adata XPG Gammix, but since the heatspreader is mainly the difference, I just installed them without complaining to get the correct model.

I finished building it in the beginning of January.

 

In the first week of March, the first drive failed. I returned it via the seller and got a new one to install, no problems there. I thought it was just a bad unit where I had bad luck.

Today, the other drive failed! So I am assuming unriad is doing something the drives probably do not like? Which configurations would be important to diagnose that?

Could you please help me in diagnosing that issue, so I do not have to swap drives out that often. ;) Especially when warranty is over, they should last a bit longer.

 

Hardwarewise one of the drives is installed on the mainboard with the included heatspreader. It was running at about 40° in idle and 50-55°C on large transfers. This one died today.

The other drive is installed on a riser card and is running at about 30°C and 40°C when used. This one died in March and is now replaced. Attached is the Smart report of that disk, I will try to get the other one now, by swapping it to a different slot to get it running for some minutes again. Did not work sadly...

According to that, I do not think that heat should be what is killing the drives? So I am assuming software is the problem here?

 

Thanks for your help,

AlexADATA_SX8200PNP_2J0420041579-20190427-1036.txt

Edited by alexhalbi
Updated Information
Link to comment

I guess all models listed from you are very cheap vendors.

 

Usually im not the kind of person who says you need to buy better brands, but it seems like thats the case.

 

I had good success with intel p600 (and they are cheap, actually the cheapest brand nvmes)

 

Normally your warranty starts new when it dies bc of the same thing of and over again. (atleast in EU)

 

Then i saw your log

Data Units Written:                 14,029,848 [7.18 TB]


Isnt really much i guess? My samsung ssd 850 evo has 325290353990 lbas which is around 151.48 TB written.

 

Heat shouldnt be the problem, in worst case they just throttle so they dont get damage. I couldnt find any datasheets for your drives, but normal they can get around 55 without problem. So a bit higher wouldnt be a problem i guess.

 

MAYBE the problem is the Mainboard?

 

Edited by nuhll
Link to comment

The log I posted is from the drive that is still alive, obviosly. It was swapped out in March, so it was not in use for the initial transfer of alll my files to the server (~10TB) like the two dead drives.

But I think the usage of the last 1.5 months is representative of the usage in the time before, except for the initial setup.

 

If I can get a refund for the dead drive instead of a swap, I will buy another brand for sure.

Do you have any specific settings on your unriad that are optimized for nvme/ssd drives?

Link to comment

Nope. I have 2 ssds as cache, with VM and many plugins, dockers, all downloads go to ssd first... Both have around that 150TB, which (what smart reports) is only around 60-70% of their lifetime.

 

But i might read somewhere that nvme (or m2) not very good for cache. I can recommend 850 evo, for my part.

Edited by nuhll
  • Like 1
Link to comment
42 minutes ago, nuhll said:

But i might read somewhere that nvme (or m2) not very good for cache. I can recommend 850 evo, for my part.

Not sure where you would have read that!    I would have thought exactly the opposite since Nvme SSD drives tend to be higher performance than SATA based SSD’s.

Link to comment
They have higher performance, thats true, but dont last so long like normal ssds (as far as i know)
Would be curious to see a link where you read this. I have never heard of nvme drives having less of a lifetime. My understanding is they are the same underlying storage (flash) but just through a faster protocol.

Sent from my Pixel 3 XL using Tapatalk

Link to comment
yes, im no pro, but most nvme drives use cheaper Q something flash, which shoudnt last so much TBAS. Thats what ive wrote. You have to be picky about which to use.
 
But since your online i really need help here (i couldnt find anyhting about this in the help or google):
 
Replied in that other thread. Really need to see where you got this info about nvme drives. All flash storage can vary in method or type of memory. Nvme is just a different protocol which allows for faster raw speeds that would otherwise be impossible over SATA, and by itself shouldn't have any overall impact on device endurance.

Sent from my Pixel 3 XL using Tapatalk

Link to comment

As jonp said, the protocol doesn't matter, if its SATA, SATA-2, SATA-3 or NVME it doesn't impact write-endurance. What matters is how they impliment the flash cells (QLC, TLC, MLC, or SLC), and if and how large of a SLC write-cache they have.  If they use QLC then the write endurance will be lower than TLC when all else being equal. If they don't have any SLC write-cache then they're fools.

Edited by BRiT
Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.