stealth82 Posted December 16, 2015 Share Posted December 16, 2015 My server underwent lots of changes lately, mostly due to the virtualization opportunity. I got myself a SSD cache disk and for the first days everything went well. It was directly attached to the mb controller. Then I decided to make it hot-swappable, by using a SSD converter and slapped eveything into my trayless HDD 5x3 cage where the array disks have been for some time now. When I assembled everything back and started the server again the cache disk was detected, the array could start and everything seemed well. I tinkered with my virtual machine, installed software and everything till the day before yesterday. Yesterday I wanted to use my vm and although the windows desktop was visible I could not interact with anything. The vm seemed half frozen. I decided to take a look at the syslog and found it flooded with error messages regarding my cache disk. I decided to turn the vm off and try to understand what went wrong (see diag1.zip). After many minutes the vm would not shut down gracefully so I killed it. I stopped the array and turned the server off. I thought the problem was the SSD converter the cache disk was in. I removed the disk and reattached it directly to the motherboard. I started the server again. The SSD disk was there, but the fs was now corrupt... I imagined that. I removed it as cache drive since it said unmountable and presented me with the option of formatting it... I was able to get it back with btrfschk , I then mounted it manually and copied everything to my array (it worked and the data seemed OK to me). I unmounted the disk and the ran btrfs check --repair. It worked again. Therefore I set the disk as cache drive again. unRAID read it correctly. At this point I really wanted to make sure it was the SSD converter, so I stopped the server and put the disk in the converter and again in the cage. I booted up the server, unRAID was OK with it. I decided to test the cache disk by converting a raw image file to qcow2 from the array to it. It wrote 3.5GB and then froze: errors again... I thought it definitively was the SSD converter. I stopped everything again and attached the disk back directly to the mb and tried the image conversion copy again. This time it didn't even start... errors again (see diag2.zip). So, now I don't know what to think. Is it the file system? Has the disk gone bad? The SATA cable? The mb port? Can you tell from the logs? I attached 2 sets of logs. diag1.zip diag2.zip Quote Link to comment
stealth82 Posted December 20, 2015 Author Share Posted December 20, 2015 I don't know why but it turns out the SSD is unstable when attached to the 6Gb primary mobo sata controller. If I move the SSD on one of the 8 ports of my Supermicro 3Gb controller it works OK. If I move a regular HD on the same channel the SSD was throwing a gazillion errors before, the HD works just fine. So, I can't use my SSD at its full speed with unRAID. Why? It makes me lose confidence in the overall stability of the system. I'm too old for this crap. Quote Link to comment
interwebtech Posted December 20, 2015 Share Posted December 20, 2015 I had unexplained SSD cache disc errors, enough that it was taken offline. Swapped to new SATA cable and no problems since. Quote Link to comment
stealth82 Posted December 20, 2015 Author Share Posted December 20, 2015 I swapped the cable, I did every possible test. As I wrote what I know is that a drive platter has no problem using the same channel and cable. Could it really be that a cable makes the difference when it comes to SSD, despite the use of a cables tested and used for years? Quote Link to comment
interwebtech Posted December 20, 2015 Share Posted December 20, 2015 I swapped the cable, I did every possible test. As I wrote what I know is that a drive platter has no problem using the same channel and cable. Could it really be that a cable makes the difference when it comes to SSD, despite the use of a cables tested and used for years? All I can say is the cable had worked for a long time, and likely had worked before on a different disk as I tend to hoard them and only toss if proven faulty. If you have a brand new cable try that. Quote Link to comment
lionelhutz Posted December 21, 2015 Share Posted December 21, 2015 I'd suggest trying XFS instead of BTRFS and see what happens. A number of people have had issues trying to use BTRFS on their cache disk. Quote Link to comment
stealth82 Posted December 21, 2015 Author Share Posted December 21, 2015 OK, I bought new cables and I will test everything again. If it is still unstable I will try xfs as my last attempt, after which I will keep the SSD on the 3Gb controller and wait for better times, sigh Quote Link to comment
stealth82 Posted December 28, 2015 Author Share Posted December 28, 2015 I swapped the cable, I did every possible test. As I wrote what I know is that a drive platter has no problem using the same channel and cable. Could it really be that a cable makes the difference when it comes to SSD, despite the use of a cables tested and used for years? All I can say is the cable had worked for a long time, and likely had worked before on a different disk as I tend to hoard them and only toss if proven faulty. If you have a brand new cable try that. I can confirm it definitively was the sata cable. I had 4 different sata cables and all of them proved to be unreliable, at least for a SATA 3.1 device. What was left was a couple of errors that, elsewhere in this forum, were considered negligible. However, since I was still on time to have the SSD replaced I swapped it for a SandDisk Extreme Pro. The log is now clean with this one. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.