Cache drive read-only issue [Solved]


Nihil

Recommended Posts

I am keep running into read-only issue with my cache drive(s). At first, I've assigned an Intenso 128GB SSD to the cache, where after a while read-only errors started to happen. Thinking it's a cheap SSD that might have procured some bad sectors, I replaced it with a Samsung EVO 120GB SSD. Everything has been running smooth, until a week ago, when read-only errors started to appear again. Restarting the array and rebooting the system does not help, only way make the drive usable again is to re-format it, but it's a very short term solution.

 

The drive is plugged in directly to the motherboard's sata port. I've tried switching the ports and cables.

 

I've found multiple topics to this issue, but did not find a solution in any of them. In one of the topics I've found that btrfs is not the best choice for a single drive cache, so I formatted it to XFS - Same issue two days later.

Running the xfs_check outputs:

Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!
attempting to find secondary superblock...
[after a while]
...Sorry, could not find valid secondary superblock
Exiting now.

 

My cache drive usually contains:

- appdata

- docker.img

- Games share with a single active game I play

- Downloads share

 

I've included the diagnostics. Can anyone help me find a solution?.

triglav-diagnostics-20161011-2349.zip

Link to comment

It would seem that I misspoke, rebooting did help this time. Not sure if it was the xfs_check or the xfs itself, but up until now rebooting didn't help with btrfs.

 

But still, I believe the issue will return pretty soon, so if anyone has any ideas on what to do, I'd appreciate it

Link to comment

Running the xfs_check outputs:

Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

 

This usually happens when running xfs_repair on the device instead of the partition, post the command you're using.

 

Your docker.img is corrupt and needs to be deleted and rebuilt.

 

Your getting lots of what look like interface errors on both your SSDs, did you replace both the power and SATA cables? are they in some sort of enclosure?

Link to comment

Running the xfs_check outputs:

Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

 

This usually happens when running xfs_repair on the device instead of the partition, post the command you're using.

 

Your docker.img is corrupt and needs to be deleted and rebuilt.

 

Your getting lots of what look like interface errors on both your SSDs, did you replace both the power and SATA cables? are they in some sort of enclosure?

I see, I've run "xfs_repair -v /dev/sdl".

 

Yes, docker image keeps getting corrupt, what I believe comes with these read-only issues.

 

No I did not replace the power sata connectors, just swapped the cables. Sata power connectors are kinda extended using Y-splitters and there's also a bay fan controller powered from the same extension. Could that be an issue? Another thing is that I've bundled the sata cables tightly together. Could that actually cause behavior like this?

Link to comment

I see, I've run "xfs_repair -v /dev/sdl".

That is an incorrect command.  When using a physical device you also need to specify the partition to be operated on (e.g. /dev/sdl1).

No I did not replace the power sata connectors, just swapped the cables. Sata power connectors are kinda extended using Y-splitters and there's also a bay fan controller powered from the same extension. Could that be an issue? Another thing is that I've bundled the sata cables tightly together. Could that actually cause behavior like this?

Bad power connections can cause all sort of intermittent issues, particularly if you can get momentary disconnects.

 

Bundling SATA cables together is contra-indicated as that can increase the risk of cross-talk.  The exact opposite is the normal recommendation - keep SATA cables as far apart as possible.

Link to comment
  • 2 weeks later...

Ok, to leave a feedback to this topic for any fellow user with the same issue - I've been running the system for two weeks without any issues by unbundling the sata cables (and replacing the SSD ones just to be sure). The interface disk errors are also gone, everything has been running smootly.

 

Ending with the picture of the neat cable management, that caused the issue:

 

20160806_164223_zpsziab1fgo.jpg~original

 

 

Link to comment

Ok, to leave a feedback to this topic for any fellow user with the same issue - I've been running the system for two weeks without any issues by unbundling the sata cables (and replacing the SSD ones just to be sure). The interface disk errors are also gone, everything has been running smootly.

 

Ending with the picture of the neat cable management, that caused the issue:

I'm not positive that the issue is crosstalk, another very likely candidate is the poor design of the sata data connector. Bundling the cables can put bending stress on the junction between the cable and the drive, meaning either the top or bottom facing connections are not fully in contact. I'm fairly sure bad connections are the problem in 99% of the drive errors that aren't caused by a failing drive.
Link to comment

Ok, to leave a feedback to this topic for any fellow user with the same issue - I've been running the system for two weeks without any issues by unbundling the sata cables (and replacing the SSD ones just to be sure). The interface disk errors are also gone, everything has been running smootly.

 

Ending with the picture of the neat cable management, that caused the issue:

I'm not positive that the issue is crosstalk, another very likely candidate is the poor design of the sata data connector. Bundling the cables can put bending stress on the junction between the cable and the drive, meaning either the top or bottom facing connections are not fully in contact. I'm fairly sure bad connections are the problem in 99% of the drive errors that aren't caused by a failing drive.

That explanation makes more sense, I've been running the system these two weeks with an open back cover on case and cables dangling freely outside it. I guess it could also be that I applied too much stress on the connectors by bundling the cables this way.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.