Jaybau Posted April 21, 2022 Share Posted April 21, 2022 (edited) Can bits on a hard disk unintentionally change after the data has been written to disk? If yes: How does consumer level hardware handle this? How does Unraid handle this? Will btfrs detect and easily solve this problem? Does Unraid have tools to manage this? Edited April 22, 2022 by Jaybau Refine my question. Quote Link to comment
trurl Posted April 22, 2022 Share Posted April 22, 2022 https://www.karlstechnology.com/blog/hard-drive-error-correcting-code-ecc/ Quote Link to comment
Jaybau Posted April 22, 2022 Author Share Posted April 22, 2022 If hard disk ECC was all that was needed, then why do we have checksums via Btfrs, ZFS, Dynamix File Integrity, SnapRAID? There's even an Unraid Wiki document for checking disk/filesystems (https://wiki.unraid.net/index.php/Check_Disk_Filesystems). This leads me to believe bit errors can silently occur; a problem hard drive ECC nor Unraid handle. And requires additional maintenance to prevent, detect, and correct (hence the checksum tools mentioned and provided by 3rd parties). Quote Link to comment
trurl Posted April 22, 2022 Share Posted April 22, 2022 14 minutes ago, Jaybau said: why do we have checksums via Btfrs, ZFS, Dynamix File Integrity, SnapRAID? There's even an Unraid Wiki document for checking disk/filesystems These problems are not typically the result of bitrot. Several recent examples on this forum where bad RAM caused corruption, and there are other causes. Filesystems can become corrupt from incomplete writes due to power loss, for example. And of course, malware can result in data not as it should be. Quote Link to comment
Jaybau Posted April 22, 2022 Author Share Posted April 22, 2022 10 minutes ago, trurl said: These problems are not typically the result of bitrot. Several recent examples on this forum where bad RAM caused corruption, and there are other causes. Filesystems can become corrupt from incomplete writes due to power loss, for example. And of course, malware can result in data not as it should be. To be more clear regarding my original question: Can bits on a hard disk unintentionally change after the data has been written to disk? Quote Link to comment
trurl Posted April 22, 2022 Share Posted April 22, 2022 The link I already gave explained how the disk hardware and firmware deals with this problem so the actual data returned is reliable. Of course, disks can fail in various ways, but this is also not typically "bit rot". Quote Link to comment
Michael_P Posted April 22, 2022 Share Posted April 22, 2022 13 minutes ago, Jaybau said: To be more clear regarding my original question: Can bits on a hard disk unintentionally change after the data has been written to disk? Yes. But unlikely (random cosmic ray blasts a bit on the drive for instance), the drive would still likely report a read error. Data "decaying" in any reasonable amount of time, really unlikely. Quote Link to comment
Jaybau Posted April 22, 2022 Author Share Posted April 22, 2022 When people think of defining bitrot they are mostly concerned about data decaying/degradation (rotting) over time (decades). These two comments address the question of how "rotting" could be possible, and address the probability/risks: 1) Random cosmic ray blasts a bit on the drive. 2) Data "decaying" in any reasonable amount of time, really unlikely. Hard disk ECC is not 100% reliable, example: Seagate Ironwolf drives have a 1 per 10E14 unrecoverable read errors per bits read. Seagate Ironwolf Pro drives have a 1 per 10E15 unrecoverable read errors per bits read. I do not know what above really means for a home data hoarder. 1 per 10E14 = 1 URE per 12.5 terabytes? 1 per 10E15 = 1 URE per 125 terabytes? Is that the probability within the MTBR time frame? I'm not sure how many bits would be rotted if such an event would occur (one bit over 100's of years?) Nor how many rotted/decayed bits would make a difference. Especially since the source is not perfect (movies, audio, photos, my eyes, my ears, headphones, TV display). I'm not sure what 1 bit off is going to look or sound like (I presume the file will still be read as normal). Quote Link to comment
Michael_P Posted April 23, 2022 Share Posted April 23, 2022 That's not rot, that's an error. URE's would still be logged (in theory) as an error by the drive and can be immediately recognized if monitored by the OS. The theory is that it will pretty much guarantee the death of a normal RAID implementation since drives are well above 12~ TBs now, and read failures during a, for example, RAID 5 array will drop a disk during rebuild and thus the array will be lost. It's only theory tho, as the MTBURE is not set in stone. It's a guess as to the chance, and even then the drive is likely to recover from the error anyway. IMHO, bit rot and URE are WAAAAAAY less important to worry about than just keeping backups of your important data, and verifying your backups 1 Quote Link to comment
JorgeB Posted April 23, 2022 Share Posted April 23, 2022 On 4/22/2022 at 12:22 AM, Jaybau said: Will btfrs detect and easily solve this problem? It will detect it but since each array disk is an individual filesystem it can't fix it, you can restore the affect file(s) from backups, it can detect and fix it for redundant pools. Like mentioned bitrot is way the down list of things to worry about, data corruption due to bad RAM or other hardware hardware issues, is much more common. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.