RobJ

Members

Joined
March 27, 200719 yr
Last visited
May 18, 20179 yr

View Profile Find content

cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up
cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

RobJ replied to Joe L.'s topic in User Customizations

Gary, you seem very certain that this would be useful so I assume you have either thought it through or have seen tests that showed an advantage to disabling CacheDirs? I've tried thinking it through (but no tests done), and I can't see any advantage at all, and in fact it looks to me as if it could actually slow the parity check down a tiny bit. Can you explain your logic, help me see what I'm missing here? As far as I can see, a parity check and CacheDirs are 2 independent operations. The parity check heavily uses the physical I/O busses and a little CPU time, but does not use the file system, and I don't thing it even uses any memory caching (could be wrong though). A properly setup CacheDirs does not use any physical I/O, does use some CPU, and does heavily access the buffered file system entries (the dentry cache as was stated above), in order to keep marking each dir entry as too important to drop. Both need CPU time, but I doubt if either needs enough to impact the other. An enabled CacheDirs should keep the Parity check humming along without interruption. Disabling CacheDirs would disable that protection, and allow other file system accesses to force pausing of the parity check to move the heads elsewhere to service other needs no longer cached. My SageTV polls the video folders every 5 minutes, and without CacheDirs running would require physical access to those folders. A small point, but as to drives still spinning, that only applies to those users whose drives are all the same size. Most of us have many sizes,and many of our smaller disks have spun down well before the end of the check. In my array, 8 of the 10 drives spin down before the end. They would all have to spin up on re-enabling CacheDirs, but that's not a big deal.
- June 22, 201313 yr
- 1209 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

As has been often said, don't look at the RAW numbers, especially for the error rate attributes. Only Seagate reports the raw numbers, the other manufacturers keep them hidden. Just look at the VALUE, and how it compares with the Threshhold number. Your drives look near perfect.
- June 5, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

I never knew that! I had always thought that the PreRead was a completely linear, sequential read. That of course explains why it is a little slower than the write zeroes phase.
- March 23, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

Looks like you ran a short test just before obtaining this SMART report. Unlike the long test, the short test does not check the entire media surface (it would take hours), so it seems somewhat significant that even the short test found a bad sector. That should be statistically extremely rare, since it only spot checks a relatively very small number of tracks. Short and long SMART tests seem to stop at the first error they find, so it's probably not stuck here. Assuming you hadn't rebooted, what is significant is that you had no trouble accessing the drive and its SMART report. My previous guess from the syslog entries above was that the drive had been disabled for some reason, and therefore was just spewing errors from a stubborn and persistent module ignorant of that fact. This shows the drive is still operational in this session. So unless you are able to access the very first part of that huge syslog, especially when the very first drive errors occurred, we don't yet have a good clue as to what is wrong with it. The syslog errors look like interface issues, yet there are also bad sectors, a completely different and generally unrelated problem. The SMART report otherwise looked pretty good, relatively speaking. It shows 3 Pending sectors, which aren't good and will have to be dealt with, but cannot in any way explain the lack of progress on the PreRead, or the very long wait up til now, or the errors from that syslog piece above (or its 2.5GB size). I'd shut down and reboot, then grab a fresh copy of the syslog and a SMART report for this drive, then start a tail of the syslog in a separate console/Telnet/PuTTY session, then start a Preclear again of this drive. When/If the tail shows drive errors again, then grab another copy of the syslog and another SMART report and post them here. I'm also bothered by the rather high sector number (5,860,520,752) repeated in your syslog entries. It's a Preclear so no ReiserFS on the drive to explain a large seek, you report it is still at 0% of the PreRead, so no progress beyond the first few gigabytes of the drive, and the SMART report and test show no sectors beyond 52,167,120, within the first few gigabytes. So why are there errors accessing something so far out on the drive? Who or what is requesting that sector?
- March 22, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

Please see the Troubleshooting page, Obtaining a SMART report section.
- March 22, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

High_Fly_Writes is not a critical attribute, and is currently (and has been for some time) just an experimental item. If you examine numerous SMART reports to see how High_Fly_Writes is handled, you can clearly see that they have attached a dummy routine to it. When a High_Fly_Write event occurs, they increment the RAW value and decrement the VALUE, until VALUE reaches one, then they leave VALUE alone, so it never reaches zero. Obviously just a dummy routine. The VALUE and WORST for Spin_Retry_Counts is 100, essentially perfect, so nothing to worry about.
- March 11, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

Drive looks essentially perfect.
- March 11, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

All 4 drives look fine, no issues at all.
- March 7, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

That is where they started as a brand new drive, notice that the VALUE and WORST are 100, essentially perfect. The problem is that on some of these newest attributes the 'SMART' developers have set the THRESH to be very close to the starting value. I suppose the thinking is that with Spin_Retry_Count you get 3 strikes and you're out, with End-to-End_Error you only get one strike and you're dead. Keep in mind though that for both of these cases, they have not yet decided (for whatever reason) that either of these is a critical attribute, one that fails the drive, even though both clearly are probably predictive of impending drive failure.
- March 6, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

It's probably getting bogged down, waiting for the drive to deal with all of the bad sectors. Current count of reallocated sectors is 3168, and there are 4904 bad sectors still to be dealt with, and as you said, it's only gotten through about a third of the drive. Not good.
- March 3, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

I agree with Joe, that combo of drive model and firmware is not confidence inspiring. Any chance of a firmware update for it?
- February 28, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

This error line, with the BadCRC flag set: It indicates corrupted packets across the cabling. That is almost always a bad SATA cable, easily replaced. It could also be a power issue to the drive, but if it was a power issue you would probably have had more BadCRC errors on other drives too. Since you didn't indicate any others, it's probably a bad cable to this drive. Both drives, after the bad sector handling, indicated "configured for UDMA/100", so I figured you must have them connected to an older disk controller, with UDMA/100 speed only. That is going to significantly impact their read and write speeds.
- February 28, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

Strangely though, it looks like later in the log those errors are showing for a different drive: Advice? Are these drives hosed? Not hosed, but you do have some bad sectors on 2 drives, sdd and sdb. You also probably have a bad SATA cable to the drive sdb.
- February 28, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

The number 65535 is not 65 thousand something, it's the unsigned integer representation of the 2 byte signed integer value of -1, so I wouldn't put too much significance in it. A return of -1 is usually a flag value, indicating a possible error or the current unavailability of the true number or something else but not a valid return value. If possible and I saw a value of 65535, I would grab a few more SMART reports, to see if it would change to a valid value. If not, I'd reboot and check again. If still not, then I would have to assume a bug in the SMART functions of the drive's firmware. As to the End-to-End error, I'm not sure we should give that much significance either, because it is a relatively new attribute, and looks to me to be still experimental. That does not mean it is not significant, it IS informational, but the SMART reports where both of these occurred (this one and one a week or 2 ago) both indicate that this was flagged 'Old_age' not 'Pre-fail', and therefore this is NOT considered a critical attribute. Since it is not considered critical, it is only informational, and as such that may be useful, but I'm not sure it should carry much weight in your decision making. There are many critics of the SMART system, and both of these situations just give them more ammo. Some don't trust SMART reports at all, but I feel that once you understand and have some experience with the vagaries and inconsistencies of SMART numbers and report info, then there is useful info there. I just wish they would get their act together a little better.
- February 28, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

Both drives listed in your attached file look fine, no issues, brand new.
- February 25, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

The Maxtor looks fine, no issues at all. It's not that old either, with less than 20000 hours on it. At 250GB, it's a bit small, but that depends on what you want to use it for. The Samsung is a little older, with 28000 hours on it, but it too looks fine.
- February 24, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

Looks great!
- February 21, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

You are probably experiencing resource contention of some kind. They are probably each waiting on some resource the other has. Since you did not attach a syslog, I can assume you've already looked there for clues and found nothing. Joe L. Thanks - no i didn't look - actually didn't even think to attach the log, thinking that since the array isn't on, what could the log show.. sorry that was nooby of me. anyhoo - after a little while, it took off agian, and is in the 90MB/s range - i guess knowing it was an EARX, i paniced too quickly. chugging along now. the parity check also finished on my other VM. Glad that the machine didn't croak with all these disks chugging at the same time. *whew*. OK, are you saying you had a parity check running in another VM on the SAME physical machine? That would be major resource contention! UnRAID, especially during a parity check/build is I/O bound, meaning it will be making maximum use of the available I/O bandwidth, and everything else has to wait their turn. A VM is great for sharing unused resources, so multiple VM's can use idle CPU time, and can use unused RAM, but NOT any unused I/O because there isn't any! Running 2 VM's will not double your available I/O capabilities! So of course it sped up close to normal, once the parity check finished.
- February 17, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

These increases in Current Pending Sectors *after* a Preclear are not a good sign! I would follow Joe's advice in this post.
- February 13, 201313 yr
- 2842 replies
cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up
cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

RobJ replied to Joe L.'s topic in User Customizations

Unfortunately that disables all daily cron jobs. It would be better to examine /etc/cron.daily and determine what may be running that is causing the disk usage. Also, if cron was set to run daily jobs at 4:40am and your spindown delay is one hour, then the drives would spin down after 5:40am, not before that. What is your default spindown delay? If one hour, then it looks like something is running at 4:30am and accessing for several minutes Disk 3 and Disk 4, and once in a while Disk 2. It ran for 4 to 7 minutes last week, but only about 3 and a half minutes this week.
- February 8, 201313 yr
- 1209 replies
cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up
cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

RobJ replied to Joe L.'s topic in User Customizations

That looks like one hour after the mover runs, so you probably have the spindown timer set to one hour. So the question for you becomes - why are all of my drives except Disk 2 spinning up for the mover?
- January 30, 201313 yr
- 1209 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

This should perhaps be a FAQ item! Short answer is that the pre-read is a straight linear read, the post-read is much more complicated, with extra testing. Better answer is provided by Joe earlier in this thread, not too long ago I think.
- January 30, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

This drive is not off to a good start, with numerous bad sectors right off the bat. This drive wasn't drop-shipped to you, it was drop-kicked to you! You may want to *try* to capture a SMART report, but if you can't - RMA it. On the bright side, it's always good to find out these things *before* you trust your data to it.
- January 20, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

This is a new disk, and it appears there was a small region with sectors not correctly initialized. Perhaps there had been an electrical spike that scrambled them? I don't know. But the zeroing phase appears to have had no problems rewriting them, so they are correct now. Why not preclear the disk one more time, to reassure yourself that the disk is fine. I really doubt you will see any further issues with this disk.
- January 14, 201313 yr
- 2842 replies
Preclear.sh results - Questions about your results? Post them here.
Preclear.sh results - Questions about your results? Post them here.

RobJ replied to jbuszkie's topic in User Customizations

I don't see any connection with memory here. I'll just add though that if I have any suspicions at all about the memory of a computer, then I consider that computer to be completely unusable! Period. When you get the new memory sticks, test them with memtest overnight, until you are completely confident in them. That pretty well rules out bandwidth differences, unless there was an issue with that specific port or cable. You might try preclearing the slow drive one more time connected to the cable and port used by one of the faster drives. And if you have the syslog during the slow drives preclear, check it for any drive-related errors/exceptions.
- January 12, 201313 yr
- 2842 replies

Everything posted by RobJ

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)