Jump to content

UhClem

Members
  • Posts

    282
  • Joined

  • Last visited

Posts posted by UhClem

  1. With 20+ data drives, and upgrading from one-parity to two-parity, there could be a prior/intervening performance bottleneck. Do you have sufficient (single-core/thread) CPU power? I.e., can you generate enough parity data (fast enough) to justify/validate your stated question?

     

    -- "If you push something hard enough, it will fall over."

     

     

  2. It looks like you are going in a very sub-optimal direction. You are putting 2 x 550 MB/sec SSDs (860 evo) onto a PCIe v2 x1 (ASM1061) interface (max 350-400 MB/sec total). Why not use 2 of these mSata-to-Sata adapters [

    https://www.amazon.com/Sabrent-2-5-Inch-Aluminum-Enclosure-EC-MSSA/dp/B01MS6669V

    or

    https://www.amazon.com/ELUTENG-Enclosure-3050mm-Adapter-Compatible/dp/B07258BJJF

    ] for your mSata SSDs, connecting them to 2 of your Mobo's Sata ports (for full Sata3/6Gbps speed), and get any ASM1061/2-port card for the two "replaced" Mobo-port HDDs.

     

    Surely, you can find a place to Velcro (or duct tape) the two adapter-ed mSata's, and deal with the added cables. (It's a computer, not jewelry--so Function >> Form ...kluges are cool.)

     

     

  3. Quote

    Micron 5100 Max 1.92TB - Around $200 to $220

     ...

    The Amazon page says its MLC, though according to Micron brochures it is eTLC NAND.

    Well, that "Amazon page" is, of course, the responsibility of the seller, GoHardDrive.  Are they incompetent, or dishonest? [Remember, drives are their specialty--they should be held accountable for correctness.]

    Quote

    Samsung SM863 1.92TB - Around $215-229

     ...

    Probably a bona fide MLC NAND drive.

    Yes. From a 4 yrs ago press release [Link],

    Quote

    [ The 3-bit MLC V-NAND-based PM863 is developed for mixed pattern applications and ideal for use in content delivery networks and streaming or Web servers. ] Alternatively, the write-intensive SM863 based on 2-bit MLC V-NAND is an optimal choice for online transaction processing (OLTP) and serves as an ideal choice for email and database servers.

    [ That SM863 link on AMZN is also sold by GoHardDrive.] I have no evidence, or direct experience, but my gut tells me to question their integrity.  Keep in mind that, while I (and probably you) am (are) not able to modify/reset SMART data, it is definitely possible. A perusal of Google results for <<goharddrive honest>> is enlightening (though not ALL bad).

     

    Who did you buy from on ebay?

     

    Good luck with your new toys.

     

  4. 5 hours ago, LammeN3rd said:

    Better yet, get your tech info first-hand: From LSI (white paper on Databolt)

    [to borrow an old Unix joke: "Use the source, Luke."]

    This paper also gives a good overview on expanders.

     

    Oh, as for the queried controller in the OP ... if Chiney-fakes weren't bad enough, this one must also be avoided on moral grounds. To make such a denigrating reference/inference on the most innovative and impactful OS is pure blasphemy ... Unicaca ... feh!!!! :):)

    Quote

    Just wondering if anyone has had any luck running a Unicaca SAS 3008 ...

     

  5. 5 hours ago, RobJ said:

    I spent some time going through the linked papers, and I do appreciate their provision, did learn a little.  But I'm afraid I have yet to find one bit of evidence against the strength of the ECC bits to preserve data integrity. 

    Thank you for your summary. (I hadn't bothered -- once I saw the CERN paper was one of the references, I lost all confidence ["...baby...bath water." :) ]).

    5 hours ago, RobJ said:

    The CERN paper was frustrating, and I don't know why several sources are quoting it or referring to it. 

    ....  [ (excellent synopsis of the flaws) ] ...

    I don't think this paper should be cited at all.

    A+ for you.

     

    I'm surprised it was even published ... but I appreciate CERN's openness (or was it ignorance?). Personally, I would be totally embarrassed to admit that I had purchased, and deployed into production, 600 RAID controllers and 3000 drives, without first getting 3-4 controllers & 15-20 drives and beating the sh*t out of it all for a week or two (and not just 2GB every 2 hours). But, why should they care ... it's just the(ir) taxpayers' money. [And, in 2006, that probably represented ~US$750,000+ (in 2006 euros).] Did they even get competitive bids? [Make that $1M+]

     

    5 hours ago, RobJ said:

    The NEC paper is good, and worth reading by everyone for the ideas about other sources of silent data corruption.  At no point, does it implicate a weakness in ECC.  Rather, it points out other possibilities.

      ...

    That leaves the 'data path corruption' issue, the interesting one here, and the one that is consistent with our own experience.  It's about the many sources of corruption between the drive interface and the media surface and back again ...

    Those data path issues were formally addressed in 2007 when they were added to SMART, but had probably been implemented in drive firmware even earlier by the competent manufacturer(s).

     

    --UhClem   (almost accepted a job offer from CERN in 1968 ... then my draft deferment came through)

     

  6. 6 hours ago, c3 said:

    Many years ago the BER number from drive manufacturers were confirmed by CERN. Pretty much everyone agrees, there is no guarantee that data will be written and read without corruption. CERN found 500 errors in reading 5x10^15 bits.

    Not that CERN "study" again. c3, did you actually read it ? (not just casually)

     

    I invite everyone who has participated in this thread to read it (this version is only 7 pages). See if you can find the numerous flaws in his presentation, and conclusions. Extra credit if you deduce the overall premise/motivation.

     

    -- UhClem        "Gird your grid for a big one ..."

     

  7. 11 hours ago, c3 said:

    Unfortunately, I do see drives returning corrupt data on a regular basis.

    Please convince me (not being argumentative--I'm sincere!). But, to convince me, you'll need a rigorous presentation, using a valid evidence trail. I appreciate your time/effort.

    11 hours ago, c3 said:

    Enterprise storage uses drives which offer variable (512/520/524/528 bytes per sector) configurations for additional checksum storage.

    Oh yeah, I do recall seeing that, but always in a casual perusal of one of HGST Ultrastar manuals. Thanks for pointing it out to me in the context of a technical discussion, where I'm motivated to dig into it deeper. My first two minutes of digging has already added a few crumbs of new knowledge to my quest to understand the factory/low-level formatting.

    11 hours ago, c3 said:

    They don't do this just because they want you to buy more drives.

    Agreed ... since the reduction in drive data capacity when going from 4k sectors to (4k+128) sectors is only 3.5%. I believe they do it so you'll buy (the same # of) (much!) more expensive drives. After all, these are the same execs/bureaucrats that spent ~$500Billion on the Y2K scam; why not soak them for a measly extra $5-10B to protect their data (and cover their hiney). Remember, fear, and lawyers, (and fear of lawyers) are great motivators in such finagles.

     

    Presently, I'm about 75% serious in the above. But I'm waiting, and very open, to be convinced otherwise.

     

    -- UhClem

     

  8. 18 hours ago, c3 said:

    ...

    ECC has a very limited ability to detect errors, and a more limited ability to correct. What a reported uncorrected error means, is the drive has data it knows is bad because it failed ECC, but was unable to correct. But a collision (easy with ECC) means you can not really trust data just because it passes ECC.

    ...

     

    Untrue! Rather than (try to) get into the theory of error detection and correction [at the level implemented in HDD firmware] (which I doubt ANY reader of this forum is competent to do--I'm surely not!), consider this: If a collision was easy (Hell! forget easy; if a collision was even *possible*), hard drives would only be used by careless hobbyists.

     

    (Regarding HDDs) I agree with RobJ (and I've written the same 2+ times on this board in the last few years):

    On 3/13/2017 at 10:12 PM, RobJ said:

    Because of sector ECC, the drive knows whether the data read is correct or not, and even tries to correct it if it only has a few wrong bits.  But it never returns the data to ANY reader (including the file system code) if the data cannot be read perfectly, corrected or not.  This means you CANNOT get corrupted data back, you only get perfect data or an error code.

    Note that few is pretty large (8-12+, I think) for a 512/4096-byte sector. And the firmware will make many retry attempts to get a good read; I've seen evidence of 20. And then the OS driver will usually retry several times. Only then does the OS throw a UCE.

     

    As for johnnie.black's  original experience/report, I'm intrigued/disturbed. Whose controller does Sandisk use?

     

    [

    Added: Note in the original post, there appear (I don't know unRAID's logging methodology) to have been 26 UCEs Reported by this drive (over its lifetime, prior to 12Mar2017:2253 (that is SMART code 187) and 1 Reallocated sector (SMART code 5; that "somebody" is labeling `retired'). Do I assume, because of the way this logging is done, and the way that you are monitoring it, that you *know* that all of this bogosity (the 26 & 1) happened very recently? And that there was no sign of it in dmesg etc? If so, and if this SSD's firmware implements SMART correctly, where were those 26 errors REPORTED?

     

    As I understand, there is a different category of SMART error for logging "implicit" errors (from self-diagnosis, Trim, etc.) that can't be "reported". Speaking of which ... what does a "smartctl -a /dev/sdX" show in the Error_Log? Or, does Sandisk (mis)behave like Western Digital and not bother to Log Errors? (Hitachi/HGST has spoiled me--they do [almost] everything right.) (Hey, who remembers the Samsung fiasco when they had a serious firmware bug in the F4 series (HD204, etc.)? As if the bug wasn't bad enough, when they released a fixed firmware, they HAD NOT CHANGED THE FIRMWARE VERSION/REVISION #

    ]

     

    -- UhClem

     

  9. In fact, the bad blocks don't give me headaches.

    I'm more worried about this log entries:

    Error 418 occurred at disk power-on lifetime: 55 hours (2 days + 7 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 00 08 ff ff ff 4f 00   1d+02:05:45.236  READ FPDMA QUEUED
      27 00 00 00 00 00 e0 00   1d+02:05:45.209  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
      ec 00 00 00 00 00 a0 00   1d+02:05:45.207  IDENTIFY DEVICE
      ef 03 46 00 00 00 a0 00   1d+02:05:45.194  SET FEATURES [set transfer mode]
      27 00 00 00 00 00 e0 00   1d+02:05:45.166  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] 

     

    Meanwhile I tested a second drive - the same model - and it logs the same errors as this one.

     

    Those ARE the bad blocks, in more detail and only the last 5.  UNC is short for UNCorrectable, so "Error: UNC at LBA = 0x0fffffff = 268435455" roughly means "bad block at 268435455".

    Not this time ... Look at the fuller "picture"-- first, always be a little suspicious of numbers that are "all ones" (ie, 0x0fffffff); then look carefully at the preceding commands in the error log for the conclusive clue.

     

    "The devil is in the details."

     

    --UhClem

     

  10. Specifically, the Pre Read Time speed. 209 MB/s as an AVERAGE for the entire drive, is fantastic.

    I was amazed myself aswell.

    There are many reasons for a disk to perform below its specs. But there are hardly any reasons for a disk to perform above its specs, especially by the amount indicated in your report. Not only that, but it is not just the one disk, but both! (The other disk is only 20-25% over spec [vs ~30% for the quoted one].)

     

    Additionally, both 3rd passes are notably faster than the average for all 3 passes.

     

    Far-fetched as it seems, I've got to ask ... Did you set the system clock (back by about 1 hour) sometime Saturday morning? :) [via date -s XXX]

     

     

  11. Since he had run multiple cycles, the odds are some of what was "read" was from the linux buffer cache and not from the physical disk.

    Those odds are 0.00 (at least, with Powerball, you have > 0 odds :))

    The buffer cache keeps the most recent data. Doesn't preclear.sh do strictly increasing-LBA sequential operations (for all buffered I/O) ?

     

    [Regardless, as you noted, (even in the most optimally perverse case,) a few "fast" GB out of 3TB would have immeasurable negligible effect.]

     

     

     

  12. I just finished my first preclear runs on my first 2 disks.

    ...

    Your disks looked perfectly normal to me.

    Except, maybe, the Read performance--it's too fast :).

    ST3000DM001

    == Last Cycle's Pre Read Time  : 3:58:51 (209 MB/s)

    == Last Cycle's Zeroing time  : 5:16:21 (158 MB/s)

    == Last Cycle's Post Read Time : 13:56:02 (59 MB/s)

    == Last Cycle's Total Time    : 19:13:23

    Specifically, the Pre Read Time speed. 209 MB/s as an AVERAGE for the entire drive, is fantastic.

     

    I'd have expected something closer to 150-160.

     

    "There's something going on here, but I don't know what it is ..."

     

    Ideas?

     

     

     

     

  13. Has anyone upgraded their N54L with the modified Bios for the improved SATA options?

    There are numerous reports (on other forums' MicroServer threads [homeservershow.com & overclockers.com.au]) of people successfully installing one or another unlocked ("modified") BIOS (initially intended, and used, on the N36L/N40L) on the new N54L.

     

    If you stay with the stock BIOS, SATA ports 4 & 5 are locked in IDE mode and limited to SATA I speeds. Also they are incapable of hot-swap usage and port-multiplier controllability (both of which require the port to be in AHCI mode).

     

    Just Do It!! :)

     

  14. Note that I "stuck my nose in" here because it seemed like both RobJ and JoeL agreed that the time added to the script/cycle run (by the "feature" you described) was of some significance. It was only in the process of composing my previous (ie 2nd) response that I made an effort to quantify it (~0.5%). I'll bet neither of you realized it was so negligible--if I had, I wouldn't have bothered ... but then, look at all the fun we'd have missed.

     

    I still contend that a more apt rationale for adding the "feature" would be "I didn't want the drive to get bored." :)

     

    [Those 6 seeks every ~20 seconds are not going to affect the temperature. Here's a little experiment I just did. I had a spinning, but idle, drive at 30C. I did 5 minutes worth of flat-out reading (similar to your pre-read w/o the dance). When it finished, the drive was at 31C. I let the drive rest for 15 minutes; back to 30C. Then did 5 minutes of flat-out seeking (seektest); when finished, the drive was at 35C (that was ~1400 seeks per 20 sec.).]

     

    If the disk has difficulty in seeking, but recovers internally without posting an error externally, it does show in the speed of the preclear process and in the SMART report.

    No, it does not. In even the most perverse case, where each and every seek resulted in a read-retry, it would not even have doubled that overhead. Ie, instead of ~0.5% extra, it would have been <~1.0%. How is that going to "show in the speed of the preclear process"?

     

    Also, it does not show in a SMART report. (In anticipation of a misguided reply ... Seek_Error_Rate is not only undocumented, but also looks to not even be "implemented" on most drives (only Seagate))

    There have been plenty of disks RMA'd because their preclear speed was a fraction of other equivalent drives. 

    Huh? 995/1000 is a fraction, right? :) Seriously, any measurable speed difference was not caused in any way by the dislocation dance.

     

    ==========

    And, now for something completely different ...

     

    Here's a challenge:

     

    Add 10-20 lines to the preclear script that will cause the post-read phase to run just as fast as the pre-read phase for many/most users. For the rest, no change (but there would be such clamoring that they could easily/quickly join the party).

     

    [same exact functionality/results as now.]

     

    Who's wants to be the hero???

     

    Think about it-- ~5-10 hours saved per cycle for a 2TB drive.

     

    [No questions ... just think :)]

     

    --UhClem

     

  15. I had given it a lot of thought.

    I know you did, and I respect that. (I've been in that position many times, and) That is precisely why I said "take a step back"--meaning to try to get a different view/perspective.

    The intent is to put the disk through a bit of a torture test.  I purposely wanted the disk to do something other than a linear read.  I knew it would take a bit more time.  If the disk heads cannot be positioned to tracks, then I do not want the disk in my server.

    But, here's the problem (as I see it): For a marginal drive, that little head-fake (:)) might just cause the subsequent read attempt to be off-track and/or un-settled, BUT, if so, the drive will detect that; also, even if the drive doesn't detect that it hasn't settled sufficiently, and proceeds with the read, it will obviously get ECC failure. In all of those cases, the drive will merely RETRY the read (but this time with no/negligible prior head motion), and succeed. That little dislocation-dance (every 200 "cylinders") is just a (very minor) "waste of time" [looks like only about 0.5% extra], but has no chance of leading to any "feedback". Remember, a drive will RETRY a read 10-20 times before giving up and returning UNC to the driver (and the driver will RETRY 4-5 times before giving up and returning error to the calling program).

     

    However, I definitely agree that a new drive should really get a mechanical pummeling!! (But, instead of a couple of gnats, how about a swarm of horseflies.) I give my drives 5+ minutes of constant seeking, which also serves to verify that the drive's seek time is within spec. [seektest -n 20000 /dev/sdX -- my own little hack; don't know if Linux has something like it]. If any relevant component is sub-par, or inclined to be, now's the time to find out.  To quote Crocodile Dundee: "Now, that's a torture test." (compared to the little twitches in preclear). I'll repeat this test occasionally (at least a few times per year).

     

    Following that initial torture session, I do a thorough surface integrity test (xfrtest ... /dev/sdX -- another personal hack). I try to repeat this once/twice per year, tracking the (very quantitative) results.

     

    --UhClem

     

  16. In addition to linearly reading blocks of 200 "cylinders" from the first sector to the last, both the pre-read and the post-read ALSO intersperse reading the very first sector, the very last sector, and three other random sectors in between for every 200 sectors read.

     

    basically, it works like this:

    Looping start

        # read a random block.

        # read the first block, bypassing the buffer cache by use of iflag=direct

        # read a random block.

        # read the last block, bypassing the buffer cache by use of iflag=direct

        # read a random block.

        # Then, read the next set of blocks linearly, from start to end, 200 "cylinders" at a time.

    Looping end

    Joe, I understand what you intend the above dislocation enhancement to accomplish, but I'd suggest that you (take a step back and really) think about it.

     

    What it does do is cause a slight (probably undetectable) seek noise, and increase the elapsed time of those phases (apparently noticeably). [Neither of which were your actual intent (but unavoidable side effects of your intent).]

     

    --UhClem

     

     

  17. WOW that is excellent news.

    Two months older news now, than when I tried telling you ...

    For those interested, I just purchased a N54L.

    ...

    I'm using an addonics SIL3132 card for external eSATA access to a Sans Digital PM Unit.

    Have you tried plugging the SansDigital into the MicroServer's eSATA port? [surprise!! :)]

    I'll have to test if unRAID can support the PMP capability.

    Probably will, but performance (parity check, etc.) might be yucky.

    At the very least a SIL3132 or SIL3134 with eSATA should do nicely.

    Even there, you will be limited to about 120 MB/s total bandwidth with the PM enclosure. The SiI3132 has a bogus transfer rate limit of ~120 MB/s (even though it is PCIe x1 v1, and should be able to get 180-200 MB/s).

     

  18. It's not just the BIOS though,

    Agreed. But I am a software person, and a seemingly analogous motherboard's BIOS would be an easy starting point for comparison.

    I am studying the datasheets of the SB820M, there could be a chance that pins are held high or low on the board which enables/disables certain functionality,

    I had done that too, but only within my limitations (I studied EE [almost 50 years ago] but wasn't good at it). One thing that caught my eye (in the DataSheet) was Table 61 (pg 112) -- Performance mode. But then there is the last entry in Table 28 (on pg 77) for AZ_SDOUT which implies (to me) that  Performance mode is always available.

     

    Isn't it possible that HP decided to omit/remove a 6Gb/s setting from their BIOS so as to avoid that slight bump in TDP power draw (5.3W vs 4.9W--ref Table 61)? Or, as you surmised, maybe just to cripple the MicroServer market-segment-wise, relative to its more macho Proliant brethren?

     

    Hence, that is why I suggested checking another (SB820M-motherboard) BIOS--to see if anything jumps out at you.

     

    For example, in the MicroServer BIOS, can you tell me what effect the (SATA) 1.5G setting has, relative to the 3G choice?

     

    --UhClem

     

  19. For those interested, I just purchased a N54L.

    ...

    I'm using an addonics SIL3132 card for external eSATA access to a Sans Digital PM Unit.

    Have you tried plugging the SansDigital into the MicroServer's eSATA port? [surprise!! :)]

    I plan to swap it out with a SYBA SD-PEX40031 so I can use the internal SATA ports for the upper two bays.

    Seems a pity to put a PCIe v1 card (possibly further constrained by a PCI dependency) into a PCIe v2 slot, though I empathize with the attraction for that card's connectivity [2 SATA + 2 eSATA].

    In the mean time I am preclearing 4 drives simultaneously. They seem to be running at max speed.

     

    Two of the drives are 3TB 7200 RPM Seagates - ST3000DM001

    I was getting 170MB/s on the first 1TB of the drive, now I'm at 155MB/s.

    For the Hitachi 3TB 5400 RPM drives HDS5C3030ALA630, I started at 130MB/s and I'm at 115MB/s after the first 1TB.

     

    Nifty lil machine.

    Yes, they are. The MicroServer's 6 port SATA subsystem (via the SB820M SouthBridge) is pretty decent. You're still ~100 MB/s shy of saturation [(2x170)+(2x130) = 600]. I measure about a 675-700 MB/s ceiling (on a N40L).

     

    Envying your 50% faster CPU ...

     

     

  20.  

    Well I am currently preclearing the disk again using preclear_disk.sh -w 65536 -r 65536 -b 200 /dev/sdX

    and it took more than 24 hours just for the pre-read. Now it is writing zeros (11% done). Interestingly, this step seems to be going a lot faster than the pre-read. I did not disable any of my plugins so we'll see if it crashes again but so far I can still get to the webgui.

     

    Here is fdisk -l output:

     ...
    Units = cylinders of 16065 * 512 = 8225280 bytes
    ... 

    Thanks for that info. Unfortunately, it does not support my theory. Since the only reports of the "out of memory" condition seemed to occur with 3TB+ drives, (and since drives > 2TB might not fit into fdisk's notion of "geometry"), I was expecting to see a "Units=" value much greater than that 8225280, which would have caused preclear to suck up a correspondingly greater amount of memory. But, alas ...

     

    It is still a fact that no good will come from preclear using that "Units=" value for determining the block-size it will pass to dd using the bs=N argument. The whole notion of disk geometry is meaningless in any post-DOS (& CP/M :)) OpSys.

     

    Now, as for your result using the override parameters. By specifying a 64KB block-size, you will definitely reduce dd's memory usage, and it should have only a negligible impact on the performance of dd itself. *BUT*, because you were advised to use "-b 200" (in conjunction with that 64KB block-size), that has caused your (overall) read performance to deteriorate. That is because each invocation of dd (by preclear) is only transferring ~12.8 MB (200*64KB) instead of the previous 1.6GB (200 * 8225280). That is the (household) equivalent of emptying your bathtub with a tablespoon (1/2 oz) instead of a pitcher (~64oz). The end result is the same (finished read cycle/empty tub), but not the time to perform it.

     

    The fact that the write cycle did not seem to suffer also makes sense, but is harder (for me) to explain (to you) [my shortcoming!] If I were talking to a colleague, I would say "the write cycle doesn't suffer because sufficient write-behind data is pending to absorb the added [dd] invocation overheads."

     

    You can get the safety of the (reduced) "-w 65536 -r 65536" and not have any performance penalty by increasing the -b parameter to  compensate. Ie, -b (block-count) should be increased to ((8225280 / 65536) * 200), which is (~) 25600 (ie, 128x).

    So, "-w 65536 -r 65536 -b 25600" should do it. (Then you'll be emptying by the pitcherful again.)

     

     

  21. There was a discussion over on another thread about running out of memory while running preclear. This happened to me yesterday. It is my first time trying preclear on a 3TB disk. ...

    Could you identify that 3TB disk (by device-name: /dev/sdX)

    and report the output of:

    fdisk -l /dev/sdX
    

    replace X appropriately, and that -l is  "dash (lower-case) ell"

    [Feel free to mask out the (reported) Serial #]

     

    I suspect this whole problem derives from a misconception about disk geometry, and (maybe) the fix is simple, and efficient.

     

×
×
  • Create New...