Barziya

Members
  • Posts

    79
  • Joined

  • Last visited

Everything posted by Barziya

  1. That is not quite true. That issue was never really fixed. I can still reproduce it in v5.0 on my server: I watch the network graph on the Task Manager on the windows client, and when the server has more than 4GB memory, the graph at times goes all the way down to zero for long periods of time, then network traffic resumes for awhile, then it goes down to zero again, etc... When it happens that it goes down for a longer period, then windoze complains that it's lost the connection, or that the path is no longer available. Restarting the same server with the mem=4095M option, and the network transfer graph is smooth, never drops downto zero while the files are copying.
  2. That syslog shows that it booted in 6GB, so we've messed up something in #4. I was probably not very clear there. The syslinux.cfg should look something like this: label unRAID OS menu default kernel bzimage append initrd=bzroot mem=4095M (^^^ That's not the whole syslinux.cfg, just the relevant part of it.) --- Note for anybody else reading this thread: Don't use that 'mem=' option if you have less than 4GB of RAM, as that will crash your server.
  3. If you have a dying parity drive, that can't corrupt the data on the other disks, but it can leave you unprotrcted until you replace the failed disk. Here are some other things can be causing your problems: 1. Loose cable or a bad cable can be causing the errors seen in your syslog. Check the cables, make sure they are firmly attached. Better yet, swap the cable on the disk in question with another cable and see if you get the same errors on the same disk. Also, get a SMART report on that disk, the disk may be indeed failing. 2: I see that you have SATA disks, but for some reason they show up as IDE disks (hdX insted of sdX). Go into the BIOS settings, and check the SATA setting: It is now probably set as "compatible" or somethoing. It should be set as AHCI. 3. After you set your BIOS setting to AHCI, if the disks still show up as hdX instead of sdX, then you should blacklist the obsolete IDE driver altogether. (I don't know why that obsolete IDE driver is still cmompiled into the unRAID kernel: Every other Linux distro I know switched entirely to the new libata driver years ago.) To blacklist that driver, add the following things to the "append" line of the syslinux.cfg file on your USB flash disk: piix.blacklist=yes ide_gd_mod.blacklist=yes ide-generic.blacklist=yes i2c_piix4.blacklist=yes 4. Some systems have shown some serious problems with the 32-bit unRAID when run with 4GB of physical RAM or more. I see that you have 6GB RAM installed. You can tell the kernel to boot in 3.99GB of RAM by adding a mem=4095M to the "append" line of the syslinux.cfg file on your USB flash disk. Make that line look something like this: append initrd=bzroot mem=4095M Notice that it is 4095M, and NOT 4096M! This particular setting "cured" the timeout and disconnect problems I was having with my server. Let us know how it all goes.
  4. Iplementing that "Upgrade webgui" option in 5.0-i386 was a bit premature. What's on github at the moment is broken in many ways, and is by no means a stable release. So, that is not really an upgrade. What the stock webGui should be pulling from github is "webGui-latest-stable". What it is pulling now is in effect "webGui-latest-beta".
  5. My server I too has slow writes and freezes unless I boot it with a "mem=4095M" boot option.
  6. Or, it may be simpler if we just think of it as adding an extra 2640 bytes per stripe: (disks * 4096 * num_stripes) + (num_stripes * 2640) ... where: disks = (highest slot used + the parity disk) Or simpler: (disks * 4096 + 2640) * num_stripes ( 4 * 4096 + 2640) * 1280 = 24350720 bytes = 23780K (19 * 4096 + 2640) * 6256 = 503382784 bytes = 491584K Wonky math indeed!
  7. You know, in early April a limited series of Seagates shipped with alive kricket inside. Not clear if it's a bug or a feature.
  8. There's faulty logic in it: Mentioning how many years he's been usung unRAID is somehow supposed to be a proof that what he's saying is correct. The fact remains: Simple parity code can only detect an odd number of bits in error. It cannot detect the position of the error, therefore it cannot correct errors. http://www.raid-recovery-guide.com/raid5-write-hole.aspx
  9. Sorry - yes, a non-correcting to verify integrity... If an error is found, examine the file with the error to determine if it is a defect on the parity drive or the data drive, and take the appropriate action... Yeah, but you have no way of knowing that. Let's say cosmic rays flipped one bit on one of your disks. With single-parity setup you can only detect the existence of a parity error, but you have no way of knowing which disk it came from. So unless you keep separate checksums for everything (which could have been the case if we were using ZFS instead of reiserfs), you'll probably just go ahead and run unRAID's parity check-correct. It will assume that the error came from the parity disk, when with greater probability it came from the data disks. Thus it will make a "correcting" modification to the parity disk, which will in effect make the damage permanent.
  10. That boot option is no longer presented -- it was removed in RC15. Clearly you could still add a mem=4095 parameter line manually, but at least according to limetech this issue was resolved long ago (RC15). ... and the issue is marked as "Solved" in the OS 5.x issue list: http://lime-technology.com/forum/index.php?topic=27788.0 You don't know what you're talking about. What you're calling solved is when Tom deleted half the posts in that thread, including Barzija's whole account, and called the issue "solved". Nice way of solving things! I have two such Supermicro boards, and I can see for myself whether the issue it's solved or not.
  11. The 4GB option was removed not because the issue was fixed, but because it was a halfassed workaround and it causes a crash if you have less than 4GB physical RAM and you tell the kernel to boot in 4GB RAM. The issue for the affected systems is still exactly the same with the latest v5rc16c. Using that boot option is still "the cure" on such machine if it has 4GB-or-more physical RAM in it.
  12. Interesting, a post mentioning that Tom 's been working on v.5 for four and a half years gets moved to Bilge very quickly. Tom may not be here, but censorship on this board is alive and well. Can we know the mod who's doing that?
  13. Yes, it is. I am talking exactly the same shakes, rattles, and rolls, as Joe's script is doing them. Only I don't really need 2266 lines of Bash code to accomplish that, if I can do it in about 50 lines.
  14. Don't worry, people. Every time Tom disappears, he always comes back within eight months or so.
  15. Not only it won't help, it will make matters worse: it will use up more lowmem to track the highmem. I've had some serious problems when running the server with more than 4GB of RAM. As WeeboTech said, the solution is 64-bit, which at the current rates we may get some time in 2017.
  16. No need for that, really. The random seeks are just to add some extra stress to the heads while badblocks is doing the real job of writing/verifying. The random seeks are not speed-critical, and they can be done from a simple Bash script. What I do is, my "preclear" script spawns a simple child shell that does random reads from the disk every second or so. Then the preclear script invokes the badblocks command, and once badblocks is done, the script just kills the child shell. These's no need for anything more complicated than that.
  17. I don't really care if it's one person or a thousand persons. When people pay you money, you hire as many people as needed to meet the demand. And the problems with communication are unexcusable. It takes five minutes of that one person's time to make a post saying hi I'm still alive and working on so and so.
  18. You do realize that Joe L is making preclear do random seeks in addition to the compares to stress out the disks more with additional seeks? You do realize that I do realize that? For homework, patch that script to do the `sum` with -s, and watch how it will do the job twice as fast, which is still nowhere near how fast it does with other methods, especially on older CPUs or on ULV CPUs.
  19. It turned out that weird -c blocks-at-a-time values don't cause badblocks to ignore any blocks if the last batch of blocks is less than the -c blocks-at-a-time value. So we need not worry what -c is. The only thing we need to ensure is that the total bytes of the disk|partition|file is evenly divisible by the -b block size, and then every single byte of it will be tested. Once I had that verified to my satisfaction, I did some speed tests on a couple of disks. I noticed that bumping up the default -b block size of 1024 to values higher than 4096 doesn't bring any additional speed improvements. The command I liked the best at the end was: badblocks -vsw -b 4096 -c 1024 -d 1 /dev/sdX While the above was running, I had a separate background task that was doing a read from a random place on the disk once every few seconds. That's a preclear setup I am happy with. I am glad we sorted this out. Thanks Weebo!
  20. No, badblocks doesn't do any calculations about what the -b and -c values should be. If you don't supply -b and -c, it just uses default values of 1024 and 64 respectively, which coincidently work for (most?) disks. The only calculation apparently is the total number of bytes integer-divided by the -b block size to get the number of blocks it will do work on. Now that we know that badblocks cuts off along the -b block size line, I just need to do some more investigation to see whether it also cuts off along the -c blocks-at-a-time line. If that turns out to be true, then there's also a potential for garbage resulting from improper -c values.
  21. The "here and there" was a poor choice of words, I appologize. How I discovered this issue was, I wanted to "preclear" the disk with a single write-pass of badblocks with a zero pattern. Then I hexdumped the disk for peace of mind, and I saw the garbage at the end. Trying to reproduce the thing, I zeroed out the whole disk with dd and just wrote a few non-zero bytes near the end. The single read-pass of badblocks passed that, and that freaked me out, and I reported it here. So, the non-zero bytes weren't really "here and there", they were mostly near the end of the disk. Now I understand what's happened. Still, that's not how I would expect badblocks to work. I guess I'll always have to make careful calculations myself before invoking badblocks. That's certainly not an intuitive behavior.
  22. You're still confusing things. The OP used "badblocks -b 65536" There's your test case, in the first post of this thread: If you use the exact same command that the OP did, then badblocks won't touch whatever's in the last 24576 bytes of that disk.
  23. My prior comment may have been in haste, I believe an incomplete -c (block size) is ignored. -b option is how many -c's to read in one instance. You've mixed up the -b and the -c. It's actually the other way around: -b it the block-size, and -c is the nubmer of such blocks to be tested at a time. That lead you to completely misinterpret my previous post. The numbers there still hold: 2000398934016 is not evenly divisible by 65536. So any garbage you have in the last 24576 bytes of that disk will not be detected by the command Joe.L used in the OP.
  24. Well, dd also uses blocks, but it doesn't ignore stuff at the end. So it kind of is a problem with the program itself. If you tell dd to read x number of blocks, it will ignore anything after x number of blocks. It's the same thing. No, no, I wasn't talking about `count=` number of blocks. I was talking about dd's `bs=` size of its block. dd won't ignore bytes that don't constitute a full block. Badblock's equivalent is the -b option, and badblocks seem to ignore bytes that don't constitute a full block. We can use Joe.L's example from the original post: The disk he used that on was: User Capacity: 2,000,398,934,016 bytes 2000398934016 is not evenly divisible by 65536. So for his badblocks test he had 30523665 blocks of 65536 bytes each + 24576 bytes. For all he knew, he tested every single byte on his disk. In reality, 24576 bytes on his disk were never written to or tested. They can contain any random garbage. Furthermore, he used -c 1024 option, so he was doing 1024 such blocks at a time. 30523665 is not evenly divisible by 1024. So if badblocks is acting the same way with the "blocks at a time" option, that would mean that Joe.L had 29808 passes of 1024 65536-byte blocks, with a leftover of 273 such blocks. Which could mean that he had a total of 17,915,904 bytes at the end that were never written to or tested. I don't think the user of badblocks should burden himself with such calculations. I would expect that every single byte of the disk will be tested, or at least a warning be given otherwise. You've mentioned twice already that this tool has been around for many many years. While I see how it's tempting to bring that up, it's not a logical argument for anything.
  25. Well, dd also uses blocks, but it doesn't ignore stuff at the end. So it kind of is a problem with the program itself.