Darkguy

Members
  • Posts

    41
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Darkguy's Achievements

Rookie

Rookie (2/14)

5

Reputation

  1. Currently running a Dell PERC H310 and a Dell H200 (both PCIe 2.0 x8) in IT Mode on my current setup. Both allow for 8 drives, each I have a total of 7 (SATA drives) connected to each right now, which is the maximum possible in the case I use. I'm thinking about switching over to a case with 24 bays (plus 2x 2.5" internal drives), so I would either have to get another HBA (the board I'm looking at has 3x PCIe 4.0 x16 slots, so it would fit) or use an SAS expander. The question I'm asking myself is what is the best option for me here: 1.) H310 AND H200 AND another HBA; connect 8 drives each? 2.) H310 AND H200 AND an SAS expander; connect 16 drives to one HBA and 8 to the other? 3.) H310 OR H200 AND an SAS expander; connect all 24 drives to one HBA and ditch the other one? (or do I just aimlessly sacrifice bandwith here, if I have a second HBA on hand anyway) What SAS expander should I be looking into, especially if I were to run all 24 drives off a single Dell H200/H310? Drives currently are a mix of old and new 2.5" and 3.5" (all SATA3), ranging from 500 GB to 8 TB CPU would be a Ryzen APU, so integrated iGPU and no need for a video card to use up a slot.
  2. Restarting this thread: I had to replace my flash drive again this weekend (the previous one I used showed read/write errors). I opted for a new, factory sealed SanDisk USB 2.0 drive this time around. There still seems to be some problem with the super.dat file/storage of disk slots: if I merely stop and restart the array, everything is file; disks keep their assignments and I can start the array without problems as soon as I change anything in the list of disks (right now, I'd like to replace disk5 as described here and set disk5 to "no device"), all disk assignments are lost; there is a corresponding entry in the log: Sep 26 08:49:53 Tower kernel: read_file: error 2 opening /boot/config/super.dat Sep 26 08:49:53 Tower kernel: md: could not read superblock from /boot/config/super.dat The super.dat file is fine up to this point, and can also be read and restored from backup. What I am trying to achieve is: replace disk5 by disabling the array, setting disk5 to be emulated by parity, insert a new disk, preclear the new disk using the Unassigned Devices Preclear script, then use the new disk in the disk5 slot repeat the whole procedure for disk3 Log attached, anything else I can try? Slowly running out of space and I have disks lying around to increase storage space by a total of 4.5 TB which I currently cannot use. darkrack-diagnostics-20220926-1006.zip
  3. That's basically what I did to create the current flash drive a few days ago.
  4. Seems I fixed the error, unplugged all SATA cables to the bay in question and reattached them. The error seems to be gone now, I copied the whole contents from the ddrescue-image of the original disk back over to the new disk and no more UDMA CRC errors. There still seems to be a problem with the super.dat file though - all drives lose assignments on a reboot (but no longer on a stop/start array action). There also is no new super.dat created in /boot - there was the first time I started off the new flash drive, but I still had the error about it in the syslog, deleted it and it doesn't get created now. Current diagnostics attached. darkrack-diagnostics-20220622-1925.zip
  5. New diags attached, thanks darkrack-diagnostics-20220619-1146.zip
  6. Update, starting to think there is some underlying issue with the server, either the drive bay, a cable or the LSI controller: First of all, the flash drive was damaged to some extent, which explained the super.dat issue. I could not reboot off of it, but was able to copy the config-folder onto another flash drive on my Windows workstation, boot off it, replace the license and set up the array. Parity is rebuilding now, BUT I get a ton of UDMA CRC errors (about 15,000 over the past 12 hours) for the new replacement drive in the old slot of disk6. I also popped the old disk6 (the one with all the read errors) into an external bay on my Widows workstation and cloned it in an Ubuntu VM (in Hyper-V) using ddrescue. Drive was cloned without issue in around 9 hours and without a single error, I mounted the image (in the VM) and all data seems to be there. Any suggestions in which order I should check components? Cable, bay, controller?
  7. Presuming parity is invalid now anyway, could I just remove disk 6 (and try to get data off it on another system) shut the system down, fix the issue with the flash drive put in the new disk build a new config with the new disk in the slot of disk6 rebuild parity then go on to replace the other two disks and rebuild them from parity?
  8. Makes sense, but I guess it's tool ate since the move is already underway. Sounds sensible. But probably too late. So probably no use in stopping the array and looking to emulate the faulty disk now?
  9. I've been aware of the SMART problems of course and have e-mail notifications turned on, but just got around to getting new disks today. I do know parity is not a substitute for backups, but since I opted for dual parity, I was under the assumption that I could wait a few weeks to replace the faulty drive. Nothing on that drive is irreplaceable, but it's still a hassle to replace some of it. I was not aware of the error with the super.dat since today, which seems to be the major issue here, since it basically forced a new config as I understand it and damaged parity. I believe that there should be a notification if any of the files on /boot cannot be read/accessed. Neither the e-mail reports, the GUI nor Fix Common Problems have alerted me to this, or I would have tried fixing that problem back when all of my disks were OK. I even have regular backups of those files, but they only go back about a month. Sure, I should have tried to dig deeper when the problem of the array config first came up, but since it never led to any problems, I didn't look into it.
  10. I'm currently trying to move some data (pictures) off it which I have not yet backed up using unBALANCER. Read speeds are very slow, so this will probably take around 8-9 hours. From what was written above, changes to the emulated disk6 will have been lost (can confirm, some data that would have been written to that disk after it got disabled is missing now). So I presume parity is invalid now as well, especially as two tiny writes (config files for a syncing tool I am using and which had a share on that disk) to disk6 have happened in the past hour. Once I unassign it, do I click parity is valid or not?
  11. It also has tons of read errors sadly, but so far everything seems to work. I've enclosed the current diagnostics. The disk in question is disk 6 I'd like to fix and replace things ASAP. All in all, I have got three disks to replace (disk 6 got disabled in May and I want to replace it right away, disk 5 also has read errors and I've got a second disk to replace it for here). I would also replace another 500 GB disk with larger one I have lying around. I'll patiently await further instructions. darkrack-diagnostics-20220617-1743.zip
  12. I used the old (previously disabled) disk, not the new one. Did I still somehow stick my foot in my mouth? All data on the disk seems to still be intact and accessible.
  13. No, I formatted it as an unassigned device. So what would my next steps be? stop array shutdown server CHKDSK flash drive on another machine - fix or potentially replace Flash drive and renew license? proceed with replacing the faulty drive once super.dat issue is fixed?
  14. Dual parity actually. Checking the logs, I saw that the super.dat file was bad: Jun 17 15:27:43 tower kernel: read_file: error 2 opening /boot/config/super.dat Jun 17 15:27:43 tower kernel: md: could not read superblock from /boot/config/super.dat Jun 17 15:27:43 tower kernel: md: initializing superblock I re-inserted the old disk, "checked parity is valid" and started the array. Could I re-create a new super dat after checking the flash drive for errors? The disk in question was disabled previously, but got re-enabled upon restarting the array, could this lead to problems with parity?