Jump to content

bobobeastie

Members
  • Posts

    132
  • Joined

  • Last visited

Everything posted by bobobeastie

  1. Ugh, everything was fine since the 8th, then today I noticed a bunch of sub folders were missing on one of the shares, then I saw errors on 3-4 drives, rebooted, then there was an error on one of the drives, then I ran blkid, and two of the drives have the same uuid's again, so I ran the generate command, array wasn't fixed after that, or after a reboot. tower-diagnostics-20191019-0107.zip
  2. GREAT! Thank you very much for all your help, that worked and parity is being rebuilt, I will then follow this: https://wiki.unraid.net/Replacing_a_Data_Drive after the rebuild is done and the replacement drive gets precleared.
  3. This is what I get when I try that: root@Tower:~# xfs_admin -U generate /dev/md9 xfs_admin: /dev/md9 is not a valid XFS filesystem (unexpected SB magic number 0x4c554b53) Use -F to force a read attempt.
  4. Thanks, so it looks like I have 2 options, fix the UUID issue then try to rebuild the parity drive, I'm ok with not trusting it, or take the 2tb drive I want to replace out, mount it with unasigned devices on another unraid system, put a precleared 8tb drive in system in question, and use new configuration and build parity, then copy contents of 2tb drive to array once settled. If there is no danged in changing UUID's, then I might as well try that first while I wait for the 8tb drive to preclear. I also plan to check if I can mount the 2tb xfs encrypted drive before doing anything risky. Is that a safe plan?
  5. Assuming that means that I can't change/fix the UUID, should I be able to put sdi in a spare bay on my main server and mount with unassigned devices? That drive is encrypted xfs, will unasigned devices be okay with this and prompt me for the key? If the above works, I can then replace sdi with the drive I was trying to replace it with originally, assuming it has a different UUID, and build parity, then when done add back the data from the 2tb sdi. ...just found a post from you about duplicate UUID's, and you said that "xfs_admin -U generate /dev/sdi1" can be run. Should I try that?
  6. Thanks, I'm attaching the results of blkid in maintenance mode and array started "normally", just in case. I see that sdi is not getting a PARTUUID, sdi is the drive that I attempted to replace, but is currently the old drive while I attempt to build parity so that I can try upgrading the drive again without losing data. started in maint mode.txt started normally.txt
  7. I started in maintenance mode and ran xfs reapir, I didn't see anything useful, but I'm not sure what to look for, attached is what it reported with -vl mode. Starting the array after that had no positive results. flags vl log.txt
  8. In the beginning, I precleared an 8tb drive that I was replacing in my main server, so that I could put it in my secondary server, to replace a 2tb drive I was getting errors on. After preclearing was done I turned the secondary server off and replaced the drive, 2tb for 8tb. I changed slot 1 to the new drive and started the array. Then it asked to format the drive, and this is where I figured I did something wrong. I think I said format during parity rebuild, then I got spooked, so I stopped and tried parity rebuild again without formatting, which was also probably wrong. The rebuild finished and the "new" drive was still listed as unformated. After that, I have put the old 2tb drive back in slot 1, and tried a new configuration, nothing had been written to the array but I was going to resync parity from the data drives just in case, better than losing 2tb. When trying that, it is showing disk 9 as unformated/no fs, disk 9 is 8tb, so if I had to choose, I'd lose 2tb. I double checked that the drive I removed was the "new" previously precleared drive from my main server. The serial from slot 8 is a different brand and the one I removed matches the serial from a spreadsheet I keep for my main array for locating drives. Hopefully I'm still in a place where I can keep all my data, but after stumbling around I could really use some help on how to do that, please. Diagnostics are attached, for the boot it was generated from, I have tried new config and am getting that drive 9 is not formatted. Extra stuff that might not matter: I think some of my confusion comes from the secondary server being encrypted, which my main one isn't, maybe replacing drives is different in this case. I was going to move SAS cards around, but the 16e card I bought for my new server started smoking when I started it up, so I couldn't use one of the 8i cards from that that I was going to put in my secondary server, but moving things around seems to have at least temporarily eased the errors. tower-diagnostics-20191007-1806.zip
  9. So I think it was due to having shares from the opposite server mounted in unassigned devices, and then when that remote server restarted and became unavailable, unassigned devices became unresponsive, which I think must have messed up the whole web interface? Which caused a very frustrating game alternating server issues. There's a remote share error in the log file. I removed the shares and knock on wood, so far so good. I have the shares mounted so I can run a dupeguru container from unraid to find duplicate files, so if I understand the cause correctly, I would like to find a solution other than just not mounting the shares.
  10. I was able to grab a log file with "cat /var/log/syslog > /boot/log.txt" in putty, which then I found in the flash share over my network, it is attached, I notice connection timeouts at the end. log.txt
  11. I have two servers, both 1st gen Zen based, one threadripper and one Ryzen. The first sign seems to be a loading animation on unassigned devices in the Main page that never finishes loading. At this point I am unable to get diagnostic files either from the gui, or from putty. With putty, it just hangs at "starting diagnostics collection...". After that, eventually, none of the pages load, and I am forced to tell the server to "reboot" twice in putty, the first one doesn't do anything. After the second reboot, which works, and the systems boot, a parity check starts, but is not able to finish before all the issues start, and they seem to stop updating their status at some point near their end. Sometimes I see after a reboot that the previous parity check had finished successfully. Also, mover seems to run slowly and sometimes the transfer speeds on the array stop showing non zero values. Originally I was updating to the most current 6.7 RC's, though I think I skipped .2. But yesterday I downgraded to 6.6.7 by replacing the bz files on the flash drives with the bz files downloaded from the unraid web page, and I'm still having these issues. I left the other bz files from the RC's that were not replaced with 6.6.7 bz's.
  12. Second try worked. I'm assuming diagnostics are no longer useful as I restarted, but if I'm wrong someone let me know and I will provide it.
  13. I just finished a parity swap that took all day, and when it was done, it asked if I wanted to copy parity again, not start the array. Because of this, I rebooted, and it asked the same thing, I tried changing disk assignments but it said too many disks, so I changed it back and it is now copying parity again. I followed the procedure very closely, I even copied the instructions and inserted drive names into it, because I found the old drive, new drive labels confusing, I'm not sure if I did something wrong, but I don't think I did. Any advice for when it is done again? I will certainly take a screenshot and grab diagnostics if it happens again.
  14. rtorrent has been much better at connecting to the web ui since I ungraded to 6.7.1 official from 6.7.1 rc2. It is still timing out frequently though.
  15. Good idea, I have edited my post with the attached diagnostics file. Thank you.
  16. I have been having issues with rtorrentvpn, and maybe with unraid, I can't get the web ui for rtorrent to load reliably, and when it does load it only connects briefly. My unraid issues, which I'm mentioning in case it's related, is that pages load slowly, settings change slowly, I give up waiting and relaod whichever page and sometimes the changes took, sometimes they didn't. When I try to initiate a shutdown or restart I am used to seeing a new page with a countdown, I'm not seeing that anymore. Here's some maybe pertinent backstory. This is my second current unraid box using leftovers from upgrades, built because the first box is out of bays and running out of space. I started it with drives pre-cleared 3 times, with encrytped xfs, and initially did not set up a parity drive. With the array running, I started to get Current pending sectors on one of the drives in the array. I waited for a drive that was pre-clearing at the time to finish and then had it build parity with that drive, the drive with errors, and 2 others. Then I replaced the drive with errors and let it rebuild. This morning I added a cache drive and set things up to use it, including app data (preferred) and my download directory (yes). I also ran and xfs check of the drive that replaced the drive with errors, and a btrfs check on the cache drive. I had been using a copy of the openvpn folder from my other box which is running the same container successfully, I just tried overwriting the contents with a fresh download from PIA and that doesn't seem to have made a difference. Thank you for any help, log file is attached. edit: As suggested I have attached my unraid diagnostics. supervisord-sending.log flags-diagnostics-20190622-2223.zip
  17. I have been getting the following error since I bought this 1tb nvme ssd in November 2018: fstrim: /mnt/cache: FITRIM ioctl failed: Remote I/O error With whatever the newest "next" version of unraid was at the time. I get the error in an email after the Dynamix SSD TRIM plugin runs every night. An article on anandtech says that the drive uses a Realtek RTS5760 controller. Is there anything I can do to get trim to work? If not, is there a recommended budget nvme 1tb ssd where trim works? Thank you.
  18. Of course, the computer gods must hate me, can't get screw one threaded, foxcon...
  19. Great, thank you so much for answering. I think what you are saying is in the drive selection screen when the array is not mounted, select none for parity drive, then follow the instructions for checking a file system which I just found here: https://wiki.unraid.net/Check_Disk_Filesystems is that correct? I have no reason to think that any of the drives were written to, but possibly there is something unraid was doing, or maybe as a result of the sas card errors something was incorrectly corrected? So this sounds good, I think it is pretty unlikely that any of the 99-100% drives were written to, and the newly precleared drives were empty. Do I need to do anything to the drive which had just been precleared by the unstable system and was marked with a red x? I'm guessing I can address check disk results, maybe preclear the red x drive, and then add 1 or 2 parity drives after I am satisfied with the results? I have only just added precleared drives over the course of a year, and 1 replacing a drive for a larger drive, so this is new to me. Thank you.
  20. I ordered a designare and sent back the ASRock. I am wondering what I should do with my array tomorrow when I hook it up. 1 of the drives I think maybe had an attempted write after having been precleared, and it didn't work, so unraid emulated it from parity. Should I rebuild parity? Can I go back to before I added precleared drives to the array? If I can do this I will just do another round of preclear NG on all new drives. The only thing I did in the ASRock board was preclear drives, add them to the array, and set cache to yes. After adding them to the array was when it was clear there were real issues. Thanks.
  21. Just chatted with newegg, they will let me return with a 15% restocking fee. So now I need to choose a replacement. My criteria is ATX, 8 SATA, and I would choose 10 gbe if there were any that matched all 3, I'm leaning toward the Designare, but I don't see my RAM on it's QVL. With the designare or auros 7's 5 pcie slots, I might even be able to add a 10 gbe nic with 2 lsi sas cards and 2 graphics cards.
  22. Ugh, now I'm reading that ASRock x399 boards are not compatible with lsi sas controllers. I suppose my options are replace the motherboard or the sas controller. If I go with replacing the motherboard, I wonder what Newegg will do? 15% restocking fee maybe... Ok, not too bad. If I were to replace the sas controller, are there working alternatives with this board? I would love to get some advice from someone more practiced in these things. Thanks.
  23. Update 2: I updated the firmware to 20.00.07.00, no change, first time it wouldn't boot, 2nd time it looked good except for the 1 disk, array was started, I wanted to see what my options were if I stopped the array. After I did that, the errors rolled in, and the screen where you choose drives looks like it does at times when I boot and immediately have issues. tower-diagnostics-20181007-0156.zip
  24. Small update, I wanted to get a diagnostics file from when all but the one trouble drive show up. I have been booting in to gui mode, but these couple of times I was not, and it froze during boot. So there seems to be an issue when I don't boot in gui mode. Also I just noticed it says I could do the following to the trouble drive: Format will create a file system in all Unmountable disks, discarding all data currently on those disks. I don't think anything was stored on it, unless docker put something there, is this safe to do? Of course the next time I boot I will probably have 5 missing drives... Edit: Just remembered that the sas config utility said I had 9200-8e cards, the cards have internal connectors, so the e is weird, is this something? I found that firmware version 20-20.00.07.00 exists, so should I try it? Did they flash my cards wrong? tower-diagnostics-20181006-2326.zip
  25. I am trying to get an array going in a new case/mb/cpu (everything but HDD's), plus I added a couple of drives which seemed to work, but now every time I restart I'm either good, and see all but one drive, which may really be bad, or I see 4-5 drives which are not found and I am presented with the serial number below of what drive was expected. I am tearing my hair out and feeling sick form trying to figure this out, so time to ask for help. I had the drives that weren't in the existing array running preclears in the new chassis before the transplant, and some didn't pass, but those did not get incorporated with the previos arrays drives. So the one drive, when things are mostly working, was just precleared in the past couple days. It says the drive is missing and that it is emulated. When I browse the contents of that drive, there is nothing. I have 2 LSI 9211-8i sas cards which are new to me, ebay seller says new, at first they would freeze on their bios screens, so I removed the bios using sas2flash. I had to do this in another computer as I couldn't get sas2flash to find any cards on my NAS. New System: Norco 2442 24 Drive bay chassis 1920X Threadripper Noctua NH-U9 TR4-SP3 ASrock Fatal1ty X399 Professional Gaming CORSAIR Vengeance LPX 32GB (4 x 8GB) 288-Pin DDR4 SDRAM DDR4 3000 (PC4 24000) AMD X399 Compatible Desktop Memory Model CMK32GX4M4C3000C15 * I initially had this inserted not all the way (I hate single sided ram clips), after fixing it I ran memtest for maybe 2 hours, which I know is too short, no issues 2x LSI 9211-8i sas plus 2 reverse breakout sata->sas cables from the motherboards 8 sata ports 3 120mm and 2 80mm Noctuas 1 old and 1 new AMD graphics cards SAMSUNG 970 EVO M.2 2280 500GB PCIe Gen3. X4, NVMe 1.3 64L V-NAND 3-bit MLC Internal Solid State Drive (SSD) MZ-V7E500BW pcie order from the top (io shield end) to bottom 1 newish amd card 2 lsi card 3 old amd card 4 lsi card One thing I noticed between trying to figure out what was broken (which drive bays) was that booting unraid can take 3-5 minutes to boot, sometimes it doesn't boot, and sometimes it does boot, but in the gui boot mode, the webpage wont load, it just says something like localhost can't be found. During boot there are a bunch of sas (i think) related errors, which it says it is correcting. The only thing I can think to do is try upgrading the sas card firmware? I didn't update it in sas2flash when I cleared the bios, I took a picture of the config utility, it says they are both on firmware 20.00.00.00-IT. I believe there is a newer version, which I am happy to try if instructed. My primary goal is to not lose any files, secondary, to get the new hardware running as perfectly as my old hardware with 24 drive bays in use. Thank you so much to anyone who tries to help! tower-diagnostics-20181006-2215.zip
×
×
  • Create New...