abhi.ko

Members
  • Posts

    350
  • Joined

  • Last visited

Everything posted by abhi.ko

  1. Thank You. Will do now. Was worried about the comment "this might cause corruption"
  2. Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... writing modified primary superblock sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128 resetting superblock root inode pointer to 128 sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 resetting superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 resetting superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Got this now. The disks are unmountable as seen earlier. So should I run the check with -L and then attempt the repair or just ditch trying to repair and just resync parity to the same disk?
  3. @JorgeB As instructed, I ran the xfs filesystem check with the -nv option on both my disabled disks (12 & 19), and both was not able to run came back with this: Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... would write modified primary superblock Primary superblock would have been modified. Cannot proceed further in no_modify mode. Exiting now. What do you recommend? Should I even tryin repair or just resync parity? Could you please specify the steps for resyncing parity with 2 failed drives for me? Thank you @Michael_P and @Vr2Io - I will distribute the molex connections better and report back. I am planning to use 1 backplane to 1 SATA/PERIF connector on the PSU.
  4. thanks again. I do not plan to do the DIY connectors - is using the SATA to Molex adapters not a good substitute? Where would I get additional molex connector cables for the PSU from would any cable work with any PSU or do I have to buy the same PSU manufacturer.
  5. Thank you both for clarifying. So long term if I get a PSU that has 6 SATA/PERIF 6 pin outputs, something like this one, and setup a one backplane to one connector ratio, on the PSU that should be ideal right? 4 HDD's on one PSU connector. Also if I use a SATA to molex cables to connect these single back planes to a SATA PSU connector would that be okay or do I need to get 6 molex to 6pin PSU connectors? Appreciate all the help guys. Learning things I did not know, so I appreciate it very much.
  6. Thanks - but I am a little confused, because this is all still on a single 12V line right, irrespective of what connector we plug it into? So how does it distribute the load? Apologies if I am missing something obvious. Is the 83A power draw enough to boot up the whole system, I thought that was the problem and I needed a more beefier PSU with more amperage on the single 12V line. If yes, then I have a few of these lying around - shouldn't these do the trick, connect them to the SATA connectors from the PSU, and connect 6 backplanes to the 4 SATA/PERIF connectors. Not 1:1 but that distribution of the load should help right, currently everything is on one connector to the PSU.
  7. Thank you @JorgeB I will do it. should I do something about the power situation in my case before that. Based on other comments here from @Michael_P and @Vr2Io - Thank you both and yes I am using power splitters to connect all 6 backplanes to a single PSU connector (picture attached) - which I think might be causing all of this, please correct me if I am wrong. Should I get a different PSU - I currently have this - which I believe is a single +12V rail PSU with a 83A max output. I have a total of 23 disks including parity and cache (cache is an SSD) and majority of these HDD's are the 7200RPM ones. If I should change - do you have any recommendations? Or should I change how they are powered?
  8. Diagnostics attached. Also attached is a picture I took from the monitor attached to the server, seems like that is for the two disabled disks, but attached just in case if it gave more info. All other disks mounted fine. No sounds other than the normal bootup and fan noises were noticed. Hopefully this diagnostics has enough information. tower-diagnostics-20220208-1240.zip
  9. Thank you @Michael_P What do you mean by splitters? Like a SAS to SATA cable? I use a Norco 4224 case which has a SAS backplane with 6 SAS connectors (1 per 4 drive tray) and I have 8 Sata slots on my motherboard, which are connected using 2 SAS to 4 Sata reverse breakout cables and 16 drives goes directly to the LSI 9300 16i card, using SAS connectors similar to this. Both failed drives are on the SAS cables connected directly to the HBA card. Only one of the sata connected discs are showing errors. Do you mean the reverse breakout cables when you say splitters?
  10. Okay I will turn on and listen to it and see if I hear anything. I have reconnected all the cables and re-seated the LSI card and made sure all connections are tight. Question - I have two drives that are disabled in the array - what are the next steps when I turn it on, do I just unassign them and start the array and stop again and reassign the same drives and let the parity resync run for both drives at once, or do I do one drive at a time, or should I do something else? I have dual parity, so if one more drive becomes disabled then I will loose data wouldn't I? I just rebuilt another old Seagate drive that failed last week, not sure if that is related to this or not, so I'm concerned whether one of these drives with reallocated sectors will go bad before the parity sync finishes and cause me to loose data. Any suggestions you have for next steps would be very helpful, as I had asked I can turn the array on and run/post diagnostics as a first step, and then shutdown the server, if you think more information would help and if that is safer.
  11. Thank you both. How can we make sure if it is the drive or the PSU or the cables? I do not know why it is only those Toshiba drives and not any of the other 18 or so drives.
  12. What do you suggest? Should I start the array with the two disabled disks and run diagnostics to post here. Array is mountable though. I am just worried about loosing data. I keep getting sector reallocated errors and the counts going up on all the Toshiba disks I have on the array, weird that it is just those disks they are spread over in different trays (physically) on the case as well. So it is not like one tray/backplate has gone bad, other disks are not having the same errors. Is there any issues with Toshiba disks that has been reported, especially with RC2? Screenshot below of all the warnings I got when I turned the server on for a few minutes. Any advice on next steps please?
  13. I have an LSI 9300 16i controller and 4 SAS cables plugged into it and it has a power connection from the psu as well. psu is a 1000w EVGA psu which should be plenty of power I thought, shouldn't it be? I just reconnected all the cables to the LSI card again and re-seated the card. I started the server and see the two disks are in disabled error state. I have dual parity can I just rebuild it on to itself. Attached a diagnostics output without starting the array, not sure if that helps. tower-diagnostics-20220207-1833.zip
  14. Hello All - I have multiple disks in the array failing and multiple disks with errors, out of the blue. Seems like it is my controller or cables that is causing the issue, but not sure, i did check everything recently when I added some RAM and all looked good. I have the server shutdown currently since there are 2 failed disks now. The attached diagnostics was before the second one had failed. Multiple disks with errors as well, I did replace one failed disk and while the parity sync was going on I got a log full of errors and multiple disks were reporting errors and one failed at the beginning of parity sync and the second one towards the end. I did recently update to 6.10 rc2, but the initial issue started while I was on 6.9 stable, referenced here, the disk now in failed state is the same one referenced in that thread. I did a Win 11 VM yesterday which got added fine and everything was working well, and then this started. Please help. I have an HA virtual machine that is always running and hence my home automation is not working either currently. I am trying to determine next course of action, all hardware is pretty new. tower-diagnostics-20220207-0650.zip
  15. @Squid Disk 12 was rebuilt as we discussed and you had suggested, but I keep having errors. I have checked cables and everything looks good but no idea why the errors keep happening. Another disk failed, but that was a really old disk so I replaced it and now it is stuck mounting. the diagnostics I downloaded before stopping the array to restart is attached as well. PLease - all help is welcome. tower-diagnostics-20220206-1206.zip
  16. Thank you sir! When you say 'being it' you mean just reassign the disk to the same slot or added to another empty slot after the new disk is rebuild? Does the disk look okay? Also should I wait for the Read-Check to finish or just cancel it and start with the rebuild process.
  17. Happy New Year everyone! My new year started with a minor issue on the server. I have one disk showing up as disabled/error after I had to reboot the server mid-parity check, I forgot parity check was running. Currently the server is performing a Read-Check after the unclean shutdown was detected. I only have a single disk parity enabled, was planning to upgrade to dual soon, I have two drives pre-cleared and ready for an emergency such as this, so I can swap the failed drive and rebuild. One drive is the same size as my current parity and the other one is smaller, however I had a few questions before I start that process. Wanted to make sure that the failed drive is actually a goner and need to be thrown away or if that can still be cleared and added back to the array. Diagnostics posted below - can someone please take a look and let me know what you think about Disk 12 (sdv)? Should the second parity drive be the same size as the current one? Also is adding a second parity drive as simple as assigning the pre-cleared drive to the second parity slot on the array and restart. Order of operations - planning to first rebuild the new disk to replace the failed one, once that is finished run a parity check with the new disk assignment and then upgrade to dual parity? All help appreciated as always. tower-diagnostics-20220101-1441.zip
  18. Hello - installed this plugin and it is working fine - Thanks for the work. I am not super knowledgeable regarding networking - but I am using the remote access to LAN as shown below, followed the steps laid out in the very detailed write up, and I can connect to Unraid and docker containers (e.g. Plex, Emby etc.) well from outside my home network, so ports are forwarded correctly and everything seems to be working well. However I cannot access anything else connected to my LAN (e.g. Pi-Hole running on an R-Pi) or my router admin page. I was using openVPN before this and could access all devices on the network easily. Any advice for me to try and get to everything connected to the LAN? Thanks in advance.
  19. No I did not, and that was exactly what was happening with my upgrade too. So I downgraded and went back to 6.8.3 stable and everything is fine, no issues what so ever, nothing changed other than the version of unRaid. I had this happen twice - once with 6.9beta and RC1 - so I am pretty sure it is something related to the code changes, I have no idea what. I did post my diagnostics but did not get any feedback on it. Another thing, which I believe is related, is what happened with my NVME cache drive, 6.9 RC showed errors related to the NVME drive (Samsung 980 Pro 1TB on an M.2 slot) and after that it became unavailable and all my cache content with it. So I switched to another SSD on hand and rebuilt everything again, was planning to remove the m.2 drive and RMA it but strangely enough on the next reboot it came back as unassigned with no data loss - with all the appdata and VM's intact. I was running it on 6.8.3 until recently, moved it to my windows machine last week. So something is up, doesn't seem like 6.9.3RC doesn't like my hardware/setup. @touz do you have any nvme drives in your build - what hardware are you using?
  20. I understand the reasons why for an unclean shutdown. Just trying to figure out why this wasn't a problem before the upgrade on 6.8.3, I had even rebooted the server yesterday morning and it had gracefully shutdown and came back up without any issues. I did not change the location of the flash drive or anything else between that reboot and the upgrade reboot. Hence the thought that this might be an issue with the upgrade. In fact I had tried to upgrade previously (beta 35 I believe) and had the same issue while stopping and starting the array, it always comes back as unclean and starts parity check. So decided to go back to the stable version, now the rc came out and decided to try again and wanted to report it. Seems like there is at least another user with the same issue, @ClunkClunk. Was there anything in the logs and what is with the error related to my nvme drive, that is also new after the upgrade.
  21. So I upgraded just now, and after the reboot via the WebGUI (did not shutdown the server physically) unRaid came back up and said it detected an unclean shutdown and started a parity check. There are errors related to my nvme cache disk on the log, none of this was there prior to the upgrade, I believe. Any idea what is going on? tower-syslog-20201211-0230.zip
  22. Oh I see now. Never used this New Config option before and I don't think it was even there when I built the server in 2011. Here is what I did. Stop the array. So Tools -->New Config -->Keep All Assignments -->Apply : This is done. Added Disk 15 back to the array and used the same disk as before. Started the array. Everything looks good as of now. Parity rebuild is in process. Will report back how it goes. Thanks a ton for your help. Will try and trouble shoot the disks not being detected issue soon after. Hopefully I won't mess anything more up in that process. Any guidance you can provide there would be helpful. BTW, I did check the BIOS Boot settings and 'option ROM' and 'UEFI and Legacy OPROM' is selected under boot devices control and CSM is enabled, still no luck getting the LSI BIOS to show up.
  23. Thanks! Did that and it looks good after un-assigning from array and mounting in UD. Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB) Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] 4096-byte physical blocks Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Write Protect is off Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Mode Sense: 9b 00 10 08 Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Write cache: enabled, read cache: enabled, supports DPO and FUA Dec 7 11:31:09 Tower kernel: sdq: sdq1 Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Attached SCSI disk Dec 7 11:31:32 Tower emhttpd: ST8000VN004-2M2101_WKD08RSZ (sdq) 512 15628053168 Dec 7 11:31:33 Tower kernel: mdcmd (16): import 15 sdq 64 7814026532 0 ST8000VN004-2M2101_WKD08RSZ Dec 7 11:31:33 Tower kernel: md: import disk15: (sdq) ST8000VN004-2M2101_WKD08RSZ size: 7814026532 Dec 7 11:31:38 Tower emhttpd: shcmd (53): /usr/local/sbin/set_ncq sdq 1 Dec 7 11:31:38 Tower root: set_ncq: setting sdq queue_depth to 1 Dec 7 11:31:38 Tower emhttpd: shcmd (54): echo 128 > /sys/block/sdq/queue/nr_requests Dec 7 12:57:25 Tower emhttpd: ST8000VN004-2M2101_WKD08RSZ (sdq) 512 15628053168 Dec 7 12:58:22 Tower unassigned.devices: Issue spin down timer for device '/dev/sdq'. Dec 7 12:59:38 Tower unassigned.devices: Adding disk '/dev/sdq1'... Dec 7 12:59:38 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdq1' '/mnt/disks/ST8000VN004-2M2101_WKD08RSZ' Dec 7 12:59:38 Tower kernel: XFS (sdq1): Mounting V5 Filesystem Dec 7 12:59:38 Tower kernel: XFS (sdq1): Starting recovery (logdev: internal) Dec 7 12:59:38 Tower kernel: XFS (sdq1): Ending recovery (logdev: internal) Dec 7 12:59:38 Tower unassigned.devices: Successfully mounted '/dev/sdq1' on '/mnt/disks/ST8000VN004-2M2101_WKD08RSZ'. Dec 7 12:59:38 Tower unassigned.devices: Issue spin down timer for device '/dev/sdq'. The content looks okay as well, screenshot below. I have no ways of verifying whether this is what was in Disk 15 before the FS corruption, but it looks good to me from a cursory look. Diagnostics attached as well What should I do next? tower-diagnostics-20201207-1305.zip