Jump to content

mattiapsu

Members
  • Posts

    17
  • Joined

  • Last visited

mattiapsu's Achievements

Noob

Noob (1/14)

1

Reputation

1

Community Answers

  1. Quick update on this... I had another read error, so I took the server down, and I rebuilt into the new case. I replaced with as many new cables I had, and I'm still using the same power splitter off the molex power cable (I'm using some shucked drives). I'm back up and running, added 2 new drives, and added 1 to the array successfully. It's been up for 2 days with no issues so far. So in the end, I had to: 1) reload my flash drive from backup 2) replace / reconnect SATA and power cables 3) rebuild an array drive 4) ran scrub on my cache ssd 5) delete and replace my docker and libvert image as these were corrupted along the way. So far so good. Thanks for the help, @JorgeB. I'm sure I'll be back in the future.
  2. I started the rebuild and got horrible speeds, so I stopped immediately. I replaced the parity cable and reconnected disk 1 at the drive and MB. I got about 25 ATA errors in the first minute, but nothing since and the speeds have averaged about 140MB/sec for the rebuild. Would the errors be any indication of drive integrity or just purely connection? I see some 'slow to respond' messages as well, but don't know if that's just my hardware, as I started this for fun, and never looked for top specs. I'll plan on replacing more cables when I move cases and add disks in the next month (or if more issues pop up, I'll do sooner). Any other words of wisdom? Thanks for getting me this far. syslog.txt oldmain-diagnostics-20240330-0907.zip
  3. Thanks, I'll get those cables replaced hopefully soon. Probably when I move to a new case, not a lot of free time right now. Is parity at risk with those ATA errors? The Disk is back up. I can read the emulated contents. It looks good from a file standpoint. There's a few files in the lost+found, not sure what those might be, but only a few, not critical for me. Sorry, don't really know what next steps are. I found your other posts on rebuilding the drive... I assume I'll just follow those steps. oldmain-diagnostics-20240329-1516.zip oldmain-syslog-20240329-1917.zip
  4. I ran through GUI... make sure I did it correctly. To repair, it's running Check without -n and in my case with the -L argument? I can follow the article to run manually if that's next.
  5. Still have the red x and unmountable message across Size-Used-Free. Disk log: Mar 29 14:27:07 OldMain kernel: ata5: link is slow to respond, please be patient (ready=0) Mar 29 14:27:11 OldMain kernel: ata5: COMRESET failed (errno=-16) Mar 29 14:27:12 OldMain kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Mar 29 14:27:13 OldMain kernel: ata5.00: configured for UDMA/133 Mar 29 14:32:07 OldMain kernel: ata5: link is slow to respond, please be patient (ready=0) Mar 29 14:32:11 OldMain kernel: ata5: COMRESET failed (errno=-16) Mar 29 14:32:12 OldMain kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Mar 29 14:32:13 OldMain kernel: ata5.00: configured for UDMA/133 Mar 29 14:35:15 OldMain emhttpd: read SMART /dev/sdf Mar 29 14:37:36 OldMain kernel: ata5: link is slow to respond, please be patient (ready=0) Mar 29 14:37:40 OldMain kernel: ata5: COMRESET failed (errno=-16) Mar 29 14:37:41 OldMain kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Mar 29 14:37:42 OldMain kernel: ata5.00: configured for UDMA/133 Mar 29 14:37:47 OldMain emhttpd: WDC_WD80EMAZ-00WJTA0_7HKHX2BJ (sdf) 512 15628053168 Mar 29 14:37:47 OldMain kernel: mdcmd (2): import 1 sdf 64 7814026532 0 WDC_WD80EMAZ-00WJTA0_7HKHX2BJ Mar 29 14:37:47 OldMain kernel: md: import disk1: (sdf) WDC_WD80EMAZ-00WJTA0_7HKHX2BJ size: 7814026532 Mar 29 14:37:53 OldMain kernel: ata5: link is slow to respond, please be patient (ready=0) Mar 29 14:37:57 OldMain kernel: ata5: COMRESET failed (errno=-16) Mar 29 14:37:58 OldMain kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Mar 29 14:37:59 OldMain kernel: ata5.00: configured for UDMA/133 Mar 29 14:37:59 OldMain emhttpd: read SMART /dev/sdf Mar 29 14:40:49 OldMain emhttpd: shcmd (1772): echo 128 > /sys/block/sdf/queue/nr_requests Mar 29 14:49:20 OldMain emhttpd: read SMART /dev/sdf Mar 29 14:49:22 OldMain emhttpd: WDC_WD80EMAZ-00WJTA0_7HKHX2BJ (sdf) 512 15628053168 Mar 29 14:49:22 OldMain kernel: mdcmd (2): import 1 sdf 64 7814026532 0 WDC_WD80EMAZ-00WJTA0_7HKHX2BJ Mar 29 14:49:22 OldMain kernel: md: import disk1: (sdf) WDC_WD80EMAZ-00WJTA0_7HKHX2BJ size: 7814026532 Mar 29 14:49:22 OldMain emhttpd: read SMART /dev/sdf Mar 29 14:49:53 OldMain emhttpd: shcmd (1848): echo 128 > /sys/block/sdf/queue/nr_requests Mar 29 14:54:23 OldMain kernel: ata5: link is slow to respond, please be patient (ready=0) Mar 29 14:54:27 OldMain kernel: ata5: COMRESET failed (errno=-16) Mar 29 14:54:28 OldMain kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Mar 29 14:54:29 OldMain kernel: ata5.00: configured for UDMA/133 oldmain-diagnostics-20240329-1450.zip oldmain-syslog-20240329-1851.zip
  6. Ran the check, without -n, then with -L as instructed. Output is below. It appears that it completed successfully. However, when I start the array (non-maintenance), the drive is still showing unmountable. Again, much appreciated for the support. Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... clearing needsrepair flag and regenerating metadata - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 5 - agno = 2 - agno = 4 - agno = 6 - agno = 7 - agno = 3 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (1:976460) is ahead of log (1:2). Format log to cycle 4. done
  7. And the SMART report on the drive. oldmain-smart-20240329-1314.zip
  8. Looks like it's not mounting... unmountable / unsupported or no file system oldmain-diagnostics-20240329-1307.zip oldmain-syslog-20240329-1708.zip
  9. Back up with a flash restore to the same USB. Disk 1 emulated. Logs and diagnostics attached. I did not attempt to start the array. syslog.txt oldmain-diagnostics-20240329-1245.zip
  10. This topic is on the forum a few times, and most result in a new USB. However, my problems started last night and have cascaded to this. I wanted to get help before I start with the USB. Logs are attached and description of events below. 1) Last night, a few of my docker containers were not working, I thought it was due to bad update, and I was able to roll a couple back and they were working. 2) This morning at 7:42am - something happened and the server became unresponsive (in the logs) 3) Performed a bad shutdown - power button hold 4) I would normally start back up and see what I've got, but in prep for a new case and additional drives I opened up the case to take inventory, took some pics, closed back up and started up 5) Upon restart, Disk 1 had UDMA CRC errors and went offline - as this can be connection issue, I did a clean shutdown and made sure all the connections were good. 6) Put back up and started up, and now getting kernel panic, not syncing VFS, unable to mount root fs Note: I know I do have an error on my cache drive, that I have not formatted, and I had to rebuild a VM when first errored, but has been fine since deleting the corrupted files. Haven't taken the time to move everything off and reformat to hopefully correct. Any help is appreciated as I'm totally down now. unraid-2.log
  11. I finally took the time to pull the box down and mess with power cables. 1) plugged new power cable into drive, started up.... showed as drive missing; opened back up 2) plugged old power cable back into drive, started up... showed up with zero errors and running parity check happily I'll monitor, but does this sound like, 1) a random occurrence, slightly bad cable connection 2) a power cable or PSU issue 3) a drive issue? Appreciate your experience here. I don't have spare PSU or drives around, so depending on your thoughts may get some backups. Thanks
  12. I swapped sata cables around (3 way swap between 3 HDD). The ATA error stuck with the parity drive but moved from ata1 to ata2. I didn't mess with any power cables at this point. oldmain-syslog-20220524-0241.zip
  13. A couple of days ago I checked my cables, made sure everything was well connected. Ran a parity check after I closed back up, and it started slow, but returned to normal parity check speed. However, it noted over 50,000 errors once finished and said parity was valid. Errors did show up and are attached in syslog attachment. Started a check today to see how it would react, and started slow and picked up parity errors immediately, I canceled it without seeing if it sped up. Similar errors being thrown today as a couple of days ago. Could it be a power supply issue? I'm using a 400W supply (started with modest ambitions, but I've added more hardware). I can change out data cables as well to check those, but it may be very apparent to you that the PSU is undersized. Quick snapshot of hardware - 2 sticks RAM - 1080 Ti GPU used in VM CCTV only - USB 3.0 PCIe card - Sata PCIe expansion card - 3 HDD (2 data, 1 parity) - 2 SSD (1 appdata cache, 1 cache pool) - 2 USB HDD (old laptop drives) - 1 as unassigned drive, 1 cache pool oldmain-syslog-20220515-1852.zip
  14. Thanks, will take me a couple of days to get to it. But I'll report back.
  15. I did confirm that I installed the new hardware and then ran a parity check that ran normally with no errors found. The only thing I did after the hardware/parity check and before this issue was remove some docker containers (photoprism, mariadb), delete their appdata folder, and then reinstall them. I wanted fresh containers to start over. Thanks. oldmain-diagnostics-20220508-1512.zip
×
×
  • Create New...