Jump to content

cpetro45

Members
  • Posts

    31
  • Joined

  • Last visited

Posts posted by cpetro45

  1. Warming this back up as I have some clarification questions:

     

    This happened again.  I had PSU failure and a few unclean shutdowns (PSU would cause a reboot when disks spun up or doing anything).

     

    One stability was achieved, I ran a non-correcting parity check, it had 1 error. 

     

    I ran it again (non-correcting) and it found no errors. 

     

    This is all in maintenance mode BTW.  Here is the log:

     

    Jun 14 14:24:41 Tower kernel: mdcmd (36): check nocorrect
    Jun 14 14:24:41 Tower kernel: md: recovery thread: check P Q ...
    Jun 14 14:41:00 Tower kernel: md: recovery thread: PQ incorrect, sector=217794056
    Jun 15 06:37:38 Tower kernel: md: sync done. time=58377sec
    Jun 15 06:37:38 Tower kernel: md: recovery thread: exit status: 0
    Jun 15 09:07:11 Tower kernel: mdcmd (37): check nocorrect
    Jun 15 09:07:11 Tower kernel: md: recovery thread: check P Q ...
    Jun 16 01:20:27 Tower kernel: md: sync done. time=58396sec
    Jun 16 01:20:27 Tower kernel: md: recovery thread: exit status: 0

     

    You can see I ran these 1 after the other pretty much without a reboot. 

     

    My question is more of a general computing question.  In this post and others, I see mention of a "flipped bit" and the potential for that to show up here.  With non-ECC memory which I believe is still far more common, wouldn't this be more of a problem? 

     

    What does memory have to do with this process in the first place, what does it load into memory that it needs to compare?

     

    In addition, without ECC memory, is my array always at risk for data corruption?  For example, if I'm copying 30GiB of information over my TCP/IP Network to the Unraid array, we've got memory (non-ECC in this case) of the host computer, memory (non-ECC) of the unraid array, and the TCP/IP protocol as well.  If there is a "bit flip" anywhere along the process does that corrupt whatever I'm copying?  Wouldn't this be more of an issue? 

     

    Do I really need to rum memtest?  I didn't run it before, but I guess I'm willing to run it now since I've had the array in maintenance mode and unavailable for some time at this point. 

     

    I appreciate the insight, I am just trying to determine the best way forward as I have some data that is 25 years old and I don't want bit rot.

     

    Should I run a 3rd parity check (I haven't rebooted, it's still in maintenance mode) or should I just ignore this? 


    Thanks

  2. Update: Replacing the Power Supply with a new one stopped the rebooting.  Thank goodness it was an easy fix.  Super bizarre failure.  The old power supply that introduced the instability/reboot was a thermaltake 550W SP-550 which looks like I ordered around 2015 for something.  The age lines up with expected EOL.  It wasn't a particularly good one at the time of purchase.  I've replaced it with a seasonic focus gx-750 (totally overkill but was cheaper than the 550W with a better warranty).  I've learned over the years to buy the best PSU you can afford!  Just wanted to provide the update.  Thanks!

    • Like 1
  3. 20 minutes ago, JorgeB said:

    Server rebooting on its own is almost always a hardware issue, if it does it mostly under load, first things to check would be PSU and to a lesser extent, the temps.

     

     

    Thanks JorgeB, yeah that's what I was suspecting as well more so after I downgraded.  I guess I'll throw a PSU at it and see if it resolves.  Will update the post after I do that in a few weeks. 

  4. Hi there,

     

    I've been running unraid for a long time (10 years).  I've had problems here and there, but this one I think is gonna be hard to nail down.  I do think it's a h/w issue due to:

     

    • server being stable for years
    • keeping up with latest releases
    • downgraded to 6.12.9 and same exact issue as 6.12.10

     

    syslog shows nothing about any kernel panics or anything like that. 

     

    If I let the server sit for a few days with no disks mounted, it will stay up. 


    As soon as I start a parity check, it will reboot within 5 minutes with no indication of problem in syslog.  I wonder if I've got some weird power supply issue where with disks not spun up and under no load, no issues, but disks spun up, under load, reboots. 

     

    Any thoughts on this one?  I have 6 disks , 2 parity and 4 data. 

     

    I'm thinking throw a power supply in it first?

     

    Thanks.  Here is latest diags before manually downgrading versions. 

     

     

    tower-diagnostics-20240607_0742.zip

  5. 6 hours ago, JorgeB said:

    Because there wasn't one, all checks were non correct.

    Hey Jorge,

     

    Thanks.  The last full check shown below (16hr, 20 min, 57 sec) did have correcting enabled.  I was expecting something to show up in the log like "parity check started" and "parity check completed."  You are saying there is basically no identification for this, and things in the log would only show up if errors were found and corrected (or not corrected as the previous run shows in the log)?

     

    Thanks,

    Chris

    Screenshot 2022-06-15 153533.png

  6. 3 hours ago, JorgeB said:

    If it was a bit flip it might be hard to catch any issues with memtest, still worth running a couple of passes, if no errors are found see how the next parity checks go.

    Alright, will give it a go, thanks.  I didn't see anything related to full parity check with correction in the syslog... did I need to have debug turned on or something?


    Thanks

  7. 1 hour ago, JorgeB said:

    Log snippet doesn't show the full correcting check, assuming it did run until the end without finding any errors it suggests the previous error found was unrelated to unclean shutdown, and possibly something like a RAM bit flip, unclean shutdown related errors, when they exist, mostly exist in the beggining of the disks, which were where the metadata is stored, assuming XFS filesystem for the array.

    Thanks so much Jorge, let me get the whole log.  All disks are XFS.

     

    I attached the whole diagnostic stack.  I appreciate your reply and help. 

     

    Let me know if I can provide anymore information.  I also provided screenshot for times of the 1 sync error vs clean run. 

     

     

    Screenshot 2022-06-15 153533.png

    tower-diagnostics-20220615-1543.zip

  8. Hey there everyone,


    I'm just wondering how / why this would occur.  8 disk array, dual parity, no cache.  There was an unclean shutdown due to power issues (I have UPS, but don't have the automation setup to shutdown yet).  There were no writes going on and disks were spun down. 


    The next day, I ran parity check with correct disabled and it found 1 error about 60% of through.  I read these forums and decided to cancel and restart the parity check with correct enabled. 

     

    The parity check ran with no errors detected. 


    What could a reason be why this happened?  Should I be concerned?

     

    Here are lines related from the log:  (suspect line Jun 13 07:06:23 Tower kernel: md: recovery thread: PQ incorrect, sector=807021208)

    Jun 13 06:05:27 Tower kernel: mdcmd (42): check 
    Jun 13 06:05:27 Tower kernel: md: recovery thread: check P Q ...
    Jun 13 06:05:50 Tower kernel: mdcmd (43): nocheck Cancel
    Jun 13 06:05:50 Tower kernel: md: recovery thread: exit status: -4
    Jun 13 06:05:55 Tower kernel: mdcmd (44): check nocorrect
    Jun 13 06:05:55 Tower kernel: md: recovery thread: check P Q ...
    Jun 13 06:08:57 Tower nmbd[6513]: [2022/06/13 06:08:57.865107,  0] ../../source3/nmbd/nmbd_become_lmb.c:397(become_local_master_stage2)
    Jun 13 06:08:57 Tower nmbd[6513]:   *****
    Jun 13 06:08:57 Tower nmbd[6513]:   
    Jun 13 06:08:57 Tower nmbd[6513]:   Samba name server TOWER is now a local master browser for workgroup HOME on subnet 192.168.122.1
    Jun 13 06:08:57 Tower nmbd[6513]:   
    Jun 13 06:08:57 Tower nmbd[6513]:   *****
    Jun 13 06:08:57 Tower nmbd[6513]: [2022/06/13 06:08:57.865228,  0] ../../source3/nmbd/nmbd_become_lmb.c:397(become_local_master_stage2)
    Jun 13 06:08:57 Tower nmbd[6513]:   *****
    Jun 13 06:08:57 Tower nmbd[6513]:   
    Jun 13 06:08:57 Tower nmbd[6513]:   Samba name server TOWER is now a local master browser for workgroup HOME on subnet 172.17.0.1
    Jun 13 06:08:57 Tower nmbd[6513]:   
    Jun 13 06:08:57 Tower nmbd[6513]:   *****
    Jun 13 07:06:23 Tower kernel: md: recovery thread: PQ incorrect, sector=807021208
    Jun 13 13:03:31 Tower kernel: mdcmd (45): nocheck Cancel
    Jun 13 13:03:31 Tower kernel: md: recovery thread: exit status: -4
    Jun 13 13:03:40 Tower kernel: mdcmd (46): check 
    Jun 13 13:03:40 Tower kernel: md: recovery thread: check P Q ...
    Jun 13 13:03:58 Tower kernel: mdcmd (47): nocheck Cancel
    Jun 13 13:03:58 Tower kernel: md: recovery thread: exit status: -4
    Jun 13 13:04:02 Tower kernel: mdcmd (48): check nocorrect
    Jun 13 13:04:02 Tower kernel: md: recovery thread: check P Q ...
    Jun 13 13:04:10 Tower kernel: mdcmd (49): nocheck Cancel
    Jun 13 13:04:10 Tower kernel: md: recovery thread: exit status: -4
    Jun 13 13:04:14 Tower kernel: mdcmd (50): check 
    Jun 13 13:04:14 Tower kernel: md: recovery thread: check P Q ...
    Jun 13 13:05:29 Tower kernel: mdcmd (51): nocheck Cancel
    Jun 13 13:05:29 Tower kernel: md: recovery thread: exit status: -4
    Jun 13 13:12:29 Tower kernel: mdcmd (52): check 
    Jun 13 13:12:29 Tower kernel: md: recovery thread: check P Q ...

     

    Any thoughts much appreciated. 


    Thanks

  9. I've been running Unraid for 10 years.  I have a 6 disk array that at one time was 9 or 10.  All has been well.  As you can imagine drive failures here and there, etc over 10 years.  I updated disks and consolidated a few years ago.  I have 8TB parity and 5 Data disks 2 - 4 GB each.  I never updated the H/W.  This thing has been running on Core 2 Duo E6600 or something with 2 GB of ram...ancient Gigabyte board and an 8800GTX , still no issues besides regular disk maintenance...til now.

     

    Log in to WebUI the other day and see a disabled disk...I'm pretty annoyed as nothing has changed and don't even write to this that much.  I didn't grab the logs.  The SMART report on disabled disk says UDMA CRC error count.  I read all about that all over this forum.  In the meantime, I think I stopped the array and shutdown to replace SATA cables.  I ordered new SATA cables from Amazon.  I replaced all of them (they are old).  I didn't care about the sata plug / disk order cause unraid 6 no need to worry.  I think I was concerned so I hooked up a monitor as well to check booting.  Well, I thought it got stuck on loading bzroot since it was taking forever (more than 3 - 4 min).  I got sidetracked and ended replacing the usb drive (14 years old) thinking that was bad (it wasn't).  I had to use the HP Media preparation since this MOBO is real picky.  Did all that BS (figuring that out again that it wouldn't boot unless specially formatted with a 15 year old tool), finally got it booting from new USB thumb drive.  BTW, it was taking like 10 minutes to boot (first hint, since it never usually took that long)...remember this thing is still booting up.  I didnt think it was booting and thought usb drive was bad.  I used the new UNRAID 6 usb drive prep first, obviously, but it definitely needed that HP tool (thanks to this forum...again... I was able to find that info ...again and re-download it).

     

     

    Next, the forums say, oh, just check the file system of the disabled disk, and it's probably that.  I started array in maintenance mode (after 10 minute boot or so), ran the file system check, nothing really came up on -n option.  I figured, what the hell, run it without -n.  Next thing you know, I get a UDMA CRC error count warning on PARITY disk, during the File system check of disabled disk.  I'm like , oh man, parity better not get disabled.  There were I think 21 writes in the Parity disk column and 20 errors.  I started getting real nervous.  Somehow the retries must have been successful (I have that syslog)  I cancelled the file system check on disabled disk.  I had screenshots from config before and after I moved the disks around.  It wasn't the same SATA port that threw the UDMA CRC error count on data disk and parity disk.  At this point, I knew something on the board was toast. 

     

    Luckily, I have another system (literally only 2 years newer) but with 8GB ram and Core 2 Xtreme 3000 quad core (1200$ processor back in it's day , bought on ebay for like 250 or something like 8 years ago).  Anyway, huge upgrade compared to the 2GB and Core 2 Duo LOL.  We're talking DDR 2 here. 

     

    So, conveniently enough, the "new" computer had a removable MOBO tray, so right now, I'm running the array off the new board, rebuilding the failed disk on top of itself.  Something went real bad on in the old CPU.  Added some pics....hope this was a fun blast from the past for some, and also BACK UP YOUR DATA because you never know...

     

    Ultimately, I do feel like a few bits may be off (I hope not).  I definitely going to backup all the rest of my important items once array is back up completely. 

     

     

    20220225_220627_copy_2774x3699.jpg

    20220225_220704_copy_3699x2774.jpg

  10. 6 hours ago, quincyg said:

    Preclear post read failed.  Any suggestions here.  I got a failure at the end of pre clear.   This drive is connected via an onboard sata connector that I don't normally utilize.   The WD red drive is running pretty hot as well 47-50C. log attached.

    preclear_disk_log.txt 7.42 kB · 1 download

    Jul 31 09:43:51 preclear_disk_VDHX6KUK_30153: Command: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh --notify 3 --frequency 1 --cycles 1 --no-prompt /dev/sdb
    Jul 31 09:43:51 preclear_disk_VDHX6KUK_30153: Preclear Disk Version: 1.0.16
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: S.M.A.R.T. info type: default
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: S.M.A.R.T. attrs type: default
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: Disk size: 8001563222016
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: Disk blocks: 1953506646
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: Blocks (512 bytes): 15628053168
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: Block size: 4096
    Jul 31 09:43:52 preclear_disk_VDHX6KUK_30153: Start sector: 0
    Jul 31 09:43:57 preclear_disk_VDHX6KUK_30153: Pre-Read: dd if=/dev/sdb of=/dev/null bs=2097152 skip=0 count=8001563222016 conv=notrunc,noerror iflag=nocache,count_bytes,skip_bytes
    Jul 31 10:53:15 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 10% read @ 188 MB/s
    Jul 31 12:03:47 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 20% read @ 181 MB/s
    Jul 31 13:16:54 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 30% read @ 174 MB/s
    Jul 31 14:33:20 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 40% read @ 167 MB/s
    Jul 31 15:53:54 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 50% read @ 161 MB/s
    Jul 31 17:19:43 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 60% read @ 146 MB/s
    Jul 31 18:52:27 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 70% read @ 138 MB/s
    Jul 31 20:34:15 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 80% read @ 121 MB/s
    Jul 31 22:29:41 preclear_disk_VDHX6KUK_30153: Pre-Read: progress - 90% read @ 104 MB/s
    Aug 01 00:45:46 preclear_disk_VDHX6KUK_30153: Pre-Read: dd - read 8001565319168 of 8001563222016.
    Aug 01 00:45:47 preclear_disk_VDHX6KUK_30153: Pre-Read: elapsed time - 15:01:47
    Aug 01 00:45:47 preclear_disk_VDHX6KUK_30153: Pre-Read: dd exit code - 0
    Aug 01 00:45:51 preclear_disk_VDHX6KUK_30153: Zeroing: emptying the MBR.
    Aug 01 00:45:51 preclear_disk_VDHX6KUK_30153: Zeroing: dd if=/dev/zero of=/dev/sdb bs=2097152 seek=2097152 count=8001561124864 conv=notrunc iflag=count_bytes,nocache,fullblock oflag=seek_bytes
    Aug 01 00:45:51 preclear_disk_VDHX6KUK_30153: Zeroing: dd pid [3609]
    Aug 01 02:49:23 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 10% zeroed @ 111 MB/s
    Aug 01 04:35:33 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 20% zeroed @ 146 MB/s
    Aug 01 06:13:56 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 30% zeroed @ 127 MB/s
    Aug 01 07:57:00 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 40% zeroed @ 113 MB/s
    Aug 01 09:56:06 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 50% zeroed @ 110 MB/s
    Aug 01 11:51:07 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 60% zeroed @ 117 MB/s
    Aug 01 13:41:27 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 70% zeroed @ 130 MB/s
    Aug 01 15:19:57 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 80% zeroed @ 136 MB/s
    Aug 01 17:07:26 preclear_disk_VDHX6KUK_30153: Zeroing: progress - 90% zeroed @ 113 MB/s
    Aug 01 19:13:15 preclear_disk_VDHX6KUK_30153: Zeroing: dd - wrote 8001563222016 of 8001563222016.
    Aug 01 19:13:16 preclear_disk_VDHX6KUK_30153: Zeroing: elapsed time - 18:27:23
    Aug 01 19:13:17 preclear_disk_VDHX6KUK_30153: Zeroing: dd exit code - 0
    Aug 01 19:13:18 preclear_disk_VDHX6KUK_30153: Writing signature:    0   0   2   0   0 255 255 255   1   0   0   0 255 255 255 255
    Aug 01 19:13:22 preclear_disk_VDHX6KUK_30153: Post-Read: verifying the beggining of the disk.
    Aug 01 19:13:22 preclear_disk_VDHX6KUK_30153: Post-Read: cmp /tmp/.preclear/sdb/fifo /dev/zero
    Aug 01 19:13:22 preclear_disk_VDHX6KUK_30153: Post-Read: dd if=/dev/sdb of=/tmp/.preclear/sdb/fifo count=2096640 skip=512 conv=notrunc iflag=nocache,count_bytes,skip_bytes
    Aug 01 19:13:23 preclear_disk_VDHX6KUK_30153: Post-Read: verifying the rest of the disk.
    Aug 01 19:13:23 preclear_disk_VDHX6KUK_30153: Post-Read: cmp /tmp/.preclear/sdb/fifo /dev/zero
    Aug 01 19:13:23 preclear_disk_VDHX6KUK_30153: Post-Read: dd if=/dev/sdb of=/tmp/.preclear/sdb/fifo bs=2097152 skip=2097152 count=8001561124864 conv=notrunc iflag=nocache,count_bytes,skip_bytes
    Aug 01 20:34:20 preclear_disk_VDHX6KUK_30153: Post-Read: progress - 10% verified @ 169 MB/s
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd - read 1527356166144 of 8001563222016.
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: elapsed time - 2:33:21
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd command failed, exit code [1].
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1518178664448 bytes (1.5 TB, 1.4 TiB) copied, 9128.02 s, 166 MB/s
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 725089+0 records in
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 725088+0 records out
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1520619749376 bytes (1.5 TB, 1.4 TiB) copied, 9142.07 s, 166 MB/s
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 726061+0 records in
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 726060+0 records out
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1522658181120 bytes (1.5 TB, 1.4 TiB) copied, 9154.33 s, 166 MB/s
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 727177+0 records in
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 727176+0 records out
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1524998602752 bytes (1.5 TB, 1.4 TiB) copied, 9168.43 s, 166 MB/s
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 728256+0 records in
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 728255+0 records out
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1527261429760 bytes (1.5 TB, 1.4 TiB) copied, 9182.28 s, 166 MB/s
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: dd: error reading '/dev/sdb': Input/output error
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 728299+1 records in
    Aug 01 21:46:45 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 728299+1 records out
    Aug 01 21:46:46 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1527354068992 bytes (1.5 TB, 1.4 TiB) copied, 9198.16 s, 166 MB/s
    Aug 01 21:46:46 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 728299+1 records in
    Aug 01 21:46:46 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 728299+1 records out
    Aug 01 21:46:46 preclear_disk_VDHX6KUK_30153: Post-Read: dd output: 1527354068992 bytes (1.5 TB, 1.4 TiB) copied, 9198.16 s, 166 MB/s
    Aug 01 21:46:48 preclear_disk_VDHX6KUK_30153: ssmtp: Authorization failed (535 5.7.0 (#AUTH005) Too many bad auth attempts.)
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 5    Reallocated_Sector_Ct    6
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 9    Power_On_Hours           77
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 194  Temperature_Celsius      50
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 196  Reallocated_Event_Count  6
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 197  Current_Pending_Sector   16
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 198  Offline_Uncorrectable    0
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: S.M.A.R.T.: 199  UDMA_CRC_Error_Count     0
    Aug 01 21:46:49 preclear_disk_VDHX6KUK_30153: error encountered, exiting...
     

    warranty that thing, reallocated sectors

  11. Just an update from my issues (see back a few posts) with this plugin and 6.3.5.  On my 2nd 4TB drive, I've got the array stopped and I'm using the original (patched for latest version of unraid - see script thread) preclear_disk.sh script without issue.  The server hasn't locked up at all like last time when using this plugin.  Using screen is working well.  I'm not saying I don't like the idea of this plugin at all, it just wasn't working for me.

     

    Thanks

  12. On 12/2/2017 at 5:41 PM, Frank1940 said:

     

    If it locks up again, you could install the 'Tips and Tweaks' plugin.  Then goto the Tweaks page and set the Disk Cache 'vm.dirty_background_ratio' (%):  parameter  to    1   and the  Disk Cache 'vm.dirty_ratio' (%):   parameter to     2       This will free up a big block of memory without any observable effect on performance.  You can read a bit more about these parameters by clicking on the Help function.

     

    Hey Frank, thanks.  I implemented this setting and it seems like maybe it did something.  However, looks like the preclear job is mostly locking up on the Pre-read phase and pegging 1 cpu at 100% with the timer stopping.  I've nursed it through 2 cycles, it's working on the 3rd but I think for my 2nd HD, I may spend the time to figure out how to do it via ssh.  I'm not going to spend too much time as I believe my system resources are not constrained, and the script is just bombing out for whatever reason.  I do appreciate the help and input!

     

    Chris

  13. 1 hour ago, Frank1940 said:

     

    The E6600 should be up to the task.  How much memory do you have installed?  It takes between 2GB and 4GB of RAM with ver-6.3.5  to have a system that is not RAM constrained if you are only running the basic NAS function, the 'usual plugins', and perhaps, a Docker or two.  The preclear function does require a fair amount of RAM so that might be an issue.  

     

    Thanks, yeah it's got 2GB of ram, a little light for sure. It didn't seem like ram usage was pegged tho, about 50% versus higher cpu usage. Its prereading now, it locked up earlier (no ip, no gui) but now it seems like it's moving again...after I reset it.  We shall see.... Thanks 

  14. Is there a minimum CPU system requirement to get this script to run successfully all the way through?

     

    I've never used it before, and I have a pretty old Dual Core Duo E6600 2.4Ghz.  Anyway, I've got two new 4TB disks hooked up to a PCI Ex Ver1 X1 SATA controller.  I tried both at the same time, realized that was a big mistake as it was saturating the PCI bus.  Now I'm trying 1 at a time, and I had pre clear stop responding on Post-Read with CPUs pegged at 100%.  The WebGUI was still responding.  I uninstalled the plugin, rebooted, reinstalled it, and rebooted again and am attempting preclear once again with 1 drive and the array stopped.  Hopefully it makes it all the way through this time, I'll definitely post back.  This is with the latest version 2017.11.14 and Unraid  6.3.5.  I'm not really using any of the new features like docker containers yet.  Any thoughts?  If this doesn't work, I understand I should try it manually without the plugin.  Also, I could get the disks off this crappy PCIEv1 X1 controller (which I plan to do after I remove some other disks (MOBO has 8SATA ports), but I figured I'd be OK with the preclear.  


    Thanks !

     

    Chris

×
×
  • Create New...