(SOLVED) Is it normal for system to be unresponsive during disk tasks?


Recommended Posts

So looking to jump into Unraid as i've found myself with a good opportunity to 'start fresh'. Downloaded it, opted in for trial to make sure it fits my purpose. One of my drives (the only SMR drive) reported a slew of SMART errors so I thought i'd do a bit intense scan to see what's going on. Went to bed, woke up to check it via web gui and it can't connect anymore.

 

Ok weird I thought, it must have crashed. Was completely unresponsive, would just do the standard time out you get with no connectivity. So force restarted. Everything comes up fine of course, there was no pools or anything yet anyway. But in the disk profile page is lists the SMART scan as being interrupted. So check some other drives, add my first to the pool and begin the function of adding to pool and then starting the format. Again, it becomes -completely- unresponsive. No other pages will load at all.

 

So I guess i'm just curious, is this normal? It doesn't give a progress bar or anything, all I have is 'started, formatting' and then I have to keep coming in and hitting refresh to see if it finishes? And if this is normal and it locks you out of the UI intentionally, follow up question; does this mean the shares themselves go offline every time something has to be formatted to be added to the pool etc? 8TB is the smallest drives I have so that's a lot of downtime if so particularly during the setup process where i'm likely to add/remove a few drives over the next few weeks.

Edited by Yekul
Link to comment

Yeah I thought surely this can't be right. Thing is, it has happened with two different drives. Bit reluctant to power down right now as it's mid formatting, and I can hear it actually access the drive every so often when it's quiet. What's the best course of action here?

So to clarify, using two different cables, two different drives, two different sata slots, I could try and swap motherboards, but this board was working fine previously and it has no problems actually accessing the drives.

I can run the short SMART tests no problem for example, and track the progress through 10->100% status. The long test I walked away after it started, but it wouldn't connect to unraid via local GUI when I powered on my main PC this morning so I couldn't say if it had increased in % or was still locked to 10%.

Link to comment

Hrm yeah wasn't sure if the format was meant to be doing a preclear type situation where it zeros the drive or not. Any ideas where to look for issues? As I said, brand new to Unraid and just trialing to make sure it's fit for purpose before jumping over. Haven't updated BIOS in a very long time but it's an old board (ab350m) so doubt there'd be anything too pressing, not like it's an x570 etc that is probably still receiving a lot of software related updates that may effect one thing or the other.

Link to comment
5 hours ago, Yekul said:

does this mean the shares themselves go offline every time something has to be formatted to be added to the pool etc?

No, such loading quite little to the system.

 

1 hour ago, Yekul said:

The long test I walked away after it started

You should follow it, long test ( extended test ) no system involve, it is a kind of selftest. Check back any abnormal in SMART and test progress.

 

5 hours ago, Yekul said:

Was completely unresponsive

Disk problem can cause this behavior. 

Edited by Vr2Io
Link to comment

If long test fail or won't complete, this is disk problem and need replace or RMA.

 

To avoid system ( GUI ) unresponsive again ( supopose, you still can access through telnet / ssh ) , you can try stop array then perform long test. 

 

Some people will do monthly check by long test or parity check to ensure disk in heath.

 

** Pls stop auto spindown in 6.9.x, otherwise it will cause test aborted **

 

image.png.d4ae19f06dc9cb135ab448b9a521110e.png

 

image.png.9f76d12485b0693d93c93edd716f76ae.png

Edited by Vr2Io
Link to comment

So everything was going fine, started the extended offline test again. I could navigate to different pages, no unresponsive aspects at all. Then around ~30% or so through the test, same thing as before. Systems locks up and wont let me access anything. Can't SSH to server either. Happens with both drives tested, and one of them is brand new and passed testing no problems. Obviously not saying it's impossible for both to be faulty, but would be a hell of a coincidence. I just can't figure out what is going on, because unraid doesn't appear to power down or restart due to this.

Link to comment
1 hour ago, Yekul said:

So everything was going fine, started the extended offline test again. I could navigate to different pages, no unresponsive aspects at all. Then around ~30% or so through the test, same thing as before. Systems locks up and wont let me access anything. Can't SSH to server either. Happens with both drives tested, and one of them is brand new and passed testing no problems. Obviously not saying it's impossible for both to be faulty, but would be a hell of a coincidence. I just can't figure out what is going on, because unraid doesn't appear to power down or restart due to this.

 

You are likely to get better informed feedback if you attach your systems diagnostics zip file (obtained via Tools->Diagnostics) to your NEXT post.  Ideally you want this with the array started and after encountering your problem

Link to comment
5 minutes ago, itimpi said:

You are likely to get better informed feedback if you attach your systems diagnostics zip file (obtained via Tools->Diagnostics) to your NEXT post.  Ideally you want this with the array started and after encountering your problem

How do I do this when the system becomes unresponsive? I thought the log files cleared on reboot. Which is the only way I can access it, by doing a forced reboot. I installed the Fix Common Problems plugin, but couldn't find the Troubleshooting mode previously mentioned in older threads (I guess this actively wrote the logs somewhere else constantly so they'd be stored on a physical drive and not RAM?).

Link to comment

Ok so I tried to do this, it doesn't seem to be persisting through restarts. I have it set to ->settings->syslog server:

local syslog server: enabled

local syslog folder: app data

mirror syslog to flash: yes (temporarily to troubleshoot, not like it writes much anyway with my current setup)

However the diagnostic file still clears. I hooked up a monitor and noticed when I hit the power button and it started the 'graceful shutdown' that happens it starts doing that fine then gets to 'Starting diagnostic collection...' and then just seems to hang. So I suspect that's why i'm not actually getting any useful log files?

The syslog.txt file inside the diagnostics folder is only showing information since the restart, nothing before. Attached logs anyway fwiw, but really just shows recent restart and one of the failing drives (same thing happens without it attached btw, as I only just plugged it in).

picklerick-diagnostics-20210321-1021.zip

Link to comment

Tentatively solved. Downgraded to confirm it wasn't a 6.9.1 issue. Still present. Then found C1 state enabled in BIOS, since disabling appears to be holding steady. Likely would have stopped being an issue when I swapped to a new intel setup in a few days time anyway, but posting a reply in case others stumble on this using older hardware like mine (Ryzen 1700, AB350M).

  • Like 1
Link to comment
  • Yekul changed the title to (SOLVED) Is it normal for system to be unresponsive during disk tasks?

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.