multiple problems since last start: missings stats on dashboard, missing disks on array, no info on parity, no diagnostics download


pika

Recommended Posts

Hello!

I'm on unraid 6.11.2 and my system went down 3 days ago (no idea why, no power shortage). after new start i saw parity check starting and i did not inquire further (no time atm). today i opened the dashboard and there are were several things missing:

image.thumb.png.7285d6129c30ae4f33788d536a324739.png

 

all the infos in the marked areas were missing (while i was typing the first lines of this post they came back)...

i was not able to download diagnostics, also working again now.

 

i saw these errors in the syslog:

Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2: AER: Corrected error received: 0000:00:01.0
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:   device [1022:15d3] error status/mask=00000040/00006000
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:    [ 6] BadTLP                
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2: AER: Multiple Corrected error received: 0000:00:01.0
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:   device [1022:15d3] error status/mask=00001040/00006000
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:    [ 6] BadTLP                
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:    [12] Timeout               
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2: AER: Corrected error received: 0000:00:01.0
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:   device [1022:15d3] error status/mask=00000040/00006000
Jan 13 20:24:18 DataTower kernel: pcieport 0000:00:01.2:    [ 6] BadTLP                
Jan 13 20:24:48 DataTower kernel: pcieport 0000:00:01.2: AER: Corrected error received: 0000:00:01.0
Jan 13 20:24:48 DataTower kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Jan 13 20:24:48 DataTower kernel: pcieport 0000:00:01.2:   device [1022:15d3] error status/mask=00000040/00006000
Jan 13 20:24:48 DataTower kernel: pcieport 0000:00:01.2:    [ 6] BadTLP                
Jan 13 20:24:51 DataTower kernel: pcieport 0000:00:01.2: AER: Corrected error received: 0000:00:01.0
Jan 13 20:24:51 DataTower kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Jan 13 20:24:51 DataTower kernel: pcieport 0000:00:01.2:   device [1022:15d3] error status/mask=00000040/00006000
Jan 13 20:24:51 DataTower kernel: pcieport 0000:00:01.2:    [ 6] BadTLP                

 

no idea what this means. looks like everything is working for now but i think there is something wrong with my system...

could somebody please take a look at my diagnostics?

 

have a nice day!

datatower-diagnostics-20230114-1024.zip

Link to comment
9 hours ago, JorgeB said:

Nothing else obvious on the logs, but you should update to latest release, that one has known issues.

huh, tried the update. server didn't boot after "restart server now". had to press the power button... that shouldn't happen, right?

 

edit: parity check runs again after manual boot. so i guess there was something wrong during the update? should i provice another diagnostics?

Edited by pika
Link to comment
6 hours ago, Brucey7 said:

I have this problem too, been having it for a while, it appears the flash drive is filling up somehow, it takes a week or so

Do you have something like mover logging enabled, or the syslog server with option to mirror to flash.   Even so a bit unusual for either of these to cause this sort of problem.

 

I expect it would be relatively obvious what the culprit is by examining the contents of the flash drive when it starts getting full to see what is taking up the space?

Link to comment

Lots of nginx errors:

 

Jan 17 09:10:49 Tower2 nginx: 2023/01/17 09:10:49 [crit] 3371#3371: accept4() failed (24: Too many open files)
Jan 17 09:10:52 Tower2 nginx: 2023/01/17 09:10:52 [error] 3371#3371: OUTPUT:can't create output chain, file in buffer won't open
Jan 17 09:10:53 Tower2 nginx: 2023/01/17 09:10:53 [error] 3371#3371: OUTPUT:can't create output chain, file in buffer won't open
Jan 17 09:10:54 Tower2 nginx: 2023/01/17 09:10:54 [error] 3371#3371: OUTPUT:can't create output chain, file in buffer won't open
Jan 17 09:10:56 Tower2 nginx: 2023/01/17 09:10:56 [error] 3371#3371: OUTPUT:can't create output chain, file in buffer won't open
Jan 17 09:10:56 Tower2 nginx: 2023/01/17 09:10:56 [crit] 3371#3371: accept4() failed (24: Too many open files)
Jan 17 09:10:57 Tower2 nginx: 2023/01/17 09:10:57 [crit] 3371#3371: accept4() failed (24: Too many open files)

 

Try booting in safe mode

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.