Power Flick? Unable to stop the array


Recommended Posts

2 minutes ago, mathomas3 said:

So you think both parity drives are hosed? 

I don't think there is anything wrong with any disks, except perhaps for disk8 since it did log a critical media error. You can run an extended self-test on disk8, after you get other hardware problems resolved and before attempting to rebuild the disabled disks.

 

Your main problem seems to be controller related.

Link to comment
1 minute ago, trurl said:

I don't think there is anything wrong with any disks, except perhaps for disk8 since it did log a critical media error. You can run an extended self-test on disk8, after you get other hardware problems resolved and before attempting to rebuild the disabled disks.

 

Your main problem seems to be controller related.

Running the extended test now on disk 8

Link to comment
4 minutes ago, mathomas3 said:

Here is the controller that I am using that should have been flashed into IT Mode LSI SAS9207-8e

That looks OK

06:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
	Subsystem: Broadcom / LSI 9207-8e SAS2.1 HBA [1000:3040]

 

But there is another in your system

05:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array G6 controllers [103c:323a] (rev 01)
	Subsystem: Hewlett-Packard Company Smart Array P410i [103c:3245]

Anything attached to that?

 

Even if all disks are on that HBA, problems with multiple disks simultaneously indicate controller or power problems, or possibly SAS cable if the problems are all on one cable.

Link to comment
24 minutes ago, trurl said:

That looks OK

06:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
	Subsystem: Broadcom / LSI 9207-8e SAS2.1 HBA [1000:3040]

 

But there is another in your system

05:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array G6 controllers [103c:323a] (rev 01)
	Subsystem: Hewlett-Packard Company Smart Array P410i [103c:3245]

Anything attached to that?

 

Even if all disks are on that HBA, problems with multiple disks simultaneously indicate controller or power problems, or possibly SAS cable if the problems are all on one cable.

Yes. Everything is in the DAS(two controllers and HDDs) which then connects to the LSI card in the server. 

 

That might be what is throwing you for a loop, the dual controllers that are inside/connected to the LSI card

Link to comment
2 minutes ago, mathomas3 said:

Yes. Everything is in the DAS(two controllers and HDDs) which then connects to the LSI card in the server. 

 

That might be what is throwing you for a loop, the dual controllers that are inside/connected to the LSI card

I don't have a broad hardware experience like @JorgeB

 

Port multipliers? How long does your 8TB parity check normally take?

Link to comment
3 minutes ago, trurl said:

I don't have a broad hardware experience like @JorgeB

 

Port multipliers? How long does your 8TB parity check normally take?

last check on the 15th lasted 1day 6h avg speed was 73.7mb

though that was with many dockers running... If I turn that off and run it I get up to 175mb

Edited by mathomas3
Link to comment

About your configuration.

 

To me, it would have made more sense to put that 1TB SSD (currently assigned as disk8) as a single disk pool for caching (XFS, no redundancy since it is going to eventually wind up on the array), and use those 2x256 SSDs (currently assigned as cache) as another fast pool, btrfs raid1, for appdata, domains, system shares.

Link to comment
32 minutes ago, trurl said:

About your configuration.

 

To me, it would have made more sense to put that 1TB SSD (currently assigned as disk8) as a single disk pool for caching (since it is going to eventually wind up on the array), and use those 2x256 SSDs (currently assigned as cache) as another fast pool, raid1, for appdata, domains, system shares.

I do like that plan... Just havent took the time to get it done... The fast pools are something new that I havent touched on yet. 

 

Again thank you for the help today.

 

The UPS needs to be switched out. I suspect that it's too small for the new hardware, on boot it beeps for a second or two due to excess power draw

Edited by mathomas3
Link to comment

Controller is OK, only the LSI HBA is being used.

 

Initial issue appears to have been a power problem, multiple disks dropped at the same time.

 

Disk10 does appear to be showing some issues, run a long test and even if it passes any more read errors I would recommend replacing it.

  • Upvote 1
Link to comment

Both drives cleared on the extended check and thus far zero errors on disk 10... I will keep an eye on that disk...

 

Parity and disk 8 are being rebuilt now... decided to replace both of the disk with the spares that I had on hand and plan to use trurl's advise...

 

On a side note... I have never touched the docker vdisk since it was made years ago, and I would think that I can copy/cut it to a different location and update the location via the settings page... If I were to modify the size of the disk would that corrupt the disk/image?

 

Again thank you to Trurl for your time and quick response... Well done... JorgeB, thank you for confirming this build months ago and confirming the issues at hand... 

  • Like 1
Link to comment

Seems like I keep getting myself into trouble... 

 

While attempting to get the cache setup with  the 1tb drive I tried to let mover clear off the data, which didnt work. So I ended up unassigning one of the cache drives and assigning the 1tb in it's place. I waited for a bit for the btrfs process to complete and validating that by using btrfs fi usage -T /mnt/cache and when everything was complete I tried stopping the array to unassign the last small drive from the cache, but I am getting the error 'umount: /mnt/cache : target busy' do I have to do a hard power off again?

image.png.cece94dddabdc55bb565a75571c756b9.png

 

image.thumb.png.210beab4cc8ea698d7f15e45310f12aa.png

tower-diagnostics-20220731-1018.zip

Link to comment
21 hours ago, mathomas3 said:

tried to let mover clear off the data

Nothing can move or delete open files. You have to disable Docker and VM Manager in Settings if you want to manage files in appdata, domains, system shares.

 

20 hours ago, mathomas3 said:

ended up doing a hard reset... everything seems fine

 

post new diagnostics

Link to comment
On 8/1/2022 at 8:08 AM, trurl said:

Nothing can move or delete open files. You have to disable Docker and VM Manager in Settings if you want to manage files in appdata, domains, system shares.

 

post new diagnostics

tower-diagnostics-20220802-1946.zip

 

Sorry for the delay. I ended up driving to a wedding and back... 1200 miles round trip.

 

When I was trying to use mover to relocate data, everything was disabled and rebooted to ensure there were no file access issues... After leaving it over night it was still far away from where it needed to be... ended moving things by hand to where it needed to be and thus far it seems OK... 

 

The cache needs to be cleaned up though... I still have to pull one 265gig drive from there and add it to new disk pool. From what I have read this isnt something that can be done without wiping out the cache system again? 

 

My plan once I got back was to disable everything again and try mover(or by hand) once again... remove the cache setup... move the 256 to the disk pool and rebuild the cache with the single 1tb drive... and re-enable services

Link to comment

Haven't reviewed the thread yet.

 

What is the purpose of these shares?

a-----a                           shareUseCache="prefer"  Exists on s------s, disk8, cache
T-------e                         shareUseCache="only"    Exists on cache, disk13, disk14

I'm guessing that first is your appdata, but your docker.cfg has Plex as your APPDATA share, so that other one is anonymized. That share is set to services:prefer. Mover will only move files to/from the designated pool.

 

And mover won't move duplicates, and nothing can move open files. Possibly the files on disk8 for that share are duplicates so they won't be moved. And cache isn't the designated pool for the share so they won't be moved either.

 

The other share is cache:only, mover ignores only shares.

Link to comment
32 minutes ago, trurl said:

Haven't reviewed the thread yet.

 

What is the purpose of these shares?

a-----a                           shareUseCache="prefer"  Exists on s------s, disk8, cache
T-------e                         shareUseCache="only"    Exists on cache, disk13, disk14

I'm guessing that first is your appdata, but your docker.cfg has Plex as your APPDATA share, so that other one is anonymized. That share is set to services:prefer. Mover will only move files to/from the designated pool.

 

And mover won't move duplicates, and nothing can move open files. Possibly the files on disk8 for that share are duplicates so they won't be moved. And cache isn't the designated pool for the share so they won't be moved either.

 

The other share is cache:only, mover ignores only shares.

Let me think about what you have said and try to understand it better... what you have said has raised a few questions to the operation of the cache... and while I would agree that mover might not move data from disk 8 to the new services disk pool... I would think it would move data from the cache...

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.