Power Flick? Unable to stop the array

trurl · July 29, 2022

2 minutes ago, mathomas3 said:

So you think both parity drives are hosed?

I don't think there is anything wrong with any disks, except perhaps for disk8 since it did log a critical media error. You can run an extended self-test on disk8, after you get other hardware problems resolved and before attempting to rebuild the disabled disks.

Your main problem seems to be controller related.

mathomas3 · July 29, 2022

Here is a complete Main page. I have 3 disks attached and unassigned, waiting to be added to the array should the need arise

mathomas3 · July 29, 2022

1 minute ago, trurl said:

I don't think there is anything wrong with any disks, except perhaps for disk8 since it did log a critical media error. You can run an extended self-test on disk8, after you get other hardware problems resolved and before attempting to rebuild the disabled disks.

Your main problem seems to be controller related.

Running the extended test now on disk 8

trurl · July 29, 2022

Let's tag @JorgeB for an opinion on these controller issues, the expert on these things. May not be a good time in his time zone.

mathomas3 · July 29, 2022

5 minutes ago, trurl said:

Let's tag @JorgeB for an opinion on these controller issues, the expert on these things. May not be a good time in his time zone.

If needed I have a pair of Dell Compellent that I could try in the enclosure Disk Shelf

Here is the controller that I am using that should have been flashed into IT Mode LSI SAS9207-8e

trurl · July 29, 2022

4 minutes ago, mathomas3 said:

Here is the controller that I am using that should have been flashed into IT Mode LSI SAS9207-8e

That looks OK

06:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
	Subsystem: Broadcom / LSI 9207-8e SAS2.1 HBA [1000:3040]

But there is another in your system

05:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array G6 controllers [103c:323a] (rev 01)
	Subsystem: Hewlett-Packard Company Smart Array P410i [103c:3245]

Anything attached to that?

Even if all disks are on that HBA, problems with multiple disks simultaneously indicate controller or power problems, or possibly SAS cable if the problems are all on one cable.

mathomas3 · July 29, 2022

43 minutes ago, mathomas3 said:

Running the extended test now on disk 8

The short SMART test passed and I am still waiting on the extended.

mathomas3 · July 29, 2022

24 minutes ago, trurl said:
That looks OK
06:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
	Subsystem: Broadcom / LSI 9207-8e SAS2.1 HBA [1000:3040]
But there is another in your system
05:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array G6 controllers [103c:323a] (rev 01)
	Subsystem: Hewlett-Packard Company Smart Array P410i [103c:3245]
Anything attached to that?

Even if all disks are on that HBA, problems with multiple disks simultaneously indicate controller or power problems, or possibly SAS cable if the problems are all on one cable.

Yes. Everything is in the DAS(two controllers and HDDs) which then connects to the LSI card in the server.

That might be what is throwing you for a loop, the dual controllers that are inside/connected to the LSI card

trurl · July 29, 2022

12 minutes ago, mathomas3 said:

waiting on the extended

You will probably have to disable spindown on the disk to get it to complete.

trurl · July 29, 2022

2 minutes ago, mathomas3 said:

Yes. Everything is in the DAS(two controllers and HDDs) which then connects to the LSI card in the server.

That might be what is throwing you for a loop, the dual controllers that are inside/connected to the LSI card

I don't have a broad hardware experience like @JorgeB

Port multipliers? How long does your 8TB parity check normally take?

mathomas3 · July 29, 2022

3 minutes ago, trurl said:

I don't have a broad hardware experience like @JorgeB

Port multipliers? How long does your 8TB parity check normally take?

last check on the 15th lasted 1day 6h avg speed was 73.7mb

though that was with many dockers running... If I turn that off and run it I get up to 175mb

Edited July 29, 2022 by mathomas3

trurl · July 29, 2022

My 8TB parity check is under 16 hours, 140 MB/s

trurl · July 29, 2022

About your configuration.

To me, it would have made more sense to put that 1TB SSD (currently assigned as disk8) as a single disk pool for caching (XFS, no redundancy since it is going to eventually wind up on the array), and use those 2x256 SSDs (currently assigned as cache) as another fast pool, btrfs raid1, for appdata, domains, system shares.

mathomas3 · July 29, 2022

32 minutes ago, trurl said:

About your configuration.

To me, it would have made more sense to put that 1TB SSD (currently assigned as disk8) as a single disk pool for caching (since it is going to eventually wind up on the array), and use those 2x256 SSDs (currently assigned as cache) as another fast pool, raid1, for appdata, domains, system shares.

I do like that plan... Just havent took the time to get it done... The fast pools are something new that I havent touched on yet.

Again thank you for the help today.

The UPS needs to be switched out. I suspect that it's too small for the new hardware, on boot it beeps for a second or two due to excess power draw

Edited July 29, 2022 by mathomas3

mathomas3 · July 29, 2022

Disk 8 just completed with zero errors... Parity drive is still at 10%

trurl · July 30, 2022

2 hours ago, mathomas3 said:

Disk 8 just completed with zero errors... Parity drive is still at 10%

8TB HDD will take many hours for extended test, with updates every 10%. Might as well check back tomorrow.

JorgeB · July 30, 2022

Controller is OK, only the LSI HBA is being used.

Initial issue appears to have been a power problem, multiple disks dropped at the same time.

Disk10 does appear to be showing some issues, run a long test and even if it passes any more read errors I would recommend replacing it.

mathomas3 · July 31, 2022

Both drives cleared on the extended check and thus far zero errors on disk 10... I will keep an eye on that disk...

Parity and disk 8 are being rebuilt now... decided to replace both of the disk with the spares that I had on hand and plan to use trurl's advise...

On a side note... I have never touched the docker vdisk since it was made years ago, and I would think that I can copy/cut it to a different location and update the location via the settings page... If I were to modify the size of the disk would that corrupt the disk/image?

Again thank you to Trurl for your time and quick response... Well done... JorgeB, thank you for confirming this build months ago and confirming the issues at hand...

JorgeB · July 31, 2022

8 hours ago, mathomas3 said:

. If I were to modify the size of the disk would that corrupt the disk/image?

You can just delete the old one and re-create.

mathomas3 · July 31, 2022

Seems like I keep getting myself into trouble...

While attempting to get the cache setup with the 1tb drive I tried to let mover clear off the data, which didnt work. So I ended up unassigning one of the cache drives and assigning the 1tb in it's place. I waited for a bit for the btrfs process to complete and validating that by using btrfs fi usage -T /mnt/cache and when everything was complete I tried stopping the array to unassign the last small drive from the cache, but I am getting the error 'umount: /mnt/cache : target busy' do I have to do a hard power off again?

image.png.cece94dddabdc55bb565a75571c756b9.png

tower-diagnostics-20220731-1018.zip

mathomas3 · July 31, 2022

ended up doing a hard reset... everything seems fine

trurl · August 1, 2022

21 hours ago, mathomas3 said:

tried to let mover clear off the data

Nothing can move or delete open files. You have to disable Docker and VM Manager in Settings if you want to manage files in appdata, domains, system shares.

20 hours ago, mathomas3 said:

ended up doing a hard reset... everything seems fine

post new diagnostics

mathomas3 · August 3, 2022

On 8/1/2022 at 8:08 AM, trurl said:

Nothing can move or delete open files. You have to disable Docker and VM Manager in Settings if you want to manage files in appdata, domains, system shares.

post new diagnostics

tower-diagnostics-20220802-1946.zip

Sorry for the delay. I ended up driving to a wedding and back... 1200 miles round trip.

When I was trying to use mover to relocate data, everything was disabled and rebooted to ensure there were no file access issues... After leaving it over night it was still far away from where it needed to be... ended moving things by hand to where it needed to be and thus far it seems OK...

The cache needs to be cleaned up though... I still have to pull one 265gig drive from there and add it to new disk pool. From what I have read this isnt something that can be done without wiping out the cache system again?

My plan once I got back was to disable everything again and try mover(or by hand) once again... remove the cache setup... move the 256 to the disk pool and rebuild the cache with the single 1tb drive... and re-enable services

trurl · August 3, 2022

Haven't reviewed the thread yet.

What is the purpose of these shares?

a-----a                           shareUseCache="prefer"  Exists on s------s, disk8, cache
T-------e                         shareUseCache="only"    Exists on cache, disk13, disk14

I'm guessing that first is your appdata, but your docker.cfg has Plex as your APPDATA share, so that other one is anonymized. That share is set to services:prefer. Mover will only move files to/from the designated pool.

And mover won't move duplicates, and nothing can move open files. Possibly the files on disk8 for that share are duplicates so they won't be moved. And cache isn't the designated pool for the share so they won't be moved either.

The other share is cache:only, mover ignores only shares.

mathomas3 · August 3, 2022

32 minutes ago, trurl said:
Haven't reviewed the thread yet.

What is the purpose of these shares?
a-----a                           shareUseCache="prefer"  Exists on s------s, disk8, cache
T-------e                         shareUseCache="only"    Exists on cache, disk13, disk14
I'm guessing that first is your appdata, but your docker.cfg has Plex as your APPDATA share, so that other one is anonymized. That share is set to services:prefer. Mover will only move files to/from the designated pool.

And mover won't move duplicates, and nothing can move open files. Possibly the files on disk8 for that share are duplicates so they won't be moved. And cache isn't the designated pool for the share so they won't be moved either.

The other share is cache:only, mover ignores only shares.

Let me think about what you have said and try to understand it better... what you have said has raised a few questions to the operation of the cache... and while I would agree that mover might not move data from disk 8 to the new services disk pool... I would think it would move data from the cache...

Power Flick? Unable to stop the array

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

JorgeB

mathomas3

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation