Jump to content

put my mind at rest


Go to solution Solved by bmartino1,

Recommended Posts

Good morning, this is a question in the hope that you can put my mind at rest.

 

background information = I have 5 disks within a unraid environment with one of the drivers as a parity with a total storage of 22TB

I have my unraid within a VM on a proxmox server,  and I have pasted through my HD's via the following command

 

qm set 103 -scsi1 /dev/disk/by-id/***-WDC_************

 

and I have all my shares' using all the HD's within my unraid. but a few days ago was I was watching plex, I experience the following, where one of my hard drivers failed, Resulting in unraid hanging and not allowing me access.  after a bit of investigating I located the HD and remove it from the VM config, then fired up the VM with no issues and reported that a hard disc was missing, but would still run and create a virtual HD.   

 

so far no problems, but here is the issue.  upon doing a data restore ( using Stellar Data recovery ) I discovery that I had loss quite a few films, TV shows and a lot of my personal photos.

 

as I understand it, that all the data in my raided is stored on the parity so that I can afford to loss one HD, and all my data is safe, but from this experience it looks like, any data that is stored on the failed HD is loss

Link to comment
Posted (edited)

thanks for the info on Disk 1 - I

 

I just ran the check with a -n and here is the results.  does this look ok ? , if not then what is the best steps to resolve it ?

 

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
ir_freecount/free mismatch, inode chunk 1/65902336, freecount 28 nfree 27
finobt ir_freecount/free mismatch, inode chunk 1/65902336, freecount 28 nfree 27
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
inode identifier 2306195160 mismatch on inode 2314583768
inode identifier 2306195160 mismatch on inode 2314583768
would have cleared inode 2314583768
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 1
        - agno = 2
inode identifier 2306195160 mismatch on inode 2314583768
would have cleared inode 2314583768
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

 

regarding the disk I replace it was disk 4

 

Edited by chris_netsmart
Link to comment

I have re-ran it without the -n

 


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

 

does this look good ?

Link to comment

Yes, with xfs_repair it's not always easy to see if something was done or not, unless you manually check the exit code, but if there was an issue detected it should now be fixed, reboot to clear the log and keep an eye on it for more errors like the one above.

Link to comment
Posted (edited)

This is a known proxmox issues...

proxmox never lets go of the disk and is still interacting with them. In Proxmox lsblk should not see the disks! This is where your corruption is happening, as unraid is trying to do xyz and proxmox is tying to do xyz. Even with other OS with disk by id passing, there are complication as to how proxmox keeps touching the disk!

This will kill a disk faster than anything...

When I did testing with proxmox.. (I virtualized unraid 6.11) I had to pcie pass the entire HBA card in proxmox to have unraid happy with disks. There is still a proxmox touch at Host boot as proxmox can see all the drives especial if they are zfs. but once the vm boots the drives disappear form the host and can't be touched!

There are proxmox modporbe and other commands to not touch similar to vfio command to help G card and Frame buffers.
The issues still lies with how the disks was passed by proxmox. And because of its use in that state, you killed the drive.

Check proxmox disk io...

https://forum.proxmox.com/threads/sda-has-a-holder.97771/

https://forum.proxmox.com/threads/disk-io-throttles-and-dies-inside-vm-passthrough-of-drives-encrypted.122968/

Edited by bmartino1
Link to comment
Posted (edited)

@bmartino1 

Quote

When I did testing with proxmox.. (I virtualized unraid 6.11) I had to pcie pass the entire HBA card in proxmox to have unraid happy with disks. There is still a proxmox touch at Host boot as proxmox can see all the drives especial if they are zfs. but once the vm boots the drives disappear form the host and can't be touched!

ok as I want to keep my unraid in a VM due to having issues with dockers and VM's within unraid, ( Home Assistant for a example )   I reading the above statement that if I replace my  SATA expansion card with a HBA card ( example = HP H221 HBA Controller card ) and then pass the discs into unraid from proxmox then the only the discs will be seen by promox will be when I fire up the unraid VM ?

 

example card: LSI 9207-8I Raid Controller Card 6gbs SAS for my B450 Tomhawk

Edited by chris_netsmart
Link to comment
Posted (edited)

Correct
Esentail you are wanting to use proxmox hyper-v QEMU for emulated mobo, cpu and partion of ram.

However, the HBA needs to be in IT mode. I used this in my testing:
https://a.co/d/17I658Q

In proxmox I also had to setup special modprobe to help.
 


lspci -n -s ### of your HBA

nano /etc/modprobe.d/vfio.conf

 

( if g card sould have this: )
options vfio-pci ids= disable_vga=1

(if usb card / HBA should have this:)
options vfio-pci ids= disable_idle_d3=1 enable_sriov disable_denylist

 

Edited by bmartino1
Link to comment
Posted (edited)

@bmartino1 thanks for the information , as this will be something I will be doing as I don't want to loss anymore data - from a disc fault.

 

again to make sure I understand the above

 

once I have purchase and installed the HBA Card, I will then need to run the following commands to get the HBA card information and then modify the Vfio.conf file.

 

here I am not sure what you are saying 

Quote

 

( if g card sould have this: )

options vfio-pci ids= disable_vga=1

 

(if usb card / HBA should have this:)

options vfio-pci ids= disable_idle_d3=1 enable_sriov disable_denylist

 

 

and lastly looking at the linked webpage for  about " modprobe " gosh I am so confused. or should I just be looking at this

would this effect proxmox version 8 ?

 

ps. I have looked around - to anyone is it easy to set up the SAS controller within my motherboard BIOS ? if so is there a video or a step procedure as this would be something very new to me, and i don't want to loss any more data

 

Edited by chris_netsmart
Link to comment
Posted (edited)

Kinda...
https://pve.proxmox.com/wiki/PCI_Passthrough
https://pve.proxmox.com/wiki/PCI(e)_Passthrough

You want the hba card to use vfio driver similar to passing a gpu via pci pass thorough. There a lot of steps and configurations...

I have many jumbled notes with friends in my discord server. We were using proxmox 6 - 8 with testing this. Due to physical hardware issues with our thread ripper (bad memory controller)[hardware still worked but the memory controller kept spamming the log of memory being corrected...] We had no issues except for the first proxmox attempt to touch disk before VM start that takes the HBA pci device.[we were working on a way to not have proxmox load the ata/hba driver via black listing so proxomx would never touch them...] Then proxmox never sees them again... We had 2 hba truenas testing, unraid 6.11 testing, vgpu with a 2080... good times ... issues with long COVID killed the project, and we took parts from it for others. The hardware has sat since...
see some of our poor Documentation from https://drive.google.com/file/d/1g_tUmplm-7hAlK6Efz-BVCcj82C9NbYk/view?usp=drive_link

Proxmox 8.1 was our last test with this... Due to known hardware failure and use for the project's hardware elsewhere...it wasn't worth the cost to troubleshoot the board/ processor or pay for its replacement. Kernel 6 killed many things for arm code. 10000 lines of code was removed from 5 and alot that was removed Bork other things...
pve-kernel-5.15 was the last known good kernel for true full pcie pass-through. Some of kernel 6 changes were great, other not so much. but broke a lot of the grub commands and workarounds... with proxmox 8+ i recommend using pve proxmox kernels with VGPU stuff you need the linux headers of that kernel. I have attempted to download them for prosperity but with other debs and changes over the years they will soon be EOL and not worth using as kernel 6 LTS fixes the codes that broke... This is not a dealbreaker but a known vfio problem...

Anyway:
So let me help clarify the steps you need to do to prep the machine:

I assume you have a sub if not have are using Debian repository for security updates...

Step 1 set up the repositories:

https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_package_repositories

https://pve.proxmox.com/wiki/Package_Repositories

rm /etc/apt/sources.list.d/pve-enterprise.list

Make sure you have an up-to-date Bios and your Hardware has Bios options to support for IOMMU. Enabled Iommu and settings some high-end hardware have dmar remapping setting for iommu and other options for memory remapping....
^ This is the staple to make things work...


Next we need to set up grub options. Proxmox since 8 removed some changes to revert the kernel so now it's the proxmox boot tool...
so if you edit grub it can break your system ... https://forum.proxmox.com/threads/grub-parameters-when-using-proxmox-boot-tool-refresh.118649/

Don't worry, this will not brick or break anything...

We want to add something to the Grub line to help bind and set vfio kernel driver in use...
To do this we will be editing grub...


find 
GRUB_CMDLINE_LINUX_DEFAULT=


We will want to add this to enable IOMMU...

Intel
intel_iommu=on iommu=pt

AMD:
amd_iommu=on iommu=pt

AMD Threadripper / Epyc:
intel_iommu=on amd_iommu=on iommu=pt


we also want some other grb settings:
Fix power issue with pcie devices when moved: pci=noaer pcie_aspm=off
allow the pcie device to send things upstream and downstream(also messes with IOMMU group layouts...): pcie_acs_override=downstream,multifunction
and later to add our hba device ID to line: vfio-pci.ids=

I delete quiet that may be there to have syslog see and get on screen...

I recommend # out your working line and adding it to the config under it this way if there is a problem you can still get in to the OS via editing the grub boot menu and reedit this line...

GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction pci=noaer pcie_aspm=off vfio-pci.ids=###,### "


Don't forget to update the kernal and grub: if you nano /etc/default/grub update-grub to apply the cahges and comit them... Then reboot
^Old way...

Correct proxmox way:

nano /etc/kernel/cmdline

add to end of line:
intel_iommu=on amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction pci=noaer pcie_aspm=off

later add vfio-pci.ids=###,###


Then:
 

nano /etc/modules

===================================

vfio vfio_iommu_type1 vfio_pci vfio_virqfd #not needed if on kernel 6.2 or newer

===================================

Then run

proxmox-boot-tool refresh

and 

update-initramfs -u -k all

############################################################################### 
To test if your hardware is working and to Test IOMMU. Proxmox terminal:

dmesg | grep -e DMAR -e IOMMU

dmesg | grep 'remapping'

^This will search your syslog for the option that ran at boot and the info...


If we cd to the modprobe folder, we want to add some  changes to some configs and fixes...

cd /etc/modprobe.d

in this folder we want to create a file add the content to the file to fix them.

 

*Fix kvm spam of message log:

nano kvm.conf

=================================================================

options kvm ignore_msrs=1

=================================================================

 

*Your VFIO LSPCI device to pass to VM

nano vfio.conf

=================================================================

#( if g card should have this: )

#options vfio-pci ids= disable_vga=1

 

#(if usb card / HBA should have this:)

#options vfio-pci ids= disable_idle_d3=1 enable_sriov disable_denylist

=================================================================

Optional but sometimes needed:
IF YOU DON"T GET A GOOD REMAPPING:
*Force SRV-IO

nano iommu_unsafe_interrupts.conf

=================================================================
options vfio_iommu_type1 allow_unsafe_interrupts=1
=================================================================

The key here is to get IOMMU working, then get your PCI VIFO Hardware ID :

TO get a List VIFO PCIDs:
lspci -v

to list driver PCIID use to list vfio Hardware ID

lspci -n -s 01:00

If you mess with any modprobe you need to update the kenral
update-initramfs -u -k all

and reboot...

At this point you have verified IOMMU is on, Grabbed the HBA VIFO ID for VFIO and used lspci to get the drive in use and should see vfio on the card.

We now need to edit kernel line and fill in the ###,### with the ID of the HBA, there may be 2 different numbers or the same one twice...

Update grub and reboot.

Now make a vm proxmox usb pass unraid usb, pcie device add the HBA. set other vm settings.
Edit options to set usb as default boot device.
boot to unraid...

...kinda:

^ - First 13 mins...

 



Their are many ways to do this. the premise is to get the device you want to pas to use vfio driver. 
Even a vm hook script: https://forum.proxmox.com/threads/passing-though-an-lsi-hba-to-vm.64532/

 

Edited by bmartino1
Proxmox 8 boot tool update...
Link to comment


Best I can find.. Alto of my nots are over the years and many videos / forms / wiki and changes collected...
Beter asked on proxmox forum.

While it can be done. Unraid has taken the stance that they don't want to support there OS virtualized. CPU and other bugs are bound to happen, and your on your own.

Link to comment

@bmartino1 

Quote

"While it can be done. Unraid has taken the stance that they don't want to support there OS virtualized. CPU and other bugs are bound to happen, and your on your own." 

 

this I understand

 

I just read through the procedure you posted and I can see I have a lot of reading to do, before I go ahead and do this, as, I must confess - a lot of it is over my head. and I need to understand it before I do anything.

  • Thanks 1
Link to comment
Posted (edited)
7 hours ago, chris_netsmart said:

@bmartino1 

 

this I understand

 

I just read through the procedure you posted and I can see I have a lot of reading to do, before I go ahead and do this, as, I must confess - a lot of it is over my head. and I need to understand it before I do anything.


image.thumb.png.2b3070212a3114d89e5de08132b75f2d.png

In the end it what you have and how you want to interact with it. What service is unraid providing you? Dockers? Docker can be run on any Linux os....
https://docs.docker.com/get-docker/

Since you are on proxmox. it is more of finding another /good docker UI

You can install docker native to proxmox and run dockers from another webui such as portainer.
if ipvlan dockers are what you need, you can use zima board/icewails group for their Webui for dockers. Then run them native to proxmox host and data storage.

It is easier to learn how to implement dockers:
https://docs.docker.com/network/drivers/


run "casa OS" alongside proxmox. or in a ubnutu linux VM / Continer...

Just add the Debian repository to the apt source list.  proxmox to get libs needed / repository trusted for software:
https://medium.com/@kiena/configuring-apt-sources-in-debian-12-ensuring-reliable-software-access-a940ac2ca7f0

More info on sources / repositories:
https://wiki.debian.org/SourcesList#sources.list_format


https://casaos.io/

run the bash script that's it. login to casa web ui... Proxmox hyper-v and casaOS for docker.

I don't like to use casa as they break macvlan docker network setups but have a nice UI / good implementation of docker compose / json / docker setup and a similar docker community app store. You can use docker run and or other docker commands and casa will detect them and add them to the web ui...

 

*once installed, I make a folder and set up a folder structure on proxmox zfs and symlink casa os path to maintain casa os data and all their docker data in one location...
^would have to find the notes for it.  a specific terminal command to 


Portainer is ok..
https://docs.portainer.io/start/install-ce

hard to use and more clumsy than unraids webui on docker. What unraid has is a good template setup to construct a Docker run command.
^for this reason, unriad is docker king.

You can migrate unraid dockers by editing the docker change an item and change it back to regenerate its docker run line
remove the -l for labels and paste that line in proxmox terminal, runt that line. Fixing network and data paths in the docker run line.

My 2 Cents:
Just a thought and throwing that out there. I like unraid for its docker setup macvaln for lan access and other service support.(mainly how they implement vfio bind...) Unraid has a nice 1 click ui for some of this. Unfortunate as their default was a macvlan.. The recent 6.11 - 6.12 upgrade and switch to ipvlan dockers borked a great many deal of things once unrid drops macvlan completely unraid will be dead to me.

macvlan is hard to implement correctly as it is driver dependent on a network card that supports promisc mode.

https://superuser.com/questions/1414628/understanding-promiscuous-mode

 

 

Edited by bmartino1
Link to comment

@bmartino1 as above I am all ready using VM's within proxmox as I had a lot of issues with running VM's within unraid and I also have moved out some out functions and place them within a docker.

 

I also like unraid, as I can mix and match my discs and also for the simple way you can click on a docker, and it will install it for you.  if I was able to have a stable environment with unraid ( in relation to VM's and dockers ) and I wouldn't have proxmox.

 

going back a few steps I think I will be looking at a HBA or a "hate to say" a Linux VM running as my Raid server, but in doing this then I will loss the function of mixing and matching disc's.

 

 

Link to comment

@bmartino1  I have been dong some reading and I think I have a basic understand of the HBA and how to set it up on my promox and then pass it through to unraid, 

the only thing I can't find any information about - will Ioss my data on my hard discs when I replace my PCI Sata with a HBA card ? and then pass the card through and not using 

qm set 103 -scsi1 /dev/disk/by-id/ata-WDC_*******-*****-**-********

 

image.png

Link to comment

The point of using the HBA is to pass the PCIE HBA device itself, you would no longer use disk by id passthorugh.

The disk would be passed when the PCIE device is passed... so get rid of any lingering scsi disk by id hd passthorugh. There should be no data losses when done correctly...

There will be no data loss when this happens.... data is on the disk not the port its connected to when unraid starts you may need to select the correct disk in the layout in unraid to restore unraid functions. data is not lost this way...

Link to comment
Posted (edited)

@bmartino1  one last question,  I have been looking at different HBA,  with will be able to have 8 devices,  would you say that if I got for a 12gb card to future proof myself like the LSI 9300 or 9302-8i cards  or at this time stick with a 6gbs transfer card like 9211-8I - and also do you think purchase the card on EBAY is a good ideal ? or go to a retailer like Amazon ?

Edited by chris_netsmart
Link to comment
Posted (edited)

? - https://www.theserverstore.com/lsi-sas-9300-8i-sas3008-8-port-12gb-pcie-30-it-mode-freenas-unraid-hba.html

The reason an HBA over a sata pcie extension is that HBA have extra chips that when passed help with disk management... any pcie with ports can be passed via pcei pass through.
^some are easier to pass and support sr-iov out of the box other require kernel/mod probe/ and other parameters before the vm see its..
 

A 12 GB model is fine. (this is the on card disk to disk transfer speeds...) What you really want is a card that supports IT mode. As unraid uses software raid.

Ebay is second hand and its being sold for a reason. When using ebay double-check the description. When I use Ebay the hope is either a good known technician from a salvage sale(buy wholesale / pallets of hardware) or a home lab-er who upgraded and trying to sell their hardware. Ebay protection isn't as good as it used to be and there is no warranty...
Ebay usually has the cheaper price... but you're paying the agreed value from a private seller in a unknown condition without warranty...

I would prefer name competitors and name brands, but that comes down to cost and price. It's what you are willing to purchase.

if the device is not in IT mode: You will have to go through steps to flash firmware. That firmware can be hard to acquire and may not work and then your left with a bricked device.
IT mode takes away the raid HD assignment and lets the Hard drive connected pass as if it were sata connected. NON IT mode have a another menu at boot to configure Hardware disk setup for Raid 0,1,5,JBOD and then controlled by that device. Not all raid cards are HBA. 

Ideal if future proofing and want a disk shelf you would want a HBA with a external fiber:
https://www.ebay.com/itm/326066742362?chn=ps&mkevt=1&mkcid=28&srsltid=AfmBOopIcxytXqVaXDoBXHdh0PI0t-1xgGkg_STfvfmd62xETOfF1V9u6O4

the fiber connect to the HBA in another box with power and a JBOD disk Shelf.... Note that that device is not in IT mode. That would exist in the other case and connect to the disk shelf unriad would use a hba DAK to connect to that card disk shelf.
Make sure your DAK fiber are compatible: 

Example Disk shelf setup: 
https://a.co/d/3Tf2nu5 --finally found it the it mode disk shelf PCIE card: https://a.co/d/54w3pLv
https://www.newegg.ca/norco-ds-12d/p/N82E16816133044

Server/datacenter worlds get weird fast. In that world LSI was the top brand. Their are knock offs and third party chip using LSI firmware. (Alliexpress...) I have had success with Broadcom knock offs as well. The goal and look is to get a decent firmware the board uses in IT MODE and has xyz sata port connections. ex (may not bee knock off): https://www.aliexpress.us/item/3256806551465670.html?src=google&gatewayAdapt=glo2usa

I would recommend when purchasing one to get one that comes with cables. as then you have to fight sas/minisas/dac sata mutisplit cables to use with the devices.

USA: Form what i can find:

Amazon:
6GB 8 Drives:
https://a.co/d/89HAs0W

12GB: 16 drives
https://a.co/d/6eSTsyc

Alliexpresss:
12GB: 16 drives
https://www.aliexpress.us/item/3256806546653173.html?src=google&gatewayAdapt=glo2usa

ebay:
https://www.ebay.com/itm/134338327797
https://www.ebay.com/itm/386738701327

I don't know your needs or know your plans for the future...

Edited by bmartino1
Link to comment
Posted (edited)

gosh - many thanks @bmartino1 I wasn't expecting all that information and you have given me something to think about - for future expending my data storage, but as have a 9 disc storage bay at the moment and all I am looking at as removing the SATA expansion card and replace it with a HBA then I think I will be looking at something like the 9300-8I 12Gbps.

 

regarding the cost - over here in the UK  - I will be looking at £100 from amazon, but I will shop around and see if I can get it cheaper before I go and purchase this.

 

question - so are you saying that the Fujitsu 9300-8I is a good board ?  as the link above is giving me the cheap's price for the UK

I also agree with yourself that buying from Ebay is a hit or miss and normally this is my last result.

Edited by chris_netsmart
Link to comment
  • Solution

yes, its a good card.

From my experience, AliExpress can be less than eBay in terms of quality of goods. Fujitsu is an established known brand for electronics. There are  known for their storage and electronics (more semiconductors) in the USA. If that card fits your needs 12 GB transfer speeds, 8 disk connection.
Site Tech specs: https://www.fujitsu.com/au/products/computing/servers/primergy/components/pmod-157814.html#specs

Be sure to find one in IT mode.

With alliexpress watch the sellers. I've gotten really good hardware and really bad hardware from alliexpress. Then link i sent is from a known seller who handles there products well.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...