FTW
-
Posts
27 -
Joined
-
Last visited
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Posts posted by FTW
-
-
thanks a lot for your help JorgeB
-
nope, failed again 😪
-
okay, let me reboot and try again
if fails, to destroy I need to decrypt, and then
zpool destroy cache
??
-
root@Tower:~# cryptsetup luksOpen /dev/nvme0n1p1 nvme0n1p1 --allow-discards Enter passphrase for /dev/nvme0n1p1: root@Tower:~# zpool import pool: cache id: 13452574719722492999 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: cache ONLINE nvme0n1p1 ONLINE root@Tower:~# zpool import -F cache
it work but stuck after the import -F cache...
syslog:
Feb 22 06:26:35 Tower kernel: WARNING: Pool 'cache' has encountered an uncorrectable I/O failure and has been suspended. Feb 22 06:26:35 Tower kernel:
-
so what would be my best option to format my cache pool? Start Array with 'No Device' in my cache pool? or there a special way to format it?
-
here the syslog and I do understand it know why you don't recommend using encryption uhuh 😪😅
-
-
okay, but starting my array, make it get stuck at mounting my Cache Pools because of the suspend
my whole array still offline and at the moment in terminal, zpool import -F cache seem to get stuck, no answer and nothing in logs
-
root@Tower:~# zpool import no pools available to import root@Tower:~# zpool import -F cache cannot import 'cache': no such pool available root@Tower:~# zpool import -D cache cannot import 'cache': no such pool available
root@Tower:~# zpool destroy -f Cache cannot open 'Cache': no such pool
root@Tower:~# zpool destroy -f cache cannot open 'cache': no such pool
that's before I click on Array Start
-
damn okk
so what the correct command to try with the -F??
zpool import -F my-nvme-id. ?
and to destroy?
zpool import -D my-nvme-id or /mnt/cache?
-
same thing Before and After
-
here my diagnostics before start the Array
the diagnostics after trying to start the Array seem to get stuck at
/usr/sbin/zpool status 2>/dev/null|todos >>'/tower-diagnostics-20240221-0302/system/zfs-info.txt'
so I am unable to get a better diagn after Start of the Array
unable also to attach my syslog.txt... upload failed but every drives are mounted successfully except the Cache drive
+ that
Feb 21 02:29:20 Tower ntpd[1818]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
that I don't know how to fix it....
zpool import output = no pools available to import for both before and after Starting the Array
as a resume: Array failed to start because it get stuck at mounting the Cache (NVME) which is suspended
Feb 21 02:58:43 Tower kernel: WARNING: Pool 'cache' has encountered an uncorrectable I/O failure and has been suspended.
oh and my cache pool are ZFS
-
12 hours ago, JorgeB said:
Post the output of:
zpool import
it said this after I start the Array
no pools available to import
-
yes I did and it was stable (see to be a temperature issue or CPU from what I found on google)
but now, I got a bigger issue
unraid Pool 'cache' has encountered an uncorrectable I/O failure and has been suspended.
unable to mount my cache pool after a crash...
- switch nvme slot and same issue
the nvme is a brand new from less than 1 month ago
what's should I do to fix?
cannot scrub since the Array don't want to start, get stuck at cache pool. all others hdd are mounted
I even tried to downgrade OS version (6.12.4 been stable on my other server) to since I did the upgrade yesterday to the newer one (6.12.8). Unsuccessful
Feb 20 03:21:17 Tower emhttpd: mounting /mnt/cache Feb 20 03:21:17 Tower emhttpd: shcmd (256): mkdir -p /mnt/cache Feb 20 03:21:17 Tower emhttpd: /usr/sbin/zpool import -d /dev/mapper/nvme0n1p1 2>&1 Feb 20 03:21:17 Tower emhttpd: pool: cache Feb 20 03:21:17 Tower emhttpd: id: 13452574719722492999 Feb 20 03:21:17 Tower emhttpd: shcmd (257): /usr/sbin/zpool import -N -o autoexpand=on -d /dev/mapper/nvme0n1p1 13452574719722492999 cache Feb 20 03:21:17 Tower kernel: WARNING: Pool 'cache' has encountered an uncorrectable I/O failure and has been suspended. Feb 20 03:21:17 Tower kernel: Feb 20 03:23:22 Tower shutdown[26121]: shutting down for system halt Feb 20 03:23:22 Tower init: Switching to runlevel: 0 Feb 20 03:23:22 Tower init: Trying to re-exec init Feb 20 03:23:28 Tower monitor: Stop running nchan processes Feb 20 03:23:42 Tower ntpd[1803]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
better format the nvme from the cache pool? if yes, how do I do this?
I got only 1 nvme in the cache pool at the moment didnt make it on raid yet, was testing the new GEN 5 NVME from Crucial...
thanks
if need, diag and whole syslog can be post later after work
-
hello, after few test, isnt the RAM
I attached a new diagnostic and syslog
-> using the brand new CPU Intel i9 14900k, thanks to google, find out that it could be the issue of all those kernel panic.... seem to be the E-core or Temp issue even using a WaterCooling hardware. Running with E-core disable and seem stable since then not crashing but still have some kernel issue base on the logs...
if anyone can look at it and have a clue of my issue please!
Thanks a lot for your Help
tower-syslog-20240214-0808.zip tower-diagnostics-20240214-0306.zip
-
try that too lol
found a culpry -> nvidia plugin and the script for the keylase patchI reinstalled all dockers and all plugins, stop running the keylase patch -> run smoothly for few days and start crashing again
but I find out yesterday that its may be my VM. I stopped running it since then and didnt crash yet.
VM are the only thing I didnt reinstalled yet
so I guess with the BTFS error code from the kernel and my VM who are in the cache SSD NVME, could it really be my SSD NVME who fail or it could be another thing??
Am I wrong or it could also be the Memory Ram?
because both
MemTest = PASSED
and
SMART extended test = Completed without error
SMART error log:
Error Information (NVMe Log 0x01, 16 of 63 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS Message
0 452 0 0xe002 0x4004 0x004 0 1 - Invalid Field in Command
1 451 0 0x0014 0x4004 0x028 0 0 - Invalid Field in Commandthanks taking the time reading and helping me JorgeB
-
I forgot to mention that
- memtest done and PASSED
- BIOS -> I disabled the memory auto boost and disable XMP boost or what ever in the bios who can overclock the RAM
-
Start happening after upgrading my motherboard/cpu/ram
UnRaid version: 6.12.6
- at first, I though it was a plugin so I removed and reinstall all plugins and dockers -> stop crashing a soon as I start the array but start crashing(kernel panic) once or 2 time a day...
- second: lower my new cpu performance and disable DDR5 boost and increase my fans speed at lower temp (maybe a temperature issue) -> crash less but still crash after few days
- third: I added a second nvme in my cache pool to use raid1, did btrfs scrub and perform a full balance (I saw some btrf error in the kernel panic message when it crash) -> was hoping it fix some issue and hope itsnt my cache1 nvme who failed...
- replace my usb key for a brand new one -> still crash after 1-2 days
so now I need some help reading what the kernel saying lol
here a older diagnostics file (because I am unable to have a new one, browser seem freezing after some time) and my syslog
I will post a new diagnostics file as soon as I will be able to have it
Thank you for your help
Best Regards
syslog-192.168.1.253.log tower-diagnostics-20240201-1753.zip
-
Hello, Same issue for me for the newer version of this plugin on UnRaid 6.11.5
no files been added, also tried with every hashing methods
No Files Added (it run when we do Build and after few minutes, it said 0 file added
No Export files (file at 0 so we can click on Export, it run but nothing been exported)...
it seem to just scan all files without scanning the hash
-
ok I will do it thanks
-
help please... I would like to fix that MCE error
sometime, my server crash for unknown reason... probably because of thatThanks
-
-
Hi, I need some help understanding what's my issues with that Machine Check Events from Fix Common Problems plugins
using Unraid 6.11.5
Thanks YouBest Regards
-
My Flash Drive was Read-only (don't know how that happened)
I downloaded my backup file and reinstall it in my Flash Drive and it fix that issue
seem my Flash Drive Key was corrupted maybe
kernel panic -> server crash every day or 2 days
in General Support
Posted
ok thanks a lot!!
and if that happen to me again, is that mean the NVME are dead/faulty and need replacement or its because the server crashed during a write on the cache pool and should be fix with a backup?