Airwu

Members
  • Posts

    32
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Airwu's Achievements

Noob

Noob (1/14)

0

Reputation

1

Community Answers

  1. I ran memtest last night for about 12 hours and found no errors.
  2. I used memory without ECC, maybe I am not suitable for using ZFS.
  3. Maybe this is a zfs filesystem crashing or ZFS bug. I put my NVMe disk into another computer running latest Ubuntu, than use zpool to open, system crash again. I can't read my data. I find a github page about this crash https://github.com/openzfs/zfs/issues/13483 , someone have same crash with me, and I use this zfs value to open disk and read my data Now I changing all my disk from type zfs to btrfs.
  4. When I disable disk auto start , I can login to webpage and run diagnostics tower-diagnostics-20240217-2204.zip
  5. My Unraid server has recently encountered instability issues and has been unable to access the system. Timeline: 1 My server has been purchased for about a year, has been running Unraid without any issues until last month. 2 About a month ago, the server began experiencing random reboots (approximately every 3 days) Posted: 3 About a week ago, the server's CPU randomly showed several cores at 100% usage (also approximately every 3 days). When this happened, the web page became inaccessible, SSH login was possible, but the system did not respond to the reboot command. The system could only be restarted using reboot -nf. 4 Starting today, immediately after restarting, the system again showed several cores at 100% usage. Attempted Solutions: 1 Passed the memtest. 2 Disk expansion smart test passed. 3 Errors previously indicated by "zpool status -v" have been fixed. 4 After executing the diagnostics command, there was no response 5 I can't access my data 6 I found this message in syslog Feb 17 21:19:49 Tower root: Starting Nginx server daemon... Feb 17 21:19:49 Tower kernel: mdcmd (31): set md_num_stripes 1280 Feb 17 21:19:49 Tower kernel: mdcmd (32): set md_queue_limit 80 Feb 17 21:19:49 Tower kernel: mdcmd (33): set md_sync_limit 5 Feb 17 21:19:49 Tower kernel: mdcmd (34): set md_write_method 1 Feb 17 21:19:49 Tower kernel: mdcmd (35): start STOPPED Feb 17 21:19:49 Tower kernel: unraid: allocating 15750K for 1280 stripes (3 disks) Feb 17 21:19:49 Tower kernel: md1p1: running, size: 3907018532 blocks Feb 17 21:19:49 Tower emhttpd: shcmd (27): udevadm settle Feb 17 21:19:49 Tower emhttpd: Opening encrypted volumes... Feb 17 21:19:49 Tower emhttpd: shcmd (28): touch /boot/config/forcesync Feb 17 21:19:49 Tower emhttpd: Mounting disks... Feb 17 21:19:49 Tower emhttpd: mounting /mnt/disk1 Feb 17 21:19:49 Tower emhttpd: shcmd (29): mkdir -p /mnt/disk1 Feb 17 21:19:49 Tower emhttpd: /usr/sbin/zpool import -f -d /dev/md1p1 2>&1 Feb 17 21:19:52 Tower emhttpd: pool: disk1 Feb 17 21:19:52 Tower emhttpd: id: 9902428395924024116 Feb 17 21:19:52 Tower emhttpd: shcmd (30): /usr/sbin/zpool import -f -N -o autoexpand=on -d /dev/md1p1 9902428395924024116 disk1 Feb 17 21:19:58 Tower emhttpd: shcmd (31): /usr/sbin/zpool online -e disk1 /dev/md1p1 Feb 17 21:19:58 Tower rsyslogd: action 'action-3-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2102.0 try https://www.rsyslog.com/e/2359 ] Feb 17 21:19:59 Tower emhttpd: /usr/sbin/zpool status -PL disk1 2>&1 Feb 17 21:19:59 Tower emhttpd: pool: disk1 Feb 17 21:19:59 Tower emhttpd: state: ONLINE Feb 17 21:19:59 Tower emhttpd: scan: scrub repaired 0B in 03:19:55 with 0 errors on Thu Feb 15 01:19:56 2024 Feb 17 21:19:59 Tower emhttpd: config: Feb 17 21:19:59 Tower emhttpd: NAME STATE READ WRITE CKSUM Feb 17 21:19:59 Tower emhttpd: disk1 ONLINE 0 0 0 Feb 17 21:19:59 Tower emhttpd: /dev/md1p1 ONLINE 0 0 0 Feb 17 21:19:59 Tower emhttpd: errors: No known data errors Feb 17 21:19:59 Tower emhttpd: shcmd (32): /usr/sbin/zfs set mountpoint=/mnt/disk1 disk1 Feb 17 21:19:59 Tower emhttpd: shcmd (33): /usr/sbin/zfs set atime=off disk1 Feb 17 21:19:59 Tower emhttpd: shcmd (34): /usr/sbin/zfs mount disk1 Feb 17 21:19:59 Tower emhttpd: shcmd (35): /usr/sbin/zpool set autotrim=off disk1 Feb 17 21:19:59 Tower emhttpd: shcmd (36): /usr/sbin/zfs set compression=off disk1 Feb 17 21:20:00 Tower emhttpd: mounting /mnt/nvme Feb 17 21:20:00 Tower emhttpd: shcmd (37): mkdir -p /mnt/nvme Feb 17 21:20:00 Tower emhttpd: shcmd (38): /usr/sbin/zpool import -f -N -o autoexpand=on -d /dev/nvme1n1p1 -d /dev/nvme0n1p1 7424498333111026621 nvme Feb 17 21:20:00 Tower kernel: VERIFY3(rs_get_end(rs, rt) >= end) failed (115970260992 >= 58546911126228992) Feb 17 21:20:00 Tower kernel: PANIC at range_tree.c:482:range_tree_remove_impl() Feb 17 21:20:00 Tower kernel: Showing stack for process 9822 Feb 17 21:20:00 Tower kernel: CPU: 8 PID: 9822 Comm: metaslab_group_ Tainted: P O 6.1.74-Unraid #1 Feb 17 21:20:00 Tower kernel: Hardware name: Default string Default string/MS-Terminator B660M, BIOS H3.41G 04/29/2022 Feb 17 21:20:00 Tower kernel: Call Trace: Feb 17 21:20:00 Tower kernel: <TASK> Feb 17 21:20:00 Tower kernel: dump_stack_lvl+0x44/0x5c Feb 17 21:20:00 Tower kernel: spl_panic+0xd0/0xe8 [spl] Feb 17 21:20:00 Tower kernel: ? bt_grow_leaf+0xc3/0xd6 [zfs] Feb 17 21:20:00 Tower kernel: ? zfs_btree_find_in_buf+0x4c/0x94 [zfs] Feb 17 21:20:00 Tower kernel: ? zfs_btree_find+0x16d/0x1b0 [zfs] Feb 17 21:20:00 Tower kernel: range_tree_remove_impl+0x1ea/0x406 [zfs] Feb 17 21:20:00 Tower kernel: ? zio_wait+0x1ee/0x1fd [zfs] Feb 17 21:20:00 Tower kernel: space_map_load_callback+0x70/0x79 [zfs] Feb 17 21:20:00 Tower kernel: space_map_iterate+0x2d3/0x324 [zfs] Feb 17 21:20:00 Tower kernel: ? spa_stats_destroy+0x16c/0x16c [zfs] Feb 17 21:20:00 Tower kernel: space_map_load_length+0x93/0xcb [zfs] Feb 17 21:20:00 Tower kernel: metaslab_load+0x33b/0x6e3 [zfs] Feb 17 21:20:00 Tower kernel: ? _raw_spin_unlock_irqrestore+0x24/0x3a Feb 17 21:20:00 Tower kernel: ? __wake_up_common_lock+0x88/0xbb Feb 17 21:20:00 Tower kernel: metaslab_preload+0x4c/0x97 [zfs] Feb 17 21:20:00 Tower kernel: taskq_thread+0x266/0x38a [spl] Feb 17 21:20:00 Tower kernel: ? wake_up_q+0x44/0x44 Feb 17 21:20:00 Tower kernel: ? taskq_dispatch_delay+0x106/0x106 [spl] Feb 17 21:20:00 Tower kernel: kthread+0xe4/0xef Feb 17 21:20:00 Tower kernel: ? kthread_complete_and_exit+0x1b/0x1b Feb 17 21:20:00 Tower kernel: ret_from_fork+0x1f/0x30 Feb 17 21:20:00 Tower kernel: </TASK> Feb 17 21:20:02 Tower SysDrivers: SysDrivers Build Complete
  6. After I delete /mnt/disk1/backup/ios/00008120-001135C021EB401E/7c/7cf081b7fe531b449dc5827f985bdddf11cd996a , zpool still shows: errors: Permanent errors have been detected in the following files: disk1/backup:<0x22dc2> 😂
  7. zpool status -v show there are 3 errors, after I run scrub there still have 1 error: errors: Permanent errors have been detected in the following files: /mnt/disk1/backup/ios/00008120-001135C021EB401E/7c/7cf081b7fe531b449dc5827f985bdddf11cd996a This file can delete, I’ll try to delete it. When I go home, I’ll try memtest .
  8. Last check completed on Friday, 2024-02-09, 19:14 (today) No error How I can check the filesystem?
  9. BTW, I can’t login to web management page, but I can login to ssh. Is there any way I can run diagnostics in SSH?
  10. I'm going out and have to go home in 7 days, I can’t connect to my server. I'll update it after 7 days. Thank you.