mikamap

Members
  • Posts

    27
  • Joined

  • Last visited

Everything posted by mikamap

  1. Yes I have 4 sticks. Since the are dual channel I could test both pair individually. The first pair was faulty instantly in Memtest. The second, I let it run 48 hours and everything is fine. So at the moment, I run my unraid with only 16gb or RAM and I'm trying to RMA the other pair. At least my system is up at the moment.
  2. Today, I'm having troubles with my docker files. I cannot update certain containers. In the fix common problems plugin, I have "Unable to write to Docker Image" - "Docker Image either full or corrupted". The docker image was not full. I tried giving it more space but the same problems persisted. I read on this forum it could be caused by RAM I just did a memtest and the test failed within 15 minutes. I guess bad RAM can cause crash and instability. I will keep the system down until I buy new RAM. I hope my dockers will be ok after that.
  3. I just disabled C-State globally and set "Power Supply Idle Control" to "typical current idle". Sorry for the delay, I had to install a gpu and it took more time than I like to admit to figure out the seconde pci slot was disabled because I use 2 nvme drive I will report back if I have the same problem in the next 30 or 40 days. Thank you
  4. Hi. I seem to have a recurring problem with my unraid setup. Every month or so unraid becomes completely unresponsive. I cannot log into the web ui, the docker are not accessible and I cannot connect with ssh. I do not have a graphic card so I cannot hook up a monitor and keyboard to check logs. Recently, I activated the option that saves the syslog on flash for troubleshooting. Today, I powered off my unraid server. I restarted everything before noon and everything was working great. At about 17h00, unraid becomes unresponsive again. Attached is my diagnostic files. You can see at the end of syslog-previous what seems to happen at 16h54 with a usb disconnecting (usb of my ups). And then a general a protection fault. Feb 5 16:54:41 Tower kernel: usb 3-4: USB disconnect, device number 3 Feb 5 16:54:42 Tower kernel: usb 3-4: new full-speed USB device number 4 using xhci_hcd Feb 5 16:54:42 Tower kernel: hid-generic 0003:0764:0501.0002: hiddev96,hidraw0: USB HID v1.10 Device [CPS CP1000AVRLCDa] on usb-0000:0c:00.3-4/input0 Feb 5 16:54:43 Tower kernel: usb 3-4: USB disconnect, device number 4 Feb 5 16:54:44 Tower kernel: usb 3-4: new full-speed USB device number 5 using xhci_hcd Feb 5 16:54:44 Tower kernel: hid-generic 0003:0764:0501.0003: hiddev96,hidraw0: USB HID v1.10 Device [CPS CP1000AVRLCDa] on usb-0000:0c:00.3-4/input0 Feb 5 16:54:48 Tower apcupsd[6577]: Communications with UPS restored. Feb 5 16:54:48 Tower sSMTP[2223]: Creating SSL connection to host Feb 5 16:54:48 Tower sSMTP[2223]: SSL connection using TLS_AES_256_GCM_SHA384 Feb 5 16:54:50 Tower sSMTP[2223]: Sent mail for [email protected] (221 2.0.0 closing connection s18-20020a05622a019200b0042a8a626e3esm330044qtw.53 - gsmtp) uid=0 username=xxx outbytes=652 Feb 5 16:57:27 Tower kernel: general protection fault, probably for non-canonical address 0xffff088157418fb8: 0000 [#1] PREEMPT SMP NOPTI Feb 5 16:57:27 Tower kernel: CPU: 12 PID: 216 Comm: kswapd0 Tainted: P O 6.1.64-Unraid #1 Feb 5 16:57:27 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Steel Legend, BIOS P2.90 09/11/2019 Feb 5 16:57:27 Tower kernel: RIP: 0010:remove_extent_mapping+0x3b/0x6e Feb 5 16:57:27 Tower kernel: Code: 0f 0b 48 89 fe 48 89 df e8 cc f9 ff ff 48 8b 43 68 a8 08 75 2a 48 8b 8b 80 00 00 00 48 8d 83 80 00 00 00 48 8b 93 88 00 00 00 <48> 89 51 08 48 89 0a 48 89 83 80 00 00 00 48 89 83 88 00 00 00 48 Feb 5 16:57:27 Tower kernel: RSP: 0018:ffffc900009cfa28 EFLAGS: 00010246 Feb 5 16:57:27 Tower kernel: RAX: ffff888157418fb0 RBX: ffff888157418f30 RCX: ffff088157418fb0 Feb 5 16:57:27 Tower kernel: RDX: ffff888157418fb0 RSI: ffff88846af9bbd0 RDI: ffff88846af9b900 Feb 5 16:57:27 Tower kernel: RBP: ffff888404e9c028 R08: ffff888404e9be40 R09: 0000000000000000 Feb 5 16:57:27 Tower kernel: R10: 0000000000000402 R11: ffff8884180bf478 R12: 0000000000000cc0 Feb 5 16:57:27 Tower kernel: R13: 000000000021a378 R14: ffff888157418f30 R15: 00000000019c3000 Feb 5 16:57:27 Tower kernel: FS: 0000000000000000(0000) GS:ffff8887eeb00000(0000) knlGS:0000000000000000 Feb 5 16:57:27 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 5 16:57:27 Tower kernel: CR2: 000014a790000010 CR3: 0000000418562000 CR4: 00000000003506e0 Feb 5 16:57:27 Tower kernel: Call Trace: Feb 5 16:57:27 Tower kernel: <TASK> Feb 5 16:57:27 Tower kernel: ? __die_body+0x1a/0x5c Feb 5 16:57:27 Tower kernel: ? die_addr+0x38/0x51 Feb 5 16:57:27 Tower kernel: ? exc_general_protection+0x30f/0x345 Feb 5 16:57:27 Tower kernel: ? asm_exc_general_protection+0x22/0x30 Feb 5 16:57:27 Tower kernel: ? remove_extent_mapping+0x3b/0x6e Feb 5 16:57:27 Tower kernel: ? remove_extent_mapping+0x1e/0x6e Feb 5 16:57:27 Tower kernel: try_release_extent_mapping+0x12e/0x20f Feb 5 16:57:27 Tower kernel: __btrfs_release_folio+0xf/0x31 Feb 5 16:57:27 Tower kernel: shrink_folio_list+0x7ab/0x993 Feb 5 16:57:27 Tower kernel: ? cgroup_rstat_updated+0x21/0xa5 Feb 5 16:57:27 Tower kernel: shrink_lruvec+0x61a/0x9b5 Feb 5 16:57:27 Tower kernel: shrink_node+0x301/0x549 Feb 5 16:57:27 Tower kernel: balance_pgdat+0x4e9/0x6a2 Feb 5 16:57:27 Tower kernel: ? _raw_spin_unlock+0x14/0x29 Feb 5 16:57:27 Tower kernel: ? raw_spin_rq_unlock_irq+0x5/0x10 Feb 5 16:57:27 Tower kernel: ? finish_task_switch.isra.0+0x140/0x218 Feb 5 16:57:27 Tower kernel: kswapd+0x2f0/0x333 Feb 5 16:57:27 Tower kernel: ? _raw_spin_rq_lock_irqsave+0x20/0x20 Feb 5 16:57:27 Tower kernel: ? balance_pgdat+0x6a2/0x6a2 Feb 5 16:57:27 Tower kernel: kthread+0xe7/0xef Feb 5 16:57:27 Tower kernel: ? kthread_complete_and_exit+0x1b/0x1b Feb 5 16:57:27 Tower kernel: ret_from_fork+0x22/0x30 Feb 5 16:57:27 Tower kernel: </TASK> Feb 5 16:57:27 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp veth ipvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls edac_mce_amd edac_core intel_rapl_msr intel_rapl_common iosf_mbi kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel mvsas nvme crypto_simd i2c_piix4 cryptd ch341 wmi_bmof libsas r8168(O) rapl i2c_core usbserial k10temp ccp scsi_transport_sas nvme_core ahci libahci wmi button acpi_cpufreq unix Feb 5 16:57:27 Tower kernel: ---[ end trace 0000000000000000 ]--- Feb 5 16:57:27 Tower kernel: RIP: 0010:remove_extent_mapping+0x3b/0x6e Feb 5 16:57:27 Tower kernel: Code: 0f 0b 48 89 fe 48 89 df e8 cc f9 ff ff 48 8b 43 68 a8 08 75 2a 48 8b 8b 80 00 00 00 48 8d 83 80 00 00 00 48 8b 93 88 00 00 00 <48> 89 51 08 48 89 0a 48 89 83 80 00 00 00 48 89 83 88 00 00 00 48 Feb 5 16:57:27 Tower kernel: RSP: 0018:ffffc900009cfa28 EFLAGS: 00010246 Feb 5 16:57:27 Tower kernel: RAX: ffff888157418fb0 RBX: ffff888157418f30 RCX: ffff088157418fb0 Feb 5 16:57:27 Tower kernel: RDX: ffff888157418fb0 RSI: ffff88846af9bbd0 RDI: ffff88846af9b900 Feb 5 16:57:27 Tower kernel: RBP: ffff888404e9c028 R08: ffff888404e9be40 R09: 0000000000000000 Feb 5 16:57:27 Tower kernel: R10: 0000000000000402 R11: ffff8884180bf478 R12: 0000000000000cc0 Feb 5 16:57:27 Tower kernel: R13: 000000000021a378 R14: ffff888157418f30 R15: 00000000019c3000 Feb 5 16:57:27 Tower kernel: FS: 0000000000000000(0000) GS:ffff8887eeb00000(0000) knlGS:0000000000000000 Feb 5 16:57:27 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 5 16:57:27 Tower kernel: CR2: 000014a790000010 CR3: 0000000418562000 CR4: 00000000003506e0 Feb 5 16:57:27 Tower kernel: note: kswapd0[216] exited with preempt_count 1 Feb 5 16:57:27 Tower kernel: ------------[ cut here ]------------ Feb 5 16:57:27 Tower kernel: WARNING: CPU: 12 PID: 216 at kernel/exit.c:814 do_exit+0x87/0x923 Feb 5 16:57:27 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp veth ipvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls edac_mce_amd edac_core intel_rapl_msr intel_rapl_common iosf_mbi kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel mvsas nvme crypto_simd i2c_piix4 cryptd ch341 wmi_bmof libsas r8168(O) rapl i2c_core usbserial k10temp ccp scsi_transport_sas nvme_core ahci libahci wmi button acpi_cpufreq unix Feb 5 16:57:27 Tower kernel: CPU: 12 PID: 216 Comm: kswapd0 Tainted: P D O 6.1.64-Unraid #1 Feb 5 16:57:27 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Steel Legend, BIOS P2.90 09/11/2019 Feb 5 16:57:27 Tower kernel: RIP: 0010:do_exit+0x87/0x923 Feb 5 16:57:27 Tower kernel: Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 76 dd 80 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 78 dc 80 00 48 8b 83 d0 06 00 00 83 Feb 5 16:57:27 Tower kernel: RSP: 0018:ffffc900009cfee0 EFLAGS: 00010286 Feb 5 16:57:27 Tower kernel: RAX: 0000000080000000 RBX: ffff888102d01f80 RCX: 0000000080000000 Feb 5 16:57:27 Tower kernel: RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff Feb 5 16:57:27 Tower kernel: RBP: 000000000000000b R08: 0000000000000000 R09: 0720072007200720 Feb 5 16:57:27 Tower kernel: R10: 0720072007200720 R11: 0720072007200720 R12: ffff888102d10800 Feb 5 16:57:27 Tower kernel: R13: ffff888102d09080 R14: 0000000000000000 R15: 0000000000000000 Feb 5 16:57:27 Tower kernel: FS: 0000000000000000(0000) GS:ffff8887eeb00000(0000) knlGS:0000000000000000 Feb 5 16:57:27 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 5 16:57:27 Tower kernel: CR2: 000014a790000010 CR3: 0000000418562000 CR4: 00000000003506e0 Feb 5 16:57:27 Tower kernel: Call Trace: Feb 5 16:57:27 Tower kernel: <TASK> Feb 5 16:57:27 Tower kernel: ? __warn+0xab/0x122 Feb 5 16:57:27 Tower kernel: ? report_bug+0x109/0x17e Feb 5 16:57:27 Tower kernel: ? do_exit+0x87/0x923 Feb 5 16:57:27 Tower kernel: ? handle_bug+0x41/0x6f Feb 5 16:57:27 Tower kernel: ? exc_invalid_op+0x13/0x60 Feb 5 16:57:27 Tower kernel: ? asm_exc_invalid_op+0x16/0x20 Feb 5 16:57:27 Tower kernel: ? do_exit+0x87/0x923 Feb 5 16:57:27 Tower kernel: make_task_dead+0x11c/0x11c Feb 5 16:57:27 Tower kernel: rewind_stack_and_make_dead+0x17/0x17 Feb 5 16:57:27 Tower kernel: RIP: 0000:0x0 Feb 5 16:57:27 Tower kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. Feb 5 16:57:27 Tower kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 Feb 5 16:57:27 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 Feb 5 16:57:27 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 Feb 5 16:57:27 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 Feb 5 16:57:27 Tower kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Feb 5 16:57:27 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Feb 5 16:57:27 Tower kernel: </TASK> Feb 5 16:57:27 Tower kernel: ---[ end trace 0000000000000000 ]--- I'm not sure if the problem really is with the usb connected to my UPS . Hopefully someone smarter than me can confirm with the logs. I don't absolutely need my UPS connected with my usb, so I will disconnect it at the moment. It was just a quick way to know when the power at my house went off. Also, in the "Fix Common problems" settings, the only warning I have is the "Marvel Hard Drive Controller Installed" and the "Syslog mirored to flash". Thank you for your time. anotower-diagnostics-20240205-1853.zip
  5. Also, frigate expose endpoints to export video from specific time when you record 24/7 Like : http://FRIGATE_IP:5000/api/CAMERA/start/1690529400/end/1690533000/clip.mp4 But the /tmp/cache was never too big to generate the videos. So in my manual command line, I removed the tmpfs mount and added a volume to /tmp/cache and it now works fine. Thanks
  6. Maybe I'm doing somethin wrong but if it helps someone in the future, I'm posting this. I was trying to install frigate on my unraid but the docker was always failing to start. Something about getting the path . I'm not using Coral or GPU or anything like that for my current setup. But the docker was not able to start. So I blanked all the value that were written "Remove this if your not using it". And still the docker would not start and there was an error like "Cannot find path ." Then I realized that in the command line there was this --device='' --device='' So I started the docker directly from command line (Not sure if it's safe) without these --device and it worked.
  7. Thanks I'll do everything during the weekend
  8. Thank you. I just rebooted the server and here is the diagnostics. Before the reboot, don't know if it's important, but in the web inteface, the uptime was blank. And I could not reboot or shutdown with the command line. It was not working. I had to reboot it the hard way with the power button. Thank you tower-diagnostics-20230324-1308.zip
  9. Hi, this morning, I noticed my dockers were not available. I logged in unraid and I see that the dashboard is not available. The Main tab is though. So I tried every tab and the Dashboard, Dockers, Apps are staying white and nothing appears. Shares are still working I'm using Version 6.9.2 at the moment. I did not reboot unraid yet as I have access to diagnostics and by ssh. I uploaded the diagnostics file (anonymised). What do you guys recommend. Should I just try to reboot the server ? Thank you for your help tower-diagnostics-20230324-0706.zip
  10. Happy to tell you that after addind a new sata controller and moving all data disk to the controller, I could finish the parity sync without error. Only the cache disk is on the onboard sata controller now for the moment. @JorgeB Thank you so much for your help and support.
  11. Here is the diagnostics. Just canceled the sync / stoped the array / shutdown the server. Will try to find one tonight and launch it again tonight or tomorrow. Thank you tower-diagnostics-20220222-1420.zip
  12. Not sure if everything will be ok after. Parity Sync is at 67% but there are a lot of errors. All those disk seems to be connected to the unboard sata. I'll order another one to replace rapidly
  13. Everything worked great. The parity-Sync / Data-rebuild started. Just need to wait a few hours now. I just saw that my VM didn't start and I get the message "Libvirt Service failed to start" on this tab, but I'll check back after the process is finished. Thank you.
  14. Thank you, I will try that. I really appreciate the quick response.
  15. Here is the new diagnostics that I have now. As I told in the first post, I tried to reassign the disabled drive, so I started the array without the disc. Now, 3 of them are "new drive". And I forgot to tell in the first post that I have currently 6 disk plugged into the motherboard SATA and the 3 other and the cache already plugged into a sata controller. If it's a better solution to plug all of them on sata controller, I'll purchase more. So the plan : - Update motherboard driver - Add sata controller for the drives And for the fact that 3 are new, for what I have read, I need to do a "New config" right ? tower-diagnostics-20220222-0743.zip
  16. This is the diagnostics when 2 drives were disabled cause of errorrs and 1 was INVALID because the parity/sync was not successfull. tower-diagnostics-20220221-1646.zip
  17. Hi, On my unraid server, I currently have 2 parity drive (6TB and 4TB) and 7 disk (3TB or 4TB). yesterday I was getting SMART errors from a disk in my array. So I ordered 2 new 4tb seagate Ironwolf drive. I replaced the drive with the new ones. Everything seems to work fine. And then the parity / sync was halted and 2 other drives were getting some errors (1 parity and 1 another data drive). The 2 drives with errors were disabled and the one I tried to replace was "Emulated". My error, i think, was to unassign the "disabled drive" and start the array as now, I cannot start it again since it is missing 3 disk. What I'm thinking : 1- Put back the drive with the error on the array (since the data should already be there) 2- Do a "New config" Since the 3 drives should have the data on them. edit: I also plan to change the PSU and all the SATA cable as I have read that some of these errors (CRC) are caused by bad SATA cable or fluctuating power. Is there a good chance it will work ? Should I first try to backup the data I have at the moment on external drive before doing this ? thank you
  18. I am using the docker photoprism for a few days now. Is it normal that when I update the app, I loose everything in the database. The app does not contain any photos. I need to import them again. Maybe I need to add a variable to a path or something. I may try to reinstall it tonight to see if the template has changed. Thank you
  19. No offense taken. I know the problem is all my fault. I'm really grateful for the help.
  20. Oh, I think I know what the problem was. When I made the docker, I hosted the image on my gitlab instance (hosted of unraid). But when I change router, my ip address changed so the IP address of unraid also changed. So the image was not available from this point on. It was a few months ago, but when I updated to 6.8.1, it broke my docker list. I uploaded the xml for the application. You'll see 192.168.2.116, but it is not my unraid ip anymore. Thank you my-SinopeApiMQTT.xml
  21. One of them was a custom docker I made. it was hosting a Spring boot java application. It was something I made to practice for a job interview. And the other one was a docker I installed from a custom repo. It was a spring boot admin instance. I think the repo from which I installed it was removed or made private as I could not find it anymore. Bottom line, I'll stick with CA application from now on
  22. Thank you. I don't need to do this as deleting 2 dockers without images did the trick for me. But As you recommended by bonienl post, I'll put the docker image on the cache (I did not have a cache when I first installed unraid and used docker)
  23. No it's not. It's missing 2 images. I deleted the 2 docker missing the images and now the docker list is super fast again. Thanks. And it was docker I did not use for a long time. I'll reinstall them if I need them in the future.
  24. Just to be sure, I won't loose any settings for the dockers ? I don't want to reupload everything to crashplan after that. Thanks