Jump to content

mikamap

Members
  • Posts

    27
  • Joined

  • Last visited

Posts posted by mikamap

  1. Yes I have 4 sticks.  Since the are dual channel I could test both pair individually.  The first pair was faulty instantly in Memtest.

    The second, I let it run 48 hours and everything is fine.

     

    So at the moment, I run my unraid with only 16gb or RAM and I'm trying to RMA the other pair.

     

    At least my system is up at the moment.

  2. Today, I'm having troubles with my docker files.  I cannot update certain containers.

     

    In the fix common problems plugin, I have "Unable to write to Docker Image" - "Docker Image either full or corrupted".

     

    The docker image was not full.  I tried giving it more space but the same problems persisted.

    I read on this forum it could be caused by RAM

     

    I just did a memtest and the test failed within 15 minutes.

     

    I guess bad RAM can cause crash and instability.

    I will keep the system down until I buy new RAM.

     

    I hope my dockers will be ok after that.

  3. I just disabled C-State globally and set "Power Supply Idle Control" to "typical current idle".

     

    Sorry for the delay, I had to install a gpu and it took more time than I like to admit to figure out the seconde pci slot was disabled because I use 2 nvme drive :(

     

    I will report back if I have the same problem in the next 30 or 40 days.

     

    Thank you

  4. Hi.  I seem to have a recurring problem with my unraid setup.  Every month or so unraid becomes completely unresponsive.  I cannot log into the web ui, the docker are not accessible and I cannot connect with ssh.

     

    I do not have a graphic card so I cannot hook up a monitor and keyboard to check logs.

     

    Recently, I activated the option that saves the syslog on flash for troubleshooting.

     

     

    Today, I powered off my unraid server.  I restarted everything before noon and everything was working great.  At about 17h00, unraid becomes unresponsive again.

     

    Attached is my diagnostic files.

     

    You can see at the end of syslog-previous what seems to happen at 16h54 with a usb disconnecting (usb of my ups).  And then a general a protection fault.

     

    Feb  5 16:54:41 Tower kernel: usb 3-4: USB disconnect, device number 3
    Feb  5 16:54:42 Tower kernel: usb 3-4: new full-speed USB device number 4 using xhci_hcd
    Feb  5 16:54:42 Tower kernel: hid-generic 0003:0764:0501.0002: hiddev96,hidraw0: USB HID v1.10 Device [CPS CP1000AVRLCDa] on usb-0000:0c:00.3-4/input0
    Feb  5 16:54:43 Tower kernel: usb 3-4: USB disconnect, device number 4
    Feb  5 16:54:44 Tower kernel: usb 3-4: new full-speed USB device number 5 using xhci_hcd
    Feb  5 16:54:44 Tower kernel: hid-generic 0003:0764:0501.0003: hiddev96,hidraw0: USB HID v1.10 Device [CPS CP1000AVRLCDa] on usb-0000:0c:00.3-4/input0
    Feb  5 16:54:48 Tower apcupsd[6577]: Communications with UPS restored.
    Feb  5 16:54:48 Tower sSMTP[2223]: Creating SSL connection to host
    Feb  5 16:54:48 Tower sSMTP[2223]: SSL connection using TLS_AES_256_GCM_SHA384
    Feb  5 16:54:50 Tower sSMTP[2223]: Sent mail for [email protected] (221 2.0.0 closing connection s18-20020a05622a019200b0042a8a626e3esm330044qtw.53 - gsmtp) uid=0 username=xxx outbytes=652
    Feb  5 16:57:27 Tower kernel: general protection fault, probably for non-canonical address 0xffff088157418fb8: 0000 [#1] PREEMPT SMP NOPTI
    Feb  5 16:57:27 Tower kernel: CPU: 12 PID: 216 Comm: kswapd0 Tainted: P           O       6.1.64-Unraid #1
    Feb  5 16:57:27 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Steel Legend, BIOS P2.90 09/11/2019
    Feb  5 16:57:27 Tower kernel: RIP: 0010:remove_extent_mapping+0x3b/0x6e
    Feb  5 16:57:27 Tower kernel: Code: 0f 0b 48 89 fe 48 89 df e8 cc f9 ff ff 48 8b 43 68 a8 08 75 2a 48 8b 8b 80 00 00 00 48 8d 83 80 00 00 00 48 8b 93 88 00 00 00 <48> 89 51 08 48 89 0a 48 89 83 80 00 00 00 48 89 83 88 00 00 00 48
    Feb  5 16:57:27 Tower kernel: RSP: 0018:ffffc900009cfa28 EFLAGS: 00010246
    Feb  5 16:57:27 Tower kernel: RAX: ffff888157418fb0 RBX: ffff888157418f30 RCX: ffff088157418fb0
    Feb  5 16:57:27 Tower kernel: RDX: ffff888157418fb0 RSI: ffff88846af9bbd0 RDI: ffff88846af9b900
    Feb  5 16:57:27 Tower kernel: RBP: ffff888404e9c028 R08: ffff888404e9be40 R09: 0000000000000000
    Feb  5 16:57:27 Tower kernel: R10: 0000000000000402 R11: ffff8884180bf478 R12: 0000000000000cc0
    Feb  5 16:57:27 Tower kernel: R13: 000000000021a378 R14: ffff888157418f30 R15: 00000000019c3000
    Feb  5 16:57:27 Tower kernel: FS:  0000000000000000(0000) GS:ffff8887eeb00000(0000) knlGS:0000000000000000
    Feb  5 16:57:27 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Feb  5 16:57:27 Tower kernel: CR2: 000014a790000010 CR3: 0000000418562000 CR4: 00000000003506e0
    Feb  5 16:57:27 Tower kernel: Call Trace:
    Feb  5 16:57:27 Tower kernel: <TASK>
    Feb  5 16:57:27 Tower kernel: ? __die_body+0x1a/0x5c
    Feb  5 16:57:27 Tower kernel: ? die_addr+0x38/0x51
    Feb  5 16:57:27 Tower kernel: ? exc_general_protection+0x30f/0x345
    Feb  5 16:57:27 Tower kernel: ? asm_exc_general_protection+0x22/0x30
    Feb  5 16:57:27 Tower kernel: ? remove_extent_mapping+0x3b/0x6e
    Feb  5 16:57:27 Tower kernel: ? remove_extent_mapping+0x1e/0x6e
    Feb  5 16:57:27 Tower kernel: try_release_extent_mapping+0x12e/0x20f
    Feb  5 16:57:27 Tower kernel: __btrfs_release_folio+0xf/0x31
    Feb  5 16:57:27 Tower kernel: shrink_folio_list+0x7ab/0x993
    Feb  5 16:57:27 Tower kernel: ? cgroup_rstat_updated+0x21/0xa5
    Feb  5 16:57:27 Tower kernel: shrink_lruvec+0x61a/0x9b5
    Feb  5 16:57:27 Tower kernel: shrink_node+0x301/0x549
    Feb  5 16:57:27 Tower kernel: balance_pgdat+0x4e9/0x6a2
    Feb  5 16:57:27 Tower kernel: ? _raw_spin_unlock+0x14/0x29
    Feb  5 16:57:27 Tower kernel: ? raw_spin_rq_unlock_irq+0x5/0x10
    Feb  5 16:57:27 Tower kernel: ? finish_task_switch.isra.0+0x140/0x218
    Feb  5 16:57:27 Tower kernel: kswapd+0x2f0/0x333
    Feb  5 16:57:27 Tower kernel: ? _raw_spin_rq_lock_irqsave+0x20/0x20
    Feb  5 16:57:27 Tower kernel: ? balance_pgdat+0x6a2/0x6a2
    Feb  5 16:57:27 Tower kernel: kthread+0xe7/0xef
    Feb  5 16:57:27 Tower kernel: ? kthread_complete_and_exit+0x1b/0x1b
    Feb  5 16:57:27 Tower kernel: ret_from_fork+0x22/0x30
    Feb  5 16:57:27 Tower kernel: </TASK>
    Feb  5 16:57:27 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp veth ipvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls edac_mce_amd edac_core intel_rapl_msr intel_rapl_common iosf_mbi kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel mvsas nvme crypto_simd i2c_piix4 cryptd ch341 wmi_bmof libsas r8168(O) rapl i2c_core usbserial k10temp ccp scsi_transport_sas nvme_core ahci libahci wmi button acpi_cpufreq unix
    Feb  5 16:57:27 Tower kernel: ---[ end trace 0000000000000000 ]---
    Feb  5 16:57:27 Tower kernel: RIP: 0010:remove_extent_mapping+0x3b/0x6e
    Feb  5 16:57:27 Tower kernel: Code: 0f 0b 48 89 fe 48 89 df e8 cc f9 ff ff 48 8b 43 68 a8 08 75 2a 48 8b 8b 80 00 00 00 48 8d 83 80 00 00 00 48 8b 93 88 00 00 00 <48> 89 51 08 48 89 0a 48 89 83 80 00 00 00 48 89 83 88 00 00 00 48
    Feb  5 16:57:27 Tower kernel: RSP: 0018:ffffc900009cfa28 EFLAGS: 00010246
    Feb  5 16:57:27 Tower kernel: RAX: ffff888157418fb0 RBX: ffff888157418f30 RCX: ffff088157418fb0
    Feb  5 16:57:27 Tower kernel: RDX: ffff888157418fb0 RSI: ffff88846af9bbd0 RDI: ffff88846af9b900
    Feb  5 16:57:27 Tower kernel: RBP: ffff888404e9c028 R08: ffff888404e9be40 R09: 0000000000000000
    Feb  5 16:57:27 Tower kernel: R10: 0000000000000402 R11: ffff8884180bf478 R12: 0000000000000cc0
    Feb  5 16:57:27 Tower kernel: R13: 000000000021a378 R14: ffff888157418f30 R15: 00000000019c3000
    Feb  5 16:57:27 Tower kernel: FS:  0000000000000000(0000) GS:ffff8887eeb00000(0000) knlGS:0000000000000000
    Feb  5 16:57:27 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Feb  5 16:57:27 Tower kernel: CR2: 000014a790000010 CR3: 0000000418562000 CR4: 00000000003506e0
    Feb  5 16:57:27 Tower kernel: note: kswapd0[216] exited with preempt_count 1
    Feb  5 16:57:27 Tower kernel: ------------[ cut here ]------------
    Feb  5 16:57:27 Tower kernel: WARNING: CPU: 12 PID: 216 at kernel/exit.c:814 do_exit+0x87/0x923
    Feb  5 16:57:27 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp veth ipvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls edac_mce_amd edac_core intel_rapl_msr intel_rapl_common iosf_mbi kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel mvsas nvme crypto_simd i2c_piix4 cryptd ch341 wmi_bmof libsas r8168(O) rapl i2c_core usbserial k10temp ccp scsi_transport_sas nvme_core ahci libahci wmi button acpi_cpufreq unix
    Feb  5 16:57:27 Tower kernel: CPU: 12 PID: 216 Comm: kswapd0 Tainted: P      D    O       6.1.64-Unraid #1
    Feb  5 16:57:27 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Steel Legend, BIOS P2.90 09/11/2019
    Feb  5 16:57:27 Tower kernel: RIP: 0010:do_exit+0x87/0x923
    Feb  5 16:57:27 Tower kernel: Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 76 dd 80 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 78 dc 80 00 48 8b 83 d0 06 00 00 83
    Feb  5 16:57:27 Tower kernel: RSP: 0018:ffffc900009cfee0 EFLAGS: 00010286
    Feb  5 16:57:27 Tower kernel: RAX: 0000000080000000 RBX: ffff888102d01f80 RCX: 0000000080000000
    Feb  5 16:57:27 Tower kernel: RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff
    Feb  5 16:57:27 Tower kernel: RBP: 000000000000000b R08: 0000000000000000 R09: 0720072007200720
    Feb  5 16:57:27 Tower kernel: R10: 0720072007200720 R11: 0720072007200720 R12: ffff888102d10800
    Feb  5 16:57:27 Tower kernel: R13: ffff888102d09080 R14: 0000000000000000 R15: 0000000000000000
    Feb  5 16:57:27 Tower kernel: FS:  0000000000000000(0000) GS:ffff8887eeb00000(0000) knlGS:0000000000000000
    Feb  5 16:57:27 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Feb  5 16:57:27 Tower kernel: CR2: 000014a790000010 CR3: 0000000418562000 CR4: 00000000003506e0
    Feb  5 16:57:27 Tower kernel: Call Trace:
    Feb  5 16:57:27 Tower kernel: <TASK>
    Feb  5 16:57:27 Tower kernel: ? __warn+0xab/0x122
    Feb  5 16:57:27 Tower kernel: ? report_bug+0x109/0x17e
    Feb  5 16:57:27 Tower kernel: ? do_exit+0x87/0x923
    Feb  5 16:57:27 Tower kernel: ? handle_bug+0x41/0x6f
    Feb  5 16:57:27 Tower kernel: ? exc_invalid_op+0x13/0x60
    Feb  5 16:57:27 Tower kernel: ? asm_exc_invalid_op+0x16/0x20
    Feb  5 16:57:27 Tower kernel: ? do_exit+0x87/0x923
    Feb  5 16:57:27 Tower kernel: make_task_dead+0x11c/0x11c
    Feb  5 16:57:27 Tower kernel: rewind_stack_and_make_dead+0x17/0x17
    Feb  5 16:57:27 Tower kernel: RIP: 0000:0x0
    Feb  5 16:57:27 Tower kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
    Feb  5 16:57:27 Tower kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
    Feb  5 16:57:27 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    Feb  5 16:57:27 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    Feb  5 16:57:27 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    Feb  5 16:57:27 Tower kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    Feb  5 16:57:27 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    Feb  5 16:57:27 Tower kernel: </TASK>
    Feb  5 16:57:27 Tower kernel: ---[ end trace 0000000000000000 ]---

     

    I'm not sure if the problem really is with the usb connected to my UPS .  Hopefully someone smarter than me can confirm with the logs.

     

    I don't absolutely need my UPS connected with my usb, so I will disconnect it at the moment.  It was just a quick way to know when the power at my house went off.

     

    Also, in the "Fix Common problems" settings, the only warning I have is the "Marvel Hard Drive Controller Installed" and the "Syslog mirored to flash".

     

    Thank you for your time.

    anotower-diagnostics-20240205-1853.zip

  5. Also, frigate expose endpoints to export video from specific time when you record 24/7

     

    Like : http://FRIGATE_IP:5000/api/CAMERA/start/1690529400/end/1690533000/clip.mp4

    But the /tmp/cache was never too big to generate the videos.  So in my manual command line, I removed the tmpfs mount and added a volume to /tmp/cache and it now works fine.

     

    Thanks

  6. Maybe I'm doing somethin wrong but if it helps someone in the future, I'm posting this.

     

    I was trying to install frigate on my unraid but the docker was always failing to start.  Something about getting the path .

    I'm not using Coral or GPU or anything like that for my current setup.

     

    But the docker was not able to start.

     

    So I blanked all the value that were written "Remove this if your not using it".

    And still the docker would not start and there was an error like "Cannot find path ."

     

    Then I realized that in the command line there was this --device='' --device=''

     

    So I started the docker directly from command line (Not sure if it's safe) without these --device and it worked.

  7. Hi, this morning, I noticed my dockers were not available.

     

    I logged in unraid and I see that the dashboard is not available.  The Main tab is though.

    So I tried every tab and the Dashboard, Dockers, Apps are staying white and nothing appears.

    Shares are still working

     

    I'm using Version 6.9.2 at the moment.

    I did not reboot unraid yet as I have access to diagnostics and by ssh. 

     

    I uploaded the diagnostics file (anonymised).

     

    What do you guys recommend.  Should I just try to reboot the server ?

     

    Thank you for your help

     

    tower-diagnostics-20230324-0706.zip

  8. Here is the new diagnostics that I have now.

     

    As I told in the first post, I tried to reassign the disabled drive, so I started the array without the disc.  Now, 3 of them are "new drive".

     

     

    And I forgot to tell in the first post that I have currently 6 disk plugged into the motherboard SATA and the 3 other and the cache already plugged into a sata controller.

     

    If it's a better solution to plug all of them on sata controller, I'll purchase more.

     

     

    So the plan : 

    - Update motherboard driver

    - Add sata controller for the drives

     

     

    And for the fact that 3 are new, for what I have read, I need to do a "New config" right ?

    tower-diagnostics-20220222-0743.zip

  9. Hi, 

     

    On my unraid server, I currently have 2 parity drive (6TB and 4TB) and 7 disk (3TB or 4TB).

    yesterday I was getting SMART errors from a disk in my array.

     

    So I ordered 2 new 4tb seagate Ironwolf drive.

     

    I replaced the drive with the new ones.  Everything seems to work fine.  And then the parity / sync was halted and 2 other drives were getting some errors (1 parity and 1 another data drive).

     

    The 2 drives with errors were disabled and the one I tried to replace was "Emulated".

     

    My error, i think, was to unassign the "disabled drive" and start the array as now, I cannot start it again since it is missing 3 disk.

     

     

    What I'm thinking : 
    1- Put back the drive with the error on the array (since the data should already be there)

    2- Do a "New config"  Since the 3 drives should have the data on them.

     

    edit: I also plan to change the PSU and all the SATA cable as I have read that some of these errors (CRC) are caused by bad SATA cable or fluctuating power.

     

    Is there a good chance it will work ?

     

    Should I first try to backup the data I have at the moment on external drive before doing this ?

     

     

    thank you 

  10. 2 minutes ago, Squid said:

    A bit irrelevant now, but I'll issue a fix for the next unRaid version that will lower the connect timeout to 15 seconds, rather than the combined 60 seconds for download & connect.

     

    Effectively though, the issue you're seeing only happens in your self-inflicted circumstance (no offense meant), as any URL would basically return a 404 immediately.

    No offense taken.  I know the problem is all my fault.

     

    I'm really grateful for the help.

     

  11. 4 minutes ago, Squid said:

    Can you do me a favour and post or PM me the applicable xml files from /config/plugins/dockerMan/templates-user

      Oh, I think I know what the problem was.

     

      When I made the docker, I hosted the image on my gitlab instance (hosted of unraid).

      But when I change router, my ip address changed so the IP address of unraid also changed.  So the image was not available from this point on.  It was a few months ago, but when I updated to 6.8.1, it broke my docker list.

     

     I uploaded the xml for the application.

    You'll see 192.168.2.116, but it is not my unraid ip anymore.

     

     Thank you

     

     

    my-SinopeApiMQTT.xml

  12. 8 minutes ago, Squid said:

    Which 2?

     

    An underlying change in 6.8+ was that icons needed to be downloaded again the first time you hit either the dashboard or the docker tab.

    One of them was a custom docker I made.  it was hosting a Spring boot java application.  It was something I made to practice for a job interview.

     

    And the other one was a docker I installed from a custom repo.  It was a spring boot admin instance.  I think the repo from which I installed it was removed or made private as I could not find it anymore.

     

    Bottom line, I'll stick with CA application from now on

  13. 2 hours ago, Squid said:

    Curious when you on 6.8.x what is the output of this command?

     

    
    ls /var/lib/docker/unraid/images

    Is it showing an icon for every container you have installed?

    No it's not.  It's missing 2 images.

     

    I deleted the 2 docker missing the images and now the docker list is super fast again.

     

    Thanks.  And it was docker I did not use for a long time.  I'll reinstall them if I need them in the future.

  14. 2 hours ago, bonienl said:

    Try the following

     

    1. Stop the docker service

    2. Delete the docker image

    3. Create a new docker image, highly recommended on the cache device

    4. Re-install your containers (use CA to do this)

    Just to be sure, I won't loose any settings for the dockers ?  I don't want to reupload everything to crashplan after that.

     

    Thanks

×
×
  • Create New...