GTM Posted March 29, 2022 Share Posted March 29, 2022 (edited) Need help asap. production unit. willing to pay for support. attaching screen shots of vnc ssh session. Ubuntu server has been running fine for months as a VM, now wont boot up - production unit. need help fixing. Edited March 29, 2022 by GTM typo Quote Link to comment
Squid Posted March 29, 2022 Share Posted March 29, 2022 Can you post your diagnostics. Might not show anything depending upon where the issue is though (The OS has the drive in read-only or the VM's vdisk is read only) Quote Link to comment
GTM Posted March 29, 2022 Author Share Posted March 29, 2022 Thanks, Here is the diagnostics I just now downloaded. server-vm-diagnostics-20220329-1911.zip Quote Link to comment
GTM Posted March 29, 2022 Author Share Posted March 29, 2022 To update the situation, I found this article: https://ostechnix.com/how-to-fix-busybox-initramfs-error-on-ubuntu/ I then at the last prompt of the last screenshot (above) typed in "exit" and pressed enter.. and got this: Quote Link to comment
GTM Posted March 29, 2022 Author Share Posted March 29, 2022 Then I followed the article and performed the FSCK as described, it did the repair of the /dev/mapper/zimbramail--vg-root partition, I pressed yes through the prompts, it did the repairs, then I typed in "reboot" when it was complete and it came back up. After reading this similar problem on the web: On boot up, I logged in and did a df -h command.. to see if any partititions on the disk were full.... I don't see a sector called "boot".. but I may not know what I am looking at.. do you think disk space has anything to do with this? It doesn't look like it to me, but I could be wrong. Please let me know what you think and thank you in advance for your help. Quote Link to comment
GTM Posted March 29, 2022 Author Share Posted March 29, 2022 To sum it up, we are up and running again which is good, HOWEVER these errors, shown at the beginning of the post that were showing up on the VM VNC session of this ubuntu server need to be researched, so I can prevent it from happening again. If anyone has any insight as to what could cause them, PLEASE let me know and thanks in advance for your help. Sincerely, GTM Quote Link to comment
ghost82 Posted March 30, 2022 Share Posted March 30, 2022 (edited) 6 hours ago, GTM said: I don't see a sector called "boot".. but I may not know what I am looking at.. boot partition is not mandatory in ubuntu so probably you installed everything in the root partition /. 6 hours ago, GTM said: these errors, shown at the beginning of the post that were showing up on the VM VNC session of this ubuntu server need to be researched, so I can prevent it from happening again Difficult to say, if in your fstab file you have errors=remount-ro for the root partition, something happened (inconsistency in the filesystem) when the os was live, or in an attempt that failed to mount the root partition, or if the filesystem check failed during boot and according to your fstab the system entered read-only mode to try to save what it could save. Fixing the corrupted filesystem fixed it. Possible causes: power failures, vm force stopped, updates applied without rebooting, failing/failed hd. I would bet on an improper shutdown since you had that inode 7078004 had zero dtime Edited March 30, 2022 by ghost82 Quote Link to comment
GTM Posted March 31, 2022 Author Share Posted March 31, 2022 NEW ERRORS Everything was clean for a few days after the fix, but now I am seeing these (below) error starting to creep in. Maybe there are better clues in this screen shot. We have had no power failures (unraid system is on a huge UPS), no force stoppings, no updates, HD's are reporting good within unraid, no improper shutdowns either ( always shutdown this email system from the SSH terminal). Thanks (in advance too) to everyone looking at this with me. I appreciate your help. any more ideas? Quote Link to comment
ghost82 Posted March 31, 2022 Share Posted March 31, 2022 (edited) It should be the disk, if smart data are ok check connecting cables. Moreover, looking again at diagnostics, your hd sdf is dropping offline and resetting. Mar 7 01:27:06 SERVER-VM kernel: ata10.00: exception Emask 0x10 SAct 0x400000 SErr 0x400000 action 0x6 frozen Mar 7 01:27:06 SERVER-VM kernel: ata10.00: irq_stat 0x08000000, interface fatal error Mar 7 01:27:06 SERVER-VM kernel: ata10: SError: { Handshk } Mar 7 01:27:06 SERVER-VM kernel: ata10.00: failed command: WRITE FPDMA QUEUED Mar 7 01:27:06 SERVER-VM kernel: ata10.00: cmd 61/40:b0:70:7d:e1/05:00:27:01:00/40 tag 22 ncq dma 688128 out Mar 7 01:27:06 SERVER-VM kernel: res 50/00:40:70:7d:e1/00:05:27:01:00/40 Emask 0x10 (ATA bus error) Mar 7 01:27:06 SERVER-VM kernel: ata10.00: status: { DRDY } Mar 7 01:27:06 SERVER-VM kernel: ata10: hard resetting link Edited March 31, 2022 by ghost82 Quote Link to comment
GTM Posted April 4, 2022 Author Share Posted April 4, 2022 Thank you. I have a question. I do not know how to tie atat10.00 to which Sata port or piece of hardware. I do have a multiple port sata card.. and a total of 7 disks in this thing including two cache drives.. How can I figure this out? Thank you for finding this. SIncerely, George Miller Quote Link to comment
ghost82 Posted April 4, 2022 Share Posted April 4, 2022 (edited) 22 minutes ago, GTM said: How can I figure this out? I think you could try: 1. identify the port from reading labels on your motherboard (may be difficult to read) 2. identify the port from reading the manual of your motherboard 3. identify the port from the attached disk: sdf should be your 'parity2 disk' Nevermind, you have a pcie sata card?do the same as above replacing 'motherboard' with 'your sata card' https://delightlylinux.wordpress.com/2019/08/02/how-to-find-a-hard-drives-sata-port-in-bash/#:~:text=To find the physical SATA,%3A0%3A0%3A0. [2:0:0:0] disk ATA Hitachi HUS72404 A5F0 /dev/sdb /dev/sg1 dir: /sys/bus/scsi/devices/2:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.1/ata2/host2/target2:0:0/2:0:0:0] [5:0:0:0] disk ATA SanDisk SSD PLUS 04RL /dev/sdc /dev/sg2 dir: /sys/bus/scsi/devices/5:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.1/ata5/host5/target5:0:0/5:0:0:0] [6:0:0:0] disk ATA SanDisk SSD PLUS 04RL /dev/sdd /dev/sg3 dir: /sys/bus/scsi/devices/6:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.1/ata6/host6/target6:0:0/6:0:0:0] [9:0:0:0] disk ATA Hitachi HUS72404 A5F0 /dev/sde /dev/sg4 dir: /sys/bus/scsi/devices/9:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata9/host9/target9:0:0/9:0:0:0] [10:0:0:0] disk ATA Hitachi HUS72404 A5F0 /dev/sdf /dev/sg5 dir: /sys/bus/scsi/devices/10:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata10/host10/target10:0:0/10:0:0:0] [11:0:0:0] disk ATA Hitachi HUS72404 A5F0 /dev/sdg /dev/sg6 dir: /sys/bus/scsi/devices/11:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata11/host11/target11:0:0/11:0:0:0] [12:0:0:0] disk ATA Hitachi HUS72404 A5F0 /dev/sdh /dev/sg7 dir: /sys/bus/scsi/devices/12:0:0:0 [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata12/host12/target12:0:0/12:0:0:0] Edited April 4, 2022 by ghost82 Quote Link to comment
GTM Posted April 5, 2022 Author Share Posted April 5, 2022 Thank you very much for the help. I need to schedule some after hours maintenance and replace things. I am thinking of replacing the sata cable first... then maybe the sata pcie card that has 4 ports on it if the problem persists. maybe both at the same time... since its a production server.... oh the joy.... George Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.