Jump to content

Ubuntu server VM, Failed to write entry... ignoring: read-only file system errors - production unit


GTM

Recommended Posts

Need help asap. production unit. willing to pay for support. attaching screen shots of vnc ssh session.

Ubuntu server has been running fine for months as a VM, now wont boot up - production unit. need help fixing.

 

Capture.PNG

Capture2.PNG

Edited by GTM
typo
Link to comment
  • GTM changed the title to Ubuntu server VM, Failed to write entry... ignoring: read-only file system errors - production unit

Then I followed the article and performed the FSCK as described, it did the repair of the /dev/mapper/zimbramail--vg-root partition, I pressed yes through the prompts, it did the repairs, then I typed in "reboot" when it was complete and it came back up.

 

After reading this similar problem on the web:

 

On boot up, I logged in and did a df -h command.. to see if any partititions on the disk were full....

 

I don't see a sector called "boot".. but I may not know what I am looking at..  do you think disk space has anything to do with this?  It doesn't look like it to me, but I could be wrong. Please let me know what you think and thank you in advance for your help.

 

 

Capture6.PNG

Link to comment

To sum it up, we are up and running again which is good, HOWEVER these errors,  shown at the beginning of the post that were showing up on the VM VNC session of this ubuntu server need to be researched, so I can prevent it from happening again.

 

If anyone has any insight as to what could cause them, PLEASE let me know and thanks in advance for your help.

 

Sincerely,

GTM

Link to comment
6 hours ago, GTM said:

I don't see a sector called "boot".. but I may not know what I am looking at..

boot partition is not mandatory in ubuntu so probably you installed everything in the root partition /.

 

6 hours ago, GTM said:

these errors,  shown at the beginning of the post that were showing up on the VM VNC session of this ubuntu server need to be researched, so I can prevent it from happening again

Difficult to say, if in your fstab file you have errors=remount-ro for the root partition, something happened (inconsistency in the filesystem) when the os was live, or in an attempt that failed to mount the root partition, or if the filesystem check failed during boot and according to your fstab the system entered read-only mode to try to save what it could save.

Fixing the corrupted filesystem fixed it.

 

Possible causes: power failures, vm force stopped, updates applied without rebooting, failing/failed hd.

I would bet on an improper shutdown since you had that inode 7078004 had zero dtime

Edited by ghost82
Link to comment

NEW ERRORS

 

Everything was clean for a few days after the fix, but now I am seeing these (below) error starting to creep in.  Maybe there are better clues in this screen shot.

 

We have had no power failures  (unraid system is on a huge UPS), no force stoppings, no updates, HD's are reporting good within unraid, no improper shutdowns either ( always shutdown this email system from the SSH terminal).

 

Thanks (in advance too) to everyone looking at this with me.  I appreciate your help.

 

any more ideas?

 

vmerrorscontinued.PNG

Link to comment

It should be the disk, if smart data are ok check connecting cables.

 

Moreover, looking again at diagnostics, your hd sdf is dropping offline and resetting.

Mar  7 01:27:06 SERVER-VM kernel: ata10.00: exception Emask 0x10 SAct 0x400000 SErr 0x400000 action 0x6 frozen
Mar  7 01:27:06 SERVER-VM kernel: ata10.00: irq_stat 0x08000000, interface fatal error
Mar  7 01:27:06 SERVER-VM kernel: ata10: SError: { Handshk }
Mar  7 01:27:06 SERVER-VM kernel: ata10.00: failed command: WRITE FPDMA QUEUED
Mar  7 01:27:06 SERVER-VM kernel: ata10.00: cmd 61/40:b0:70:7d:e1/05:00:27:01:00/40 tag 22 ncq dma 688128 out
Mar  7 01:27:06 SERVER-VM kernel:         res 50/00:40:70:7d:e1/00:05:27:01:00/40 Emask 0x10 (ATA bus error)
Mar  7 01:27:06 SERVER-VM kernel: ata10.00: status: { DRDY }
Mar  7 01:27:06 SERVER-VM kernel: ata10: hard resetting link

 

Edited by ghost82
Link to comment

Thank you.  

 

I have a question. I do not know how to tie atat10.00 to which Sata port or piece of hardware.

I do have a multiple port sata card.. and a total of 7 disks in this thing including two cache drives..

 

How can I figure this out?

 

Thank you for finding this.

 

SIncerely,

 

George Miller

 

Link to comment
22 minutes ago, GTM said:

How can I figure this out?

I think you could try:

1. identify the port from reading labels on your motherboard (may be difficult to read)

2. identify the port from reading the manual of your motherboard

3. identify the port from the attached disk: sdf should be your 'parity2 disk'

 

Nevermind, you have a pcie sata card?do the same as above replacing 'motherboard' with 'your sata card'

 

https://delightlylinux.wordpress.com/2019/08/02/how-to-find-a-hard-drives-sata-port-in-bash/#:~:text=To find the physical SATA,%3A0%3A0%3A0.

[2:0:0:0]    disk    ATA      Hitachi HUS72404 A5F0  /dev/sdb   /dev/sg1 
  dir: /sys/bus/scsi/devices/2:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.1/ata2/host2/target2:0:0/2:0:0:0]
[5:0:0:0]    disk    ATA      SanDisk SSD PLUS 04RL  /dev/sdc   /dev/sg2 
  dir: /sys/bus/scsi/devices/5:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.1/ata5/host5/target5:0:0/5:0:0:0]
[6:0:0:0]    disk    ATA      SanDisk SSD PLUS 04RL  /dev/sdd   /dev/sg3 
  dir: /sys/bus/scsi/devices/6:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.1/ata6/host6/target6:0:0/6:0:0:0]
[9:0:0:0]    disk    ATA      Hitachi HUS72404 A5F0  /dev/sde   /dev/sg4 
  dir: /sys/bus/scsi/devices/9:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata9/host9/target9:0:0/9:0:0:0]
[10:0:0:0]   disk    ATA      Hitachi HUS72404 A5F0  /dev/sdf   /dev/sg5 
  dir: /sys/bus/scsi/devices/10:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata10/host10/target10:0:0/10:0:0:0]
[11:0:0:0]   disk    ATA      Hitachi HUS72404 A5F0  /dev/sdg   /dev/sg6 
  dir: /sys/bus/scsi/devices/11:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata11/host11/target11:0:0/11:0:0:0]
[12:0:0:0]   disk    ATA      Hitachi HUS72404 A5F0  /dev/sdh   /dev/sg7 
  dir: /sys/bus/scsi/devices/12:0:0:0  [/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.2/0000:02:06.0/0000:06:00.0/ata12/host12/target12:0:0/12:0:0:0]

 

Edited by ghost82
Link to comment

Thank you very much for the help.  I need to schedule some after hours maintenance and replace things.

 

I am thinking of replacing the sata cable first...  then maybe the sata pcie card that has 4 ports on it if the problem persists.  

 

maybe both at the same time...  since its a production server....  oh the joy....

 

George

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...