my 1 harddrive Lost its File System (unmountable)


Recommended Posts

my one unraid was updating the plex  it was processing  ... updating deleting whatever

 

i noticed it stopped worked.. so i decided to reboot unraid server...  it didn fix things

i noticed in my arroy my 12TB hard drive is now  unmountable : no file system

it is a BTRFS setup

how do i recover..  remember a while back had SSD issue for cache  but i cant remember... and this is the HDds  so this be slightly different

 

what can i run that fixs the file system to be mountable..  stopping array  restarting also does nothing to fix it

and there are no smart errors.. as this hard drive a new one i got around christmas

 

 

also this unraid doesnt have a cache drive... or parity drive  so the appdata is saved across the 4 drives i have in it

 

and  i dont have spare disk space i have a couple 6TB external drives  but i dont have 1 12TB drive  can i still fix it

 

and i currnety did a stop array... then click maintance mode then start array and i clicked the check button  so its checking all 4 array disks

 

 

Edited by comet424
Link to comment

Transid error with btrfs is usually fatal, and it's usually caused by writes not being completely flushed to disk, there are some recovery options here, btrfs restore is likely the best one for this.

 

There are constant ATA errors with disk3, and possibly the reason for the fs becoming corrupt:

 

May 10 22:48:22 MitchFlix kernel: ata6.00: status: { DRDY }
May 10 22:48:22 MitchFlix kernel: ata6.00: failed command: READ FPDMA QUEUED
May 10 22:48:22 MitchFlix kernel: ata6.00: cmd 60/40:78:20:bf:1e/05:00:31:00:00/40 tag 15 ncq dma 688128 in
May 10 22:48:22 MitchFlix kernel:         res 40/00:a0:c8:d6:1e/00:00:31:00:00/40 Emask 0x50 (ATA bus error)
May 10 22:48:22 MitchFlix kernel: ata6.00: status: { DRDY }
May 10 22:48:22 MitchFlix kernel: ata6.00: failed command: READ FPDMA QUEUED
May 10 22:48:22 MitchFlix kernel: ata6.00: cmd 60/a8:80:60:c4:1e/02:00:31:00:00/40 tag 16 ncq dma 348160 in
May 10 22:48:22 MitchFlix kernel:         res 40/00:a0:c8:d6:1e/00:00:31:00:00/40 Emask 0x50 (ATA bus error)
May 10 22:48:22 MitchFlix kernel: ata6.00: status: { DRDY }
May 10 22:48:22 MitchFlix kernel: ata6.00: failed command: READ FPDMA QUEUED
May 10 22:48:22 MitchFlix kernel: ata6.00: cmd 60/40:88:08:c7:1e/05:00:31:00:00/40 tag 17 ncq dma 688128 in
May 10 22:48:22 MitchFlix kernel:         res 40/00:a0:c8:d6:1e/00:00:31:00:00/40 Emask 0x50 (ATA bus error)
May 10 22:48:22 MitchFlix kernel: ata6.00: status: { DRDY }
May 10 22:48:22 MitchFlix kernel: ata6.00: failed command: READ FPDMA QUEUED
May 10 22:48:22 MitchFlix kernel: ata6.00: cmd 60/40:90:48:cc:1e/05:00:31:00:00/40 tag 18 ncq dma 688128 in
May 10 22:48:22 MitchFlix kernel:         res 40/00:a0:c8:d6:1e/00:00:31:00:00/40 Emask 0x50 (ATA bus error)

 

Check/replace cables.

 

 

Link to comment

so would it be the cables the issue? or the backplain?   i looked at the smart  and nothing in yellow to show errors with the hard drive

as its also a new motherboard in there i just never replaced  the backplain or cables

i repaired an old NAS

 

so using this command? 

"btrfs restore -v /dev/sdX1 /mnt/disk2/restore"

as i know it cant restore when i dont have 12TB diskspace free though...

 

would this one be better ?

"btrfs check --repair /dev/md5"  

 

from that list

so a TransID what is that?  and is BTRFS a good file system i picked it because what i read was good.. but how do i protect myself for no failures like that..  i do have them backed up on my own server.   this server is 2 hours drive away at my sisters house so i remote accessing to try to fix it at the moment

 

and what does  "completely flushed to disk"  what does that mean?  like fully copied to the drive?

 

actually i cant remember if i replaced the sata cables or not.. they in a tight fit case and cant move

 

Edited by comet424
Link to comment

i cant find what /dev/mdx    be for that hard drive

but i now have option under the drive 

is it ok to change readonly to repair ?  is that what you would type in..  i not hitting enter or nothing as its still doing its check  should be done maybe midnight tonight its at 37 percent complete

 

btrfs1.PNG

btrfs2.PNG

Edited by comet424
Link to comment

is there a way to run the btrfs  partially  since my sister has 2 external 6tb drives

can  btrfs  restore like 5TBs to 1 External hd... then  pause when its full and then plug the next drive in and continue it

 

or you need a fully blank 12 14tb drive   to restore the 12tb drive thats having issues

as there is only 1.3TB free i think  on the array cant tell now  as its still checking

 

Edited by comet424
Link to comment

ah ok  what does the check --repair actually do  as it said  if your in an array to do the --repair

 

and s btrfs really a good file system  or should i be running xfs    

 

hmmm guess i gotta try to find where i can get a 12tb  drive with everything shut down  cant buy crap  

Link to comment

i have 14TB free on my unraid..

now her drive isnt part of a Parity raid  setup  just single disk

 

if i get her hard drive  can i shove that drive in my computer and run 

btrfs restore -v /dev/sdX1 /mnt/disk2/restore" example   and i not sure what the hard drive would look like    like the SDX   or is that a different type of drive

and all hard drives are DISK

but that was my idea if i get her drive  can i put in mine and retreve it

 

Link to comment
  • 2 years later...

hi i had btrfs  file system not work again..  its been 2 yrs   since i last posted

 

but my download drive i use a spinner as a cache pool to download   and uses btrfs... i was trying to move files off it.. and it was failing.. i rebooted and its unmounted...

 

here is the logs

Jul  2 12:18:42 mitchsserver kernel: ACPI: Early table checksum verification disabled
Jul  2 12:18:42 mitchsserver kernel: floppy0: no floppy controllers found
Jul  2 12:18:42 mitchsserver kernel: ata5.00: exception Emask 0x50 SAct 0x1000000 SErr 0x30802 action 0xe frozen
Jul  2 12:18:42 mitchsserver kernel: ata5.00: failed command: READ FPDMA QUEUED
Jul  2 12:18:42 mitchsserver kernel: ata5: hard resetting link
Jul  2 12:18:42 mitchsserver kernel: ata5: hard resetting link
Jul  2 12:18:42 mitchsserver kernel: blk_update_request: I/O error, dev sdd, sector 4160 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
Jul  2 12:18:43 mitchsserver mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
Jul  2 12:19:15 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:19:15 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:19:15 mitchsserver kernel: BTRFS warning (device sde1): couldn't read tree root
Jul  2 12:19:15 mitchsserver kernel: BTRFS error (device sde1): open_ctree failed
Jul  2 12:19:15 mitchsserver root: mount: /mnt/download_drive: wrong fs type, bad option, bad superblock on /dev/sde1, missing codepage or helper program, or other error.
Jul  2 12:19:15 mitchsserver emhttpd: /mnt/download_drive mount error: No file system
Jul  2 12:19:26 mitchsserver rc.docker: transmission: Error response from daemon: error while creating mount source path '/mnt/user/Transmission/completed': mkdir /mnt/user/Transmission: no medium found
Jul  2 12:19:26 mitchsserver rc.docker: Error: failed to start containers: transmission
Jul  2 12:29:03 mitchsserver root: Fix Common Problems: Warning: unRaids built in FTP server is currently disabled, but users are defined
Jul  2 12:36:19 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:36:19 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:36:19 mitchsserver kernel: BTRFS warning (device sde1): couldn't read tree root
Jul  2 12:36:19 mitchsserver kernel: BTRFS error (device sde1): open_ctree failed
Jul  2 12:36:43 mitchsserver kernel: BTRFS warning (device sde1): 'usebackuproot' is deprecated, use 'rescue=usebackuproot' instead
Jul  2 12:36:43 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:36:43 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:36:43 mitchsserver kernel: BTRFS warning (device sde1): couldn't read tree root
Jul  2 12:36:43 mitchsserver kernel: BTRFS error (device sde1): chunk 1664330235904 has missing dev extent, have 0 expect 1
Jul  2 12:36:43 mitchsserver kernel: BTRFS error (device sde1): failed to verify dev extents against chunks: -117
Jul  2 12:36:43 mitchsserver kernel: BTRFS error (device sde1): open_ctree failed
Jul  2 12:37:09 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:37:09 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:37:09 mitchsserver kernel: BTRFS warning (device sde1): couldn't read tree root
Jul  2 12:37:09 mitchsserver kernel: BTRFS error (device sde1): open_ctree failed
Jul  2 12:39:09 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:39:09 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:39:09 mitchsserver kernel: BTRFS warning (device sde1): couldn't read tree root
Jul  2 12:39:09 mitchsserver kernel: BTRFS error (device sde1): open_ctree failed
Jul  2 12:42:38 mitchsserver kernel: sd 1:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=DRIVER_OK
Jul  2 12:42:38 mitchsserver kernel: sd 1:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=DRIVER_OK
Jul  2 12:46:37 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:46:37 mitchsserver kernel: BTRFS error (device sde1): parent transid verify failed on 1543279591424 wanted 136563 found 130890
Jul  2 12:46:37 mitchsserver kernel: BTRFS warning (device sde1): couldn't read tree root
Jul  2 12:46:37 mitchsserver kernel: BTRFS error (device sde1): open_ctree failed

 

 

i tried these commands  as the  link i got provided in here was upadated for 6.10  but i get these other errors it wont work now

 

root@mitchsserver:~# btrfs fi show /dev/sde1
Label: none  uuid: 547e712d-e478-48a9-a84a-30ed71534b0d
        Total devices 1 FS bytes used 530.19GiB
        devid    1 size 931.51GiB used 572.02GiB path /dev/sde1

root@mitchsserver:~# mkdir /temp
root@mitchsserver:~# mount -o rescue=all,ro /dev/sde1 /temp
mount: /temp: wrong fs type, bad option, bad superblock on /dev/sde1, missing codepage or helper program, or other error.
root@mitchsserver:~# mount -o usebackuproot,ro /dev/sde1 /temp
mount: /temp: wrong fs type, bad option, bad superblock on /dev/sde1, missing codepage or helper program, or other error.
root@mitchsserver:~# mount -o degraded,rescue=all,ro /dev/sde1 /temp
mount: /temp: wrong fs type, bad option, bad superblock on /dev/sde1, missing codepage or helper program, or other error.
root@mitchsserver:~# ls /dev/sde1
/dev/sde1

 

 

 

so not sure how do i get it back...   not sure whats going on

or why this stuff happens/  better to have ntfs cache pool maybe?

 

 

should i be using  like an ssd instead of a spinner to download?    as i use it to keep array offline  or should i use 2 disks for cache pool for downloads? or error happen on both?

 

i do the same setup on a few unraids

 

1 drive to dump any downloads from server windows etc..  but i use a spinner  and 1 drive   so not sure what the best is to do it

mitchsserver-diagnostics-20220702-1255.zip

Link to comment

i got some help to boot into "maintance mode" but they asked me to scrub drive but since it wont mount it wont let me scrub.

 

but there was a btrfs  check  i ran it and the log it gave is


parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
ERROR: cannot open file system
Opening filesystem to check...

 

Link to comment
15 hours ago, comet424 said:
parent transid verify failed on 1543279591424 wanted 136563 found 130890

This means some writes were lost to the device, and it's usually a fatal error.

 

 

There are still frequent ATA errors on that device, as warned above link resets can lead to lost writes and fs corruption.

 

Jul  2 12:18:42 mitchsserver kernel: ata5.00: exception Emask 0x50 SAct 0x1000000 SErr 0x30802 action 0xe frozen
Jul  2 12:18:42 mitchsserver kernel: ata5.00: irq_stat 0x00400000, PHY RDY changed
Jul  2 12:18:42 mitchsserver kernel: ata5: SError: { RecovComm HostInt PHYRdyChg PHYInt }
Jul  2 12:18:42 mitchsserver kernel: ata5.00: failed command: READ FPDMA QUEUED
Jul  2 12:18:42 mitchsserver kernel: ata5.00: cmd 60/08:c0:40:10:00/00:00:00:00:00/40 tag 24 ncq dma 4096 in
Jul  2 12:18:42 mitchsserver kernel:         res 40/00:c4:40:10:00/00:00:00:00:00/40 Emask 0x50 (ATA bus error)
Jul  2 12:18:42 mitchsserver kernel: ata5.00: status: { DRDY }
Jul  2 12:18:42 mitchsserver kernel: ata5: hard resetting link

 

Link to comment

@JorgeB

so does that mean its un mountable?  as the commands to for a mount in read only does nothing...

 

does it say  from that ata info    if its the hard drive? it passes smart ..    is it cuz im running btrfs?  

 

what link resets  can lead to lost writes..

 

 

so whats common issue?  those errors..  can it be unraid  memory problem..  is it say   "ata5"   cable number 5?  although its a 4 bay hotswap.. maybe backplain?

 

i guess i should have taken a sys log before i rebooted... it was up for 12 days...  would been better?  is it better to switch to XFS  or would this happen to XFS  or switch to NTFS  as just a download pool drive as i learned from the discord...  that you unraid recommends using  XFS for a single drive not the BTRFS  they default.. something with errors btrfs  causes for cache pools. i dunno 😞

 

maybe time to replace the 4 bay server maybe?

Edited by comet424
Link to comment
7 minutes ago, comet424 said:

as the commands to for a mount in read only does nothing...

Like mentioned it's fatal, you can use btrfs restore to try and recover some data but the fs will need to be destroyed and re-created.

 

8 minutes ago, comet424 said:

does it say  from that ata info    if its the hard drive? it passes smart ..    is it cuz im running btrfs?  

 

what link resets  can lead to lost writes..

Those ATA errors a usually a power/connection problem, and they are not caused by btrfs, but because btrfs is COW if writes barriers are not being honored and some writes are lost it's a big problem.

 

10 minutes ago, comet424 said:

is it say   "ata5"   cable number 5? 

ATA5 is the WD 2TB disk.

 

The correct fix would be to get rid of those errors, but if you cannot switch to a more forgiving filesystem.

Link to comment

ah  ok ya that restore doesnt work  least it doesnt work now  i rebooted again and says

er:~# mount -o degraded,rescue=all,ro /dev/sde1 /mnt/user/restore
mount: /mnt/user/restore: wrong fs type, bad option, bad superblock on /dev/sde1, missing codepage or helper program, or other error.
root@mitchsserver:~# ^C
root@mitchsserver:~# 

 

i have a folder.  called restore.. but why is the wrong fs type?  or is it saying it could be any of these..  someone from the discord  thinks maybe its a superblock?

 

ok so ata5 means drive i guess so 4 hard drives an 1 nvme..  ok 

 

so what does a btrfs is a COW  if write barries are not being honored and some writes are lost....   what does all that mean?... so  if its a power/connection problem  that could me power from the power supply? or power from the backplaine?    could it be if sister accidently  slides out the hot swap and shoves back in?  wish i had diagnostics before  reboot  probably really would helped...  ugh  but i dont know if she did that.. i just guessing what could gone wrong

 

oh  since those errors are  coming now after a reboot.. that means its a  power/connection problem now thats accurring then.... as it came up after a reboot  so couldnt been an accidently pull hot swap...

 

and whats the best recommendations for a forgiving filesystem?  i thought the btrfs was forgiving as unraid only offered the xfs vs btrfs  as i read  dont use resif  file system...  sometimes i get confused  which one to choose... 

 

is it best not to use btrfs....  or use btrfs with 2 drives...  

 

and which raid setting is it.  i read it somewhere...  if you have 2 hard drives or SSDS  mirrored or so if the Drive 1  corrupts   Drive 2 also corrupts.... 

 

always wanting to learn and improve my setups   best i can .....

 

ill get my sister to swap the hard drive into a different bay  see if that fixes it.. and maybe then i can try the mounting..  probably the Fs type error is from the ata5 power/connection issue 

 

ill give it a try today

maybe its also time to change the whole case too

 

 

Link to comment
4 minutes ago, comet424 said:

i have a folder.  called restore.. but why is the wrong fs type?

You're not doing it right, btrfs restore is done without mounting the fs, it's option #2 in the link.

 

5 minutes ago, comet424 said:

so what does a btrfs is a COW  if write barries are not being honored and some writes are lost....   what does all that mean?.

It means btrfs is receiving information from the kernel/device that writes were done to disk when in fact they were not, btrfs updates the superblock to the new generation but it never reached the disk, hence the error:

 

3 hours ago, JorgeB said:
parent transid verify failed on 1543279591424 wanted 136563 found 130890

This means that according to the superblock the fs should be on generation 136563 but it's only on generation 130890, all those writes in between were lost and never reached the device.

 

9 minutes ago, comet424 said:

and whats the best recommendations for a forgiving filesystem? 

XFS is more forgiving, it will rebuild the log and lose the data that didn't reach the disk but the fs usually survives.

 

I only recommend btrfs for users with stable hardware, and ideally with ECC RAM.

Link to comment

sorry my bad...  here is option 2.. and it askes for maybe a backup super?

root@mitchsserver:~# btrfs restore -v /dev/sdd1 /mnt/user/restore
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
Could not open root, trying backup super
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
Could not open root, trying backup super
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
Could not open root, trying backup super
root@mitchsserver:~# btrfs restore -vi /dev/sdd1 /mnt/user/restore
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
Could not open root, trying backup super
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
Could not open root, trying backup super
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
parent transid verify failed on 1543279591424 wanted 136563 found 130890
Ignoring transid failure
ERROR: root [1 0] level 1 does not match 0

Couldn't read tree root
Could not open root, trying backup super
root@mitchsserver:~# 

 

 

i did a reboot.. and the power errors arent there... so i re tried the commands   still get this above error..  guess it just cant be fixed...

 

so what actually is a superblock and a superblock generation? 

and is there anyway to notify me either through email text  or browswer...   for when a btrfs failure like this happens i know i have notifications set  but i usually get the green ok..  or a issues in the fix common... so i  i didnt know  errors were happening

 

so if xfs is better..  does that mean i need to change all my servers from  btrfs  to xfs?  or just cache..   i went the btrfs  as the google searches says its better.. plus also about bit rot.... been running  unraid setups 4 yrs now  on btrfs..  only had it fail twice 2 yrs ago  when i recreated this thread and just couple days ago  on 2 different comps..

 

so i was told in discord...  its better for cache   1 disk  use xfs   2 disks  use btrfs.. and that my arrays are fine using btrfs.. is that cuz i using parity drives?  well  least this server doesnt have one.. i just have a 1 drive and is btrfs.. so i guessing thats bad idea too

 

 

ya no ecc ram in this server.. this server been running a year  no issues i figured it was stable ...

its just a gaming motherboard  micro itx in a small form factor 4 bay hot swap  case...

 

 

i just didnt want bit rot  as u learn online they tell ya  u dont want.. but then a cache pool isnt going to rot right as it wont stay on there long enough...

 

how come they dont have  ntfs  as a cache pool file system.. is that not reliable too..  and here i thought the btrfs  was the best... always learning and still know nothing lol

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.