help! drives show up as unformatted?!

March 24, 200917 yr

running 4.4.2 with 3x1,5tb seagate + 1tb samsung drives everything was normal ... until this morning.

i tried to shutdown my unraid server and noticed that 2 drives show up as unformated!?! heres the syslog

the machine is still up and running ... im afraid to shut it down until some1 maybe can clear things up. please assist, i have no idea what went wrong ....

thanks!

March 24, 200917 yr

DO NOT PRESS THE FORMAT BUTTON.

What can happen is that if files are left open or telnet sessions are left open on a volume when you stop the array, unRAID can get confused and show the drives as unformatted. They are not really unformatted. But if you press the format button unRAID WILL format them and you will lose your data.

The safest course is just to shutdown the server. On reboot the drives should be properly recognized as formatted and a part of the array.

March 24, 200917 yr

Author

thanks for the reply, i guessed that the format button would be the worst choice ...

and i also determined the problem, i did some testing and had still open a root console on the server directly. i closed that, stopped again, started .. and everything seems back to normal.

shocking second!! is this a bug? at least i should be documented somewhere ... i couldn't find anything

March 24, 200917 yr

I would recommend using the powerdown script. See here.

March 24, 200917 yr

Author

thanks i will check that out ...

March 24, 200917 yr

This bug REALLY needs squashed immediately.

This user was one mistaken mouse click away from massive data loss.

March 24, 200917 yr

Author

i have to agree, its pretty confusing even if you are a experienced user!

March 24, 200917 yr

Tom Please squash this bug...

At the very least, perhaps include the powerdown package or minimally the rc.unRAID script.

This will alleviate the issue until the root cause has been resolved.

http://lime-technology.com/wiki/index.php?title=Powerdown_script

http://code.google.com/p/unraid-powercontrol/downloads/list

March 24, 200917 yr

paste this command into the telnet session that is open:

cd /; fuser -muvk -9 /mnt/disk* /mnt/user* /dev/loop* /dev/md*

Then use the stop button on the array.

March 24, 200917 yr

NAS and I rarely agree on much of anything unRAID related. But this particular bug does provide a new user with a loaded weapon to easily purge their data.

Once you know whys this happens, it is very easy to avoid, but if you are a new user and see this happen, it is confusing at best. Having the powerdown command certainly gives a remedy, but would not stop the knee jerk reaction of immediately hitting the format button.

I do want to point out that people hitting the format button in these circumstances has been quite rare. I can only remember once. But clearly reporting a drive that contains a valid filesystem as unformatted is a bug - and certainly causes stress and anxious moments to anyone that has seen it.

Few of us have a real appreciation for how hard it would be to actually fix this the right way. But even a label change from "unformatted" to "in use or unformatted" would help tremendously. A person seeing that message would NOT be inclined to hit the format button IMO. Instead, they would seek out what to do and likely find a helpful wiki.

If someone comes across this thread having just accidentally hit the format button and wondering if they are totally screwed, I want to communicate that there is hope in recovering your data. JUST DO NOT START COPYING NEW DATA TO THE DRIVE. The reiserfsck tool does a very respectable job at recovering data, and I would think it would provide very good results after an accidental format.

March 25, 200917 yr

How about an easy "work around" -> the format button should have a confirmation message (like some of the others, need to check a "are you sure" box before it will let you hit it). At least that can stop the "knee jerk" from killing data... now a user that doesn't try a reboot first, or the forums could still kill his/her data, but at least it wasn't 1 mouse click.

April 2, 200917 yr

Hey all,

I have the same problem, I tried to post a new topic, but it wouldn't let me. Or does it take a few minutes for the topics to appear?

I am sort of stuck at what to do next. First of all there was a power cut and when I checked the unraid server it had a red dot beside the parity drive and beside disk 1, it also said disk1 was unformatted. So after reading the other thread about this I did a proper powerdown, replaced the sata cable and checked all other connections then rebooted. The drives still showed up with the red dots.

So this is the drive with the problems - disk1 device: pci-0000:00:08.0-scsi-0:0:0:0 (sdc) ata-WDC_WD5000AAKS-00YGA0_WD-WCAS82518183

I used the reiserfsck -check /devsdc1 command to check the drive. Is this correct? Well it came back and said "no corrupton found"

Didn't find any problems using the smartcpl command either.

Reiserfsck did find problems on the parity drive, but I didn't do the rebuiltree that it advised, I am just leaving it until the other problem is solved.

I have included my syslog, but I had to rar it.

Can anyone advise what I should do next? should I try a rebuild tree on the unformatter drive even though reiserfsck says it's ok?

April 2, 200917 yr

Hey all,

I have the same problem, I tried to post a new topic, but it wouldn't let me. Or does it take a few minutes for the topics to appear?

I am sort of stuck at what to do next. First of all there was a power cut and when I checked the unraid server it had a red dot beside the parity drive and beside disk 1, it also said disk1 was unformatted. So after reading the other thread about this I did a proper powerdown, replaced the sata cable and checked all other connections then rebooted. The drives still showed up with the red dots.

So this is the drive with the problems - disk1 device: pci-0000:00:08.0-scsi-0:0:0:0 (sdc) ata-WDC_WD5000AAKS-00YGA0_WD-WCAS82518183

I used the reiserfsck -check /devsdc1 command to check the drive. Is this correct? Well it came back and said "no corrupton found"

Didn't find any problems using the smartcpl command either.

Reiserfsck did find problems on the parity drive, but I didn't do the rebuiltree that it advised, I am just leaving it until the other problem is solved.

I have included my syslog, but I had to rar it.

Can anyone advise what I should do next? should I try a rebuild tree on the unformatter drive even though reiserfsck says it's ok?

Stop... do not do anything... at least until we can look at your syslog.

Do NOT format your drive.

Do NOT run reiserfsck on the parity drive. It does NOT contain a file-system. To try would invalidate any chance you have of recovery if a disk has actually failed.

Before I go look at what's up with your parity drive and the data drives, I need to ask, did you run reiserfsck with the fix-fixable option on the parity drive? or just a plain check?

The ONLY command you should run on the disks at this time to test them is smartctl.

Joe L.

April 2, 200917 yr

Hey all,

I have the same problem, I tried to post a new topic, but it wouldn't let me. Or does it take a few minutes for the topics to appear?

I am sort of stuck at what to do next. First of all there was a power cut and when I checked the unraid server it had a red dot beside the parity drive and beside disk 1, it also said disk1 was unformatted. So after reading the other thread about this I did a proper powerdown, replaced the sata cable and checked all other connections then rebooted. The drives still showed up with the red dots.

So this is the drive with the problems - disk1 device: pci-0000:00:08.0-scsi-0:0:0:0 (sdc) ata-WDC_WD5000AAKS-00YGA0_WD-WCAS82518183

I used the reiserfsck -check /devsdc1 command to check the drive. Is this correct? Well it came back and said "no corrupton found"

Didn't find any problems using the smartcpl command either.

Reiserfsck did find problems on the parity drive, but I didn't do the rebuiltree that it advised, I am just leaving it until the other problem is solved.

I have included my syslog, but I had to rar it.

Can anyone advise what I should do next? should I try a rebuild tree on the unformatter drive even though reiserfsck says it's ok?

Your syslog shows that disk1 has some corruption on its superblock. I would expect disk1 to show as "Unformatted" because of that.

Apr  2 18:34:30 Projectx kernel: ReiserFS: md1: warning: sh-2006: read_super_block: bread failed (dev md1, block 2, size 4096)
Apr  2 18:34:30 Projectx kernel: ReiserFS: md1: warning: sh-2006: read_super_block: bread failed (dev md1, block 16, size 4096)
Apr  2 18:34:30 Projectx kernel: ReiserFS: md1: warning: sh-2021: reiserfs_fill_super: can not find reiserfs on md1
Apr  2 18:34:30 Projectx emhttp: shcmd: shcmd (13): exit status: 32

Are you CERTAIN you did not swap cables with the parity drive? It might explain why there is no superblock on disk1.

Another of your disks /dev/sdd, is having all kinds of errors. It might have a bad or a loose SATA cable.

Apr  2 18:35:38 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x380000 action 0x6
Apr  2 18:35:38 Projectx kernel: ata4.00: BMDMA stat 0x24
Apr  2 18:35:38 Projectx kernel: ata4: SError: { 10B8B Dispar BadCRC }
Apr  2 18:35:38 Projectx kernel: ata4.00: cmd 25/00:08:3f:3c:6b/00:00:26:00:00/e0 tag 0 dma 4096 in
Apr  2 18:35:38 Projectx kernel:          res 51/84:00:46:3c:6b/84:00:26:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:35:38 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:35:38 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:35:38 Projectx kernel: ata4: hard resetting link
Apr  2 18:35:38 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Apr  2 18:35:39 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:35:39 Projectx kernel: ata4: EH complete
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:35:39 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x300000 action 0x6
Apr  2 18:35:39 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:35:39 Projectx kernel: ata4: SError: { Dispar BadCRC }
Apr  2 18:35:39 Projectx kernel: ata4.00: cmd 25/00:00:af:e3:d1/00:04:19:00:00/e0 tag 0 dma 524288 in
Apr  2 18:35:39 Projectx kernel:          res 51/84:df:d0:e3:d1/84:03:19:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:35:39 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:35:39 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:35:39 Projectx kernel: ata4: hard resetting link
Apr  2 18:35:39 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Apr  2 18:35:39 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:35:39 Projectx kernel: ata4: EH complete
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:35:39 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:35:40 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x300000 action 0x6
Apr  2 18:35:40 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:35:40 Projectx kernel: ata4: SError: { Dispar BadCRC }
Apr  2 18:35:40 Projectx kernel: ata4.00: cmd 25/00:00:ef:bc:d1/00:02:19:00:00/e0 tag 0 dma 262144 in
Apr  2 18:35:40 Projectx kernel:          res 51/84:ef:00:be:d1/84:00:19:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:35:40 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:35:40 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:35:40 Projectx kernel: ata4: hard resetting link
Apr  2 18:35:40 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Apr  2 18:35:40 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:35:40 Projectx kernel: ata4: EH complete
Apr  2 18:35:40 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:35:40 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:35:40 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:35:40 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:21 Projectx kernel: ata4: limiting SATA link speed to 1.5 Gbps
Apr  2 18:36:21 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x380000 action 0x6
Apr  2 18:36:21 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:36:21 Projectx kernel: ata4: SError: { 10B8B Dispar BadCRC }
Apr  2 18:36:21 Projectx kernel: ata4.00: cmd 25/00:00:97:b5:8d/00:02:12:00:00/e0 tag 0 dma 262144 in
Apr  2 18:36:21 Projectx kernel:          res 51/84:0f:88:b7:8d/84:00:12:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:36:21 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:21 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:21 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:22 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:22 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:36:22 Projectx kernel: ata4: EH complete
Apr  2 18:36:22 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:22 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:22 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:22 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:22 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x180000 action 0x6
Apr  2 18:36:22 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:36:22 Projectx kernel: ata4: SError: { 10B8B Dispar }
Apr  2 18:36:22 Projectx kernel: ata4.00: cmd 25/00:00:9f:a1:f7/00:02:2d:00:00/e0 tag 0 dma 262144 in
Apr  2 18:36:22 Projectx kernel:          res 51/84:6f:30:a3:f7/84:00:2d:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:36:22 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:22 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:22 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:22 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:22 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:36:22 Projectx kernel: ata4: EH complete
Apr  2 18:36:22 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x100000 action 0x6
Apr  2 18:36:22 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:36:22 Projectx kernel: ata4: SError: { Dispar }
Apr  2 18:36:22 Projectx kernel: ata4.00: cmd 25/00:00:9f:a1:f7/00:02:2d:00:00/e0 tag 0 dma 262144 in
Apr  2 18:36:22 Projectx kernel:          res 51/84:6f:30:a3:f7/84:00:2d:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:36:22 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:22 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:22 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:23 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:23 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:36:23 Projectx kernel: ata4: EH complete
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:23 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x300000 action 0x6
Apr  2 18:36:23 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:36:23 Projectx kernel: ata4: SError: { Dispar BadCRC }
Apr  2 18:36:23 Projectx kernel: ata4.00: cmd 25/00:00:9f:b3:f7/00:04:2d:00:00/e0 tag 0 dma 524288 in
Apr  2 18:36:23 Projectx kernel:          res 51/84:3f:60:b6:f7/84:01:2d:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:36:23 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:23 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:23 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:23 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:23 Projectx kernel: ata4.00: configured for UDMA/133
Apr  2 18:36:23 Projectx kernel: ata4: EH complete
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:23 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:25 Projectx kernel: ata4.00: limiting speed to UDMA/100:PIO4
Apr  2 18:36:25 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x180000 action 0x6
Apr  2 18:36:25 Projectx kernel: ata4.00: BMDMA stat 0x24
Apr  2 18:36:25 Projectx kernel: ata4: SError: { 10B8B Dispar }
Apr  2 18:36:25 Projectx kernel: ata4.00: cmd c8/00:28:df:47:6f/00:00:00:00:00/e5 tag 0 dma 20480 in
Apr  2 18:36:25 Projectx kernel:          res 51/84:00:06:48:6f/84:01:2d:00:00/e5 Emask 0x10 (ATA bus error)
Apr  2 18:36:25 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:25 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:25 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:26 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:26 Projectx kernel: ata4.00: configured for UDMA/100
Apr  2 18:36:26 Projectx kernel: ata4: EH complete
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:26 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x380000 action 0x6
Apr  2 18:36:26 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:36:26 Projectx kernel: ata4: SError: { 10B8B Dispar BadCRC }
Apr  2 18:36:26 Projectx kernel: ata4.00: cmd 25/00:00:e7:55:6f/00:04:05:00:00/e0 tag 0 dma 524288 in
Apr  2 18:36:26 Projectx kernel:          res 51/84:bf:28:59:6f/84:00:05:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:36:26 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:26 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:26 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:26 Projectx in.telnetd[1553]: connect from 192.168.2.93 (192.168.2.93)
Apr  2 18:36:26 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:26 Projectx kernel: ata4.00: configured for UDMA/100
Apr  2 18:36:26 Projectx kernel: ata4: EH complete
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:26 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr  2 18:36:26 Projectx kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x300000 action 0x6
Apr  2 18:36:26 Projectx kernel: ata4.00: BMDMA stat 0x25
Apr  2 18:36:26 Projectx kernel: ata4: SError: { Dispar BadCRC }
Apr  2 18:36:26 Projectx kernel: ata4.00: cmd 25/00:00:ef:65:6f/00:04:05:00:00/e0 tag 0 dma 524288 in
Apr  2 18:36:26 Projectx kernel:          res 51/84:5f:90:69:6f/84:00:05:00:00/e0 Emask 0x10 (ATA bus error)
Apr  2 18:36:26 Projectx kernel: ata4.00: status: { DRDY ERR }
Apr  2 18:36:26 Projectx kernel: ata4.00: error: { ICRC ABRT }
Apr  2 18:36:26 Projectx kernel: ata4: hard resetting link
Apr  2 18:36:27 Projectx kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Apr  2 18:36:27 Projectx kernel: ata4.00: configured for UDMA/100
Apr  2 18:36:27 Projectx kernel: ata4: EH complete
Apr  2 18:36:27 Projectx kernel: sd 4:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB)
Apr  2 18:36:27 Projectx kernel: sd 4:0:0:0: [sdd] Write Protect is off
Apr  2 18:36:27 Projectx kernel: sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Apr  2 18:36:27 Projectx kernel: sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Time to ask you what you see on the management web-page. Which disks are marked as "red" ? I suspect disk3 and perhaps disk1.

RobJ is far better than me to figure out what is going on by reading a syslog. Hopefully he'll chime in. In the interim, try re-seating the cable to disk3. With multiple disks having problems, you do not want to do anything without expert guidance. (and I'm not expert enough to guide you myself... I can only offer advice of what NOT to do. Do NOT press "Restore" as it just resets an initial configuration and throws away parity unless you take specific non-standard actions.

Do NOT add or delete drives to the configuration,nor move drives around, nor should you run reiserfsck on ANY drive at this point. It can only do harm by invalidating your parity. First lets see if you can get disk3 working by swapping its cable.... Then, with only disk1 out, there is a much higher chance of getting all your data on disk1 recovered.

Before you do anything, make a copy of your "config" folder on the flash drive to some other drive, even if only on your PC. It might be needed if things get really confused to get back to a known starting point.

Joe L.

April 2, 200917 yr

Hi Joe,

thanks, for the help so far.

I have already ran reiserfsck on disk 1 and the parity disk, that's SDC1. So it's too late. But, I only ran the --check option.

The cables have not been switched around or anything like that, it was working perfectly before the powercut (which was yesterday). And it has been working that way for ages now.

there is only two disks showing up with problems, the parity disk(SDJ) and disk 1 (SDC)

I willl check the cables and see if there are any loose. I will leave the box off until rob or someone else chimes in.

ok, I am just back after replacing the cables into the disk1, parity and disk 3. Just to make sure. And I have rebooted and got another syslog.

April 3, 200917 yr

Changing the cabling to Disk 3 worked, there are no more errors on it, and it is able to start with a SATA link up speed of 3.0 Gbps now. There is no evidence of any problems with the parity drive, so its red ball was probably for invalid parity once Disk 1 became invalid. Since you did not note any issues with Disk 3, I'm not sure *why* parity is invalid, since only one drive failed. However, the corruption in the superblock of Disk 1 had to have happened on the previous run, so perhaps there were other problems evident at that time too. What can you remember about the previous session, prior to the boot chronicled in your first posted syslog? Especially useful would be whatever you can remember about the shutdown.

I don't understand how reiserfsck could report "no corruption found" on Disk 1. Can you try one more time, with the following command?

reiserfsck  -y  /dev/sdc1

That ends in a 'one', not an 'el'. (sorry, just trying to avoid any chance of confusion)

I'd like to note one oddity, probably just a coincidence. Disk 1 and Disk 3 were set up together, and are connected together on the second pair of onboard nForce SATA ports, on ports 3 and 4. I can't see a connection, especially since there is no known way (at least to me) that a bad cable on one drive could possibly cause file system corruption on the drive that happened to be connected right next to it. But it would be very interesting to know what actually happened on the previous run, possibly corrupted driver or file system structures? Maybe related to the communication errors on the adjacent drive?

Just thinking ahead, don't know of a reason so far to not trust the parity drive, so long as you did not run anything but the '--check' parameter, and you did not see anything 'fixed' when it ran (problems reported but not fixed). So at some point, you will use the Trust My Array procedure to restore the parity drive to green. We really hope another run of reiserfsck will fix the superblock on Disk 1. Failing that, then we have to use the '--rebuild-sb' parameter, and that may cause additional problems to fix. Rebuilding the drive is an option, but I'm not sure it will help, because if the file system did not 'know' there was something wrong previously when it saved the corrupted superblock to the disk, then the parity was probably updated along with it. So rebuilding would only restore the same corrupted superblock.

April 3, 200917 yr

Hi Rob,

Well I am thinking about all that i did, and I can't really say I did anything else in the session before the prior syslog. I did a reiserfsck -check and a smartctl on disk1. I also did a reiserfsck --check on the parity drive.

Another thing that is happening on the webgui, there is a problem with the "stop" button. When I press "stop" it seems to be stopping, but it doesn't. All the drives stay green with two red ones (parity and disk1) But it says all the drives are unformatted, except disk 3. And no option comes up to reboot or poweroff the server.

I do a samba stop through telnet instead. And when try to umount it says disk1 is already unmounted. I check ever disk is unmounted and the samba is stopped. I try to access the server through the windows file manager and it can't access any of the shares. so I know it's offline, but yet if I go into the webgui it says that the is running.

could there be a problem with my superdat file or maybe the 4.4.2 files got corrupt in the power cut?

I ran the reiserfsck --y /dev/sdc1

It came back the same thing, no corruption found. Here is the smartctl results.

April 3, 200917 yr

If you press "stop" and some drives stop but others show "unformatted", that is a symptom of open files on the drives that show "unformatted".

Log in with telnet and run this command to clear our all processes with open files:

cd /; fuser -muvk -9 /mnt/disk* /mnt/user* /dev/loop* /dev/md*

April 3, 200917 yr

Hi Rob,

Well I am thinking about all that i did, and I can't really say I did anything else in the session before the prior syslog. I did a reiserfsck -check and a smartctl on disk1. I also did a reiserfsck --check on the parity drive.

Please don't take this the wrong way, but if you do not give us all the clues we need to help, there is no way we will ever have the big picture needed. Clearly, your Linux skills are at a beginner level, and clearly you have read lots of posts where various commands have been used to help others... I can see you are trying to learn. It is obviously difficult for you to interpret what you are seeing, and us to know what you are ignoring.

With that said, it appears as if you did no harm, as the -check option does not write to the tested file-systems. But the parity drive never has a file-system. We have no idea (because you did not give the specific command you used) if you used the reiserfsck command on /dev/md1 or on /dev/sdc1. There is a HUGE difference if actually repairing the drive, even though they both point to the same file-system. One (/dev/mdX) will properly fix the drive AND keep parity in sync, the other (/dev/sdX1) will break parity and cause errors the next time a parity check is performed.

Another thing that is happening on the webgui, there is a problem with the "stop" button. When I press "stop" it seems to be stopping, but it doesn't. All the drives stay green with two red ones (parity and disk1) But it says all the drives are unformatted, except disk 3. And no option comes up to reboot or poweroff the server.

That is a huge clue as to what is happening. When you press STOP the first thing that normally happens is that the samba process is stopped, the file-systems un-mounted, and then the array stopped.

The array cannot be stopped unless all the file-systems under its control can be un-mounted. Because you have one file-system (on disk3) that was not un-mounted, the array could not stop. Therefore, you were not presented with a powerdown or reboot option because the array was not stopped.

Now, when this happens, those file-systems that have already been un-mounted are unfortunately shown as "unformatted" This has been reported as a major flaw in the user-interface logic... Hopefully it will be more clear in some future version of unRAID, but for now, we just need to understand why disk3 was not able to be un-mounted, then once it can, the stop button will be able to stop the array and you will get the powerdown and reboot buttons.

There are two main reasons a disk cannot be un-mounted. They are:

A file on the disk is open.

A process has its "current directory" on the disk. (If you log in, "cd" to /mnt/disk3/ then your command shell's "current directory" is on disk3 and it cannot be un-mounted until you "cd" elsewhere. If you "cd" to disk3 and then start a process running, its "current directory" is disk3 and disk3 cannot be un-mounted.)

So... something has disk3 busy... very likely either an open file, or a process using it as the current directory.

Basically, to get the process that has the file open you can type:

/usr/bin/fuser -mv /mnt/disk* /mnt/user/*

The fuser command will list the process involved and the PID (numeric process ID)

To kill the specific process you can usually use the following command:

kill PID

(where PID is the numeric process ID)

I do a samba stop through telnet instead.

That will usually stop SAMBA and close files open from the LAN shares. Unfortunately, the web-interface is not aware of your action, so it is basically useless to properly manage the array from that point until it is rebooted.

And when try to umount it says disk1 is already unmounted.

Yup, as described above, all the disks have already been un-mounted... all except disk3.

I check ever disk is unmounted and the samba is stopped.

I'm sure samba is stopped... you stopped it.

I try to access the server through the windows file manager and it can't access any of the shares. so I know it's offline, but yet if I go into the webgui it says that the is running.

the web-interface process is NOT designed for you to perform actions behind its back. As I already said... it is simply out of sync.

could there be a problem with my superdat file or maybe the 4.4.2 files got corrupt in the power cut?

Not too likely that anything is "corrupt"... It may need to be brought back into sync, as it thinks disk1 has failed, and it will not let that same disk back into the array without specific actions that are non-standard. Fortunately, that is fairly easy.

I ran the reiserfsck --y /dev/sdc1

It came back the same thing, no corruption found. Here is the smartctl results.

Good that the file-system is sound. The smart report seems to show communications with the drive is working.

Joe L.

April 3, 200917 yr

If you press "stop" and some drives stop but others show "unformatted", that is a symptom of open files on the drives that show "unformatted".

Log in with telnet and run this command to clear our all processes with open files:
cd /; fuser -muvk -9 /mnt/disk* /mnt/user* /dev/loop* /dev/md*

Close... all the drives that show as "unformatted" were able to be un-mounted. Those that do NOT show as un-formatted are those with the open files. In the case he described, disk3 could not be un-mounted.

The command you gave will kill all processess on all disks, so it will work regardless.

After running the "fuser" command given here, you can again use the "Stop" button on the web-interface. It should now work as appropriate. You should then reboot.

Or, you can use the "fuser" command in my prior post and kill the specific process so you know what to look out for in the future. And then use the "Stop" button.

Whatever you do, do NOT use the "Format" button when those disks show incorrectly as "Unformatted" as it will do as you ask and most of your disks will be re-formatted. (and your files gone)

Joe L.

April 3, 200917 yr

And no option comes up to reboot or poweroff the server.

There is a nice script written by WeeboTech called "powerdown" that is very useful in situations like this. It may not fix the problem you are having, but it will, as cleanly as possible, shutdown the server.

There is a link in the "Best of the Forums" unRAID Addons and Tools section (near the middle).

April 3, 200917 yr

Hi Joe,

What clues are you talking about? I have said everything I have done and have posted my syslogs and the report from using the smartctl command. I don't know what more information I can give. I didnt do anything only the reiserfsck -check and the smartctl. I replaced the sata cables going to the hard drives. And I did give the specific command I used. Here it is again

reiserfsck --check /dev/sdc1

I used the exact same command for checking the parity disk.

Robj's post below told me to try

reiserfsck --y /dev/sdc1

Which I did, but, it found no corruptions either.

Disk1 has a red light (which is /sdc1) the parrity disk has an orange light.

What more information do you need Joe? I have a disk that seems to be working perfectly according to all the checks. But, it stilll has a red light beside it and comes up as unformatted. How do I fix this?

As for the array not stopping, I have since discovered that if I wait a few minutes after booting the unraid server then try the stop command it works perfectly. I also understand about the format bug, because I read about it earlier in this thread. I did not press format or anything like that. This is not my concern at all. All the commands work from the telnet session, so I know everything is stopped before I do a reboot or check any drive for errors.

At the moment the array is up and running, there is no data loss as far as I can see. I tested watching two 1080p HD movies on two different cumputers in my house and they both played fine. I can copy files to the server from my computer with write speeds of 30MB/s.

April 3, 200917 yr

Hi Joe,

What clues are you talking about?

Specifically, that the indicator next to the parity drive is ORANGE, and that you stopped the smbd process using a telnet command and that you were unable to use the "Stop" command because you have disks that cannot be un-mounted resulting in all those that could be un-mounted showing up as un-formatted.

In fact, you had said in your first post:

when I checked the unraid server it had a red dot beside the parity drive and beside disk 1, it also said disk1 was unformatted. So after reading the other thread about this I did a proper powerdown, replaced the sata cable and checked all other connections then rebooted. The drives still showed up with the red dots.

I have said everything I have done and have posted my syslogs and the report from using the smartctl command. I don't know what more information I can give. I didnt do anything only the reiserfsck -check and the smartctl. I replaced the sata cables going to the hard drives. And I did give the specific command I used. Here it is again

reiserfsck --check /dev/sdc1

I used the exact same command for checking the parity disk.

Robj's post below told me to try

reiserfsck --y /dev/sdc1

Which I did, but, it found no corruptions either.

Disk1 has a red light (which is /sdc1) the parrity disk has an orange light.

What more information do you need Joe? I have a disk that seems to be working perfectly according to all the checks. But, it stilll has a red light beside it and comes up as unformatted. How do I fix this?

As for the array not stopping, I have since discovered that if I wait a few minutes after booting the unraid server then try the stop command it works perfectly. I also understand about the format bug, because I read about it earlier in this thread. I did not press format or anything like that. This is not my concern at all. All the commands work from the telnet session, so I know everything is stopped before I do a reboot or check any drive for errors.

At the moment the array is up and running, there is no data loss as far as I can see. I tested watching two 1080p HD movies on two different cumputers in my house and they both played fine. I can copy files to the server from my computer with write speeds of 30MB/s.

I'm glad you are running. Apparently, disk1 at one point was not able to be written to. It might have even been before your power failure... no way to tell unless you had looked at the management page just before the outage. When that write-error occurred, disk1 was taken off-line and subsequently its contents were simulated by reading the parity disk and all the other disks in your array. This is the really neat part of a raid array... you can still use it when a disk has failed.

Now, unRAID takes the very conservative approach once a write failure occurs. It will not use that disk again, even if it works on the next reboot. You must take one of several actions.

One way is to replace the drive with another with a different serial number. Typically, this is what you would do if a drive fails, or you are upgrading it to a one of a bigger size.
A second way to proceed is to un-assign the disk, reboot with it un-assigned, then re-assign it. The action of un-assigning it is exactly the same as it being un-available... as in broken... it will be simulated by parity and the other data drives... When you re-assign it after a reboot (which causes the array to forget the serial number of the drive you un-assigned), it would use parity and the other drives to rebuild the newly re-assigned drive. To it, it is then a "new" drive and it would be rebuilt. This with your array would take a great number of hours, during which you would not be protected from another failure until it was completely rebuilt and back online. For many years, this was the only way to get a drive to be re-recognized. Now, for your situation, there is a third method that is better since you will still have parity protection while it checks the array. It ONLY can be used when all the disks last used to calculate parity are present and working.
That third method to get a drive that has had a loose cable back on-line is described here in the wiki:
http://lime-technology.com/wiki/index.php/Make_unRAID_Trust_the_Parity_Drive,_Avoid_Rebuilding_Parity_Unnecessarily
It basically forces the array to think that no existing drive is invalid (off-line) I know it says not to use it with a disabled disk... in your situation, since you a reasonably certain disk1 is a working disk, and we are fairly sure all the others disks are OK, we can use it to force the array to come back on-line with it.

Basically, you check the checkbox next to the "Restore" button and press it, but DO NOT start the array. You should see BLUE indicators next to ALL of your drives, including disk1. If you do not, do NOT proceed. Post back here with what you do see... attaching a new syslog.
Then, you at the telnet command line (or on the system console) type the command as shown in the wiki.
Then, only after seeing the response shown in the wiki you can press the start button to start the array. A full parity check will begin, and you should let it run to completion.

As I already said, since you did have a power outage, I would let the resulting parity check run to completion... It might find a few errors as a result of the abrupt power loss. If it does, run another parity check, it should come up clean. With any luck you will not see any of the errors I saw at the end of your initial syslog attachment on disk3.

If you have any questions about how to proceed, ask here first before you do anything more... Odds are all will go smoothly... Do check that you are not seeing any other errors in the syslog before you proceed with the above procedure.

Joe L.

April 3, 200917 yr

Joe's mention of a power failure prompted me to go back to your first post, and I have to apologize, I completely missed your mention of a power cut, which explains the damage. That is what I was looking for, and I was unclear in how I asked, as to what had happened in the previous session when the damage occurred at shutdown, prior to the appearance of the red ball.

A power outage, and a power restart for that matter, are often accompanied by a large and possibly damaging electrical spike, and it appears you were hit pretty hard at a very bad time, during a write to the superblock area. We are still left with the problem of fixing that. Even after rebooting and running reiserfsck, the superblock still appears damaged, unable to be mounted. I really don't understand how reiserfsck can find no corruption, when the mounting of the Reiser file system on the drive fails. My advice to run reiserfsck on sdc1 was based on my (mistaken) thought that you could not start the array. Since you can, would you mind running it one more time, but this time on md1, exactly as instructed in the Check Disk Filesystems page? Unfortunately, I'm not positive this will work either, as it may try to check the virtual and reconstructed Disk 1, not the physical Disk 1.

Now that I know it is power outage related damage, you can ignore my other comments about rebuilding the drive being problematic.

April 3, 200917 yr

Joe's mention of a power failure prompted me to go back to your first post, and I have to apologize, I completely missed your mention of a power cut, which explains the damage. That is what I was looking for, and I was unclear in how I asked, as to what had happened in the previous session when the damage occurred at shutdown, prior to the appearance of the red ball.

A power outage, and a power restart for that matter, are often accompanied by a large and possibly damaging electrical spike, and it appears you were hit pretty hard at a very bad time, during a write to the superblock area. We are still left with the problem of fixing that. Even after rebooting and running reiserfsck, the superblock still appears damaged, unable to be mounted. I really don't understand how reiserfsck can find no corruption, when the mounting of the Reiser file system on the drive fails. My advice to run reiserfsck on sdc1 was based on my (mistaken) thought that you could not start the array. Since you can, would you mind running it one more time, but this time on md1, exactly as instructed in the Check Disk Filesystems page? Unfortunately, I'm not positive this will work either, as it may try to check the virtual and reconstructed Disk 1, not the physical Disk 1.

Now that I know it is power outage related damage, you can ignore my other comments about rebuilding the drive being problematic.

We do not know if the superblock for /dev/md1 is bad at all.. in fact, odds are as good as any it is OK. All we know is reiserfcsk finds no problems. The only way to know for sure is to try to mount the drive... but until unRAID decides to use it, we really don't know. The first syslog he posted was before he swapped out the SATA cables. A new syslog might be in order. If it is still failing to mount the drive, that is one thing. If it is just marked as bad, another thing entirely.

I tend to think reiserfsck would find an error if one existed. Therefore, still think the Trust" procedure is still the best route. If the drive on disk1 is still defective, it will just be removed from service once more.

The process of pressing "restore" re-names the config/super.dat file to super.old, and then rebuilds it new from the currently assigned, working, and configured drives. It would eliminate any source of corruption in config/super.dat as it would be entirely new at that point.

Your thoughts RobJ? My instinct is to trust reiserfsck.

A new copy of the syslog might offer a few clues, before any new actions are taken.

Joe L.

help! drives show up as unformatted?!

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)