myndphunkie

February 19, 2010

Yep, PEBKAC it is then :-)

I think your right on the data rebuild process. Basically, I had moved files (which would have gone to disk3 as it had the most free-space), found an error, pressed restore (even though it said disk contents are not affected), and did a parity sync.

Lesson learned.

I'll check the cables out on the weekend.

As for the error message(s), dmesg said it was unable to identify the interface. This is the same message in the Wiki:

"ata7: hard resetting link

ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

ata7.00: qc timeout (cmd 0xec)

ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4)

ata7.00: revalidation failed (errno=-5)

ata7: failed to recover some devices, retrying in 5 secs"

February 18, 2010

Hi Guys,

From the Wiki (somewhere, I can't find it now), I think I have a faulty power cable or SATA cable but would like some confirmation before wriggling / replacing things.

I am currently running unraid 4.4.2 on a full slackware distribution. Up until recently, I have had no real issues until my newest drive started showing errors. Unraid has marked the drive with a red circle.

Now, I've run short and long S.M.A.R.T. tests several times, and there are 0 issues. So, I pressed the restore button, did a parity sync and all was fine for a few days.

It was after this I noticed that some of my files may have disappeared and I didn't think it was PEBKAC.

A few days later, the same issue again. So, I went in the same circle again - and lost some data - again.

To make it easier, I'll post some stats:

Drive:

1TB - ata-ST31000528AS_6VP1PBAY (Disk 3)

Smart Report:

Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===

Device Model: ST31000528AS

Serial Number: 6VP1PBAY

Firmware Version: CC37

User Capacity: 1,000,204,886,016 bytes

Device is: Not in smartctl database [for details use: -P showall]

ATA Version is: 8

ATA Standard is: ATA-8-ACS revision 4

Local Time is: Fri Feb 19 04:17:18 2010 CST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status: (0x82) Offline data collection activity

was completed without error.

Auto Offline Data Collection: Enabled.

Self-test execution status: ( 245) Self-test routine in progress...

50% of test remaining.

Total time to complete Offline

data collection: ( 600) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 1) minutes.

Extended self-test routine

recommended polling time: ( 180) minutes.

Conveyance self-test routine

recommended polling time: ( 2) minutes.

SCT capabilities: (0x103f) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail Always - 226697282

3 Spin_Up_Time 0x0003 097 095 000 Pre-fail Always - 0

4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 361

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

7 Seek_Error_Rate 0x000f 066 060 030 Pre-fail Always - 4873256

9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 3028

10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0

12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 164

183 Unknown_Attribute 0x0032 099 099 000 Old_age Always - 1

184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0

187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

188 Unknown_Attribute 0x0032 100 099 000 Old_age Always - 100

189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

190 Airflow_Temperature_Cel 0x0022 071 059 045 Old_age Always - 29 (Lifetime Min/Max 27/29)

194 Temperature_Celsius 0x0022 029 041 000 Old_age Always - 29 (0 19 0 0)

195 Hardware_ECC_Recovered 0x001a 037 023 000 Old_age Always - 226697282

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 144976621079867

241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 3589836420

242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 630453533

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short offline Self-test routine in progress 50% 3028 -

# 2 Short offline Completed without error 00% 2801 -

# 3 Extended offline Completed without error 00% 2711 -

# 4 Short offline Completed without error 00% 2699 -

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

(This seems to be a perfect drive?)

/var/log/messages:

Feb 14 07:49:06 TANK kernel: ata9: hard resetting link

Feb 14 07:49:08 TANK kernel: ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Feb 15 04:47:41 TANK kernel: sdk:md: disk3 read error

Feb 15 04:47:42 TANK kernel: pe read error: 1205131344/3, count: 1

Feb 15 04:47:43 TANK kernel: pe read error: 1205139208/3, count: 1

Feb 15 04:47:43 TANK kernel: <4pe read error: 1205139216/3, count: 1

Feb 15 04:47:43 TANK kernel: <4pe read error: 1205139248/3, count: 1

Feb 15 04:47:43 TANK kernel: <pe read error: 1205139256/3, count: 1

Feb 15 04:47:43 TANK kernel: pe read error: 1205139264/3, count: 1

Feb 17 22:27:02 TANK kernel: scsi 9:0:0:0: Direct-Access ATA ST31000528AS CC37 PQ: 0 ANSI: 5

Feb 17 22:27:02 TANK kernel: sd 9:0:0:0: [sdi] 1953525168 512-byte hardware sectors (1000205 MB)

Feb 17 22:27:02 TANK kernel: sd 9:0:0:0: [sdi] Write Protect is off

Feb 17 22:27:02 TANK kernel: sd 9:0:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

October 13, 2009

Has anyone tried unmounting / stopping the filesystem as the cacher disk is moving files to the array? If my theory is correct, it would unmount all disks except the one in use.

October 6, 2009

Weird... I had an issue that was/is similar..

I'm running on a full slackware install with parity + cache. I got the disk full message even though there was plenty of free space.

The shortish story: I was visually watching files disappear after unzipping them.. I had no idea where they went until my root partition was full (100%!). All the missing files ended up in the black hole of /mnt/user0/<folders>. Each of the shares that had the issue were set to high-water with no split level anything.

Ive now switched these back to 'most-free' and will see how it goes.

I think I just found another cause behind this... When I pressed 'stop' to stop the array, the cache mover was still moving files to disk8. As it couldn't 'unmount' the folder of /mnt/user0 im guessing that upon reboot it won't re-mount it.

This would cause my /mnt/user0 to fill up quickly - in this case, it's an actual drive and not my flash drive.

September 8, 2009

You didn't check the cables and still don't know why the drive was taken out of service?

I believe the UDMA errors could be cables. I have one SATA drive that shows a high number after I had a cable problem.

Peter

Hi Peter,

No, I didn't check the cables. The UDMA errors were from a long time ago (see http://lime-technology.com/forum/index.php?topic=3021.0 where the count is exactly the same). I believe the drive was taken out of service because of the multiple power failures and my laziness in checking the management page afterwards as this was around the same time as the problem began.

According to the syslogs, it actually started directly after a reboot - not in the middle of a 'powered up session' (if that makes sense).

Cheers

Edit: The 'long' S.M.A.R.T. test passed without any errors.

September 7, 2009

Absolutely fantastic response, thanks bjp999.

I decided to rebuild onto the original drive and your theory does make sense. I'm confident that the errors are from a long time ago and not from recently. As I now recall from the logs, this issue started around the same day we had many power failures - each time a parity check was in place. It has taught me to check the unraid page more often though - if it wasn't for the weird network issues, I probably would have noticed for a few more days (which is scary if I had another failure!)

I actually do a parity check at least 2-3 times per month, so I'm confident there isn't any *real* issues.

The rebuild has now finished, and this is the results:

Last checked on 9/7/2009 7:52:08 PM, finding 0 errors.)

I went out and bought a 1TB drive to replace it anyway, so I'm going to add that to the array as I was running out of free space!

I'll run a long test now.

September 7, 2009

nb: SMART errors were from an older issue (bad cable):

http://lime-technology.com/forum/index.php?topic=3021.0

If I'm right, it only shows errors at 438 days but it has been powered on for 610 days. Hoping someone can diagnose the smart report before I press the 'rebuild' option

Cheers

September 7, 2009

I feel a little bit silly right about now.

It just occured to me that I had never powered off the server, I had only rebooted it.

I powered down cleanly, and powered back up, and now unraid says:

"Stopped. Disabled disk replaced." (which I haven't yet)

The drive is now visible, but I can see this:

root@TANK:~# smartctl -a -d ata /dev/hdb

Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===

Model Family: Seagate Barracuda 7200.10 family

Device Model: ST3500630A

Serial Number: 9QG1TV5X

Firmware Version: 3.AAE

User Capacity: 500,107,862,016 bytes

Device is: In smartctl database [for details use: -P show]

ATA Version is: 7

ATA Standard is: Exact ATA specification draft version not indicated

Local Time is: Mon Sep 7 14:29:48 2009 CST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status: (0x82) Offline data collection activity

was completed without error.

Auto Offline Data Collection: Enabled.

Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: ( 430) seconds.

Offline data collection

capabilities: (0x5b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

No Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 1) minutes.

Extended self-test routine

recommended polling time: ( 163) minutes.

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x000f 115 082 006 Pre-fail Always - 89052350

3 Spin_Up_Time 0x0003 092 092 000 Pre-fail Always - 0

4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2743

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 6

7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail Always - 465126290

9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 14645

10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0

12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 233

187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0

190 Airflow_Temperature_Cel 0x0022 061 048 045 Old_age Always - 39 (Lifetime Min/Max 39/39)

194 Temperature_Celsius 0x0022 039 052 000 Old_age Always - 39 (0 17 0 0)

195 Hardware_ECC_Recovered 0x001a 090 052 000 Old_age Always - 80496754

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 102

200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0

202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0

SMART Error Log Version: 1

ATA Error Count: 101 (device log contains only the most recent five errors)

CR = Command Register [HEX]

FR = Features Register [HEX]

SC = Sector Count Register [HEX]

SN = Sector Number Register [HEX]

CL = Cylinder Low Register [HEX]

CH = Cylinder High Register [HEX]

DH = Device/Head Register [HEX]

DC = Device Command Register [HEX]

ER = Error register [HEX]

ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 101 occurred at disk power-on lifetime: 10525 hours (438 days + 13 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

84 51 00 00 00 00 e0 Error: ICRC, ABRT at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 00 08 c7 87 5d e0 00 08:03:17.347 READ DMA EXT

25 00 08 c7 87 5d e0 00 08:03:16.905 READ DMA EXT

10 00 3f 00 00 00 e0 00 08:03:16.905 RECALIBRATE [OBS-4]

25 00 08 c7 87 5d e0 00 08:03:16.463 READ DMA EXT

25 00 08 c7 87 5d e0 00 08:03:16.023 READ DMA EXT

Error 100 occurred at disk power-on lifetime: 10525 hours (438 days + 13 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

84 51 00 00 00 00 e0 Error: ICRC, ABRT at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 00 08 c7 87 5d e0 00 08:03:14.192 READ DMA EXT

10 00 3f 00 00 00 e0 00 08:03:16.905 RECALIBRATE [OBS-4]

25 00 08 c7 87 5d e0 00 08:03:16.905 READ DMA EXT

25 00 08 c7 87 5d e0 00 08:03:16.463 READ DMA EXT

c6 00 10 00 00 00 e0 00 08:03:16.023 SET MULTIPLE MODE

Error 99 occurred at disk power-on lifetime: 10525 hours (438 days + 13 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

84 51 00 00 00 00 e0 Error: ICRC, ABRT at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 00 08 c7 87 5d e0 00 08:03:14.192 READ DMA EXT

25 00 08 c7 87 5d e0 00 08:03:14.172 READ DMA EXT

c6 00 10 00 00 00 e0 00 08:03:14.162 SET MULTIPLE MODE

00 00 40 00 00 00 00 06 08:03:16.463 NOP [Abort queued commands]

ef 03 40 00 00 00 e0 02 08:03:16.023 SET FEATURES [set transfer mode]

Error 98 occurred at disk power-on lifetime: 10525 hours (438 days + 13 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

84 51 00 00 00 00 e0 Error: ICRC, ABRT at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 00 08 c7 87 5d e0 00 08:03:14.192 READ DMA EXT

c6 00 10 00 00 00 e0 00 08:03:14.172 SET MULTIPLE MODE

00 00 40 00 00 00 00 06 08:03:14.162 NOP [Abort queued commands]

ef 03 40 00 00 00 e0 02 08:03:14.152 SET FEATURES [set transfer mode]

25 00 08 c7 87 5d e0 00 08:03:16.023 READ DMA EXT

Error 97 occurred at disk power-on lifetime: 10525 hours (438 days + 13 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

84 51 00 00 00 00 e0 Error: ICRC, ABRT at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 00 08 c7 87 5d e0 00 08:03:14.192 READ DMA EXT

25 00 08 c7 87 5d e0 00 08:03:14.172 READ DMA EXT

10 00 3f 00 00 00 e0 00 08:03:14.162 RECALIBRATE [OBS-4]

25 00 08 c7 87 5d e0 00 08:03:14.152 READ DMA EXT

25 00 08 c7 87 5d e0 00 08:03:14.141 READ DMA EXT

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Extended offline Completed without error 00% 12818 -

# 2 Short offline Completed without error 00% 10432 -

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Is it worth replacing it?

September 7, 2009

Fixing this type of thing quickly is a priority!

I'm heading out the door now to buy a new drive :-)

It is unfortunate if you rebooted the server without taking a screenshot and capturing a full syslog. Every time you reboot the syslog is completely refreshed and hints of what caused events like this are lost. All we know at this point is that unRAID removed the disk from the array. The best way to know if the disk is good or bad is to look at its smart report. (For more info. go to the troubleshooting link in my sig and read about smartctl).

I'm running a full slackware install, all the syslog files are rotated so I still have access to them :-)

ls -l /dev/disk/by-id

Unfortunately, the drive doesn't show up here :-(

Opening up the magic syslog shows this:

Aug 31 11:30:24 TANK kernel: hdb: dma_timer_expiry: dma status == 0x61

Aug 31 11:30:34 TANK kernel: hdb: DMA timeout error

Aug 31 11:30:34 TANK kernel: hdb: dma timeout error: status=0xd0 { Busy }

Aug 31 11:30:34 TANK kernel: ide: failed opcode was: unknown

Aug 31 11:31:04 TANK kernel: ide0: reset: master: passed; slave: failed

Aug 31 11:31:05 TANK kernel: hdb: status error: status=0x00 { }

Aug 31 11:31:35 TANK kernel: end_request: I/O error, dev hdb, sector 401820735

Aug 31 11:31:35 TANK kernel: md: disk3 read error

Aug 31 11:31:35 TANK kernel: handle_stripe read error: 401820672/3, count: 1

Aug 31 11:31:36 TANK kernel: end_request: I/O error, dev hdb, sector 401820743

Aug 31 11:31:36 TANK kernel: end_request: I/O error, dev hdb, sector 401820759

Aug 31 11:31:36 TANK kernel: end_request: I/O error, dev hdb, sector 401820767

^^ repeated hundreds of times

September 6, 2009

disk3 device: pci-0000:06:00.1-ide-0:1 (no device)

Going from "http://lime-technology.com/forum/index.php?topic=2601.msg21033#msg21033"

"Unraid will simulate a failed disk"

Would that be the reason I can still access all of my shares?

September 6, 2009

Hi All,

I'm running Unraid 4.4.2 on a full slackware install (maybe relevant, maybe not). Anyway, I noticed my HDD light on constantly today and when I went to use the unraid server it seemed a little "off color".

Long story short, there were 1000+ errors on 'drive 3' (500gig IDE drive).

After a reboot, unraid is still working, I can still get to the disk share via \\server\disk3 and I can also see it mounted via 'df -h'. I am concerned that I will lose data on this drive and I want to replace it with a SATA 1TB drive.

Would I need to follow this process:

1) 'Stop' the array

2) 'Unassign' disk 3

3) Shutdown the server

4) Replace IDE disk with SATA disk (bigger capacity)

5) Startup server

6) Assign new SATA disk to disk 3

7) Do a parity check / rebuild (??)

Also, is it possible to assign my current parity drive as disk 3 and not lose any data?

Syslog snippit is below:

Sep 6 22:30:28 TOWER kernel: md: disk3 removed

August 27, 2009

Weird... I had an issue that was/is similar..

I'm running on a full slackware install with parity + cache. I got the disk full message even though there was plenty of free space.

The shortish story: I was visually watching files disappear after unzipping them.. I had no idea where they went until my root partition was full (100%!). All the missing files ended up in the black hole of /mnt/user0/<folders>. Each of the shares that had the issue were set to high-water with no split level anything.

Ive now switched these back to 'most-free' and will see how it goes.

User Customizations · June 23, 2009

No idea :-(

I wouldn't want to go through this again until I actually know what I did the first time!

Perhaps try getting it working in a virtual first?

User Customizations · June 15, 2009

I used the slackware .config and also the menuconfig (I think menuconfig grabs the defaults from .config).

File is here: http://lime-technology.com/forum/index.php?action=dlattach;topic=3828.0;attach=1623

User Customizations · June 9, 2009

I know I copied over the md stuff but I'm not sure I copied over the fuse stuff before building the kernel. I will try that and report back.

Thanks again!

No probs... would adding my .config to a comment help you at all or make things worse?

User Customizations · June 2, 2009

from the slackware book http://www.slackware.com/config/init.php:

The first program to run under Slackware besides the Linux kernel is init. This program reads /etc/inittab file to see how to run the system. It runs the /etc/rc.d/rc.S script to prepare the system before going into your desired runlevel. The rc.S file enables your virtual memory, mounts your filesystems, cleans up certain log directories, initializes Plug and Play devices, loads kernel modules, configures PCMCIA devices, sets up serial ports, and runs System V init scripts (if found).

So I believe rc.S runs no matter what and rc.K is for single user mode and rc.M is multi-user. I looked in rc.M and there is no mention of fuse, and as I said before I disabled the rc.fuse mention in rc.S. dmesg still shows fuse loading, which I believe is the kernel.

I run my server headless so I want to capture what streams by on the screen during boot. Is this stored in a file somewhere or do I just have to connect a monitor and watch it?

Well that's good start that fuse is still loading before your go script or rc.local (which is what I was trying to get at before with the module thing). As for the boot screen, I thought dmesg shows this but I guess not.

I may/may not have mentioned it, but I copied all the fuse stuff and md stuff from the unraid kernel to my kernel source before compiling. This may have made a difference too.

User Customizations · June 2, 2009

What I found in /lib/modules/2.6.27.7/kernel/fs/fuse/ was fuse.ko. Is your intention to copy this to /lib/modules/2.6.27.7-unRAID/kernel/fs/fuse/? This is what I did. Then I renamed rc.fuse and removed the reference to fuse in rc.S.

Just noticed... I believe rc.S. is for single user mode, normally, Linux boots into Multi-user mode which is rc.M. Because you have renamed rc.fuse, any scripts should not load this. I believe mine loads from the kernel itself.

I'm happy to share whatever info you need from my system :-)

User Customizations · June 1, 2009

Attempted what you recommended:

1) mkdir /lib/modules/2.6.27.7-unRAID/kernel/fs/fuse
2) cp /lib/modules/2.6.27.7/kernel/fs/fuse/* /2.6.27.7-unRAID

3) mv /etc/rc.d/rc.fuse /etc/rc.d/rc.fuse.disabled

Stop array / Reboot via emhttp

Make sure you dont have any scripts that load /etc/rc.d/rc.fuse :-)

What I found in /lib/modules/2.6.27.7/kernel/fs/fuse/ was fuse.ko. Is your intention to copy this to /lib/modules/2.6.27.7-unRAID/kernel/fs/fuse/? This is what I did. Then I renamed rc.fuse and removed the reference to fuse in rc.S.

Tried a reboot and no-dice, then tried the start-stop method of getting it to work, which worked previously, and now it doesn't. Did I copy fuse.ko to the right place?

Thanks,

Phil/TW

Hi Phil,

Yes, that is the file I copied across to the same folder and it worked for me after that. I noticed that on bootup it was saying 'the fuse filesystem has already been loaded'. I had also compiled fuse support into my kernel as a module - perhaps thats why it's working for me?

Hopefully this gives you something to work on... it does work perfectly for me now, however, it took me around 12 hours (straight) of messing around with it. I do remember copying /md* to the unraid kernel and (possibly) all the fuse stuff too before compiling... perhaps thats what I did.

Someone with some more experience may be able to help :-)

User Customizations · May 31, 2009

Hey Torque..

Can you try this and let me know if it works?: http://lime-technology.com/forum/index.php?topic=3828.msg33853#msg33853

User Customizations · May 31, 2009

I've narrowed it down to this now (and I'm forgetting about NFS for a while):

Using the details in the last part of this page: http://www.thetechguide.com/howto/unraid-on-hard-drive.html, in order to get user shares working I must (after a reboot):

1) Stop the array

2) Disable User shares

3) Start the array

4) Stop the array

5) Enable user shares

6) Start the array

Surely I'm missing something?

edit (again):

I've FINALLY got it working on reboot...

For those who are interested (and this may / may not work for you):

1) mkdir /lib/modules/2.6.27.7-unRAID/kernel/fs/fuse

2) cp /lib/modules/2.6.27.7/kernel/fs/fuse/* /2.6.27.7-unRAID

3) mv /etc/rc.d/rc.fuse /etc/rc.d/rc.fuse.disabled

Stop array / Reboot via emhttp

Make sure you dont have any scripts that load /etc/rc.d/rc.fuse :-)

This is all assuming that you have compiled a kernel with Fuse support as a module, running slackware 12.2 and unraid 4.4.2. My .config is below and is setup for an ABIT AB9 PRO.

Cheers!

User Customizations · May 30, 2009

OK, I have been working on this for the last 8.5 hours and now I'm beat.

Things I have achieved:

1) Installed Slackware 12.2 onto my primary IDE drive with 3 paritions: /dev/hda1 = cache parition, /dev/hda2 = root parition, /dev/hda3 = swap partition

2) compiled a custom kernel with/for unRaid + virtualisation and rebooted into the new kernel successfully

3) Got emhttp running, assigned my drives, all is well

4) \\tower\diskX is working - that's good

5) \\tower\<user share> is visible, but not working.

I used the default .config file from unraid, and added IDE/SATA support, SysV support (for apache) and virtualisation support (for vmware).

I can't find any syslogs at all that show any issues, I can't find any logs that smbd / nmbd is complaing about either. I know a couple of people have had this issue before but am unsure if anyone has successfully got user-shares working with unraid on a HDD.

I've also copied these 2 files plus all the rc.<server> files to their respective directory:

/unraid/etc/rc.d/rc.fuse

/unraid/bin/fusermount

So in brief... everything works except user-shares. Has anyone got any idea on how to get this working? the kids are killling me!

Edit: I've been messing around with the Kernel again... I had also tried this with CIFS but have now hone back to SAMBA instead of CIFS as I read somewhere that this is what Tom is using in unRaid? Anyway, I enabled samba, and am now stuck on the NFS part (small issue in the scheme of things: (FATAL: Error inserting nfsd (/lib/modules/2.6.27.7-unRAID/kernel/fs/nfsd/nfsd.ko): Device or resource busy

Should NFS server be a module or * ?

User Customizations · May 26, 2009

Ah ha... thanks for that... it turns out the copy/paste had word wrapped when i put it in the .conf file... everytime i pressed install it would overwrite the .manual_install file

User Customizations · May 26, 2009

Does the last part of this url help you at all regarding fuse?:

http://www.thetechguide.com/howto/unraid-on-hard-drive.html

User Customizations · May 26, 2009

Awesome.. I'll give that a shot ;-)

User Customizations · May 26, 2009

Hi All,

I'm currently running unRAID v4.4.2 (due to the instructions only being available for this version) with VMWare on top. Everything is working fine except the vmware guest is really slow due to disk writes which I read would happen anyway if you run it on a drive with parity. I have thought about this quite often and I'd like to run the following:

1) Full slackware 12.2 distro installed onto a primary IDE drive (160gb)

1a) partitioned into: 40gb [Cache], 110gb [OS], 5gb [swap]

2) Install VMWare onto the OS partition

3) Still boot into the unRAID kernel with my licence file (I'd assume I boot from the USB stick?)

Is all this possible? I have setup unraid in vmware using slackware 12.2 + unraid 4.4.2, then packaged the files i needed to run vmware on my live server - so it shouldn't be a huge difference I'd imagine.

My questions really relate to point 1a - Would this still work?

Before you ask, I'd like to run other apps etc in the future so yes, a full distro would suit me better.

Thanks!

myndphunkie

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by myndphunkie

Faulty Cable?

Faulty Cable?

unRAID Server release 4.5-beta7 available

Cache drive not moving data to main drives???

Failed drive missing, unraid still working

Failed drive missing, unraid still working

Failed drive missing, unraid still working

Failed drive missing, unraid still working

Failed drive missing, unraid still working

Failed drive missing, unraid still working

Failed drive missing, unraid still working

Cache drive not moving data to main drives???

Running unRAID with a full Slackware distro

Running unRAID with a full Slackware distro

Running unRAID with a full Slackware distro

Running unRAID with a full Slackware distro

Running unRAID with a full Slackware distro

Running unRAID with a full Slackware distro

Running unRAID with a full Slackware distro

unRAID on HDD + VMware + Cache Drive + Swap Space

unRAID on HDD + VMware + Cache Drive + Swap Space

New unMENU package: mySQL

Running unRAID with a full Slackware distro

unRAID on HDD + VMware + Cache Drive + Swap Space

unRAID on HDD + VMware + Cache Drive + Swap Space