calypsocowboy

June 1, 2017

Thanks!

May 30, 2017

So i think I'm going to go with option 2 and do the new config. I'll then mount disk3 up to another machine and see what data I can get off it and copy it back to the server.

I believe these are correct procedures https://wiki.lime-technology.com/UnRAID_6_2/Storage_Management#Reset_the_array_configuration

One additional question, I'm assuming that resetting the array rebuilds the parity drive. So I'm guessing it makes sense to wait to do this until I get my new larger parity drive. Is that correct?

Once that all completes, then I'll preclear my old parity drive and add it back in.

BTW, thanks for the help. I definitely need to get notifications setup on the server.

May 30, 2017

Okay, at this point, I've copied off most of what I think my (non-replaceable) files off the array. I haven't checked all the files to see if they are okay. At this point on the array is mostly music and movies about 8.8TB worth that I can replace but would prefer not to. I've shut down the array to prevent further writes to it.

My current array is 4TB parity, 4TB, 3x2TB data drives. As it sits right now, I'm using 8.8TB, so I don't have room to pull the failing 2TB drive out now. In a week, I'll have a new motherboard and/or a new controller card and a new 8TB drive I was planning on for parity. What's the best way to bring things back up as it sounds like at this point, it sounds like trusting parity to rebuild the drive wouldn't be a good idea.

My initial thought was some what around clearing my 4TB parity drive, bringing up all 4 other discs, copying the data from the failing drive over to a good one and removing it from the array. All of this would I'm assuming be done with the array unprotected. Then once the failing 2TB drive is out and the 4TB drive is in, Put the 8TB drive and rebuild parity. Lastly, hope that I didn't lose too much data.

May 27, 2017

I have a drive showing with a red X, device is disabled. About my system, I'm running Unraid 6.2 on an older Supermicro PDSMi board with a RR1U-ELi riser card with a SuperMicro AOC-SASLP-MV8 card. The drive that is having problems is connected to that card. It's a 4TB Seagate drive. I got it from a schucked enclosure a number of years back from Costco. I have two of those drives in the computer. I believe my mobo bios is up to date, I'm not sure what bios the card is running.

This first happened about 3 months ago. At the time, I stopped the array, powered down the server, pulled and reset the power and data cables, restarted the server, checked the smart report, it came back clean on the drive, so went through the process of adding the drive back into the array, Things seemed to be working well for a bit.

About a week ago, I noticed the same thing, I took the same steps only this time, I connected the drive to a different end of the breakout cable, the one I had it connected to looked a little suspect, same process, clean smart report, added drive back in.

And now, back comes the red x, same drive.

No now trying to figure out what's next. I'm not sure the drive is bad because after each reboot, it come back clean. I've tried different ends on the breakout cable (Monoprice). I could try ordering a new cable to see if that might be the issue. I'm not sure if it's the expansion card or maybe the riser or the combo. I don't think my mobo supports just the card without the riser.

Any thoughts or am I too the point where I need to look for a new expansion card, or maybe a mobo that supports more SATA connections, or supports the card I have directly instead of via riser.

Current Diagnostics - cascade-diagnostics-20170527-1408.zip

Last Weeks Diagnostics - cascade-diagnostics-20170521-0812.zip

September 10, 2014

Any update on this plugin?

September 9, 2013

Second log

syslog20130908.txt

September 9, 2013

20130908 is the current log, at this point web console is still responding, parity is rebuilding right now.

20130907 the server was unresponsive and just before I triggered an unclean shutdown.

syslog20130907.txt

September 8, 2013

I'm upgrading from rc12 I think to 5 final. The upgrade seemed to go fine, installed the key rebooted and for the most part I can get to the console okay on reboot.

I can get to the gui fine without problems for a bit, then all of sudden, nothing. I can't get to via server name or IP. The server is still up and functioning, serving files and I can get to it via telnet, it's just the console appears to be dead, I'm seeing a lot of these errors in my syslog.

Sep 7 17:27:24 CASCADE avahi-daemon[1112]: Invalid response packet from host 192.168.1.101.

Sep 7 17:29:04 CASCADE avahi-daemon[1112]: Invalid response packet from host 192.168.1.101.

Sep 7 17:30:44 CASCADE avahi-daemon[1112]: Invalid response packet from host 192.168.1.101.

Sep 7 17:32:23 CASCADE avahi-daemon[1112]: Invalid response packet from host 192.168.1.101.

Sep 7 17:34:03 CASCADE avahi-daemon[1112]: Invalid response packet from host 192.168.1.101.

This is pretty much a fresh install, no plugins and the stock 5.0 web gui.

SMB setting are as follows:

Enable SMB: Yes (Workgroup)

Done

Workgroup: BAWDENHOME

Local master: No Yes

-Josh

June 24, 2013

I was having the same errors on my SuperMicro board. I honestly don't know what fixed it. I know know Tom changed a timeout value in a later release and that helped, but it seemed to be more kernel related as the newer kernel worked.

My last Bios update for my board was over 5 years ago, so I'm tempted to think it's mobo related. These new boards are getting faster and faster and new OS's, sometimes I think they just don't work as well on the older hardware.

I also got a new flashdrive and did notice it was faster.

Sorry, I'm not more help.

June 15, 2013

System booted, no USB not found errors, configuration valid, array started and parity is running right now. Will check again in the morning.

Tom, thanks again for all your hard work. I know we don't say it enough.

-Josh

June 15, 2013

New flashdrive Sandisk Fit, old drive was getting 10MB/s new drive 31MB/s. Same result. I'll just wait for rc15.

-Josh

June 14, 2013

Thanks Tom. I may go out and get a new flash drive just to see if that makes a difference as well.

June 14, 2013

I'm currently using a Lexar Firefly, that is only a couple of years old. I've checkdisked it and it comes back fine.

I also tried another newer USB drive I have laying around. Same thing, system behaved the same way. Also if the flash drive was going bad, I would not have expected rc13 to work.

It seems to be related some how to this http://lime-technology.com/forum/index.php?topic=25250.msg220900#msg220900 For what ever reason the USB drive isn't mounting and it's only waiting soo long.

I do see a lot of ata5: link is slow to respond errors on the concole, but I thought those were due to ata ports on the mobo that are not in use.

Jun 14 20:50:43 Tower kernel: sd 1:0:0:0: [sdb] Attached SCSI disk

Jun 14 20:50:43 Tower kernel: scsi 8:0:0:0: Direct-Access Lexar JD FireFly 1100 PQ: 0 ANSI: 0 CCS

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] 3915776 512-byte logical blocks: (2.00 GB/1.86 GiB)

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] Write Protect is off

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] Mode Sense: 43 00 00 00

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] No Caching mode page present

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] Assuming drive cache: write through

Jun 14 20:50:43 Tower kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0

Jun 14 20:50:43 Tower kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0

Jun 14 20:50:43 Tower kernel: sd 2:0:0:0: Attached scsi generic sg2 type 0

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: Attached scsi generic sg3 type 0

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] No Caching mode page present

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] Assuming drive cache: write through

Jun 14 20:50:43 Tower kernel: sdd: sdd1

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] No Caching mode page present

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] Assuming drive cache: write through

Jun 14 20:50:43 Tower kernel: sd 8:0:0:0: [sdd] Attached SCSI removable disk

Jun 14 20:50:43 Tower kernel: ata5: link is slow to respond, please be patient (ready=-19)

Jun 14 20:50:43 Tower kernel: ata5: COMRESET failed (errno=-16)

Jun 14 20:50:43 Tower kernel: ata5: link is slow to respond, please be patient (ready=-19)

Jun 14 20:50:43 Tower kernel: ata5: COMRESET failed (errno=-16)

Jun 14 20:50:43 Tower kernel: ata5: link is slow to respond, please be patient (ready=-19)

Jun 14 20:50:43 Tower kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

June 14, 2013

At the console, I'm seeing a number of "waiting for /dev/disk/by-label/UNRAID" there are about 10 or messages then /dev/disk/by-label/UNRAID not found. Mounting non-root local file systems.

if -rc15 is coming out with the 3.9.6 kernel, I'll just see how that goes as -rc13 worked.

-Josh

June 14, 2013

Thanks for all your hard work and awesome news!

June 14, 2013

Can someone help me understand when in the process the USB drive gets mounted as /boot or as UNRAID and is there a timing of how long it waits for the drive to be loaded before moving on?

I'm trying to troubleshoot an error. rc10 worked, 11-12 no workie, 13 worked, 14 no workie. It appears to be related to the USB drive not getting mounted or being mounted in properly. The only thing I can someone trace it to is the kernel.

From my limited unix knowledge it appears the drive is mounted on the computer, but something is going on. The system comes up and emhttp doesn't appear or want to load. I can start it manually, but it has problems.

root@Tower:/# ls /dev/disk/by-id/*

/dev/disk/by-id/ata-WDC_WD20EARS-00MVWB0_WD-WMAZA0310325@ /dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WMAZA2802606@

/dev/disk/by-id/ata-WDC_WD20EARS-00MVWB0_WD-WMAZA0310325-part1@ /dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WMAZA2802606-part1@

/dev/disk/by-id/ata-WDC_WD20EARS-00MVWB0_WD-WMAZA2802606@ /dev/disk/by-id/usb-Lexar_JD_FireFly_TXVSZS46RZ0JRC7V5WG1-0:0@

/dev/disk/by-id/ata-WDC_WD20EARS-00MVWB0_WD-WMAZA2802606-part1@ /dev/disk/by-id/usb-Lexar_JD_FireFly_TXVSZS46RZ0JRC7V5WG1-0:0-part1@

/dev/disk/by-id/ata-WDC_WD20EARS-00S8B1_WD-WCAVY2836928@ /dev/disk/by-id/wwn-0x50014ee00261e6e5@

/dev/disk/by-id/ata-WDC_WD20EARS-00S8B1_WD-WCAVY2836928-part1@ /dev/disk/by-id/wwn-0x50014ee00261e6e5-part1@

/dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WCAVY2836928@ /dev/disk/by-id/wwn-0x50014ee2041dd087@

/dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WCAVY2836928-part1@ /dev/disk/by-id/wwn-0x50014ee2041dd087-part1@

/dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WMAZA0310325@ /dev/disk/by-id/wwn-0x50014ee65608294e@

/dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WMAZA0310325-part1@ /dev/disk/by-id/wwn-0x50014ee65608294e-part1@

root@Tower:/# ls /dev/disk/by-label/*

/dev/disk/by-label/UNRAID@

Should I try unmounting /boot and remounting in manually and starting emhttp?

Thanks,

Josh

syslog14a.txt

June 13, 2013

Also hook up a monitor if you can. I had issues but I don't think they were related to the NIC. If you can get a monitor and keyboard hooked up, you can troubleshoot better, type ifconfig to see your ip and check your syslog by doing cat /var/log/syslog.

June 13, 2013

I don't know what it is about that 3.4 kernel but my mobo doesn't like it. It never seems to find my USB drive. Mine doesn't even come up with rc14. I went back to 13 and it was fine. I've rebooted a few times with 13 and no issues with stopping the array.

I feel for you Tom, trying to find something that works for everyone.

syslog14.txt

June 4, 2013

I'm up and running as well, this solved the issues I was having with 11 and 12a with not being able to come up. I'm still seeing some udevd worker failed errors but it's running which was more than before. Running parity now.

-Josh

May 31, 2013

It's an older system. I'm using the most current Bios, but it's still from 2008.

SuperMicro PDSMi http://www.supermicro.com/products/motherboard/pd/E7230/pdsmi.cfm

- Intel® E7230 (Mukilteo) Chipset

- 1x Intel® 82573L PCI-e Gigabit LAN

- 1x Intel® 82573V PCI-e Gigabit LAN

- SATA ICH7R Controller

- ATI RageXL Graphics

Intel Pentium D 2.8

4GB Memory

Lexar Firefly USB Drive

WD Green Drives

No addin cards installed at this time.

May 30, 2013

May want to add a note about taking a snap shot of users so you have that to restore to.

May 29, 2013

I went back through RC versions, It looks like I can get 10 to come up and the array starts successfully. 10 has come up on me twice, and twice it has hung with a "Waiting for USB Subsystem" message on the screen for more than 5 minutes. The two times it's come up I've seen the same error on the screen, nothing in the log, but the system did eventually come up. I've attached the syslog from 10. Right now I'm waiting for the permission utility to do it's thing before doing too much.

So I know 10 is on kernel 3.4.24 and 11a is 3.4.26, is my problem likely a kernel error or a unRaid error?

-Josh

syslog10_running2.txt

May 25, 2013

Yes, I did read through the Wiki article but didn't see anything that stood out relating to this. I'm using the stock go file which matches the wiki page. On the release page, in the past I've tried just copying over the bzimage and bzroot files, that didn't work. So I've set the drive up as I mentioned above, reformated with freshly downloaded 12a files (checksum matched) with my Shares folder, disk.cfg, ident.cfg, network.cfg, share.cfg, and super.dat.

At this point the server isn't "coming up all the way" as emhttp isn't running, it isn't getting to the go script. So the array is showing "Stopped: no devices". I can see my three disks, just not my flash drive which is where are the config is.

-Josh

May 25, 2013

Still seems to be related to mounting the flash drive and or drives. I've tried different drives and had the same results. So I don't think it's my flash drive. I've also tried different usb slots on the MB, same result. I was able to get emhttp manually started and got the web server up. It shows a flash drive but no info. I was able to get another syslog after I started emhttp.

I tried 11a and got similar errors, but I only tried it briefly.

syslog05242013_512a_w_emhttp.txt

May 24, 2013

Update: I've reformatted my flashdrive and put the 12a files on it and just the following files from my old install Shares folder, disk.cfg, ident.cfg, network.cfg, share.cfg, super.dat. I left the go from 12a and I didn't put a secrets.tbd file.

Still no go, but I was able to telnet into the server and with putty get some screenshots of the syslog. I'm seeing the following which may be were it is stopping.

May 25 20:55:07 Tower udevd[670]: worker [688] unexpectedly returned with status 0x0100

May 25 20:55:07 Tower udevd[670]: worker [688] failed while handling '/devices/pci0000:00/0000:00:1f.2'

May 25 20:55:07 Tower udevd[670]: worker [708] unexpectedly returned with status 0x0100

May 25 20:55:07 Tower udevd[670]: worker [708] failed while handling '/devices/pci0000:00/0000:00:1c.5/0000:0e:00.0'

May 25 20:55:07 Tower udevd[670]: worker [701] unexpectedly returned with status 0x0100

May 25 20:55:07 Tower udevd[670]: worker [701] failed while handling '/devices/pci0000:00/0000:00:1e.0/0000:0f:02.0'

May 25 20:55:08 Tower udevd[670]: worker [803] unexpectedly returned with status 0x0100

May 25 20:55:08 Tower udevd[670]: worker [803] failed while handling '/devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0'

I've heard about these being plugin related, but with fresh 12a files and only the files listed above, I don't have any plugin's installed. I've attached the syslog.

PS. I copied over my 4.7 files and the server boots right up.

-Josh

syslog05242013_512a.txt

calypsocowboy

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by calypsocowboy

(SOLVED) - Drive shows with red X, device is disabled.

(SOLVED) - Drive shows with red X, device is disabled.

(SOLVED) - Drive shows with red X, device is disabled.

(SOLVED) - Drive shows with red X, device is disabled.

VDR, TVHeadend, unRAID 6 x64

Web console unavailable, avahi errors in the syslog

Web console unavailable, avahi errors in the syslog

Web console unavailable, avahi errors in the syslog

unRaid sometimes does not find /dev/disk/by-label/UNRAID

unRAID Server Release 5.0-rc15 Available

unRAID Server Release 5.0-rc14 Available

unRAID Server Release 5.0-rc14 Available

unRAID Server Release 5.0-rc14 Available

unRAID Server Release 5.0-rc14 Available

Apology and Plan Moving Forward

unRAID Server Release 5.0-rc14 Available

5.0 RC14 - Intel NIC with no network access (works fine on 4.7)

unRAID Server Release 5.0-rc14 Available

unRAID Server Release 5.0-rc13 Available

udevd[670]: worker [688] unexpectedly returned with status 0x0100

How-To: Migrate from unRAID 4.7 to unRAID 5.0

udevd[670]: worker [688] unexpectedly returned with status 0x0100

udevd[670]: worker [688] unexpectedly returned with status 0x0100

udevd[670]: worker [688] unexpectedly returned with status 0x0100

udevd[670]: worker [688] unexpectedly returned with status 0x0100