May 13, 201412 yr Before I start I know that the best plan is to attach a Syslog file, but unfortunately I did not take one at the time. :'( I was running 5.05 with Dynamix with valid parity from a very recenty parity check. I wanted to upgrade the CPU and RAM, so stopped the array and powered down, everything went as expected. On restart, UnRaid would boot but freeze and hang. I tried a couple of restarts but it would freeze at the same point In desperation I booted into maintenance mode and the arrary started OK, but without the Dynamix GUI. However on a display attached to the server, it indicated that sda1 did not unmount correctly and may be corrupted. Rather than try and investigate this at the time, I stupidy rebooted the array in the normal mode, it rebooted fine and parity was valid. However it was still indicating that sda1 (which is my USB stick) could be corrupt. This afternoon the server became unresponsive whilst tranferring some large files, and enentually. crashed. I could not access it via the GUI or from telnet. I have rebooted and a parity check is now underway, which is going to take over 1 day. Again it is showing that sda1 may be corrupt and suggested that I run fsck. I created MD5 check files for all files on the USB stick recently, on checking them I have found that they are all OK, but the SuperDat file has changed as it's chexk sum has changed. Does this sound like the SuperDat file being corrupted? Once the parity check has completed I intend to put the USB stick into a Windows machine and run CHKDSK. I have got a copy of RFSTools but don't know how to use them to analyse the SuperDat file. I do have a copy of the SuperDat file taken a few days ago can I use this to replace the one on the USB stick? However on reading through the forums I understand that it might be better to restart with no SuperDat file? I would welcome comments
May 13, 201412 yr Author Atatched is a syslog, parity is ongoing but I hope that someone can advise me why sda1 is marked as not being correctly unmounted syslog.txt
May 13, 201412 yr Author Can I run fsck from UnRaid via telnet, or do I have to use RSFTools on a seperate PC? Sorry for asking but I know next to nothing about Linux!
May 13, 201412 yr Is sda1 your USB flash drive? If so the normal process for that is to remove it from your unRAID server and take it to a Mac or Windows PC and run a file system check (I.E. CHKDSK on Windows). Reiserfsck is for drives in the array and the cache drive not the USB flash drive.
May 13, 201412 yr Author Thanks, that confused me too, I only thought that data drives were allocate an sd letter, but on checking it's for all drives included boot, parity and cache I'll give CHKDSK when parity is complete. I have noticed that parity is showing sync errors, never seen this before
May 13, 201412 yr Thanks, that confused me too, I only thought that data drives were allocate an sd letter, but on checking it's for all drives included boot, parity and cache I'll give CHKDSK when parity is complete. I have noticed that parity is showing sync errors, never seen this before If you are running a correcting check then hopefully they (sync errors) are being fixed and you can run another parity check after the current one finishes. That second one should come back with zero sync errors if everything is working correctly. But an expert would potentially give better advice. I really don't feel comfortable advising about array errors except in specific cases that I've experienced myself. Thankfully they have been few and far between. The worst one I got was a recurring sync error of 128 sectors at the start of a random disk that appears to have been fixed with a firmware upgrade to P18 on my M1015 - knock on wood as it has only been a week. Otherwise the only sync errors I've gotten have been on failed or failing disks that I replaced ASAP.
May 14, 201412 yr In addition to your unRAID disk assignments, super.dat also keeps track of whether your array is started or stopped. If you made a checksum of super.dat when the array was stopped, it would not match a checksum taken when the array is started, for example.
May 14, 201412 yr Author HELP!!! :'( The parity check completed finding 92 sync errors I stopped the array and powered down, however the array hangs at unmounting remote file system, see attached jpg It is also stating the zip is not installed? I have run Unraid since V3 and never had any problems. It might be a coincidense but this only happened since I changed processor and memory. However the new processor and RAM are reported OK in BIOS, and the RAM has been used in another PC for the last 2 years. I would be grateful for advice as to what I should do next
May 14, 201412 yr It is probably trying to unmount a disk but that disk has a file open. This is usually caused by either a computer on the network reading a file or a plugin on the unRAID server. Can you still access the server with telnet/putty? If so then you could use a command that identifies open files and kill the process that has it open. However I don't know that command as I use the unMenu GUI to tell me what files are open so I don't know it off hand - sorry. Hopefully someone else has better advice.
May 14, 201412 yr Author I don't know what happened to my first reply but here's a second I can't telnet in, and don't know what to do next. The array was stopped properly, and I was not aware of any file being being open
May 14, 201412 yr Wish I could advise you further but your past where I can offer any advise now. Hopefully someone else can take you further. If no answers here you might try emailing limetech they might be able to help never had to go that far myself yet so can't advise you on that route either.
May 15, 201412 yr Author In the absence of any further replies, I powered down the server, run CHKDSK on the USB stick, which found and corrected errors I then rebooted, every thing come up OK and parity is valid Just to be sure, I am currently running another parity check.
May 15, 201412 yr You updated your CPU and RAM? Did you run a lengthy memtest? Your issuers point to bad memory.
May 15, 201412 yr Author Thanks I'll run a Memtest, just to chear me up I now have a 2TB hard disk which has reballed I don't know if i should replace it, as due to my problems I don't trust that the array has failed it for good reason. I am waiting for Memtest to complete, but could faulty memory cause a disk to redball? If the failed hard disk checks out OK on diagnostic and SMART tests. how do I reinstall it?
May 16, 201412 yr Author I have just run memtest for 18 hours with 8 passes, no faults have been found I am just about to pull the defective disk from the array and commence SMART & diagnostic tests
May 16, 201412 yr Author Just to be on the safe side I have installed a new disk Currently running long diagnostic test on failed drive, at one time it was my parity disk and so I guess that it had hard use, but is Hitachi and only 3.5years old so I am very suprised Before booting up with a new drive I did double check all data and power cables, I have 3 5x3 enclosures Syslog attached syslog.zip
May 17, 201412 yr Author I added a new 2TB disk, data was reconstructed and everything OK Hower the array has crashed on me some 24 hours later, and I could not get a syslog or use telnet to access, but a lot of REISERFS messages were shown I took the USB stick and run CHKDSK on it, errors were found and corrected I have just restarted and a parity check is underway, I attach a new syslog from boot up, I notice that it has a lot of REISFER messages too I would welcome assistance, my array has been workng for a long time, the problems started when to changed CPU and RAM. As noted in another post, I have run memtest for 8 cycles lasting 18 hours, and no fault could be found with the RAM syslog.txt
May 17, 201412 yr Author HELP!!!!!! On carrying out a parity check, Disk4 is coming up reballed I really confused, the previous red balled disk has tested OK using all of Hitachi's diagnostic tools, and has been completely erased twice Before I consider buying yet another hard disk, I need some urgent help, a syslog is attached Syslog.zip
May 17, 201412 yr Author On syslog disks 4 & 5 were coming up with faults, but only disk 4 was redballed Stopped array and found that disks 4 & 5 were not assigned!!! Disconneted and reseated all sata cable to 5x3 units Restart and now disk 5 is connected, and disk 4 is redballed I have now unassigned disk 4 and restarted, stoped the array and reassigned disk 4, as I think that the problem is loose connections Disk 4 is now undergoing a data rebuild I am at a loss for my recent problems, in the past 2 months in have replaced two redballed hard disks (which on testing were found to be OK), and replaced the CPU and RAM. Memtest has been run for 8 cycles for 18 hours with no faults In the past 6 months I have reached 16 disks . this is from a Corsair TX650 power supply, this should be OK shouldn't it The data rebuilt in disk 4 has just halted, with the report "Kernel panic - not syncing : Fatal exception in interupt I also can not get a syslog or telnet in
May 18, 201412 yr Try rebuilding the drive in "Safe Mode" without your plugins and see if it will complete that way. Once it completes than you can restart and let your plugins load.
May 18, 201412 yr Author Fingers crossed, I have now started data reconstruction in safe mode Before this I carried out SMART and Seagate diagnostics to the disk, run the SeaTools fast fix utility and fully erased the disk, everything tested OK I also fully formated the USB stick, run a USB check utility which found that the USB stick was OK, I then loaded Unraid + Dynamix back onto it I currenty have 15 disks in 3 5x3 enclosures, with are served from 2 x 8 port Supermicro cards, the connecting cables only have locking connectors to the card end and not at the SATA. I have now bought new 4 cables that have locking connectors at both ends. The SATA connection is daft, you only have to sneeze and you have a loose connection, it was better in the PATA days!
May 18, 201412 yr Author I got to 50% data reconstruction and the array crashed with Kernel Panic again!! I have now put the orginal memory back in to try and rule out memory problems, orginal memory had been in for 4-5 years with no problem I checked the USB stick using Windows, errors were found and corrected, the stick was then formatted and a basic 5.05 installed + my config file I have just bought a new USB stick, and will use the $30 offer to get an extra license, this should allow me to rule out any USB problems I attach a syslog, in which I can't see any problems syslog.txt
May 18, 201412 yr Author Just crashed at around 20%, screen went black and I can't access the GUI or telnet in I am going to try putting back the orginal failed Hitachi disk, which passes all SMART and diagnostics OK Other than that could a faulty CPU be causing problems, which case I will put the original back in
May 19, 201412 yr Author The data has been reconstructed and parity is valid Fingers crossed that it will now work from now on, but I notice that I have a message FAT-fs (sda1) : Volume was not properly unmounted . Some data may be corrupted. Please run fsck. /dev/sda1 on /boot type vfat (rw,noatime,nodiratime,unmask=0,shortname=mixed) my USB stick has not been given an sd number, but I assume that sda1 is the USB stick as all other hard disks have different sd numbers I know that I can take the USB stick ahd have an errors corrected via Windows, but would like to try and use fsck via telnet I am going search the forums but would be grateful if someone could indicate what commands i need to run. or how to run the above command
Archived
This topic is now archived and is closed to further replies.