JorgeB Posted January 11, 2017 Share Posted January 11, 2017 How can I use ddrescue to recover data from a failing disk? It can happen due to a variety of reasons, like a disk failing while parity is invalid or two disks failing with single parity, a user having a failing disk with pending sectors and no way to rebuild it using parity, for those cases you can use ddrescue to salvage as much data as possible. To install ddrescue install the the NerdTools plugin then go to Settings -> NerdTools and install ddrescue. You need an extra disk (same size or larger than the failing disk) to clone the old disk to, using the console/SSH type: ddrescue -f /dev/sdX /dev/sdY /boot/ddrescue.log Both source and destination disks can't be mounted, replace X with source disk, Y with destination, always triple check these, if the wrong disk is used as destination it will be overwritten deleting all data. If this is not the first time you use ddrescue make sure you use a different log file (/boot/ddrescue.log) (or delete the existing one) or ddrescue will resume the previous run and possibly not do anything. It's also possible to use an array disk as destination, though only if it's the same size as the original, but to maintain parity you can only clone the partition, so the existing array disk needs to be a formatted unRAID disk already in any filesystem, still to maintain parity you need to use the md# device and the array needs to be started in maintenance mode, i.e., not accessible during the copy, by using the command: ddrescue -f /dev/sdX1 /dev/md# /boot/ddrescue.log Replace X with source disk (note de 1 in the source disk identifier), # with destination disk number, recommend enabling turbo write first or it will take much longer. Example output during the 1st pass: GNU ddrescue 1.22 ipos: 926889 MB, non-trimmed: 1695 kB, current rate: 95092 kB/s opos: 926889 MB, non-scraped: 0 B, average rate: 79236 kB/s non-tried: 1074 GB, bad-sector: 0 B, error rate: 0 B/s rescued: 925804 MB, bad areas: 0, run time: 3h 14m 44s pct rescued: 46.28%, read errors: 54, remaining time: 3h 18m time since last successful read: 0s Copying non-tried blocks... Pass 1 (forwards) After copying all the good blocks ddrescue will retry the bad blocks, forwards and backwards, this last part can take some time depending on how bad the disk is, example: GNU ddrescue 1.22 ipos: 17878 MB, non-trimmed: 0 B, current rate: 0 B/s opos: 17878 MB, non-scraped: 362496 B, average rate: 74898 kB/s non-tried: 0 B, bad-sector: 93696 B, error rate: 102 B/s rescued: 2000 GB, bad areas: 101, run time: 7h 25m 8s pct rescued: 99.99%, read errors: 260, remaining time: 25m time since last successful read: 10s Scraping failed blocks... (forwards) After the clone is complete you can mount the destination disk manually or using for example the UD plugin (if the cloned disk is unmountable run the appropriate filesystem repair tool, it might also be a good idea to run a filesystem check even if it mounts OK) and copy the recovered data to the array, some files will likely be corrupt and if you have checksums or are using btrfs you can easily find out which ones, if not see below. If you don't have checksums for your files (or use btrfs) there's a way you can still check which files were affected: Create a temporary text file with a text string not present on your data, e.g.: printf "unRAID " >~/fill.txt Then fill the bad blocks on the destination disk with that string: ddrescue -f --fill=- ~/fill.txt /dev/sdY /boot/ddrescue.log Replace Y with the cloned disk (not the original) and use the existing ddrescue mapfile. Finally mount the disk, manually or for example using the UD plugin and search for that string: find /mnt/path/to/disk -type f -exec grep -l "unRAID" '{}' ';' Replace /path/to/disk with the correct mount point, all files containing the string "unRAID" will be output and those are your corrupt files, this will take some time as all files on the disks will be scanned, output is only displayed in the end, and if there's no output then the bad sectors were in areas without any files. 2 Quote Link to comment
JorgeB Posted January 15, 2017 Share Posted January 15, 2017 I'm getting low read speeds from my unRAID server, is there a fix? There's an issue with Samba included with unRAID v6.2 or above that with some hardware configurations may give slower than normal read speed for Windows 8/10 (and related server releases) clients, my tests indicate that the HDD brand/model used is one of the main factors, write speed is not affected, Windows 7 clients are also not affected. To fix the issue add to "Samba extra configuration" on Settings -> SMB: * max protocol = SMB2_02 Stop and re-start array for changes to take effect, Windows clients may need to reboot to reconnect. Unrelated to this, 10GbE users should make two more changes for better overall performance (reads and writes): 1-Change NIC mtu do 9000 (unRAID server and any other computer with a 10GbE NIC) 2-Go to Settings -> Global Share Settings -> Tunable (enable direct IO): set to Yes ** * Should not be needed/make a difference in Unraid v6.5.x or newer ** Unlikely to make much of a difference starting with Unraid v6.8.x or newer due to changes in FUSE. This last one may also improve performance for gigabit users in some hardware configurations when reading from user shares. 1 1 Quote Link to comment
RobJ Posted February 18, 2017 Author Share Posted February 18, 2017 (edited) Can you explain the Cache drive option types, and what is the difference between them? Here is a table illustrating the differences: Table of Cache drive usage options and their behaviors C=Cache drive D=Data drive(s) Cache:No Cache:Yes Cache:Only Cache:Prefer Data should be on: D C+D C C+D New files first to: D C C C Files overflow to: - D - D Mover moves: No C to D No D to C Orphaned files: C - D - Notes: - Orphaned files are those files located where they don't belong (e.g. files on D with Cache:Only), they won't be moved by the Mover - Files on both C and D are still visible in shares, for all options - Shares are all root folders on all array data and Cache drives - New files overflow to the secondary destination when there is not enough space on the preferred destination - Cache:Prefer is the newest option. In general, it is now preferred over Cache:Only because it behaves the same but adds overflow protection. If you fill up the Cache drive, copying to that share will continue to a data drive, and not error out, as it would if marked Cache:Only. And if the Cache drive drops out, you will still be able to continue, using a data drive for the same share. Once the Cache drive is restored, then the Mover will move the share back to the Cache drive. Some typical usage scenarios Cache:Yes - data is written to the Cache drive, then Mover moves it to the data drives - This is the typical Cache drive usage for large shares, to speed up writes to the array. The data will mainly be stored on the parity protected array, but writes will be at full speed to the Cache drive, then later moved at idle times to the array. Cache:No - keeps all data on the data drives - This is similar to Cache:Yes, but doesn't use the Cache drive, which is fine if you don't need the speed boost when writing files to the shares. - An alternative usage is to keep most of the data on the array drives, but manually place selected data on a fast Cache drive, in the same share folders, for faster access to that data. It is still visible in the share but won't be moved to the data drives. For example, commonly accessed metadata might be placed there. This may help keep the data drives from spinning up. (This is similar to the alternative usage of Cache:Only) Cache:Only - keeps all data on the Cache drive or pool - This is typically used for smaller shares or shares you want faster access to. - An alternative usage is to write and keep new data on the Cache drive, but manually move rarely accessed older files to the same share folders on the array data drives. Both sets of files are visible in the share. This may help minimize data drive spin up. See this post. (This is similar to the alternative usage of Cache:No) Cache:Prefer - keeps data mainly on the Cache drive or pool, but allows overflow to the array - This is similar to Cache:Only, typically used for smaller shares or shares you want faster access to. But it has additional advantages over Cache:Only - data that won't fit on the Cache drive can overflow to the array drives. Also, if the Cache drive fails, the same share folders on the data drives will still continue working. It's also useful if you don't yet have a Cache drive, but are planning to get one. Once it is installed, the Mover will automatically (on its schedule) move all it can to the Cache drive. And if you need to do maintenance on the Cache drive or pool, you can move all the files to the array, and they will be moved back once you are done 'maintaining'. Edited February 22, 2017 by RobJ try to fix formatting for IPS 9 2 3 Quote Link to comment
JorgeB Posted March 8, 2017 Share Posted March 8, 2017 I have an unmountable BTRFS filesystem disk or pool, what can I do to recover my data? Unlike most other file systems, btrfs fsck (check --repair) should only be used as a last resort. While it's much better in the latest kernels/btrfs-tools, it can still make things worse. So before doing that, these are the steps you should try in this order: Note: if using encryption you need to adjust the path, e.g., instead of /dev/sdX1 it should be /dev/mapper/sdX1 1) Mount filesystem read only (safe to use) Create a temporary mount point, e.g.: mkdir /temp Now attempt to mount the filesystem read-only. v6.9.2 and older use: mount -o usebackuproot,ro /dev/sdX1 /temp v6.10-rc1 and newer use: mount -o rescue=all,ro /dev/sdX1 /temp For a single device: replace X with actual device, don't forget the 1 in the end, e.g., /dev/sdf1 For a pool: replace X with any of the devices from the pool to mount the whole pool (as long as there are no devices missing), don't forget the 1 in the end, e.g., /dev/sdf1, if the normal read only recovery mount doesn't work, e.g., because there's a damaged or missing device you should use instead the option below. v6.9.2 and older use: mount -o degraded,usebackuproot,ro /dev/sdX1 /temp v6.10-rc1 and newer use: mount -o degraded,rescue=all,ro /dev/sdX1 /temp Replace X with any of the remaining pool devices to mount the whole pool, don't forget the 1 in the end, e.g., /dev/sdf1, if all devices are present and it doesn't mount with the first device you tried use the other(s), filesystem on one of them may be more damaged then the other(s). Note that if there are more devices missing than the profile permits for redundancy it may still mount but there will be some data missing, e.g., mounting a 4 device raid1 pool with 2 devices missing will result in missing data. With v6.9.2 and older, these additional options might also help in certain cases (with or without usebackuproot and degraded), with v6.10-rc1 and newer rescue=all already uses all theses options and more. mount -o ro,notreelog,nologreplay /dev/sdX1 /temp If it mounts copy all the data from /x to another destination, like an array disk, you can use Midnight Command (mc on the console/SSH) or your favorite tool, after all data is copied format the device or pool and restore data. 2) BTRFS restore (safe to use) If mounting read-only fails try btrfs restore, it will try to copy all data to another disk, you need to create the destination folder before, e.g., create a folder named restore on disk2 and then: btrfs restore -v /dev/sdX1 /mnt/disk2/restore For a single device: replace X with actual device, don't forget the 1 in the end, e.g., /dev/sdf1 For a pool: replace X with any of the devices from the pool to recover the whole pool, don't forget the 1 in the end, e.g., /dev/sdf1, if it doesn't work with the first device you tried use the other(s). If restoring from an unmountbale array device use mdX, where X is the disk number, e.g. to restore disk3: btrfs restore -v /dev/md3 /mnt/disk2/restore If the restore aborts due an error you can try adding -i to the command to skip errors, e.g.: btrfs restore -vi /dev/sdX1 /mnt/disk2/restore If it works check that restored data is OK, then format the original btrfs device or pool and restore data. 3) BTRFS check --repair (dangerous to use) If all else fails ask for help on the btrfs mailing list or #btrfs on libera.chat, if you don't want to do that and as a last resort you can try check --repair: If it's an array disk first start the array in maintenance mode and use mdX, where X is the disk number, e.g., for disk5: btrfs check --repair /dev/md5 For a cache device (or pool) stop the array and use sdX: btrfs check --repair /dev/sdX1 Replace X with actual device (use cache1 for a pool), don't forget the 1 in the end, e.g., /dev/sdf1 4 7 2 Quote Link to comment
Squid Posted March 16, 2017 Share Posted March 16, 2017 (edited) Why do I see csrf errors in my syslog? Starting with 6.3.0-rc9 unRaid includes code to prevent CSRF vulnerabilities. (See here) Some plugins may have needed to be updated in order to properly work with this security measure. There are 3 different errors that you may see logged in your syslog: missing csrf_token - This error happens if you have plugins that have either not been updated to conform to the security system or the version of the plugin you are running is not up to date. Should you see this error, then check for and install updates for your plugins via the Plugins tab. To my knowledge, all available plugins within Community Applications have been either updated to handle csrf_tokens or they were not affected in the first place. If updating your plugins does not solve your issue, then post in the relevant support thread for the plugin. There will be hints on the log line as to which plugin generated the error. wrong csrf_token - CSRF tokens are randomly generated at every boot of unRaid. You will see this error if you have one browser tab pointed at a page in unRaid and on another tab you initiate a restart of unRaid. Note that the browser in question can also be on any device on your network. This includes other computers, tablets, phones, etc. IE: Close the other browser tabs. This error can also be caused by mobile apps such as ControlR checking the status of unRaid but the server has been rebooted after the app was started. Restart the application to fix. unitialized csrf_token - Thus far the community has never once seen any report of this being logged. Presumably it is an error generated by unRaid itself during Limetech's debugging period (ie: not plugin related), and should you see this you should post your diagnostics in the release thread for the version of unRaid you are running. EDIT: There is a possibility that if your rootfs is completely full due to misconfiguration of an application that you may see this particular token error. Edited November 4, 2018 by Squid 3 3 Quote Link to comment
ken-ji Posted March 19, 2017 Share Posted March 19, 2017 (edited) Why can't I delete a file (without permissions from root/nobody/Unix user/999/etc)? My VM/Docker created some files but I can't access them from Windows? First a primer: Unix filesystem permissions/ACLs (access control lists) in a nutshell There are always 3 permission groups (owner, group, other) owner - if you own the file, these permissions apply group - if you are a member of the group, these permissions apply other - if you are not the owner or member of the group, these permissions apply Permissions are cumulative, there is no "deny" permission, so if one group grants permission, permission is granted. You can easily check the permissions of a file from the shell with [email protected]:~# # ls -l /mnt/user0/slackware/ total 92 -rwxr-xr-x 1 nobody users 410 Aug 10 2016 getall.sh* -rw-r--r-- 1 nobody users 5336 Oct 29 15:20 mirror-slackware-current.conf -rwxr-xr-x 1 nobody users 39870 Nov 30 2013 mirror-slackware-current.sh* -rw-r--r-- 1 nobody users 5397 Oct 29 15:20 mirror-slackware.conf lrwxrwxrwx 1 root root 27 Jan 28 2016 mirror-slackware.sh -> mirror-slackware-current.sh* drwxrws--- 1 root root 56 Jan 16 2014 multilib/ -rwxr-xr-x 1 nobody users 7165 May 20 2010 rsync_slackware_patches.sh* drwxrws--- 1 root root 4096 Jun 11 2015 sbopkgs/ lrwxrwxrwx 1 root root 16 Jan 28 2016 slackware64 -> slackware64-14.1/ drwxrws--- 1 nobody users 4096 May 28 2016 slackware64-14.1/ drwxr-xr-x 1 root root 4096 Dec 5 02:00 slackware64-14.2/ drwxrws--- 1 root root 4096 Aug 11 2016 slackware64-14.2-iso/ drwxr-xr-x 1 nobody users 4096 Dec 5 02:01 slackware64-current/ drwxrws--- 1 nobody users 4096 May 1 2015 slackwarearm-14.1/ The permissions are the displayed with the 10 character string at the start of the line [l][rwx][rwx][rwx] the first character just tells us the type of the file/directory/link we are working with the first triad are the owner permissions, these are the permissions that apply to the owner of the file/directory/etc the 2nd triad are the group permissions, these are the permissions that apply to the members of the group of the file/directory/etc the last triad are the other/else permissions, these are the permissions that apply to users who are not the owner nor members of the group of the file/directory/etc For files: To read a file: read permission is needed. r-- To write a file: write permission is needed. -w- To execute a file (as a script, or binary): execute is needed. --x For directories: To list the contents a directory: read and execute is needed. r-x Weird things happen otherwise To create/delete files in a directory: write is needed on both the file and the directory. -w- Example: So for a file /mnt/user/share/a/b drwxrwxr-x 1 nobody users 2 Mar 15 11:57 a/ -rw-rw-rw- 1 nobody users 2 Mar 15 11:57 a/b Other than root, nobody or members of users. the file b would be impossible to delete, since the write permission to the directory is missing. The file however, can be overwritten by anybody. Now, Windows access to the files is over SMB SMB has two modes of access to the file. samba is the app providing the access. Public/Guest access - (unRAID default) in this mode, all access is allowed. There are no passwords needed. Files and directories are created with the nobody user. Permissions are typically set to rwxrwxrwx which grant anybody read and write access Private/Secure access - in this mode, users need to be defined and passwords assigned. Files and directories are owned by the user who created them. But when a share is created, unRAID assigns it to nobody with full read, write, execute for all (owner, group, and others).(ie drwxrwxrwx) The problem begins when there is a VM, docker creating files. Lets say the VM is using the user backup. Lets say user alice is trying to delete the old backups from her Windows PC. Even if the shares are public, she would hit the error about requiring permissions from backup to delete the files. Why? Because samba will be using the user nobody to delete the files made by the backup user, and typically the file permissions won't allow it. If the shares are private/secure, it can still fail because alice user is not the same as the backup user, and thus the permission problem exists again. (There are cases where this is not true, but that's a bit outside the scope of the FAQ) How do we correct the issue The easiest way to correct the issue is to run Tools|New Permissions which pave over all of the shares and disks to have files with rwxrwxrwx permission and ownership by nobody. But now we don't want that since our VMs and dockers are, in effect, separate OS with their own users, which may or maynot coincide with the new attributes. So, we login to the terminal (over SSH or console) and from the terminal, we run: [email protected]:~$ chmod 777 -Rv /mnt/user/<share1> /mnt/user/<share2> ... This will cause all the permissions of the affected shares to be set to rwxrwxrwx which should normally fix the issues. In case you have more complex settings or requirements, feel free to discuss them in the forums as this requires case to case settings that might be applicable to your specific scenario. Initial stuff, will expand as needed Edited February 26, 2018 by ken-ji 1 1 Quote Link to comment
Squid Posted April 17, 2017 Share Posted April 17, 2017 (edited) On 4/18/2016 at 10:17 PM, RobJ said: This thread is reserved for Frequently Asked Questions, concerning unRAID as a NAS, its setup, operation, management, and troubleshooting. Please do not ask for support here, such requests and anything off-topic will be deleted or moved, probably to the FAQ feedback topic. If you wish to comment on the current FAQ posts, or have suggestions or requests for the FAQ, please put them in the FAQ feedback topic. Thank you! Index to common questions Some are from the wiki FAQ, some from this thread, and some from the LimeTech web site. There are many more questions with answers on the wiki FAQ. Getting Started What is unRAID? What are the minimum system requirements? How is unRAID licensed? What counts as a drive against my license storage device limit? What are the current device limits for each license? How does parity work? How hard is unRAID to use if I don't know Linux? How do I get started? How do I know if I've configured my server properly? General Questions How do I get help? What Are Page Allocation Stalls? Please explain how Linux permissions work! NEW My VM or Docker created some files but I can't access them from Windows? NEW Why can't I delete a file (without permissions from root/nobody/Unix user/999/etc)? NEW Why is so much of my RAM being used? What are the requirements for the second parity disk? Does the second parity drive have to be the same size as the first? What is "Boot GUI mode", and how do I change to it? Can I reorder my drives within my array? (Link currently dead) Why does the WebGUI crash when changing the number of array disk slots? Why is the webGUI not displaying any of my shares, yet I can see them over the network? Why can't unRAID find or add my SAS drive? I have a new SAS drive. How do I get unRAID to recognize and assign it? Reformat a SAS HDD to different block sizes mainly 512 to use in UNRAID I set a share to private, and created a user with read/write permissions for that share. But on my Windows PC, when I try to navigate to that share, why does it say I don't have access when I enter that user name and password? I have software that requires using port 80. How do I change the HTTP port that unRAID uses? Is there a way to create a Windows shortcut for shutting down the unRAID server? I have an unmountable BTRFS data disk, what can I do to recover my data? How do I search the forum? (No, its not always obvious) (obsolete?) How Can I Stop Mover From Running? NEW Why does ARRAY STARTED STALE CONFIGURATION appear at the bottom of the webUI? Cache Drive/Pool How do I add a disk to create a redundant cache pool? How do I remove a cache pool disk? How do I replace or upgrade a cache pool disk? I have two different size cache devices, why is the reported space incorrect? Can I change my cache pool to RAID0 or other modes? How do I replace/upgrade my single cache device? (unRAID v6.2 and above only) How do I replace my cache drive? (all versions) Can I replace my cache device with a smaller one? Why are my cache disk(s) unassigned after a reboot? Can you explain the Cache drive option types, and what is the difference between them? I have an unmountable BTRFS cache disk or pool, what can I do to recover my data? How can I monitor a btrfs pool for read/write errors? NEW Plugins Why am I unable to install plugins? Is there documentation for the unRAID plugin system? Maintenance and Troubleshooting I need help! What do I do? Why do I see "Cannot open root device null" and unRaid will not boot? NEW I'm having trouble with lockups / crashes / etc. How can I see the syslog following a reboot? Why is my GUI Slow and/or unresponsive? NEW Why do I see csrf errors in my syslog? NEW How do I know if I've configured my server properly? What do I do if I get a red X next to a hard disk? Why is my disk being marked as Read-Only? I have corruption in a file system, how do I fix it? I'm getting an error message " Failed to find user 'avahi' ". What do I do? Why after upgrading to a 6.2 version is my webGUI so slow? Why is my webGUI taking so long between pages? I'm getting low read speeds from my unRAID server, is there a fix? I found a segfault error in my syslog. How do I fix it? My system is crashing randomly, and/or I'm seeing a little data corruption. How do I fix it? How do I test my RAM, my system memory? I lost my Unraid password. What can I do? Fix Common Problems is telling me write cache is disabled on a drive. What do I do? One of my XFS formatted drives is unmountable. What do I do? unRAID FAQ's and Guides - * Guides and Videos - comprehensive collection of all unRAID guides (please let us know if you find one that's missing) * FAQ for unRAID v6 on the forums, general NAS questions, not for Dockers or VM's * FAQ for unRAID v6 on the unRAID wiki - it has a tremendous amount of information, questions and answers about unRAID. It's being updated for v6, but much is still only for v4 and v5. * Docker FAQ - concerning all things Docker, their setup, operation, management, and troubleshooting * FAQ for binhex Docker containers - some of the questions and answers are of general interest, not just for binhex containers * VM FAQ - a FAQ for VM's and all virtualization issues Know of a question that ought to be here? Please suggest it in the FAQ feedback topic. ------------------------------------------------------- Suggested format for FAQ entries - clearly shape the issue as a question or as a statement of the problem to solve, then fully answer it below, including any appropriate links to related info or videos. Optionally, set the subject heading to be appropriate, perhaps the question itself. While a moderator could cut and paste a FAQ entry here, only another moderator could edit it. It's best therefore if only knowledgeable and experienced users create the FAQ posts, so they can be the ones to edit it later, as needed. Later, the author may want to add new info to the post, or add links to new and helpful info. And the post may need to be modified if a new unRAID release changes the behavior being discussed. Moderators: please feel free to edit this post. Updated the links since the ones in the OP are no longer working... (Yes I had nothing else to do ) Edited August 18, 2019 by Squid add new link 1 2 Quote Link to comment
Squid Posted April 17, 2017 Share Posted April 17, 2017 (edited) How can I stop mover from running? (Possibly unRaid 6.3.3+ only) Since you can't stop the array while mover is running, in order to stop mover either from an SSH terminal or from the local keyboard / monitor, enter in the following mover stop Reference: Edited April 17, 2017 by Squid 1 1 Quote Link to comment
Frank1940 Posted April 20, 2017 Share Posted April 20, 2017 (edited) Why is my GUI Slow and/or unresponsive? This problem has been traced to an anti-virus program suite and its settings in several cases. The link below will take to two posts which provide a rather complete descriptions of the problem and its solution. While you might not be running Avast, I have no doubt that other antivirus products will have a similar issue in the future. You should definitely investigate this area if you are having any type of problem with a slow, misbehaving or unresponsive GUI. EDIT: Keep reading in the thread as there is continuing investigation into the issues with Avast. Edited April 21, 2017 by Frank1940 1 Quote Link to comment
Squid Posted August 7, 2017 Share Posted August 7, 2017 I'm having trouble with lockups / crashes / etc. How can I see the syslog following a reboot? All 3 of the methods below will continually write the syslog (as it changes) to the flashdrive up to the moment the lockup / crash / reboot of the server happens. unRaid runs completely from RAM, so there is normally no way to view the syslog from one boot to another. However, there are a few different ways to grab the syslog from one boot to another. Method Preferred: ENABLE THE SYSLOG SERVER AND MIRROR THE SYSLOG TO FLASH DRIVE (SETTINGS - SYSLOG SERVER) Method 1: Via the User Scripts Plugin: Install the user scripts plugin, and add this script to it set to run at First Array Start Only: https://forums.lime-technology.com/topic/48707-additional-scripts-for-userscripts-plugin/?page=5#comment-581595 Method 2: Via Fix Common Problems Plugin Within Fix Common Problems settings, put it into Troubleshooting mode Method 3: Via a screen session or at the local keyboard & monitor tail -f /var/log/syslog > /boot/syslog.txt Pros / Cons Method 1 will create a new syslog file on the flash drive at every boot so that you can compare / not lose any historical data for reference Method 2 logs a ton of extra information that may (or may not) help with diagnosing any issues. This extra information being logged however may contribute to a crash due to the logging being filled up if troubleshooting mode is enabled for more than a week or so. Also requires you to re-enable it on every boot (by design) Method 3 Information is identical to Method 1, but requires you to reenter the command every time you want the information. Additionally, if this is not entered at the local command prompt or via a screen session, then closing the SSH (Putty) window will stop the logging from happening. BIG NOTE: In the case of lockups, etc it is highly advised to have a monitor connected to the server and take a picture of whatever is on it prior to rebooting the server. It is impossible for any script to capture any errors that may have been outputted to the local monitor 1 Quote Link to comment
Squid Posted September 7, 2017 Share Posted September 7, 2017 Why does ARRAY STARTED STALE CONFIGURATION appear at the bottom of the webUI? See here: https://forums.lime-technology.com/topic/59845-unraid-os-version-640-rc8q-available/?page=11&tab=comments#comment-588602 Quote Link to comment
Squid Posted September 10, 2017 Share Posted September 10, 2017 (edited) What are "Page Allocation Stalls?" While not the most technical explanation, this is as far as I can tell pretty close to the actual truth: https://forums.lime-technology.com/topic/59858-trace-error-found-635/?tab=comments#comment-587518 (Updating to unRaid 6.4.0 will most likely also solve this problem as that OS version has better memory management) Edited January 27, 2018 by Squid Quote Link to comment
SundarNET Posted October 31, 2018 Share Posted October 31, 2018 (edited) Reformat a SAS HDD to different block sizes mainly 512 to use in UNRAID This took me a few hours to find and work out but was so much needed OK so I have just now done this for myself by installing sg3_utils onto my UNRAID OS using installpkg all using terminal 1. download the package into a tmp dir # wget http://slackware.cs.utah.edu/pub/slackware/slackware64-14.1/slackware64/l/sg3_utils-1.36-x86_64-1.txz 2. run this from that tmp dir after the download to install sg3_utils # upgradepkg --install-new sg3_utils-1.36-x86_64-1.txz 3. use this command to show SAS HDD's # sg_scan -i 4. this command to format 'obviously /dev/XXX should be the HDD u wish to format MAKE SURE ITS THE RIGHT ONE! # sg_format --format --size=512 -v /dev/XXX this has been allowing me to reformat the block size and use previously non usable drives saving buttonnes of money WARNING this format will destroy a HDD if interrupted during this process if you can a UPS is recommended have a great day I love UNRAID! Edited October 31, 2018 by SundarNET bad grammar + guide update/change 1 2 Quote Link to comment
JorgeB Posted November 28, 2018 Share Posted November 28, 2018 How can I monitor a btrfs or zfs pool for errors? As some may have noticed the GUI errors column for the cache pool is just for show, at least for now, as the error counter remains at zero even when there are some, I've already asked and hope LT will use the info from btrfs dev stats/zpool status in the near future, but for now, anyone using a btrfs or zfs cache or unassigned redundant pool should regularly monitor it for errors since it's fairly common for a device to drop offline, usually from a cable/connection issue, since there's redundancy the user keeps working without noticing and when the device comes back online on the next reboot it will be out of sync. For btrfs a scrub can usually fix it (though note that any NOCOW shares can't be checked or fixed, and worse than that, if you bring online an out of sync device it can easy corrupt the data on the remaining good devices, since btrfs can read from the out of sync device without knowing it contains out of sync/invalid data), but it's good for the user to know there's a problem as soon as possible so it can be corrected, for zfs the missing device will automatically be synced when it's back online. BTRFS Any btrfs device or pool can be checked for errors read/write with btrfs dev stats command, e.g.: btrfs dev stats /mnt/cache It will output something like this: [/dev/sdd1].write_io_errs 0 [/dev/sdd1].read_io_errs 0 [/dev/sdd1].flush_io_errs 0 [/dev/sdd1].corruption_errs 0 [/dev/sdd1].generation_errs 0 [/dev/sde1].write_io_errs 0 [/dev/sde1].read_io_errs 0 [/dev/sde1].flush_io_errs 0 [/dev/sde1].corruption_errs 0 [/dev/sde1].generation_errs 0 All values should always be zero, and to avoid surprises they can be monitored with a script using Squid's great User Scripts plugin, just create a script with the contents below, adjust path and pool name as needed, and I recommend scheduling it to run hourly, if there are any errors you'll get a system notification on the GUI and/or push/email if so configured. #!/bin/bash if mountpoint -q /mnt/cache; then btrfs dev stats -c /mnt/cache if [[ $? -ne 0 ]]; then /usr/local/emhttp/webGui/scripts/notify -i warning -s "ERRORS on cache pool"; fi fi If you get notified you can then check with the dev stats command which device is having issues and take the appropriate steps to fix them, most times when there are read/write errors, especially with SSDs, it's a cable issue, so start by replacing the cables, then and since the stats are for the lifetime of the filesystem, i.e., they don't reset with a reboot, force a reset of the stats with: btrfs dev stats -z /mnt/cache Finally run a scrub, make sure there are no uncorrectable errors and keep working normally, any more issues you'll get a new notification. P.S. you can also monitor a single btrfs device or a non redundant pool, but for those any dropped device is usually quickly apparent. ZFS: For zfs click on the pool and scroll down to the "Scrub Status" section: All values should always be zero, and to avoid surprises they can be monitored with a script using Squid's great User Scripts plugin, just create a script with the contents below, adjust path and pool name as needed, and I recommend scheduling it to run hourly, if there are any errors you'll get a system notification on the GUI and/or push/email if so configured. if mountpoint -q /mnt/tank; then (( $(zpool status -x tank | wc -l) < 2 )) if [[ $? -ne 0 ]]; then /usr/local/emhttp/webGui/scripts/notify -i warning -s "ERRORS on tank pool"; fi fi If you get notified you can then check in the GUI which device is having issues and take the appropriate steps to fix them, most times when there are read/write errors, especially with SSDs, it's a cable issue, so start by replacing the cables, zfs stats clear after an array start/stop or reboot, but if that option is available you can also clear them using the GUI by clicking on "ZPOOL CLAR" below the pool stats. Then run a scrub, make sure there are no more errors and keep working normally, any more issues you'll get a new notification. P.S. you can also monitor a single zfs device or a non redundant pool, but for those any dropped device is usually quickly apparent. Thanks to @golli53for a script improvement so errors are not reported if the pool is not mounted. 8 10 1 Quote Link to comment
Squid Posted January 20, 2019 Share Posted January 20, 2019 (edited) Why do I see "Cannot open root device null" and unRaid will not boot? See this thread here: https://forums.unraid.net/topic/74419-tried-to-upgrade-from-653-to-66-and-wont-boot-up-after-reboot/ And in particular read from this post onwards: https://forums.unraid.net/topic/74419-tried-to-upgrade-from-653-to-66-and-wont-boot-up-after-reboot/?tab=comments#comment-710968 Edited January 20, 2019 by Squid Quote Link to comment
Squid Posted June 30, 2019 Share Posted June 30, 2019 (edited) Fix Common Problems is telling me that Write Cache is disabled on a drive. What do I do? This test has nothing to do with any given unRaid version. For some reason, sometimes hard drive manufacturers disable write cache on their drives (in particular shucked drives) by default. This is not a problem per se, but you will see better performance by enabling the write cache on the drive in question. To do this, first make a note of the drive letter which you can get from the Main Tab Then, from unRaid's terminal enter in the following (changing the sdX accordingly) hdparm -W 1 /dev/sdm You should get a response similar to this: /dev/sdm: setting drive write-caching to 1 (on) write-caching = 1 (on) If write caching stays disabled, then either the drive is a SAS drive, in which case you will need to utilize the sdparm commands (google is your friend), or the drive may be connected via USB in which case you may not be able to do anything about this. 99% of the time, this command will permanently set write caching to be on. In some rare circumstances, this change is not permanent, and you will need to add the appropriate command to either the "go" file (/config/go on the flash drive), or execute it via the user scripts plugin (with it set to run at first array start only) It should be noted that even with write-caching disabled this is not a big deal. Only performance will suffer. No other ill-effects will happen. NOTE: If this does not work for you, then you will either need to contact the drive manufacturer as to why or simply ignore the warning from Fix Common Problems Edited July 7, 2019 by Squid 8 4 Quote Link to comment
Frank1940 Posted October 16, 2019 Share Posted October 16, 2019 How can I calibrate my UPS, silence alarms, change battery dates using apcupsd? Unraid has apcupsd built in to have any UPS's and questions have arisen about calibrations, alarm silencing, battery replacement date changes. @hpka did some extensive research to find that there is a utility included which will allow the root user the ability to adjust many parameters. Exactly which ones can be adjusted will depend on the manufacturer and the model of UPS that you are using. Here is a link to @hpka's post: By the way, the sudo command is not required when using the Unraid terminal session as is shown below: [email protected]:~# apctest 2019-10-16 11:52:33 apctest 3.14.14 (31 May 2016) slackware Checking configuration ... sharenet.type = Network & ShareUPS Disabled cable.type = USB Cable mode.type = USB UPS Driver Setting up the port ... Doing prep_device() ... Quote Link to comment
Frank1940 Posted October 16, 2019 Share Posted October 16, 2019 (edited) How do I use the Syslog Server? Beginning with release 6.7.0, there has been a syslog server functionality added to Unraid. This can be a very powerful diagnostic tool when you are confronted with a situation where the regular tools can not or do not capture information about about a problem because the server has become non-responsive, has rebooted, or spontaneously powered down. However, getting it set up to use has been confusing to many. Let's see if we clarify setting it up for use. Begin by going to Settings >>> Syslog Server This is the basic Syslog Server page: You can click on the 'Help' icon on the Toolbar and get more information for all of these three options. The first one to be considered for use is the Mirror syslog to flash: This one is the simplest to set up. You select 'Yes' from the dropdown box and click on the 'Apply' button and the syslog will be mirrored to logs folder/directory of the flash drive. There is one principal disadvantage to this method. If the condition, that you are trying to troubleshoot, takes days to weeks to occur, it can do a lot of writes to the flash drive. Some folks are hesitant to use the flash drive in this manner as it may shorten the life of the flash drive. This is how the setup screen looks when the Syslog Server is set up to mirror to the flash drive. The second option is use an external Syslog Server. This can be another Unraid server. You can also use virtually any other computer. You find the necessary software by googling for the syslog server <Operating system> After you have set up the computer/server, you fill in the computer/server name or the IP address. (I prefer to use the IP address as there is never any confusion about what it is.) The Click on the 'Apply' button and your syslog will be mirrored to the other computer. The principal disadvantage to this system is that the other computer has be left on continuously until the problem occurs. The third option uses a bit of trickery in that we use the Unraid server with the problem as the Local syslog server. Let's begin by setting up the Local syslog server. After changing the Local syslog server: dropdown to 'Enabled', the screen will look like this. Note that we have a new menu option-- Local syslog folder: This will be a share on the your server but chose it with care. Ideally, it will be a 'cache only' or a 'cache preferred' share. This will minimize the spinning up of disks due to the continuous writing of new lines to the syslog. A cache SSD drive would be the ideal choice here. (The folder that you see above is a 'cache preferred' share. The syslog will be in the root of that folder/share.) If you click the 'Apply button at this point, you will have this server setup to serve as a Remote Syslog Server. It can now capture syslogs from several computers if the need should arise. Now, we added the ip address of this server as the Remote syslog server (Remember the mention of trickery. So basically, you send data out-of-the-server and it comes-right-back-in.) This is what it looks now: As soon as you click on apply, the logging of your syslog will start to a file named (in this case) syslog-192.168.1.242.log in the root of the selected folder (in this case-- Folder_Tree). One very neat feature is that each entry are appended onto this file every time a new line is added to the syslog. This should mean if you have a reboot of the server after a week of collecting the syslog, you will have everything from before the reboot and after the reboot in one file! Thanks @bonienl for both writing this utility and the guidance in putting this together. Edited October 17, 2019 by Frank1940 6 6 1 Quote Link to comment
Squid Posted December 21, 2019 Share Posted December 21, 2019 What are the causes of unclean shutdowns? (Why does a parity check happen whenever I power on my server?) https://forums.unraid.net/topic/86385-docker-containers-have-the-be-off-when-you-power-off-the-array/?tab=comments#comment-801379 Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 What can I do to keep my Ryzen based server from crashing/locking up with Unraid? Ryzen on Linux can lock up due to issues with c-states, and while this should mostly affect 1st gen Ryzen there are reports that 2nd and even 3rd gen can be affected in some cases, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar). If there's no such setting in the BIOS try instead to disable C-States globally, also note that there have been some reports that with some boards the setting above is not enough and only completely disabling C-States brings stability. Also many of those servers seem to be running overclocked RAM, this is known to cause stability issues and even data corruption on some Ryzen/Threadripper systems, even if no errors are detected during memtest, server and overclock don't go well together, respect max RAM speed according to config and CPU listed on the tables below. Note: Ryzen based APUs don't follow the same generation convention compared to regular desktop CPUs and are generally one generation behind, so for example Ryzen 3200G is a 2nd Gen CPU: 1st gen Ryzen: 2nd gen Ryzen: 3rd gen (3xxx) and Zen3 (5xxx) Ryzen : Threadripper 1st Gen: Threadripper 2nd Gen: Threadripper 3rd Gen: 15 1 Quote Link to comment
JorgeB Posted April 6, 2021 Share Posted April 6, 2021 Why are files not being moved by the Mover? These are some common reasons the Mover is not working as expected: If using the mover tuning plugin first thing to do is to check its settings or remove it and try without it just to rule it out. use cache pool option for the share(s) is not correctly set, see here for more details but basically cache=yes moves data from pool to array, cache=prefer moves data from array to pool, cache=only and cache=no options are not touched by the Mover. files are open, they already exist or there's not enough space in the destination, enable Mover logging (Settings -> Scheduler -> Mover Settings) and it will show in the syslog what the error is. if it's a not enough space error note that split level overrides allocation method, also minimum free space for the share(s) must be correctly set, usual recommendation is to set it to twice the max file size you expect to copy/move to that share. If none of these help enable Mover logging, run the Mover, download the diagnostics and please attach them to a new thread in the general support forum. 2 Quote Link to comment
Squid Posted January 29, 2022 Share Posted January 29, 2022 Where can I see what folders are taking up my RAM? If you think (or have been told on the forum) that something somewhere is filling up your RAM (rootfs etc), then this might help in diagnosing exactly where to help you in finding out why From the plugins tab, install plugin and enter in this URL https://raw.githubusercontent.com/Squidly271/misc-stuff/master/memorystorage.plg NOTE: this does not actually install anything, but is simply a useful way to run a script You will see where all the memory in your RAM is being consumed. Pay particular attention to the last few lines (where it will detail /mnt). If you have anything listed under /mnt, that would mean that (most likely) a docker app is directly referencing a disk or pool that doesn't actually exist (ie: any actual disks and pools existing will not be listed) Other common areas for trouble would be /tmp and /var/log This script (while hopefully being useful) can potentially take a number of minutes to run, especially if you have bypassed the OS or Unassigned Devices (eg: rclone) and are making your own mount points anywhere in the system. Because it's impossible for this script to know that you are making your own mountpoints manually out of system control, it will think that this is in RAM and calculate the space taken accordingly. Also, do not be deceived by some of the entries in this list. Many of the folders listed will consume a couple of hundred meg. It's the folders which take up Gigabytes that you would be most concerned about 1 1 Quote Link to comment
Squid Posted January 29, 2022 Share Posted January 29, 2022 My server won't proper wake up after S3 Sleep Note that the OS does not official support S3 Sleep. This is handled by an auxiliary plugin (Dynamix S3 Sleep). But for some users, the following seems to work to allow them to wakeup. Your mileage may vary. Read from here down Quote Link to comment
Squid Posted October 20, 2022 Share Posted October 20, 2022 My Proxy isn't working in Unraid 6.11.x See this post for Unraid as a whole: https://forums.unraid.net/topic/69785-trial-cannot-connect-through-web-proxy/#comment-1182405 And for the apps tab, see this post: https://forums.unraid.net/topic/38582-plug-in-community-applications/page/122/#comment-848436 Quote Link to comment
JorgeB Posted April 27 Share Posted April 27 Can I use a cache, log, special, spare and/or dedup vdev with my zfs pool? At this time (Unraid v6.12) they cannot be added to a pool using the GUI, but you can add them manually and have Unraid import the pool, a few notes: currently zfs must be on partition #1, for better future compatibility (though not guaranteed) recommend partitioning the devices first with UD. main pool should be created with Unraid, then add the extra vdev(s) using the CLI and re-import the pool available vdev types and what they do are beyond the scope of this entry, you can for example see here for more information. please note that since the GUI doesn't not support this it might give unpredictable results if you then try to replace one of pool devices, so if you plan to use this recommend for now doing any needed device replacement with the CLI. How to: first create the main pool using Unraid in this example I've created a 4 device raidz pool start array, format the pool if it's a new one, and with the array running partition then add the extra vdev(s) using the command line to partition the devices with UD you need to format them, but there's no need to use zfs, I usually format with xfs since it's faster, just format the device and leave it unmounted: to add a vdev to the pool use the CLI (need to use -f to overwrite the existing filesystem, always double check that you are specifying the correct devices, also note the 1 in the end for the partition), a few examples: - add a 2-way mirror special vdev zpool add tank -f special mirror /dev/sdr1 /dev/sds1 - add a 2-way mirror log zpool add tank -f log mirror /dev/sdt1 /dev/sdu1 - add a striped cache vdev zpool add tank -f cache /dev/sdv1 /dev/sdw1 - add a 2-way mirror dedup vdev zpool add tank -f dedup mirror /dev/sdx1 /dev/sdy1 - add a couple of spares zpool add tank -f spare /dev/sdb1 /dev/sde1 when all the vdev(s) are added to the pool stop the array, now you need to re-import the pool unassign all pool devices start array (check the "Yes I want to do this" box) stop array re-assign all pool devices, including the new vdev(s), main pool should be assigned to the top slots, order doesn't matter for the remaining devices but assign all devices sequentially, i.e., don't leave empty slots in the middle of the assigned devices. start array existing pool will be imported with the new vdev(s): 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.