johnm160 Posted June 20, 2011 Posted June 20, 2011 I have a 500gb 7200 rpm IDE drive sitting around that seems like it might be useful as a cache drive. I am running over gigabit ethernet, would I see a performance hit writing to the server using an IDE drive or is the drive still likely to be faster than the network?
lionelhutz Posted June 20, 2011 Posted June 20, 2011 Jason & Others; unRAID will automatically look after your files as they go to the cache disk and then get tranferred to the array disks. You set each share to use or not use the cache disk. You set the schedule. The, data copied to the shares will first go to the cache and then be moved to the array disks when the mover runs. You do not have to manually copy to the cache disk and you do not have to manually handle the transfers from the cache disk to the array disks. If the cache disk is full then data goes directly to the array until the mover runs and empties it again. To many people are over-thinking the use of the cache disk and trying to manually manage it. You shouldn't even have to pay any attention to it or look at the files on it. Peter
Heretic Posted June 20, 2011 Posted June 20, 2011 I'm using the cache disk only for transmission. i I want to put files on the server i always choose disk shares too copy straight to the selected drive. using the cache was the easiest option to use an drive outside the array.
jortan Posted June 21, 2011 Posted June 21, 2011 I'm using the cache disk only for transmission. i I want to put files on the server i always choose disk shares too copy straight to the selected drive. using the cache was the easiest option to use an drive outside the array. I would write to the share which would utilise the cache for a couple of reasons: 1) It might prevent additional disk/s from spinning up (unless the disk/s you're writing to are also being seeded from) 2) It prevents some fragmentation of downloaded files as the full contiguous files are written to your disks each night. But to each their own!
Heretic Posted June 21, 2011 Posted June 21, 2011 I'm using the cache disk only for transmission. i I want to put files on the server i always choose disk shares too copy straight to the selected drive. using the cache was the easiest option to use an drive outside the array. I would write to the share which would utilise the cache for a couple of reasons: 1) It might prevent additional disk/s from spinning up (unless the disk/s you're writing to are also being seeded from) 2) It prevents some fragmentation of downloaded files as the full contiguous files are written to your disks each night. But to each their own! I'm not sure I get what you mean all torrents are on the cache drive in the folder that starts with a "." so it will be invisible for the mover script. all files will remain on the cache until the are manually moved to the disk of choice so there is no fragmentation when the files are moved only disk that is active 24/7 is the cache drive. Personally i see no point in using a drive inside the protected array for torrents. torrent already have protection -> verify
dgaschk Posted June 21, 2011 Posted June 21, 2011 Personally i see no point in using a drive inside the protected array for torrents. torrent already have protection -> verify This is a completely different kind of protection. The array does nothing like "verify". It protects only from a total disk failure. There is no individual file protection or "verify". This can be added with the md5deep package. If the cache drive fails everything on it is lost, but you can just download it again. If a file within unRAID becomes corrupt there is no way to fix it. It will fail when accessed.
Heretic Posted June 22, 2011 Posted June 22, 2011 that is very true. for critical data its always best to have (offsite) backups. apart from knowing if a files is corrupted md5 doesn't really protect does it? creating parity (.par) files can to a certain extent. (just seems like a lot of work) Having the entire array active to seed some of the files is for various reasons no option for me. as you said if the drive fails most files can be downloaded again
jortan Posted June 22, 2011 Posted June 22, 2011 Having the entire array active to seed some of the files sabnzbd/sickbeard. You'll never look back.
dgaschk Posted June 22, 2011 Posted June 22, 2011 that is very true. for critical data its always best to have (offsite) backups. apart from knowing if a files is corrupted md5 doesn't really protect does it? creating parity (.par) files can to a certain extent. (just seems like a lot of work) Having the entire array active to seed some of the files is for various reasons no option for me. as you said if the drive fails most files can be downloaded again You're right. I use md5deep in case I ever get a parity error. Then I can determine if the error is in parity or on disk. If none of my data files have errors then I update parity. If I have corrupt data files then I can rebuild that disk.
Heretic Posted June 23, 2011 Posted June 23, 2011 altho were going a bit offtopic (cache) this is quite interesting. basically your saying under normal condition you don't know if you can trust parity?
defected07 Posted June 23, 2011 Posted June 23, 2011 You're right. I use md5deep in case I ever get a parity error. Then I can determine if the error is in parity or on disk. If none of my data files have errors then I update parity. If I have corrupt data files then I can rebuild that disk. How is this process configured? Do you have a cron job to scan all of your data drives and calculate md5 of each file (if it's been changed or wasn't already calculated)? I'd like to do something like this, too.
dgaschk Posted June 23, 2011 Posted June 23, 2011 A parity error could be the result of a corrupt data disk or corrupt parity. It's most likely parity but this is not certain. I have just done it manually and saved a copies of the hashes for all data disks in a .hashes directory on the top level of all drives. For my full drives it never has to change. I occasionally update the hashes for a drive as it fills. A cron job is a good idea though. It doesn't take very long to compute for for an entire 2T but so a nightly job for all non-full data drives should work. Since I'm filling my media drives one at a time I only have a single drive to compute. I don't worry about disks that hold backups because if I have a parity error and my media drives are ok then I can just delete the backups and recompute parity. Backups are easy to replace.
Joe L. Posted June 23, 2011 Posted June 23, 2011 A parity error could be the result of a corrupt data disk or corrupt parity. It's most likely parity but this is not certain.True, or neither if the bit was corrupted in memory. It is one chance out of N (where N = the number of total drives in your system + the number of other hardware items involved.) It could be ANY disk, or any part of the I/O hardware, from memory to motherboard chipset.
defected07 Posted June 23, 2011 Posted June 23, 2011 So how does the md5deep package work? It creates one file with a hash in it for each file? Or does it do a hash => value scheme, where it has one file with each line containing the file name and the md5 value of it? Just want to figure out the best way to do this and let it be as automated as possible..
KellyVB Posted June 25, 2011 Posted June 25, 2011 I'm confused about the diffrent drives that have been talked about. I currently am using 2 WD 1TB greens for data and a 1TB WD blue for parity. These are all SATA drives and run at 3gps. I now have the pro lisence so am thinking of adding a cache drive. The WD blue is capable of 6gps same as the WD black, BUT only if it's connected to the new SATA 3 ports. Since these were not avalible untill now, how can a WD black be any faster my greens or blue which all run at 3gps runing on SATA 2 ports? I do understand that the black has a 64 meg cache, the greens have a 32 and i think the blue is 16
Joe L. Posted June 25, 2011 Posted June 25, 2011 I'm confused about the diffrent drives that have been talked about. I currently am using 2 WD 1TB greens for data and a 1TB WD blue for parity. These are all SATA drives and run at 3gps. I now have the pro lisence so am thinking of adding a cache drive. The WD blue is capable of 6gps same as the WD black, BUT only if it's connected to the new SATA 3 ports. Since these were not avalible untill now, how can a WD black be any faster my greens or blue which all run at 3gps runing on SATA 2 ports? I do understand that the black has a 64 meg cache, the greens have a 32 and i think the blue is 16 Don't get sucked in by the marketing. Today's SATA disks, regardless of who that are made by, can basically attain a sustained max read speed of between 120 and 150 MB/s. (multiply by 8 to get bps) 150 * 8 = 1200 Mbps, or 1.2Gbps. It does not make a bit of difference if the SATA link to it can theoretically transfer bits faster, it is not going to happen. It has been frequently said, a spinning disk can barely saturate an SATA-1 link to it. The BIGGEST factor for any disk is the areal density of the bits on the platters and the rotational speed of the platters. The cache on the disk is nearly useless when playing music, or movies. (When was the last you watched a movie that was less than 64 Meg in size?) Joe L.
johnm160 Posted June 26, 2011 Posted June 26, 2011 Would I notice a performance hit using a 500GB 7200 RPM IDE drive as a cache disk? Just trying to decide if I can utilize this disk or if I would benifit from buying another SATA drive for cache.
defected07 Posted June 27, 2011 Posted June 27, 2011 Well, you should see a performance increase, as I believe your other drives are 5900 RPM, correct? Why do you think you'd see a performance "hit"? I have a 500 GB 7200 RPM Hitachi cache drive, with my array drives all being green--of mixed Hitachi and WD sizes. I never did a benchmark comparison between the two, but, according to those specs, performance should be increased. Please I use the cache drive for running YAMJ.
Heretic Posted June 27, 2011 Posted June 27, 2011 it also depends on the data density. With higher data density a drive can spin slower to get the same data throughput as a drive with lower data density
SSD Posted June 27, 2011 Posted June 27, 2011 Depends on usage. For sequential read or write access, a slower/ more dense surface can yield better performance than a higher RPM disk with lower density. But for random access, the higher RPM drive can perform better due to faster access time. Due to the way unRaid writes to the array, a higher RPM drive will provide better performance that a higher density slower RPM drive for array disks.
Heretic Posted June 27, 2011 Posted June 27, 2011 Due to the way unRaid writes to the array, a higher RPM drive will provide better performance that a higher density slower RPM drive for array disks. is this valid for the cache disk?
johnm160 Posted June 30, 2011 Posted June 30, 2011 Well, you should see a performance increase, as I believe your other drives are 5900 RPM, correct? Why do you think you'd see a performance "hit"? I have a 500 GB 7200 RPM Hitachi cache drive, with my array drives all being green--of mixed Hitachi and WD sizes. I never did a benchmark comparison between the two, but, according to those specs, performance should be increased. Please I use the cache drive for running YAMJ. Correct my data drives are 5900 RPM, My concern about a performance hit is because the 500GB drive I am thinking about using as a cache drive is IDE interface not SATA
defected07 Posted June 30, 2011 Posted June 30, 2011 Well, you should see a performance increase, as I believe your other drives are 5900 RPM, correct? Why do you think you'd see a performance "hit"? I have a 500 GB 7200 RPM Hitachi cache drive, with my array drives all being green--of mixed Hitachi and WD sizes. I never did a benchmark comparison between the two, but, according to those specs, performance should be increased. Please I use the cache drive for running YAMJ. Correct my data drives are 5900 RPM, My concern about a performance hit is because the 500GB drive I am thinking about using as a cache drive is IDE interface not SATA Do you expect to put more than 500GB worth of a data on your server per day? I don't see why that'd be an issue--actually a smaller cache drive would be better. Use a cache drive that's as small as the daily amount of data you'd transfer, plus the size of whatever Applications will permanently reside on your cache drive... And yes, you should get a SATA interface cache drive...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.