【读书笔记】 分布式文件存储系统 MogileFS
原文地址在这里: http://blogread.cn/it/article/4311?f=sa
Dango 的作品,用perl写成;
代码在这里: https://github.com/mogilefs/
相关wiki在这里: http://code.google.com/p/mogilefs/wiki/Start?tm=6
学姿势,有这么几个特征:
- Application level -- no special kernel modules required. 运行应用层,不需要特定内核模块的支持;
- No single point of failure -- all three components of a MogileFS setup (storage nodes, trackers, and the tracker's database(s)) can be run on multiple machines, so there's no single point of failure. (you can run trackers on the same machines as storage nodes, too, so you don't need 4 machines...) A minimum of 2 machines is recommended. 没有单点失败,因为其主要的三个部件(tracker,storeage node, tracking DB)都是可以运行在不同的节点上的;建议最少用两台机器。
- Automatic file replication -- files, based on their "class", are automatically replicated between enough different storage nodes as to satisfy the minimum replica count as requested by their class. For instance, for a photo hosting site you can make original JPEGs have a minimum replica count of 3, but thumbnails and scaled versions only have a replica count of 1 or 2. If you lose the only copy of a thumbnail, the application can just rebuild it. In this way, MogileFS (without RAID) can save money on disks that would otherwise be storing multiple copies of data unnecessarily. 自动复制。我的理解就是自动选择文件的复制和备份,依据是否有节点有足够的存储空间,这样整个决策过程是个动态过程而非静态的决定某个特定文件需要拷贝多少份,这样据说就节省了磁盘空间,减少拷贝次数;而这里的作者描述是,各个文件依据其 类别的不同被自动拷贝到那些拥有足够存储空间的存储节点上去,以满足该类被对于备份数目的最低要求;例如对于图片网站,对于每张原始的JPEG文件至少要存储3份复件,但是缩略图和扩展信息只是存储1份或者2份。如果丢失了缩略图的备份,应用程序只需要重建即可。如此以来,MogileFS就可以节省DISK的开销; (没有看懂这里的逻辑啊。。太弱了)
- "Better than RAID" -- in a non-SAN RAID setup, the disks are redundant, but the host isn't. If you lose the entire machine, the files are inaccessible. MogileFS replicates the files between devices which are on different hosts, so files are always available. 在RAID中,磁盘有富裕但是主机却不是。当主机都挂了后,存储于磁盘当中的文件就不可访问了。MogileFS的做法是将不同机器节点上的数据在不同的机器间交叉存储,因此数据总是可以访问的;
- Flat Namespace -- Files are identified by named keys in a flat, global namespace. You can create as many namespaces as you'd like, so multiple applications with potentially conflicting keys can run on the same MogileFS installation. 单一层级的命名空间;各个文件在其所属的命名空间中有唯一的名字来标识;不同的命名空间中是允许有重复命名的;
- Shared-Nothing -- MogileFS doesn't depend on a pricey SAN with shared disks. Every machine maintains its own local disks. 各个机器拥有和维护子集的本地磁盘,彼此之间不共享磁盘;
- No RAID required -- Local disks on MogileFS storage nodes can be in a RAID, or not. It's cheaper not to, as RAID doesn't buy you any safety that MogileFS doesn't already provide. RAID对于MogileFS不是必需品;
- Local filesystem agnostic -- Local disks on MogileFS storage nodes can be formatted with your filesystem of choice (ext3, XFS, etc..). MogileFS does its own internal directory hashing so it doesn't hit filesystem limits such as "max files per directory" or "max directories per directory". Use what you're comfortable with. MogileFS自己内部实现了一套目录哈系映射操作,不以来于本地的具体文件系统的选择;