Cluster Technology and File Systems
看到一遍介绍集群和文件系统的文章,不错,转过来,原文链接如下:
http://www.stalker.com/notes/SFS.html
Clusteringtechnology is useful for large computing systems with strict uptimerequirements. By deploying groups of servers and other resources together in acluster, organizations can increase performance, balance traffic and createhigh availability.
The key to asuccessful implementation is that the cluster appear as a single highlyavailable system to the end users. For the administrators also, the clustersystem should be as easy to manage as a single-server system. This view of theentire Cluster as one large system is a feature called SingleSystem Image or Single Service Image.
Many clusterdesigns require a Shared File System, so cluster members can work with thesame data files at the same time. The most popular and well-knownimplementation of a Shared File System is a file server, also called NAS (networkattached storage). These days an alternative architecture, called SAN (storagearea network), is becoming more and more popular, and many companies want toimplement Shared File Systems using SAN.
This documentprovides a brief overview of these technologies, explains why SAN is not aShared File System, and provides an introduction to Cluster File Systems thatcan be used to build Shared File Systems using SAN.
本文介绍内容如下:
一.File Systems
The disk devicesused today are "dumb" devices from the user and application point ofview. Each device has some number of blocks - fixed-size datasegments, for example 1K (1024 bytes) in size. When the disk device isconnected to a computer, it can process only very simple requests, like:
Ø READBLOCK(12345) - read the block number 12345 and send the blockdata to the computer.
Ø WRITEBLOCK(765645) - receive the data from the computer and storethem in the block number 765645.
Disks can beconnected to computers using IDE, SCSI, or FDDI interfaces. These interfacesare used to send commands and data to the disks, and to retrieve the data andcommand completion codes from the disks.
Disks themselves do not create any other structure, meaningthat a disk device cannot create "files" or "filedirectories". All the disks work with are blocks,and all they can do is read and write those blocks.
二.Single OSFile Systems
Every modernOperating System (OS) has a component called a File System. That componentis part of the OS kernel and it implements things like"files" and "file directories".
There are manydifferent File Systems, and they use various methods and algorithms, but thesame basic functions are present in most File Systems:
Ø The File System maintains some sort of FAT (FileAllocation Table) - information that associates logicalfiles with physical disk block numbers.
For example, it can say that"File1" is stored in 5 disk blocks with numbers123400,123405,123401,177777,123456 and "File2" is stored in 6 disk blockswith numbers 323400,323405,323401,377777,323456, 893456.
Ø The File System maintains a list of allunused disk blocks and it automatically allocates new diskblocks when the file grows in size, and returns blocks into the list of unusedblocks when a file decreases in size or when a file is deleted.
Ø The File System processes application requeststhat need to read from or write to logical files. TheFile System converts these requests into one or several disk block read andwrite operations, using the information in the File Allocation Table.
Ø The File System maintains special filescalled "file directories" and stores the information aboutother files in these directories.
Ø The File System maintains the "filecache." When new information is written to a file,it stores it on disks and it also copies this information into the File System"cache buffers." When file information is read from disk, it passesit to the application program and also copies it into the "cachebuffers." When the same (or another) application needs to read the sameportion of the file again, the File System simply retrieves that informationfrom its cache buffers instead of re-reading it from the disk.
The following figure illustrates how a FileSystem works:
In this example, the File System servesrequests from two applications.
Application1 asks the File System to read block number 5 from File1.
The File Systemfinds the information for File1 in the File Allocation Table, and detects thatthis file has 5 blocks allocated, and file block number 5 is stored in theblock number 123456 on the disk.
The File Systemuses the disk interface (IDE, SCSI, or any other one) to send theREADBLOCK(123456) command to the disk.
The disk devicesends the information from the specified block to the computer.
The File Systemplaces the read information into its cache buffers, and sends it to theapplication.
Application2 asks the File System to write block number 7 into File2.
The File Systemfinds the information for File2 in the File Allocation Table, and detects thatthis file has 6 blocks allocated. It checks the list of the unused disk blocks,and finds the unused block number 13477. It removes the block number from thelist of unused blocks and adds it as the 7th block to the File2 information inthe File Allocation Table, so now File2 is 7 blocks in size.
The File Systemuses the disk interface (IDE, SCSI, or any other one) to send theWRITEBLOCK(13477) command to the disk, and sends the block data that theapplication program has composed.
The disk devicewrites the block data into the specified disk block, and confirms theoperation.
The File Systemcopies the block data information into its cache buffers.
If anyapplication tries to read block 5 from File1 or block 7 from File2, the FileSystem will retrieve the information from its cache buffers, and it will notperform any disk operation.
All applicationsrunning on this operating system use the same File System. The File Systemguarantees the data consistency. If the disk block 13477 is allocated to File2,it will not be allocated to any other file - until File2 is deleted or isdecreased in size to less than 7 blocks.
三.Network FileSystem (NAS)
When servercomputers need to use the same data, a Network FileSystem (also called NAS, or Network Attached Storage) can be used.
The Network FileSystem is implemented using a File Server and a network. TheFile Server is a regular computer or specialized OS that has a regular FileSystem and regular disk devices controlled with this File System.
The Network FileSystem "stubs" running inside the OS kernel on "client"computers are "dummy" File Systems that retranslate application filerequests to the File Server, using the network:
In this example,the File System on the File Server serves requests from several applicationsrunning on server "client" computers.
The onlydifference with the single OS is in the request delivery; instead of internalcommunication between an application and the File System running inside the OSkernel, the "stub" sends the requests via the network, receives theresponses, and passes them to the application. All "real work" (FileAllocation Table and cache maintenance) is done on the File Server computer.
Since only theFile Server computer has direct access to the physical disk, all applicationsrunning on server systems use the same File System - the File System running onthe File Server. That File System guarantees the data consistency. If the diskblock 13477 is allocated to File2, it will not be allocated to any other file -until File2 is deleted or is decreased in size to less than 7 blocks.
四.Storage AreaNetwork
Storage AreaNetwork is a special type of network that connects computers and disk devices;in the same way as SCSI cables connect disk devices to one computer.
Any computerconnected to SAN can send disk commands to any disk device connected to thesame SAN. On the physical level, SAN can be implementedusing FDDI, Ethernet, or other types of networks.
Some disk drivesor arrays have "dual-channel" SCSI controllers and can be connectedto two computers using regular SCSI cables. Since both computers can send diskread/write commands to that shared disk, this configuration has the samefunctionality as a one-disk SAN.
SAN provides Shared Disks, but SAN itself does not provide aShared File System. If you have several computers thathave access to a Shared Disk (via SAN or dual-channel SCSI), and try to usethat disk with a regular File System, the disk logical structure will bedamaged very quickly.
There are two mainproblems with Shared Disks and regular File Systems:
(1)Disk SpaceAllocation inconsistency
If computer Xand computer Y both connected ("mounted") a shared disk, their FileSystems loaded the File Allocation Tables into each computer's memory. Now, ifsome program running on computer X tried to write a new block to some file, theFile System running on that computer will check its File Allocation Table andfree blocks list, and it will allocate a new file block number 13477 to thatfile.
The File Systemrunning on that computer will modify its File Allocation Table, but it willhave no effect on the File Allocation Tables loaded on other computers. If anapplication running on some other computer Y needs to expand a file, the FileSystem running on that computer may allocate the same block 13477 to that otherfile, since it has no idea that this block has been already allocated bycomputer X.
(2)File Datainconsistency
If a programrunning on computer X has read block 5 from some File1, that block is copiedinto the computer X File System Cache. If the same or another program runningon computer X tries to read the same block 5 from the same file, the computer XFile System will simply copy data from its cache.
A programrunning on some other computer Y can modify the information in the block 5 ofFile1. Since the File System running on computer X is not aware of this fact,it will continue to use its cache providing computer X applications with datathat is no longer valid.
These problemsmake it impossible to use Shared Disks with regular File Systems as Shared FileSystems. They can be used for fail-over systems or in any other configurationwhere only one computer is actually using the disk at any given time. The FileSystem on computer Y starts to process the Shared Disk only when computer X hasbeen shutdown, or stopped using the Shared Disk.
六.Cluster FileSystem
Cluster FileSystems are software products designed to solve the problems outlined above.They allow you to build multi-computer systems with Shared Disks, solving theinconsistency problems.
The Cluster FileSystems are usually implemented as "wrapper" around some regular FileSystem. Cluster File Systems use some kind ofinter-server network to talk to each other and to synchronize their activities.That inter-server "interconnect" can be implemented usingregular Ethernet networks, using the same SAN that connects computers anddisks, or using special fast, low-latency "cluster interconnect"devices.
In this example, the Cluster File System isinstalled on several computers and serves requests from applications running onthese computers.
Application1 running on the first computer asks the Cluster File System to read blocknumber 5 from File1.
The Cluster Filesystem passes the request to the regular File System serving the Shared Disk,and the data block is read in the same way it is read on a single-serversystem.
Application2 running on a different system asks the Cluster File System to write blocknumber 7 into File2.
The Cluster file system uses the inter-server network tonotify the Cluster File Systems on other computers that this block is beingmodified. The Cluster FileSystems remove the old, obsolete copy of the block data from their caches.
The Cluster FileSystem passes the request to the regular File System. It finds the informationfor File2 in the File Allocation Table, and detects that this file has 6 blocksallocated. It checks the list of unused disk blocks, and finds unused blocknumber 13477. It removes the block number from the list of unused blocks andadds it as the 7th block to the File2 information in the File Allocation Table,so now File2 is 7 blocks in size.
The Cluster File System uses the inter-server network tonotify the Cluster File Systems on other computers about the File AllocationTable modification. The Cluster File Systems on thosecomputers update their File Allocation Tables to keep them in sync.
The File Systemuses the disk interface to send the WRITEBLOCK(13477) command to the SharedDisk, and sends the block data that the application program has composed.
The disk devicewrites the block data into the specified disk block, and confirms theoperation.
The Cluster File System solves the inconsistency problems andallows several computers to use Shared Disk(s) as Shared File System.
Cluster File System products are availablefor several Operating Systems:
|
-------------------------------------------------------------------------------------------------------
Skype: tianlesoftware
QQ: tianlesoftware@gmail.com
Email: tianlesoftware@gmail.com
Blog: http://www.tianlesoftware.com
Weibo: http://weibo.com/tianlesoftware
Twitter: http://twitter.com/tianlesoftware
Facebook: http://www.facebook.com/tianlesoftware
Linkedin: http://cn.linkedin.com/in/tianlesoftware