[Operating System] {ud923} P4L2: Distributed File Systems

 

 

 Visual Metaphor

  

VFS: virtual file system

 

 

 

 Distributed File Systems

 

multiple machines involved in the delivery of the file system service together form a distributed file system.

client => local cache

server => data storage 

1 client + 1 server => a mini distributed file system

 

 

 

 

 DFS Models

 

in this lesson, the foucus is "1 client + 1 server on different machines"

 Google/Facebook... use "both" (7th line)

 

 

 Remote File Service: Extremes

 

 

 

 

Remote File Service A Compromise

 

 

 

 

 Stateless vs. Stateful File Server

 

state inforamtion examples:

which clients access which file,

how many different clients are serviced

 

 

 

 Caching State in a DFS

 

 

 

 

 

 

 

File Sharing Semantics on a DFS 

 

Transactional guarantees

=> file system will need to export some interfaces/APIs

=> so that the clients can specify what is  the collection of files or the collection of operations that need to be treated like a certain single transaction?

=> and then the file system can make some guarantees that all those changes are tomically committed, atomically made visible into the file system.

 

 

 

 

 

 

 

 

 File vs Directory Service

 

 

 

 

Replication and Partitioning 

these two techniques can be combined to have a solution where the files are partitioned across different groups or in different volumes.

and each of these groups is then replicated potentially with different degree of replication.

For instance, you can have partitions of read-only files versus files that are also written to, and you can replicate the read-only files to a greater degree.

 Or you can consider smaller partitions where there are files that are more frequently accessed, versus larger partitions that consist of more files but less frequently access files

 Then you can consider using different degrees of replication for the partition that has more frequently accessed files.

 So that overall each machine has approximately the same number of expected client requests.

 

 

 

 

 

Total files formula:

  • files_stored_per_machine * number_of_machines

Percentage lost formula:

  • (files_lost_per_single_failure / total_files) * 100

 

 

 

 Networking File System (NFS) Design

 

 

 

 

 

 

 

 

 

 NFS Versions

 

NFS v4 is is stateful, allows it by design to support operations like client cacheing and file logging

 

 

 

 

 

NFS allows files to be modified => NOT immutable

Distributed system => no guarantee that an update for a file will immmediately be visible => NOT Unix

 

for both session and periodic, perhaps there are elements of the sharing semantics that NFS supports that are session like or periodic like

and whether it will behave like with session or periodic semantics, it will really depend on how NFS is configured.

 That leaves that by default, NFS is really neither. => not pure session-based or periodic-based

 

 

 

 

 Sprite Distributed File System 

 

 

 

 

 Sprite DFS Access Pattern Analysis

 

based on these observations, they made first the following decision.

a write back on close, which is what apperas in session sematics, that's not really necessary.

we dont really have two sharing situations and most of the data will get deleted anyways.

so forcing the data to be written back to the server when the file is closed, doesnt seem like it will be useful.

 

 the decisions are not really friendly to concurrency, but they observe that file sharing is very rare => that's okak. no need to optimize for the concurrent situations.

 

 

 

 Sprite DFS From Analysis to Design

 

 

 

 

 File Access Operations in Sprite

 

 

 

 

 

 

posted @ 2019-05-30 06:00  ecoflex  阅读(296)  评论(0编辑  收藏  举报