In this lab you
will build a server and client that cache extents at the client, reducing the load
on the server and improving client performance. The main challenge is to ensure
consistency of extents cached at different clients. To achieve consistency we
will use the caching lock service from Lab 5.
First you'll add a local
write-back extent cache to each extent client. The extent client will
serve all extent operations from this cache; the extent client will contact the
extent server only to fetch an extent that is not present on the client. Then
you'll make the caches at different clients consistent by forcing write-back of
the dirty cached extent associated with a filehandle (and deletion of the clean
extent) when you release the lock on that filehandle.
Your client will be
a success if it manages to operate out of its local extent and lock cache when
reading/writing files and directories that other hosts aren't looking at, but
maintains correctness when the same files and directories are concurrently read
and updated on multiple hosts.
Begin by initializing
your Lab 6 branch with your implementation from Lab 5.
There is no new code for this lab, except for an update to the makefile to
build this lab.
% cd ~/lab
% git commit -am 'my solution to lab5'
Created commit ...
% git pull
remote: Generating pack...
...
% git checkout -b lab6 origin/lab6
Branch lab5 set up to track remote branch refs/remotes/origin/lab6.
Switched to a new branch "lab6"
% git merge lab5
Our measure of performance
is the number of put and get RPCs that your extent server receives. You can
tell the extent server to print out a line every 25 RPCs telling you the
current totals as you did for the lock server in Lab 5, by setting RPC_COUNT to 25.
Then you can start the
servers, run the test-lab-4-c script, and look in extent_server.log to see how many RPCs have
been received.
% export RPC_COUNT=25
% ./start.sh
% ./test-lab-4-c ./yfs1 ./yfs2
Create/delete in separate directories: tests completed OK
% grep "RPC STATS" extent_server.log
...
RPC STATS: 6001:801 6002:1402 6003:797
% ./stop.sh
The RPC STATS line indicates the number
of put, get and getattr RPCs received by the extent server. The above line is the output
of our solution for Lab 5. Your goal is to reduce those numbers to about a
dozen puts and at most a few hundred gets.
In Step One you'll add caching to your extent client, without
cache consistency. This cache will make your server fast but incorrect. (You
can simply modify extent_client.cc and extent_client.h, or if you'd like, you can add the code to a sub-class in a
separate file. Remember to git
add any new files you create.)
get() should check if the extent is cached, and if
so return the cached copy. Otherwise get() should fetch the extent from the extent server, put it in the
local cache, and then return it to the YFS client. put() should just replace the cached copy, and not
send it to the extent server. You'll find it helpful for the next section if
you keep track of which cached extents have been modified by put() (i.e., are "dirty"). remove() should delete the extent
from the local cache.
When you're done, set RPC_COUNT and run test-lab-4-c giving the same directory
twice, and watch the statistics printed by the extent server. You should see
zero puts and somewhere between zero and a few hundred gets (or perhaps no numbers at all, if the value
of RPC_COUNT is more than the number of
gets). Your server should pass test-lab-4-a.pl and test-lab-4-b if you give it the same
directory twice, but it will probably fail test-lab-4-b with two different directories because it has no cache
consistency.
In Step Two you'll ensure
that each get() sees the latest put(), even when the get() and put() are from different YFS clients. You'll arrange this by ensuring
that your extent client writes a file's modified (dirty) cached extents back to
the extent server before the client releases the lock on that file.
Similarly, your server should delete extents from its cache when it releases
the lock on the relevant file.
You will need to add a
method to the extent client to eliminate an extent from the cache. This flush() method should first check whether the extent
is dirty in the cache, in which case it sends it to the extent server. Extents
that your server has removed (with the extent client's remove() method) should also be
removed from the extent server (if the extent server knows about them).
Your server will need to
call flush() just before releasing a lock
back to the lock server. You could just add flush() calls to yfs_client.cc before each release(). However, now that your
lock client handles the caching of locks, flushing the extents after each
release is overkill; what you really want is to flush the extents only once the
client is forced to give the lock back to the lock server.
We provide an interface for
this in the form of the lock_release_user class, defined in lock_client_cache.h. This is a virtual class
supporting only one method: dorelease(std::string lockname). Your job is to subclass lock_release_user and implement that
subclass's dorelease method to call flush() on your extent client for whatever data is
about to lose its lock. Then, create an instance of this class and pass it into
the lock_client_cache object constructed in yfs_client.cc. Finally, your lock_client_cache must call the dorelease() method of its lu object before it releases a lock back to the
lock server. (Note that lu was defined and initialized
in the code we provided you for Lab 5.) Overall, this will ensure that any
dirty extents are flushed back to the cache before the lock is released, so
that when the next client gets the lock and fetches the extent, it will see
consistent data.
You should also keep extent
meta-data cached along with the extents, and flush dirty meta-data back to the
extent server along with the extents. If an extent is cached, then any calls
that set attributes should change the meta-data in the cache, and need not
propagate to the extent server until flush() is called.
When you're done with Step
Two your server should pass all the correctness tests (test-lab-4-a.pl, test-lab-4-b, and test-lab-4-c should execute correctly
with two separate YFS directories), and you should see a dramatic drop in the
number of puts and gets received by the extent server.
You must write all the code you hand in for the programming
assignments, except for code that we give you as part of the assignment. You
are not allowed to look at anyone else's solution (and you're not allowed to
look at solutions from previous years). You may discuss the assignments with
other students, but you may not look at or copy each others' code.
You will need to email your completed code (without binaries) as a
gzipped tar file to ds-assignment@mpi-sws.org by the deadline stated at the top of the page. To do this, switch
to the source directory and execute these commands:
% tar czvf MATR1-MATR2-lab6.tgz lab/
That should produce a file called [MATR1-MATR2]-lab6.tgz in that directory, where MATR1
and MATR2 are the
matriculation numbers of the team members. Attach that file to an email and
send it to the address above with the subject "Assignment 6 - LastName1
LastName2".
You will receive full credit
if your software passes the same tests we gave you when we run your software on
our machines.
Questions or comments regarding this
course? Please use the general
course mailing list or the teaching
staff mailing list.
Top // Distributed
Systems //