Peer-to-peer systems: Bittorrent This lecture's case study bittorrent Usages: static bulk content Songs and videos Linux distributions ... Usage model: cooperative user downloads file from someone using simple user interface while downloding, bittorrent serves file also to others bittorrent keeps running for a little while after download completes Goal: get file out to many users quickly Encourage everyone to upload file Approach 1: source sends to all Source's bandwidth cost proportional to # receivers Approach 2: IP multicast IP distributes content along a spanning tree connecting receivers and source not available across the general Internet unclear how to bill Approach 3: application-level multicast with a single tree only interior nodes contribute bandwidth download rate limited by lowest upload bandwidth leaves contribute no bandwidth at all Approach 4: application-level multicast with multiple, interior-node disjoint trees all nodes contribute churn requires tree repair Approach 5: swarming (e.g. BitTorrent) each node connected to a random set of other peers neighbors swap pieces of the content random choice of which piece to download next (critical for performance!) can utlilize available upload bandwidth in any amount robust to churn, bandwidth fluctuations higher latency than trees BitTorrent Challenges: Tracking which peer has what Handling high churn rates Download rate proportional to upload rate Approach: Publisher a .torrent file on a Web server (e.g., suprnova.org) URL of tracker file name, length, secure hash Seed posts the URL for .torrent with tracker Seed must have complete copy of file Every peer that is online and has copy of a file becomes a seed Peer asks tracker for list of peers to download from Tracker returns list with random selection of peers Peers contact peers to learn what parts (128-512KB) of the file they have etc. Download from other peers Transport Peers pipeline on top of TCP divide a file part further in 16Kbyte subpieces keep typically 5 requests in flight Which pieces to download strict order? rarest first? ensures that every piece is widely available also helps with the seed and bootstrapping rapidly random? avoid overloading seed when starting download parallel download of same pieces? avoid waiting on slowest final algorithm random for first piece, then rarest-first, parallel for last piece Fairness Goal: avoid peers that don't upload Approach: peers maximize download rate, but uploader can choke peers reciprocrate uploading to peers that upload to them Which peers to unchoke? the ones who provide the highest download rate pick best 4 rate is calculated over 20s average best are recalculated every 10s every 30s unchoke one unused peer to see if it gives a good download rate What if you are choked by your peers? choke those peers if yon't receive a piece in 1 min What if download finished? use upload rate as the metric How well does BitTorrent in practice? Anecdotally it works great---many happy users Difficult to answer precisely, but let's check the measurement paper figures 1, 2, 3, 5, and 6 Efficiency How long does it take for a high-upload capacity find other high-performance peers? If there are a few high-upload peers and many low-upload peers, does Bittorrent perform at its best? (Best = the total download time is at its lowest.) Fairness Do high-upload peers receive download rates proportional to their upload rate? What does BitTyrant do? upload to peer who reciprocate the most maximize those peers lower upload bandwidth as long as peer reciprocates http://www.cs.washington.edu/homes/piatek/papers/BitTyrant.pdf