Naming ======================== Problem: associate names with objects, which allows systems and users to identify, locate, or share objects. Examples: URLs, DNS names, phone numbers, ... Concept of name is analogous to that of identifier, though latter usually restricts to non-human readable names. Key operation: resolution -- maps name to object. In reality resolution handles the most critical part of this mapping, typically from the object's name to the object's location. Fetching the object itself is considered to be outside the resolution process. Locating an object might involve multiple resolution steps. Example: From a DSN name you can find an IP address, which at some point is used to find an Ethernet address. A few relevant concepts: - Namespace - Set of possible names. - Naming authority - Entity that governs the naming system. Usually responsible for defining rules regarding the namespace and assigning names. Examples? One of the issues that naming authorities usually have to address is how to avoid name collision, i.e., ensure that there is at most one object that has a certain name. How is this ensured? - Directory - (Logically) centralized system that keeps mapping from names to objects and performs name resolution. Names have several distinguishing properties, that constrain several aspects of system design. Listing some of the most important ones: - Scope (local vs global) does a name have the same meaning irrespectively of the context in which it is used? Local names simplify some aspects of system design (e.g. avoiding name collisions or providing efficient resolution) but require some form of translation when being used across different contexts. E.g., IP addresses. Normally these are global, but if you consider NATing, the answer might differ. Why are NATs necessary? What problems to they introduce? - Flat vs. hierarchical A flat name allows for more flexibility in makes it easy to handle reconfigurations in the system, but it makes the lookup process more difficult, especially at a large scale. - Persistent versus non-persistent A persistent name is one that is invariant, even when the object that it references moves or is replicated. To achieve this, a persistent name must not be tied to any administrative domain or entity, which implies that an object can change its domain without having to change the name. Examples? - Semantic-freedom A semantic-free name neither embeds information about the organization, administrative domain, or network provider it originated in or in which it is currently located, nor is human-friendly. Scalable name resolution ------------------------ Many simple proposals work well at a small scale, e.g.: - HOSTS.TXT (see assigned reading) - ARP resolution broadcasts resolution requests But need other techniques to implement scalable resolution. Let's look at the example of DNS (more details in the paper). Architecture: name servers (repositories of partial information) and resolvers (front-ends that implement the client-side protocols). Both functions can be co-located, e.g. an ISP may offer both as a service implemented by the same server. How is scalability achieved? Two important mechanisms: zones and caching. Zones - partitioning of the namespace. Different zones correspond to different subsets of the name tree, and are also managed by different naming authoritys and different name servers. The top-level domains are managed by ICANN, who delegates the authority over sub-domains to other entities. For each domain there is an authoritative name server, responsible for maintaining up-to-date information about the domain, but can delegate sub-domains to other name servers. The answers provided by the authoritative name server have been configured by an original source, e.g. a sysadmin. This is replicated for fault tolerance. Address resolution - Host contacts DNS resolver or recurser (a server running at the host's ISP that iteratively resolves query at the host's request). Go through example:, contact root server (in a well-known address) to ask for csail.mit.edu, root server answers with address of edu nameserver. edu namerserver is contacted, answers with mit.edu nameserver. Final answer is obtained for there. Works correctly and keeps the information that each server has to keep track of small. But does not scale with the number of queries near the root. Solution: Caching. Each record has an associated TTL value (could be days or even weeks). Drawback: changes are not propagated immediately, need to wait for all cached entries to expire. This is a form of weak consistency, as seen in the first few lectures of the course. DNS resolvers have cache associated with it. Avoids most contacts for resolving higher-levels of domain hierarchy. Issues with DNS: due to the commercial value of DNS names "profit has replaced pragmatism as the dominant force shaping DNS." In some settings, relying on a centralized entity (ICANN in this case) is not an option, e.g. peer-to-peer systems. How to achieve scalable name resolution in that case? Next week: Scalable lookups in a flat namespace.