Naming
========================

Problem: associate names with objects, which allows systems and users
to identify, locate, or share objects.

Examples: URLs, DNS names, phone numbers, ...

Concept of name is analogous to that of identifier, though latter
usually restricts to non-human readable names.

Key operation: resolution -- maps name to object.

In reality resolution handles the most critical part of this mapping,
typically from the object's name to the object's location. Fetching
the object itself is considered to be outside the resolution process.

Locating an object might involve multiple resolution steps.
Example: From a DSN name you can find an IP address, which at
some point is used to find an Ethernet address.

A few relevant concepts:

- Namespace - Set of possible names.

- Naming authority - Entity that governs the naming system. Usually
responsible for defining rules regarding the namespace and assigning
names. Examples?
One of the issues that naming authorities usually have to address is
how to avoid name collision, i.e., ensure that there is at most one
object that has a certain name. How is this ensured?

- Directory - (Logically) centralized system that keeps mapping from
names to objects and performs name resolution.

Names have several distinguishing properties, that constrain several
aspects of system design. Listing some of the most important ones:

- Scope (local vs global)
does a name have the same meaning irrespectively of the
context in which it is used?  Local names simplify some aspects of
system design (e.g. avoiding name collisions or providing efficient
resolution) but require some form of translation when being used
across different contexts.
E.g., IP addresses. Normally these are global, but if you consider
NATing, the answer might differ. Why are NATs necessary? What problems
to they introduce?

- Flat vs. hierarchical
A flat name allows for more flexibility in makes it easy to handle
reconfigurations in the system, but it makes the lookup process more
difficult, especially at a large scale.

- Persistent versus non-persistent
A persistent name is one that is invariant, even when the object that
it references moves or is replicated. To achieve this, a persistent
name must not be tied to any administrative domain or entity, which
implies that an object can change its domain without having to change
the name.  Examples?

- Semantic-freedom
A semantic-free name neither embeds information about the
organization, administrative domain, or network provider
it originated in or in which it is currently located, nor is
human-friendly.


Scalable name resolution
------------------------

Many simple proposals work well at a small scale, e.g.:

- HOSTS.TXT (see assigned reading)
- ARP resolution broadcasts resolution requests

But need other techniques to implement scalable resolution. Let's look
at the example of DNS (more details in the paper).

Architecture: name servers (repositories of partial information)
and resolvers (front-ends that implement the client-side protocols).
Both functions can be co-located, e.g. an ISP may offer both as a
service implemented by the same server.

How is scalability achieved? Two important mechanisms: zones and caching.

Zones - partitioning of the namespace. Different zones correspond to
different subsets of the name tree, and are also managed by different
naming authoritys and different name servers. The top-level domains
are managed by ICANN, who delegates the authority over sub-domains
to other entities.

For each domain there is an authoritative name server, responsible
for maintaining up-to-date information about the domain, but can
delegate sub-domains to other name servers. The answers provided
by the authoritative name server have been configured by an original
source, e.g. a sysadmin. This is replicated for fault tolerance.

Address resolution - 

Host contacts DNS resolver or recurser (a server running at the host's ISP
that iteratively resolves query at the host's request).

Go through example:, contact root server (in a well-known address) to ask for
csail.mit.edu, root server answers with address of edu nameserver.

edu namerserver is contacted, answers with mit.edu nameserver.

Final answer is obtained for there.

Works correctly and keeps the information that each server has to
keep track of small. But does not scale with the number of queries
near the root.

Solution: Caching.

Each record has an associated TTL value (could be days or even weeks).

Drawback: changes are not propagated immediately, need to wait for
all cached entries to expire. This is a form of weak consistency, as seen
in the first few lectures of the course.

DNS resolvers have cache associated with it. Avoids most contacts
for resolving higher-levels of domain hierarchy.

Issues with DNS: due to the commercial value of DNS names "profit has replaced
pragmatism as the dominant force shaping DNS."

In some settings, relying on a centralized entity (ICANN in this case) is
not an option, e.g. peer-to-peer systems. How to achieve scalable
name resolution in that case?

Next week: Scalable lookups in a flat namespace.