Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Panel
titleTable of Contents
Table of Contents

Introduction

...

The intended audience of this document is members of the MIT Community wishing to know more about the technical implementation of Athena. Much of this document is taken verbatim from

/mit/ghudson/info/athena

...

,

...

written

...

by

...

Greg

...

Hudson,

...

former

...

Release

...

Engineer

...

for

...

Athena.

This is not a complete list. New sections to be written include Zephyr, Printing, the Global MOTD system, and the Lert system.

Kerberos

Many Athena services use a security system called Kerberos. Kerberos can be thought of as a service for negotiating shared secrets between unfamiliar parties.

A central server called a KDC (Key Distribution Center) has a pre-shared secret with each user and with each service. The secrets shared with users are conventionally called "passwords"; the secrets shared with services are conventionally called "keytabs" (or "srvtabs", in older jargon). Together, users and services are called "principals".

When one principal requests to negotiate a shared key with another principal, the KDC makes up a random new key (called a "session key"), encrypts it once in each principal's key (along with a bunch of other information), and sends both pieces of ciphertext back to the first principal, which will in turn send the appropriate part to the second principal when it is ready to talk. Since both principals can get at the session key by decrypting their bit of ciphertext, they now have a
shared secret which they can use to communicate securely. Kerberos clients record these bits of information in "credential caches" (or "ticket files" in older jargon; neither term is particularly correct since the file is not strictly a cache and stores more than just tickets).

There are two versions of the Kerberos protocol in use on Athena, 4 and 5. The Kerberos 5 protocol supports more features and different types of cryptographic algorithms, but is also a great deal more complicated. Kerberos 4 is being actively phased out, and is expected to be completely retired by early 2010.

See http://web.mit.edu/kerberos/www

...

for

...

more

...

complete

...

and

...

precise

...

information

...

about

...

Kerberos.

...

Athena

...

services

...

which

...

use

...

Kerberos

...

include

...

AFS,

...

discuss,

...

zephyr,

...

olc,

...

moira,

...

and

...

remote

...

login

...

and

...

FTP

...

(when

...

both

...

parties

...

support

...

it).

...

AFS

Athena workstations use a filesystem called AFS. Running AFS allows workstations to access files under the /afs hierarchy. Of particular interest are the MIT parts of this hierarchy: /afs/athena.mit.edu,

...

/afs/dev.mit.edu,

...

/afs/sipb.mit.edu,

...

/afs/net.mit.edu,

...

/afs/ops.mit.edu,

...

and

...

/afs/zone.mit.edu.

...

Unlike

...

NFS,

...

AFS

...

includes

...

two

...

layers

...

of

...

indirection

...

which

...

shield

...

a

...

client

...

from

...

having

...

to

...

know

...

what

...

hostname

...

a

...

file

...

resides

...

on

...

in

...

order

...

to

...

access

...

it.

...

The

...

first

...

layer

...

of

...

indirection

...

is

...

"cells",

...

such

...

as

...

athena.mit.edu.

...

Each

...

workstation

...

has

...

a

...

directory

...

of

...

cells

...

in

...

/usr/vice/etc/CellServDB,

...

which

...

it

...

can

...

use

...

to

...

look

...

up

...

the

...

database

...

servers

...

for

...

a

...

cell

...

name.

...

If

...

a

...

cell's

...

database

...

servers

...

change,

...

each

...


client's

...

CellServDB

...

has

...

to

...

be

...

updated,

...

but

...

the

...

canonical

...

paths

...

to

...

files

...

in

...

that

...

cell

...

do

...

not

...

change.

...

Athena

...

workstations

...

update

...

their

...

CellServDB

...

files

...

periodically

...

(at

...

boot

...

and

...

reactivate

...

time)

...

from

...

/afs/athena.mit.edu/service/CellServDB.

...

The

...

second

...

layer

...

of

...

indirection

...

is

...

the

...

volume

...

location

...

database,

...

or

...

VLDB.

...

Each

...

AFS

...

cell's

...

contents

...

are

...

divided

...

into

...

named

...

volumes

...

of

...

files

...

which

...

are

...

stored

...

together;

...

volumes

...

refer

...

to

...

other

...

volumes

...

using

...

mountpoints

...

within

...

their

...

directory

...

structure.

...

When

...

a

...

client

...

wishes

...

to

...

access

...

a

...

file

...

in

...

a

...

volume,

...

it

...

uses

...

the

...

VLDB

...

servers

...

to

...

find

...

out

...

which

...

file

...

server

...

the

...

volume

...

lives

...

on.

...

Volumes

...

can

...

move

...

around

...

from

...

one

...

file

...

server

...

to

...

another

...

and

...

clients

...

will

...

track

...

them

...

without

...

the

...

user

...

noticing

...

anything

...

other

...

than

...

a

...

slight

...

slowdown.

...

AFS

...

has

...

several

...

advantages

...

over

...

traditional

...

filesystems:

...

  • Volumes

...

  • can

...

  • be

...

  • moved

...

  • around

...

  • between

...

  • servers

...

  • without

...

  • causing

...

  • an

...

  • outage.

...

  • Volumes

...

  • can

...

  • be

...

  • replicated

...

  • so

...

  • that

...

  • they

...

  • are

...

  • accessible

...

  • from

...

  • several

...

  • servers.

...

  • (Only

...

  • read-only

...

  • copies

...

  • of

...

  • a

...

  • volume

...

  • can

...

  • be

...

  • replicated;

...

  • read/write

...

  • replication

...

  • is

...

  • a

...

  • difficult

...

  • problem.)

...

  • It

...

  • is

...

  • more

...

  • secure

...

  • than

...

  • traditional

...

  • NFS.

...

  • (Secure

...

  • variants

...

  • of

...

  • NFS

...

  • are

...

  • not

...

  • widely

...

  • implemented

...

  • outside

...

  • of

...

  • Solaris.)

...

  • AFS

...

  • clients

...

  • cache

...

  • data,

...

  • reducing

...

  • load

...

  • on

...

  • the

...

  • servers

...

  • and

...

  • improving

...

  • access

...

  • speed

...

  • in

...

  • some

...

  • cases.

...

  • Permissions

...

  • can

...

  • be

...

  • managed

...

  • in

...

  • a

...

  • (not

...

  • strictly)

...

  • more

...

  • flexible

...

  • manner

...

  • than

...

  • in

...

  • other

...

  • filesystems.

...

AFS

...

has

...

several

...

unusual

...

properties

...

which

...

sometimes

...

causes

...

software

...

to

...

behave

...

poorly

...

in

...

relationship

...

to

...

it:

...

  • AFS

...

  • uses

...

  • a

...

  • totally

...

  • different

...

  • permissions

...

  • system

...

  • from

...

  • most

...

  • other

...

  • Unix

...

  • filesystems;

...

  • instead

...

  • of

...

  • assigning

...

  • meanings

...

  • to

...

  • a

...

  • file's

...

  • status

...

  • bits

...

  • for

...

  • the

...

  • group

...

  • owner

...

  • and

...

  • the

...

  • world,

...

  • AFS

...

  • stores

...

  • an

...

  • access

...

  • control

...

  • list

...

  • in

...

  • each

...

  • directory

...

  • and

...

  • applies

...

  • that

...

  • list

...

  • to

...

  • all

...

  • files

...

  • in

...

  • the

...

  • directory.

...

  • As

...

  • a

...

  • result,

...

  • programs

...

  • that

...

  • copy

...

  • files

...

  • and

...

  • directories

...

  • will

...

  • usually

...

  • not

...

  • automatically

...

  • copy

...

  • the

...

  • permissions

...

  • along

...

  • with

...

  • them,

...

  • and

...

  • programs

...

  • that

...

  • use

...

  • file

...

  • status

...

  • bits

...

  • to

...

  • determine

...

  • in

...

  • advance

...

  • whether

...

  • they

...

  • have

...

  • permission

...

  • to

...

  • perform

...

  • an

...

  • operation

...

  • will

...

  • often

...

  • get

...

  • the

...

  • wrong

...

  • answer.

...

  • It

...

  • is

...

  • not

...

  • possible

...

  • to

...

  • make

...

  • a

...

  • hard

...

  • link

...

  • between

...

  • files

...

  • in

...

  • two

...

  • different

...

  • AFS

...

  • directories

...

  • even

...

  • if

...

  • they

...

  • are

...

  • in

...

  • the

...

  • same

...

  • volume,

...

  • so

...

  • programs

...

  • which

...

  • try

...

  • to

...

  • do

...

  • so

...

  • will

...

  • fail.

...

  • It

...

  • is

...

  • possible

...

  • to

...

  • lose

...

  • permissions

...

  • on

...

  • an

...

  • AFS

...

  • file

...

  • because

...

  • of

...

  • changing

...

  • ACLs

...

  • or

...

  • expired

...

  • or

...

  • destroyed

...

  • tokens.

...

  • This

...

  • is

...

  • not

...

  • possible

...

  • for

...

  • a

...

  • local

...

  • filesystem

...

  • and

...

  • some

...

  • programs

...

  • don't

...

  • behave

...

  • gracefully

...

  • when

...

  • it

...

  • happens

...

  • in

...

  • AFS.

...

  • It

...

  • is

...

  • possible

...

  • for

...

  • close()

...

  • to

...

  • fail

...

  • in

...

  • AFS

...

  • for

...

  • a

...

  • file

...

  • which

...

  • was

...

  • open

...

  • for

...

  • writing,

...

  • either

...

  • because

...

  • of

...

  • reaching

...

  • quota

...

  • or

...

  • because

...

  • of

...

  • lost

...

  • permissions.

...

  • This

...

  • is

...

  • also

...

  • not

...

  • possible

...

  • for

...

  • a

...

  • local

...

  • filesystem.

...

  • AFS

...

  • is

...

  • a

...

  • lot

...

  • slower

...

  • than

...

  • local

...

  • filesystem

...

  • access,

...

  • so

...

  • software

...

  • which

...

  • peforms

...

  • acceptably

...

  • on

...

  • local

...

  • disk

...

  • may

...

  • not

...

  • perform

...

  • acceptably

...

  • when

...

  • run

...

  • out

...

  • of

...

  • AFS.

...

  • Some

...

  • software

...

  • may

...

  • even

...

  • perform

...

  • unacceptably

...

  • simply

...

  • because

...

  • a

...

  • user's

...

  • home

...

  • directory

...

  • is

...

  • in

...

  • AFS,

...

  • even

...

  • though

...

  • the

...

  • software

...

  • itself

...

  • comes

...

  • from

...

  • local

...

  • disk.

...

AFS

...

uses

...

Kerberos

...

5

...

to

...

authenticate.

...

Since

...

it

...

is

...

not

...

reasonable

...

for

...

AFS

...

kernel

...

code

...

to

...

read

...

Kerberos

...

credential

...

caches

...

directly,

...

AFS-specific

...

credentials

...

are

...

stored

...

into

...

the

...

kernel

...

as

...

"tokens".

...

The

...

kernel

...

looks

...

up

...

tokens

...

using

...

a

...

"process

...

authentication

...

group"

...

or

...

PAG,

...

which

...

is

...

stored

...

in

...

the

...

user's

...

group

...

list.

...

If

...

there

...

is

...

no

...

PAG

...

in

...

the

...

user's

...

group

...

list,

...

the

...

kernel

...

falls

...

back

...

to

...

looking

...

up

...

tokens

...

by

...

uid,

...

which

...

would

...

mean

...

that

...

two

...

separate

...

logins

...

would

...

use

...

the

...

same

...

tokens

...

and

...

that

...

a

...

user

...

who

...

does

...

an

...

"su"

...

no

...

longer

...

uses

...

the

...

same

...

tokens.

...

Athena

...

workstations

...

do

...

their

...

best

...

to

...

ensure

...

that

...

each

...

login

...

gets

...

a

...

fresh

...

PAG.

...

See

...

http://www.openafs.org/

...

for

...

more

...

information

...

about

...

AFS.

...

Hesiod

Hesiod is a simple string lookup service built on top of the Domain Name System. Conceptually, the service translates a pair of strings (the "name" and "type") into a set of result strings. This lookup is done very simply; a DNS lookup is done for name.type.ns.athena.mit.edu

...

and

...

the

...

strings

...

in

...

the

...

resulting

...

TXT

...

records

...

are

...

returned.

...

Athena

...

uses

...

Hesiod

...

to

...

store

...

user

...

account

...

information

...

(see

...

section

...

6),

...

locker

...

information

...

(see

...

section

...

4),

...

post

...

office

...

box

...

information

...

(see

...

section

...

11),

...

workstation

...

cluster

...

information

...

(see

...

section

...

7),

...

and

...

printer

...

information.

...

Lockers

Athena imposes a filesystem-independent

...

layer

...

of

...

indirection

...

on

...

file

...

storage

...

called

...

"lockers".

...

Because

...

most

...

Athena

...

lockers

...

currently

...

live

...

in

...

AFS,

...

lockers

...

may

...

seem

...

a

...

little

...

inconvenient

...

and

...

pointless,

...

but

...

the

...

concept

...

may

...

come

...

in

...

handy

...

if

...

Athena

...

ever

...

moves

...

to

...

a

...

different

...

filesystem.

...

Operationally,

...

a

...

locker

...

is

...

represented

...

by

...

a

...

Hesiod

...

entry

...

with

...

type

...

"filsys".

...

The

...

value

...

of

...

the

...

filsys

...

record

...

is

...

a

...

string

...

which

...

usually

...

looks

...

like

...

"AFS

...

<pathname>

...

<mode>

...

<mountpoint>

...

<pref>",

...

where

...

AFS

...

is

...

the

...

filesystem

...

type,

...

<pathname>

...

is

...

the

...

AFS

...

path

...

of

...

the

...

locker,

...

<mode>

...

determines

...

whether

...

tokens

...

are

...

desirable

...

or

...

required

...

for

...

the

...

locker,

...

<mountpoint>

...

determines

...

where

...

the

...

locker

...

should

...

appear

...

on

...

the

...

local

...


workstation,

...

and

...

<pref>

...

is

...

used

...

to

...

order

...

filsys

...

entries

...

when

...

there

...

is

...

more

...

than

...

one.

...

If

...

the

...

filesystem

...

type

...

is

...

something

...

other

...

than

...

AFS,

...

different

...

fields

...

may

...

be

...

present.

...

Users

...

can

...

make

...

lockers

...

visible

...

on

...

an

...

Athena

...

workstation

...

using

...

the

...

setuid

...

"attach"

...

program.

...

The

...

"add"

...

alias

...

from

...

the

...

standard

...

Athena

...

dotfiles

...

attaches

...

a

...

locker

...

and

...

places

...

the

...

appropriate

...

binary

...

and

...

manual

...

directories

...

in

...

the

...

user's

...

PATH

...

and

...

MANPATH.

...

A

...

loose

...

convention,

...

documented

...

in

...

the

...

lockers(7)

...

man

...

page,

...

governs

...

how

...

software

...

lockers

...

should

...

be

...

organized.

...

Not

...

all

...

lockers

...

are

...

for

...

software;

...

in

...

particular,

...

user

...

home

...

directories

...

are

...

also

...

lockers,

...

and

...

generally

...

do

...

not

...

contain

...

any

...

software.

Mail infrastructure

To send mail, Athena machines use a mostly unmodified version of
sendmail. Outgoing mail is sent through the MIT mailhubs, although it
may be queued temporarily on local workstations if the MIT mailhubs
won't accept it immediately.

When mail is received by an MIT mailhub for username@mit.edu, it is
normally delivered to a storage area on a PO server. PO servers can
speak either IMAP (see RFC 2060) or a modified version of the POP
protocol (see RFC 1725) which uses Kerberos 4 instead of passwords to
authenticate.

The supported Athena mail client is a modified version of nmh which
uses KPOP to retrieve mail and store it in files in the user's home
directory. Many users use alternative mail programs; most use KPOP
and store into the user's homedir in some format. Some users use
netscape, which speaks IMAP using SSL and which generally leaves mail
on the PO server so that it can be accessed from non-Athena machines.

Moira

Moira is a database and primary information repository for:

  • Workstation cluster information
  • Locker filsys entries, quotas, and server locations
  • Lists, which compasses mailing lists and filesystem access
    groups
  • Host and network configuration
  • Kerberized NFS server configuration
  • Printer configurations
  • User information
  • Zephyr ACLs
  • "Generic ACLs" which can be used by any service which can
    be made to understand the ACL file format

and probably a few other things.

Production systems never (at least, ideally) retrieve information from
moira as part of regular operation; instead, a periodic process called
a DCM (Data Control Manager) pushes out new versions of information
from the Moira database to the affected servers. For instance, the
Hesiod DNS servers are periodically updated with a new zone file
containing new cluster, filsys, printer, and user information.
Currently, the DCM runs several times a day. A few kinds of changes
to the Moira database are propagated immediately to the affected
servers via incremental update; an example is changes to AFS groups
resulting from changes to Moira list membership.

The Moira server is implemented as an Oracle database with a
surrounding layer of C code. The Moira clients for Unix live in the
moira locker (the Athena release contains scripts which attach the
moira locker and run the actual programs), and use Kerberos 4 to
authenticate to the Moira server.

Larvnet

Larvnet is the cluster monitoring system which gathers the data
returned by the "cview" command--a list of free machines of each type
in the Athena clusters, and a list of the number of current jobs
pending on Athena printers.

When a user logs in or out of an Athena machine, or when an Athena
machine starts up the login system, the machine sends a status packet
to the Larvnet server. The status packet gives the machine's name,
host type, and an determination of whether any user is logged into the
machine at the console. Workstations can also be queried for the same
status information using the "busyd" UDP service, which runs out of
inetd. The Larvnet server separates machine names into clusters
according to a configuration file and produces a data file once per
minute containing counts of the free machines in each cluster of each
type.

The Larvnet server also queries the print spooler for each Athena
printer once per minute, using an "lpq" query. (Sadly, the output
returned by "lpq" is not standardized well enough to be robustly
machine-readable, so the mechanism here sometimes requires maintenance
when changes are made to the printing system.)

Athinfo

Athinfo is a TCP service which runs out of inetd on Athena machines.
It allows anyone to remotely run one of a specified set of named
commands and view the output. "athinfo machinename queries" will
generally give a list of the commands which can be run.

Software License Wrapper

Some of the commercial third-party software used at MIT is
"license-wrapped". This means the binary which lives in the locker
has been corrupted by DES-encrypting a chunk of the binary in some
key. A front-end script invokes a program which contacts the slw
server to retrieve the key which can decrypt the chunk of the binary
so that it can be run. The server will refuse the request for the key
if the client machine does not appear to be on an MIT network.

The license software currently lives in the slw locker, although it
may move into the Athena release.