Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

Queue

...

system administration

To do list

The following is a list of items requiring attention, in order of priority:

  1. Reinstall OS on Pegasus.
  2. Measure usage on each machine using PBS/Torque accounting tools.
  3. Clean up our junk in NE47-181 and E40-008: throw away boxes, remove old Cyrus2 cluster.
  4. Find better way to manage user data after they leave the group

Backup services

Backup service for our four clusters is provided by MIT TSM.  This comes at a cost of $65 per month per system.

Restore procedures

There are two cases for restoring the backed up data:

  1. When the cluster is accessible and the data is accidentally deleted. The lost data is to be restored to the same location. TSM backup server works best in this case. 
  2. When the cluster is inaccessible and the data is to be restored to a new location i.e a drive connected to your computer.

TSM works best for the first case. TSM software for linux is already installed on our clusters. 

For case 1, follow the following steps:

First, establish connection to the TSM server.

Code Block

$ sudo dsmc

Next, you can see the version of the file stored on the TSM server.

Code Block
 administration


h2. To do list

The following is a list of items requiring attention, in order of priority:

# Reinstall OS on Pegasus.
# Measure usage on each machine using PBS/Torque accounting tools.
# Clean up our junk in NE47-181 and E40-008: throw away boxes, remove old Cyrus2 cluster.
# Find better way to manage user data after they leave the group

h2. Backup services

Backup service for our four clusters is provided by [MIT TSM|http://ist.mit.edu/services/backup/tsm].  This comes at a cost of $65 per month per system.


h3. Restore procedures

There are two cases for restoring the backed up data:

# When the cluster is accessible and the data is accidentally deleted. The lost data is to be restored to the same location. TSM backup server works best in this case. 
# When the cluster is inaccessible and the data is to be restored to a new location i.e a drive connected to your computer.


TSM works best for the first case. TSM software for linux is already installed on our clusters. 

For case 1, follow the following steps:

First, establish connection to the TSM server.
{code}
$ sudo dsmc
{code}
Next, you can see the version of the file stored on the TSM server.
{code}
$ query backup /path/filename
{code}

You

...

can

...

restore

...

the

...

file

...

to

...

its

...

original

...

location,

...

or restore the file to a new location. For restoring folders with subdirectories use option --sub=yes.

{
Code Block
}
$ restore backup /path/filename
OR
$ restore backup “/path/filename” /newpath/newfilename
{code}

For

...

case

...

2,

...

the

...

procedure

...

is

...

little

...

involved.

...

There

...

are

...

few

...

important

...

things

...

to

...

take

...

into

...

account.

...

The

...

exact

...

procedure

...

is

...

included

...

below:

...

1.

...

The

...

backup

...

can

...

only

...

be

...

restored

...

using

...

a

...

linux

...

machine.

...

This

...

is

...

because

...

the

...

downloading

...

machine

...

has

...

to

...

masquerade

...

as

...

the

...

original

...

cluster

...

in

...

order

...

to

...

retrieve

...

data.

...

All

...

our

...

clusters

...

use

...

some

...

form

...

of

...

linux.

...

Therefore,

...

a

...

linux

...

machine

...

is

...

required

...

to

...

retrieve

...

data

...

from

...

TSM

...

server.

...

2.

...

The

...

filesystem

...

of

...

the

...

disc

...

on

...

which

...

you

...

are

...

writing

...

retrieved

...

data

...

should

...

be

...

exactly

...

similar

...

to

...

the

...

filesystem

...

used

...

on

...

the

...

cluster.

...

For

...

example,

...

if

...

the

...

files

...

on

...

the

...

cluster

...

are

...

written

...

on

...

a

...

drive

...

with

...

xfs

...

file

...

system,

...

you

...

have

...

to

...

use

...

a

...

disc

...

on

...

the

...

third

...

machine,

...

which

...

is

...

also

...

formatted

...

as

...

xfs.

...

3.

...

TSM

...

software

...

for

...

linux

...

can

...

be

...

downloaded

...

from

...

the

...

IS&T

...

website.

...

  The

...

installation

...

procedure

...

for

...

a

...

new

...

Ubuntu

...

version

...

is

...

described

...

here.

4.  The older version of the software is written for RHL5 and is available as an rpm. Install the "ksh" package and the "alien" package. ksh is needed since several of the scripts included with TSM use ksh. More important is "alien" as, this allows users to install RPM packages on Ubuntu or other Debian-based distributions.

Code Block
|http://kb.mit.edu/confluence/display/istcontrib/TSM+for+32+or+64+bit+debathena+or+Ubuntu+-+Install%2C+Configure%2C+Set+Up+and+Confirm+the+Scheduler+for+TSM].

4.  The older version of the software is written for RHL5 and is available as an rpm. Install the "ksh" package and the "alien" package. ksh is needed since several of the scripts included with TSM use ksh. More important is "alien" as, this allows users to install RPM packages on Ubuntu or other Debian-based distributions.
{code}
$ sudo apt-get install ksh alien
{code}

The

...

next

...

step

...

is

...

to

...

use

...

alien

...

to

...

install

...

the

...

appropriate

...

RPMs:

{
Code Block
}
$ sudo alien -i --scripts TIVsm-API.i386.rpm TIVsm-BA.i386.rpm
{code}

6.

...

There

...

are

...

several

...

other

...

libraries

...

which

...

are

...

required

...

by

...

TSM

...

like

...

libstdc++.so.5

...

etc.

...

Download

...

and

...

install

...

the

...

required

...

files

...

from

...

apt-get

...

or

...

some

...

other

...

source.

...

7.

...

Change

...

the

...

Nodename,

...

backup

...

server

...

and

...

errorlog

...

file

...

location

...

in

...

dsm.sys.

...

This

...

file

...

is

...

located

...

in

...

the

...

/opt/tivoli/tsm/client/ba/bin/

...

folder.

...

Settings

...

for

...

each

...

cluster

...

are

...

given

...

on

...

this

...

page.

...

8.

...

Follow

...

the

...

instructions

...

from

...

case

...

1

...

above

...

to

...

restore

...

files

...

to

...

a

...

new

...

location.

...

Look

...

up

...

the

...

documentation

...

pages

...

for

...

TSM

...

commands

...

like

...

restart

...

restore,

...

cancel

...

restore,

...

etc.

...

Installing TSM backup software

The TSM 5.4

...

software

...

has

...

been

...

installed

...

in

...

accordance

...

with

...

the

...

instructions

...

on

...

the

...

TSM

...

page

...

.  There is a need to install older libraries, namely libstdc+.so.5.

...

 On Darius1 this was done as follows: the compat-libstdc+

...

package

...

was

...

downloaded

...

from here and here, and then installed using the "yum localinstall" command:

Code Block
 here|http://rpm.pbone.net/index.php3/stat/4/idpl/13943754/dir/centos_5/com/compat-libstdc++-33-3.2.3-61.x86_64.rpm.html] and [here|http://rpm.pbone.net/index.php3/stat/4/idpl/13943753/dir/centos_5/com/compat-libstdc++-33-3.2.3-61.i386.rpm.html], and then installed using the "yum localinstall" command:
{code}
sudo yum localinstall compat-libstdc++-33-3.2.3-61.x86_64.rpm
sudo yum localinstall compat-libstdc++-33-3.2.3-61.i386.rpm
sudo yum localinstall TIVsm-API.i386.rpm
sudo yum localinstall TIVsm-BA.i386.rpm
{code}

The

...

next

...

steps

...

are

...

to

...

edit

...

the

...

dsm.opt

...

and

...

dsm.sys

...

files

...

as

...

described

...

in

...

the

...

instructions.

...

Those

...

files

...

include

...

the

...

default

...

location

...

for

...

the

...

backup

...

logs:

{
Code Block
}
/opt/tivoli/tsm/client/ba/bin/dsmsched.log and
/opt/tivoli/tsm/client/ba/bin/dsmerror.log
{code}

Finally,

...

running

...

the

...

dsmc

...

program

...

as

...

root

...

will

...

let

...

the

...

user

...

enter

...

the

...

initial

...

password.

...

Next,

...

a

...

line

...

can

...

be

...

added

...

to

...

/etc/inittab

...

to

...

automatically

...

start

...

the

...

dsmc

...

scheduler;

...

to

...

initialize

...

it

...

after

...

installing,

...

the

...

root

...

user

...

can

...

simply

...

execute

...

the

...

dsmc

...

command

...

with

...

the

...

"sched"

...

argument:

{
Code Block
}
# nohup /usr/bin/dsmc sched > /dev/null 2>&1 &
{code}

{color:#003366}{*}TSM registration information{*}{color}

The four clusters backed up with TSM have the following registration information.  The TSM system automatically assigns an initial password (_newpass_), but according to the registration e-mail, this will be automatically changed to a new, encrypted password, and stored on the machine after the first connection to the TSM servers.

{color:#000000}{*}Darius2{*}{color}
{color:#000000}Server: 

TSM registration information

The four clusters backed up with TSM have the following registration information.  The TSM system automatically assigns an initial password (newpass), but according to the registration e-mail, this will be automatically changed to a new, encrypted password, and stored on the machine after the first connection to the TSM servers.

Darius2
Server: oc11-bk-ent-1.mit.edu

...


Nodename:

...

 DARIUS2.CSBI

...


Schedule:

...

 BUS-0700

...

Darius1
Server: backup-i.mit.edu

...


Nodename:

...

 DARIUS1.CSBI

...


Schedule:

...

 BUS-2400

...

Cyrus1
Server:

...

 backup-i.mit.edu

...


Nodename:

...

 CYRUS1.CSBI

...


Schedule:

...

 BUS-2400

...

Quantum2
Server:

...

 backup-i.mit.edu

...


Nodename:

...

 QUANTUM2
Schedule:

...

 BUS-2400

...