In December 1996 and January 1997, 40 GB of tar archives were transferred from LBNL MSS to seedis.census.gov, a VAX at the Census Bureau. Some files were corrupted, and it was discovered that the VMS operating system could not support 9GB disk drives. After the operating system was upgraded, the corrupted files were deleted. A re-inventory of the tar archives was performed in February 1997.
tapes in the LBNL mass storage system (MSS)
tar archives in MSS
truncated tar archives
597 tar archives were copied to seedis.census.gov, out of 861 available on MSS. The tar archives NOT copied include
For details, see checks for completeness.
Of the 597 tar archives that were copied, sixteen have internal EOF's that prevent them from being read completely by tar. The same problem occurs for the original tarfiles at LBL, and the same problem exists for all versions of tar (UNIX and VMS). The defective tarfiles are the following:
files files
dka tape fmt dup of slot found total contents
--- ----- --- ------ ---- ----- ----- --------------------
200 05948 gss - 2433 3873 mort6878
300 17231 gss - 370 423 cens5th
300 40131 bck 50059 h47 180 ? censagr001 - 2of3
300 40151 bck 50046 h08 2122 ? seedis004 - 1of2
300 40188 bck 50101 - 164 ? cdc002 - 1of2
300 50072 bck 40084 j15 78 ? cens80005 - 1of3
500 40128 bck - - 185 ? [census80.rept2b]
500 40153 bck 50076 j19 2222 ? parap010 - 2of3
500 40183 bck 50096 j34 1789 ? sy$ush1 - 2of4
500 40185 bck 50098 - 4933 ? sy$ush1 - 4of4
500 40186 bck 50099 - 1071 ? sy$ush2 - 11 dirs
500 50206 bck - - 10 ? [cache.temp.seedata]
600 04428 gss - - 524 636 census80/stf4bpa
600 04467 gss - - 28 34 census80/stf4bpa
600 10673 gss - - 172 300 census80/stf2aa
600 10905 gss - - 67 74 census80/stf4bpa
procedure for processing tapes
errors encountered
files used by these routines
inventory of tar archives
1. edit tarinfo.txt to select high priority tapes. The ones to be selected have 1 in first column. If one copy is defective, choose the other. If both copies are OK, try to minimize number of unique MSS tapes (AAnnnn) that will be needed.
rlogin (cedr or parep2) as merrill cd $MDOCS/census/tar2seedis/ (make scratch copy of tarinfo.txt) cp tarinfo.txt tarinfo.tmp vi tarinfo.tmp (in another window examine tlist.out to see tape names, to find tapes that are duplicates of each other) rcs.csh
2. To get the necessary tapes into the robot,
prestage.pl
which creates files prestage.csh and premssls.csh. Then type
prestage.csh
to stage the tapes, and
premssls.csh
to see the results. (1 means the file has been staged, and therefore the tape is in the robot).
This routine stages only one small file (gssa.log or bcka.log) from each MSS tape. It does NOT stage the tar files to disk.
3. Send e-mail to hhholmes@lbl requesting that any tapes not in the robot be put there.
4. Check on any jobs that are in progress. On LBL machines where jobs have been submitted (e.g. parep2 and cedr), type
ps.csh
Note the computer name, process IDs, tape numbers and starting times of the runs. Kill any jobs that have been running longer than a few hours.
kill -9 pid
5. Add all status=1 tapes in tarinfo.txt to queue.txt with status=1:
makequeue.pl queue.txt
Edit queue.txt to remove any jobs already in progress, and any tapes in already in queue.txt with status=1.
6. To check the completion status of one tape, type (for example)
checkftp.pl 56789
7. If the job has finished successfully, you will be asked whether you wish to execute the command: (for example)
update.pl 56789 0 1 7
If you answer "y" the status code for tape 56789 in tarinfo.txt will be changed from 1 (priority) to 7 (FTP completed), and the files 56789.log and ftp_56789.log will be deleted.
Then you will be shown a list of the remaining files ftp_nnnnn.log and asked to type one of the following:
5-digit tape number check another tape n exit, do nothing
8. Edit tar2seedis.pl to specify what seedis disk drive (dka200-dka700) is to receive (a) GSS tar archives (b) BCK tar archives.
9. Several jobs tar2seedis.csh can be exceuting in parallel. As each one finishes, it will resubmit itself, using the tape number of the next status=1 tape in queue.txt. To start the first job, type
onetape.pl (wait for job to get started) control-Z bg ps
which will stage files for the next 7 jobs in queue.txt. Then it flags the first entry in queue.txt as "in progress" and proceed to execute tar2seedis.csh for the first one of those jobs.
$MDOCS/census/tar2seedis/tar2seedis.csh
(setenv MDOCS /CEDRCD/data1/merrill/docs)
script to FTP one tar archive to seedis.census.gov
to execute:
rlogin (cedr or parep2) -l merrill cd $MDOCS/census/tar2seedis/tar2seedis.csh tar2seedis.csh tapenumber
tar2seedis.csh uses:
tar2seedis.pl
tarinfo.txt
msstape.txt
active tapes as of June 1985 (1703 records)
tarinfo1.txt (sorted by original tape number)
tarinfo2.txt (sorted by MSS tape number)
913 files, 39 GB, in 913 directories
This list includes tape number, owner, size, date, tarfilename, and MSS tape number.
$MDOCS/census/tar2seedis/tarinfo.txt
list of MSS tar archives (913 records)
This file is an edited version of $PDOCS/gss/mss/tarinfo.txt
(setenv PDOCS /CEDRCD/data1/merrill/docs/parep)
$MDOCS/census/tar2seedis/msstape.txt
MSS physical MSS tape numbers (913 records)
obtained from $MDOCS/census/tar2seedis/msstape.mgs
SEEDIS volumes (including duplicate copies of volumes)
back to:
copying data from tape to MSS
Internet access to historical Census data
archival tapes at LBNL
census/tar2seedis/tar2seedis.html 9/18/97 in:
http://parep2.lbl.gov/mdocs
http://merrill.wwh.net/mdocs
http://imap.chesapeake.net/~merrill/mdocs
merrill@crocker.com