tar file installation instructions
From merrill@cedr.lbl.gov Tue Mar 11 12:32:59 1997
Received: from lbl.gov (lbl.gov [128.3.254.23]) by cedr.lbl.gov (LBNLMWH3/8.6.5) with SMTP id MAA06551 for <merrill@cedr>; Tue, 11 Mar 1997 12:32:58 -0800 (PST)
Received: from parep2.lbl.gov by lbl.gov (SMI-8.6/SMI-SVR4)
id MAA00956; Tue, 11 Mar 1997 12:32:37 -0800
Received: (merrill@localhost) by parep2.lbl.gov (8.6.11/8.6.5) id MAA02851; Tue, 11 Mar 1997 12:32:55 -0800
From: Deane Merrill <merrill@cedr.lbl.gov>
Message-Id: <199703112032.MAA02851@parep2.lbl.gov>
Subject: how to untar the tar files
To: chris.stuber@census.gov (chris stuber)
Date: Tue, 11 Mar 1997 12:32:55 -0800 (PST)
Cc: dwmerrill@lbl.gov (deane merrill)
X-Mailer: ELM [version 2.4 PL23]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Status: RO
Chris,
To respond to your phone query for help with untarring SEEDIS files:
Here, in parep2.lbl.gov/~merrill/bin, is a chain of sample programs which are used to untar files now at LBL.
This example is for 1980 STF2A and STF2B7R at PLTRACT80 or TRACT80 level, whose tarfiles happen to be stored in GSS format. Other tarfiles are stored in BCK (VMS BACKUP) format and are handled a little differently. The BCK tarfiles are much easier to handle - you just need to make sure that the VMS file attributes are correct (they are not stored in the tarfiles). You will find examples of BCK programs in the same directory as these GSS programs.
Intentionally, to facilitate debugging and stepwise development, the command files are extremely modular.
To get something running in census.seedis.gov:
I would work from the bottom up, with a modified version of gss.com. With some tinkering, it should be able to create *.DAT and *.NDX files from a tarfile that is in GSS format, and put them into the SEEDIS disk cache in the location where SEEDIS expects to find them.
Don't modify any of the software currently used by SEEDIS, or you may break what works now.
Then, generalize and build a UNIX interface, moving up one step at a time. The second step is to create (in VMS) a modified seedistar_stf2x.com which will invoke gss.com with the desired arguments.
The third step is to create (in UNIX) a modified vmsjob_stf2a.pl which will invoke seedistar_stf2x.com. And so on. The highest-level programs (the ones first in this chain) are LBL-specific and you won't need them.
Once you know what works, you may prefer to start over, with a top-down approach. I prefer top-down myself.
In cedr.lbl.gov:~merrill/bin/
- temp.sh
- a sample sh script which invokes stf2x.sh four times, to install 1980 Census STF2 files, at SEEDIS level PLTRACT80, (place/tract pieces) for four California counties. The "b" means that both file A and file B (SEEDIS databases STF2A and STF2B7R) will be installed.
- stf2x.sh
- a sh script which installs the files for one county. It invokes the perl script random_county_stf2x.pl.
- random_county_stf2x.pl
- with all three arguments specified, installs the files for one county. It invokes the csh script mss_stage_stf2x.csh to "stage" the tar files, i.e. get the tar files from LBL MSS and put them on a UNIX cache disk at LBL.
- mss_stage_stf2x.csh
- a csh script which stages the tar files to the UNIX cache disk. It translates the level and database name into the appropriate MSS pathname, then invokes /vol/local/bin/mssls to check whether the staging of tar files is complete. If not, it resubmits itself at a later time (the time interval is successively doubled, up to a 4 hour maximum). If the staging is complete, it executes vmsjob_stf2x.csh to initiate the extraction from the tar files.
- vmsjob_stf2x.csh
- a csh script which initiates a VMS job on seedis.lbl.gov. If both file A and file B have been requested, this script invokes both of the perl scripts vmsjob_stf2a.pl and vmsjob_stf2b7r.pl.
- vmsjob_stf2a.pl (or vmsjob_stf2b7r.pl)
- a perl script which creates a complete rsh command, to cause files to be put into the VMS cache disk. One invocation transfers the STF2A or STF2B7R files, for one county, at level PLTRACT80 or TRACT80, to the VMS cache disk. An example of the rsh command is listed near the beginning of this perl script. It invokes, with approprate arguments, the VMS command file disk$seed0:[cache]seedistar_stf2x.com on seedis.lbl.gov.
- seedistar_stf2x.com
- a VMS command file which is submitted, with appropriate arguments, by the UNIX rsh command descibed above. At LBL it expects to find the tarfile in mss104:[seedis.'OWNER'.gss.'TAPENO']'TARFILE'.
- In seedis.census.gov) it will need to find the tarfile in 'DISK':[seedis.mss.'TAPENO']'TARFILE', where the disk drive 'DISK' as a function of tape number is specified by http://parep2.lbl.gov/mdocs/census/tar2seedis/inventory/dirloc.txt.
- This command file first invokes vmstar, then invokes sy$cachetemp:[cache]gss.com.
- gss.com
- a VMS command file which writes to the SEEDIS cache disk selected files from a tarfile in GSS format.
- First it submits a background job sy$seedis:[seedis.cache]cachefree.com to check available space in the cache, and clear out space if necessary.
- Then it uses gssconvert.com to convert the GSS files *.DAT to VMS binary format, and *.NDX to ascii format.
- Then it uses setblk.exe to set the VMS block size of the *.DAT file as required.
- Then it uses perl to modify the header line in the *.NDX file to point to the location in the VMS cache disk where the *.DAT file will be stored.
- Then it uses nx.com to make the *.NDX file from scratch from the *.DAT file. (This overwrites the *.NDX file which came from the tarfile, which is redundant extra work in some cases but required in others. For now, leave it in for safety).
- Then it resets the expiration date of the newly created *.DAT and *.NDX files, so they will be purged later than older files, when the time comes to clear out cache space.
- Then it clears out temporary files no longer needed.
- Then, if the level is TRACT80, it invokes tract80.com to complete processing of those TRACT80 which have individual PLTRACT80 components. NOTE: This processing assumes that the corresponding files for level PLTRACT80 have already been put in the cache by an earlier run. At LBL this is accomplished by the csh and perl scripts which precede the VMS command scripts.
- Then, for STF2B7R (File B, race specific) it creates separate race-specific *.NDX files which all point to the same *.DAT file.
Don't hesitate to ask for advice.
Deane
back to installing data from tar files into SEEDIS
seedis/tar2seedis.msg.html 3/11/97 in:
http://parep2.lbl.gov/mdocs
http://merrill.wwh.net/mdocs
http://imap.chesapeake.net/~merrill/mdocs
merrill@crocker.com