PSN-L Email List Message

Subject: Re: New PSN event file format.
From: Edward Cranswick cranswick@........
Date: Thu, 29 Jun 2000 18:01:03 -0600


Larry-
    After 20 years experience with digital seismic data file formats,
particularly the USGS "DR100 Format" (DR1), I have been procrastinating about
responding to your proposal of a "New PSN event file format", but I finally just
went out, drank a quadruple espresso followed by two large cups of dark roast
which were consumed in conjunction with two large sugar-laden, white-flour
pastries, and now I am ready to go ... though, maybe I've already gone.

(Actually, I have also been distracted by a number of other issues such as my
USGS job responsibilities and some other PSN stuff which I will post to PSN-L
after I finish this).

Larry Cochrane wrote:

> Greetings,
>
> Recently I created a web page at http://www.seismicnet.com/psnformat4.html
> that documents a new PSN event file format I would like to propose we use
> once software is written to handle the new format. This format would
> replace the existing type 3 format (see
> http://www.seismicnet.com/info/format.txt) currently being used by PSN
> stations. The reason for this change is to add new fields needed to better
> describe the stations seismometer etc.

My first response, like Karl's,

>  Great job coming up with the new PSN type 4 file format.  This should serve
>  us well.  I, for one, really appreciate your efforts.

is that the format is excellent and that you are very been brave (probably
crazy), competent and conscientious to embark upon this upgrade. The format could
fly as is now, but I do have a few suggestions and some comments with respect to
the other responses you have already received from Ted, Arie, Angel, etc.

> As you can see from the document the event file is broken up into 4
> sections. The first being the fixed header section. This part of the event
> file would supply the most common information needed by a program like
> WinQuake to display the event data.

Ted suggested a more generalized approach to the format:

>  That's the only are[a] where I thought your design was a little weak
>  -- too much information in a fixed area.  I think that was my big mistake in
>  defining the SDAS/EMON format and I didn't want to see you go down that
>  road again.

but -- though I understand and sympathize with his concern about getting locked
into some rigid and insufficient structure -- I think your plan to have a fixed
header block that contains the most common and minimally required information is
the best approach.

> The next section is variable in length. it would contain variable length
> strings (up to 254 characters in length) used for the comment, station
> location, sensor information etc It would also contain other information
> not in the fixed sections of the header. The format of this section is
> designed so that new fields can be added in the future without changing the
> overall format of the event file.

I think this is important because it adds flexibility to the file structure, and
that addresses Ted's concern. In particular, more information is often needed
about the path from sensor to screen: you should be able to separate the
"Sensitivity" (Ground-motion-unit/Count) into its component parts, as is done in
DR1 format. In general, there is a Transducer (sensor) constant
(Volts/Ground-motion-unit), an amplifier Amplification (scalar multiplier of
Volts), and a Digitizing constant (Counts/Volt), or Volt/LSB, which is determined
by the A/D resolution in bits (which defines the full-scale output in counts) and
the A/D full-scale input voltages (and remember there might be gain-ranging,
though not in the New Millenium, I hope). The value of one of these components
may change, such as the gain, and it is easy to keep track of these values rather
than just tweaking the "Sensitivity".

Sensitivity = 1 / [ (Transducer constant) * Amplification * (Digitizing constant)
]

We also usually assume that the sensor has a frequency band of nominally "flat
response" to some form of input ground motion -- acceleration, velocity,
displacement: as you have specified in the "Sensor Output" (should be "Input",
because the sensor output is volts) -- and that this frequency band is bounded
above and below. Therefore, like filter specifications, these upper and lower
frequency bounds also need to be stored in terms of both their corner frequencies
and some measure of the steepness of their response decays or attenuations (e.g.,
dB/octave) above and below, respectively, these corner frequencies.

The above discussion about sensors is similar to Arie's remarks about storing the
filtering and processing parameters, i.e.,

>  One thing that would be keen, could a field show the type of
>  FFT filtering that was used?

I think that Angel's suggestion is important,

>  One of my concerns has been the name of the files themselves lead to
>  confusion now and then.  It would be nice if Winquake could rename
>  files made by SDR so they got away from the 8.3 limitations and
>  contained lots more information.  i.e. for a single event file the
>  name below gives the date and time to the second. File type which in
>  this case is "S" for standard  then the name of the station 5, then
>  that it is a single channel then the sensor orientation information

>  yyyy-mm-dd-hhmm-ssT.NETWO_nnn_snsr

>  2000-06-27-0714-06S.BRU2__001_S__Z

>  Then this file name for a volume file, this give the date and time
>  station and the fact that this volume file has four events in it.

>  2000-06-01-1345-48S.bru2__004

>  These are the files names that are used by Seisan and I'm sure that
>  there are others that beat out this one the DOS limited names.

The "Time.Space" structure of the filename -- as the present PSN filename
convention follows somewhat haphazardly -- makes alot of sense, but I think we
could dispense with "S" for standard and all those hyphens and underscores that
Seisan contains. The most important criterion for event filenames is that they
are unique, and secondly, that they can be sensibly sorted, i.e., that they
"naturally" list themselves into easily interpreted sets. You specify a total of
13 bytes of station information, i.e., spatial definition: the location and
direction where the ground moved: "Sensor Network", 6 bytes; "Sensor Name", 6
bytes, i.e., station name; "Component Orientation", 1 byte -- this last could be
changed and/or amended to include channel specification.

With respect to the "DateTime Structure", I would suggest that since you are
using a "long" to store the fraction of seconds, you might as well store
nanoseconds. I mean, if you got, i.e., the precision, why not flaunt it? you
never know when you'll need it. Also, it would be useful to include a clock
correction in seconds as a double (type/length) that could be added to the
nominal time that the samples were originally time-stamped with when recorded.
Particularly in the case of "unlocked" data, that estimate of true time might be
fuzzy and controversial.

> The next sections is the seismogram data. The data array can be either 16
> bit integer, 32 bit integer or floating point. After the data is two CRC-16
> bytes. This is used to verify the integrity of the headers and data sections.

Do you mean dis dat or dat one? When referring to the discrete values of the
recorded seismic timeseries, I always try to use the word "samples" rather than
"data" to distinguish them from the other data, such as time-stamps, instrument
constants, or station info.

>  I'm also proposing a PSN Volume file. This would contain 2 or more PSN type
> 4 event records placed in one file. See the bottom of the psnformat4.html
> document for the format of this file.

I think this is an excellent idea!!!

> If I left anything fields out that you think are needed please let me know.

Two thoughts:

1) To avoid worrying about the different "Type/Length" of parameters, all
parameters -- other than the "Variable Length" info -- could be stored in the
form of the most inclusive type: 8-byte double floats. Though this will waste
some space -- which will be small compared to the data-mass of the samples -- its
uniformity will be easier to program and it will accomodate future upgrades of
parameter precision.

2) As I said in a presentation at the IRIS workshop in 1983 that was held to
launch the PASSCAL Program: "seismic data is like nuclear waste: both need to be
stored in dumps while awaiting processing", and we need a "Cosmic Database" to
store all these data and make them rapidly accessible to all. Got any ideas about
putting it all together, such that all the header info of all event/volume files
can be rapidly queried and the corresponding waveform samples can be retrieved
via the Web?

> Regards,
> Larry Cochrane
> Redwood City, PSN
>
>
>
> __________________________________________________________
>
> Public Seismic Network Mailing List (PSN-L)
>
> To leave this list email PSN-L-REQUEST@.............. with
> the body of the message (first line only): unsubscribe
> See http://www.seismicnet.com/maillist.html for more information.

Regards,
Edward

--
Edward Cranswick                Tel: 303-273-8609
US Geological Survey, MS 966    Fax: 303-273-8600
PO Box 25046, Federal Center    cranswick@........
Denver, CO 80225-0046  USA      E.M. Forster said, "Only connect".


__________________________________________________________

Public Seismic Network Mailing List (PSN-L)


[ Top ] [ Back ] [ Home Page ]

Larry Cochrane <cochrane@..............>