Larry- After 20 years experience with digital seismic data file formats, particularly the USGS "DR100 Format" (DR1), I have been procrastinating about responding to your proposal of a "New PSN event file format", but I finally just went out, drank a quadruple espresso followed by two large cups of dark roast which were consumed in conjunction with two large sugar-laden, white-flour pastries, and now I am ready to go ... though, maybe I've already gone. (Actually, I have also been distracted by a number of other issues such as my USGS job responsibilities and some other PSN stuff which I will post to PSN-L after I finish this). Larry Cochrane wrote: > Greetings, > > Recently I created a web page at http://www.seismicnet.com/psnformat4.html > that documents a new PSN event file format I would like to propose we use > once software is written to handle the new format. This format would > replace the existing type 3 format (see > http://www.seismicnet.com/info/format.txt) currently being used by PSN > stations. The reason for this change is to add new fields needed to better > describe the stations seismometer etc. My first response, like Karl's, > Great job coming up with the new PSN type 4 file format. This should serve > us well. I, for one, really appreciate your efforts. is that the format is excellent and that you are very been brave (probably crazy), competent and conscientious to embark upon this upgrade. The format could fly as is now, but I do have a few suggestions and some comments with respect to the other responses you have already received from Ted, Arie, Angel, etc. > As you can see from the document the event file is broken up into 4 > sections. The first being the fixed header section. This part of the event > file would supply the most common information needed by a program like > WinQuake to display the event data. Ted suggested a more generalized approach to the format: > That's the only are[a] where I thought your design was a little weak > -- too much information in a fixed area. I think that was my big mistake in > defining the SDAS/EMON format and I didn't want to see you go down that > road again. but -- though I understand and sympathize with his concern about getting locked into some rigid and insufficient structure -- I think your plan to have a fixed header block that contains the most common and minimally required information is the best approach. > The next section is variable in length. it would contain variable length > strings (up to 254 characters in length) used for the comment, station > location, sensor information etc It would also contain other information > not in the fixed sections of the header. The format of this section is > designed so that new fields can be added in the future without changing the > overall format of the event file. I think this is important because it adds flexibility to the file structure, and that addresses Ted's concern. In particular, more information is often needed about the path from sensor to screen: you should be able to separate the "Sensitivity" (Ground-motion-unit/Count) into its component parts, as is done in DR1 format. In general, there is a Transducer (sensor) constant (Volts/Ground-motion-unit), an amplifier Amplification (scalar multiplier of Volts), and a Digitizing constant (Counts/Volt), or Volt/LSB, which is determined by the A/D resolution in bits (which defines the full-scale output in counts) and the A/D full-scale input voltages (and remember there might be gain-ranging, though not in the New Millenium, I hope). The value of one of these components may change, such as the gain, and it is easy to keep track of these values rather than just tweaking the "Sensitivity". Sensitivity = 1 / [ (Transducer constant) * Amplification * (Digitizing constant) ] We also usually assume that the sensor has a frequency band of nominally "flat response" to some form of input ground motion -- acceleration, velocity, displacement: as you have specified in the "Sensor Output" (should be "Input", because the sensor output is volts) -- and that this frequency band is bounded above and below. Therefore, like filter specifications, these upper and lower frequency bounds also need to be stored in terms of both their corner frequencies and some measure of the steepness of their response decays or attenuations (e.g., dB/octave) above and below, respectively, these corner frequencies. The above discussion about sensors is similar to Arie's remarks about storing the filtering and processing parameters, i.e., > One thing that would be keen, could a field show the type of > FFT filtering that was used? I think that Angel's suggestion is important, > One of my concerns has been the name of the files themselves lead to > confusion now and then. It would be nice if Winquake could rename > files made by SDR so they got away from the 8.3 limitations and > contained lots more information. i.e. for a single event file the > name below gives the date and time to the second. File type which in > this case is "S" for standard then the name of the station 5, then > that it is a single channel then the sensor orientation information > yyyy-mm-dd-hhmm-ssT.NETWO_nnn_snsr > 2000-06-27-0714-06S.BRU2__001_S__Z > Then this file name for a volume file, this give the date and time > station and the fact that this volume file has four events in it. > 2000-06-01-1345-48S.bru2__004 > These are the files names that are used by Seisan and I'm sure that > there are others that beat out this one the DOS limited names. The "Time.Space" structure of the filename -- as the present PSN filename convention follows somewhat haphazardly -- makes alot of sense, but I think we could dispense with "S" for standard and all those hyphens and underscores that Seisan contains. The most important criterion for event filenames is that they are unique, and secondly, that they can be sensibly sorted, i.e., that they "naturally" list themselves into easily interpreted sets. You specify a total of 13 bytes of station information, i.e., spatial definition: the location and direction where the ground moved: "Sensor Network", 6 bytes; "Sensor Name", 6 bytes, i.e., station name; "Component Orientation", 1 byte -- this last could be changed and/or amended to include channel specification. With respect to the "DateTime Structure", I would suggest that since you are using a "long" to store the fraction of seconds, you might as well store nanoseconds. I mean, if you got, i.e., the precision, why not flaunt it? you never know when you'll need it. Also, it would be useful to include a clock correction in seconds as a double (type/length) that could be added to the nominal time that the samples were originally time-stamped with when recorded. Particularly in the case of "unlocked" data, that estimate of true time might be fuzzy and controversial. > The next sections is the seismogram data. The data array can be either 16 > bit integer, 32 bit integer or floating point. After the data is two CRC-16 > bytes. This is used to verify the integrity of the headers and data sections. Do you mean dis dat or dat one? When referring to the discrete values of the recorded seismic timeseries, I always try to use the word "samples" rather than "data" to distinguish them from the other data, such as time-stamps, instrument constants, or station info. > I'm also proposing a PSN Volume file. This would contain 2 or more PSN type > 4 event records placed in one file. See the bottom of the psnformat4.html > document for the format of this file. I think this is an excellent idea!!! > If I left anything fields out that you think are needed please let me know. Two thoughts: 1) To avoid worrying about the different "Type/Length" of parameters, all parameters -- other than the "Variable Length" info -- could be stored in the form of the most inclusive type: 8-byte double floats. Though this will waste some space -- which will be small compared to the data-mass of the samples -- its uniformity will be easier to program and it will accomodate future upgrades of parameter precision. 2) As I said in a presentation at the IRIS workshop in 1983 that was held to launch the PASSCAL Program: "seismic data is like nuclear waste: both need to be stored in dumps while awaiting processing", and we need a "Cosmic Database" to store all these data and make them rapidly accessible to all. Got any ideas about putting it all together, such that all the header info of all event/volume files can be rapidly queried and the corresponding waveform samples can be retrieved via the Web? > Regards, > Larry Cochrane > Redwood City, PSN > > > > __________________________________________________________ > > Public Seismic Network Mailing List (PSN-L) > > To leave this list email PSN-L-REQUEST@.............. with > the body of the message (first line only): unsubscribe > See http://www.seismicnet.com/maillist.html for more information. Regards, Edward -- Edward Cranswick Tel: 303-273-8609 US Geological Survey, MS 966 Fax: 303-273-8600 PO Box 25046, Federal Center cranswick@........ Denver, CO 80225-0046 USA E.M. Forster said, "Only connect". __________________________________________________________ Public Seismic Network Mailing List (PSN-L)
Larry Cochrane <cochrane@..............>