HOPS Data manipulation package: HOST: maelstrom.harvard.edu (128.103.2.50) TAR FILES: pub/HOPS/Datamng/datamng_1.6.tar.Z (121213:576000 bytes) VERSION: 1.6 (January 18, 2001) ORIGIN: Harvard University, Cambridge Massachusetts Harvard Ocean Prediction System (HOPS) DEVELOPERS: Patrick J. Haley Jr. (haley@pacific.harvard.edu) Wayne G. Leslie (leslie@pacific.harvard.edu) LIBRARIES: (1) GNUmake version 3.75 Free Software Foundation http://www.gnu.ai.mit.edu/ ============ INTRODUCTION ============ This directory contains HOPS's data manipulation package. The package is designed for manipulation of the MODS data format used by HOPS (an ASCII format). The package contains tools to examine the data file, manipulate the files, manipulate the data within the files and some sample conversion programs. This package also contains tools to read and write MODS formatted data from MATLAB ============ Installation ============ This package is available over the INTERNET via anonymous FTP from maelstrom.harvard.edu (128.103.2.50). When connected, you will be in the FTP directory. To obtain this package, go to the directory "pub/HOPS/Datamng" and get files: Readme.datamng This file. datamng_1.6.tar.Z Compressed tar file of the data manipulation package. To install this package, simply go to the directory in which you want to put the data manipulation package and execute the following command: zcat datamng_1.6.tar.Z | tar -pxvf - The tar file datamng_1.6.tar.Z contains the following files: add_top.F cat_hydro.F convcast.F minmax.F saclant2mods.F select.F select_depth.F selpoly.F selpos.F seltime.F shifttime.F smooth_casts.F thinclimo.F timestat.F all_lc.F blkdat_hydro.F blkdat_sac.F caldate.F cast_stat.F day_code.F fgaussian.F flinear.F get_date.F gregorian.F heap_sort.F inside.F intbas.F isnum.F istyp.F julian.F length.F lnblk.F my_handler.F nxt_blnk.F p2z.F q_init.F rcond_rep.F readhydro.F rget_val.F rhydbase.F rinsert.F rmblklines.FF rmed_filt.F rq_in.F rq_out.F sac_data.F sac_head.F set_outname.F set_type.F whydbase.F writehydro.F wrtstr.F z2p.F hydro.h param_hydro.h sac_stuff.h GNUmakefile.alpha GNUmakefile.cray GNUmakefile.iris GNUmakefile.rs6000 GNUmakefile.sun3 GNUmakefile.sun4 GNUmakefile.sun5 UPDATES read_cast.m read_header.m rhydro.m whydro.m write_cast.m ============ The Makefile ============ Currently, there are seven different GNUmakefiles for seven different computer architectures: GNUmakefile.alpha GNUmakefile.cray GNUmakefile.iris GNUmakefile.rs6000 GNUmakefile.sun3 GNUmakefile.sun4 GNUmakefile.sun5 These makefiles are written for GNU Make, version 3.75 Free Software Foundation (617) 876-3296 675 Mass Ave. gnu@prep.ai.mit.edu Cambridge, MA 02139, USA http://www.gnu.ai.mit.edu/ They are NOT compatible with the standard UNIX Make. The GNU Makefiles have been designed to allow the user to compile the data manipulation source codes in a separate directory from that in which the source codes are located. The makefile searches for the code segments in the following alternate paths: source code: (1) the directory containing the GNUmakefile. (2) the directory specified by the macro SRCDIR include files: (1) the directory containing the GNUmakefile. (2) the directory specified by the macro PARAMDIR (3) the directory specified by the macro SRCDIR This provides the user with the flexibility for the following configurations: (1) The user needs only a copy of a GNUmakefile and a path to the source codes to produce a version of the data manipulation with the appropriate C-preprocessing and compilier options. (2) The user who is modifying the data manipulation codes, can isolate those routines actually being changed with a copy of a GNUmakefile in a sub-directory. ------------------------------------- GNUmakefile Tunable macro definitions ------------------------------------- The User needs to check and modify the following macro definitions in the appropriate GNUmakefile before compiling and linking the application code: ADDTOP Executable name for add_top CATHYDRO Executable name for cat_hydro CONVCAST Executable name for convcast MINMAX Executable name for minmax SCLNT2MD Executable name for saclant2mods SELECT Executable name for select SELPOLY Executable name for selpoly SELPOS Executable name for selpos SELTIME Executable name for seltime SHIFTTIM Executable name for shifttime SLCTDPTH Executable name for select_depth SMTHCAST Executable name for smooth_casts THINCLMO Executable name for thinclimo TIMESTAT Executable name for timestat BINDIR directory path for the executable code. CPPFLAGS C-preprocessing flags and options. FFLAGS flags to the fortran compiler. PARAMDIR alternate directory path for the include files. SRCDIR directory path for the source code. ----------------------------------- GNUmakefile C-preprocessing Options ----------------------------------- The following are the available C-preprocessing options to use in the macro definition CPPFLAGS: aixdate AIX intrinsic date routine (IBM RS6000). craydate CRAY's intrinsic date routine. decdate DEC's VMS intrinsic date routine. gendbg Generic Debugging: Preserves intermediate files rmdocinc Remove documentation in all include files sundate SUN's intrinsic date routine. sunflush regularly flush output buffers in SUN systems. sunfpe enable SUN's Floating Point Exception trap. ------------------------------- GNUmakefile Installation Issues ------------------------------- A number of internal macros are defined for the system commands used by the GNU makefiles. These will generally only have to be defined once, the first time the user installs the data manipulation package on a new system. RMBLKLINES The name given to executable code (provided with this package) to remove blank lines from the pre-processed code. This is provided only to avoid possible conflicts. SHELL The shell to be used by the makefile. RM The remove command. ECHO The echo command. LIB The netCDF library CPP The C Pre-Processor. FC The FORTRAN compilier ===================== Compiling and Linking ===================== Once that the software has been installed and the Makefile has been selected and customized, the User needs to attend the following steps to compile and link the application code: (1) Customize include parameter file "param_hydro.h". The User needs to set the following parameters: MHDR Maximum number of lines of text in the file header. MHFLDS Maximum number of field types supported. MHPTS Maximum number of data points per station. MHVAR Maximum number of variables per station. (MHVAR<=MHFLDS) It is recommended to set these values big to avoid recompiling each time, for example: parameter (mhdr=20,mhflds=10,mhpts=10000,mhvar=4) ========= Tool List ========= The following is a list of the codes supplied in the data manipulation package. All are interactive, prompting the user for the necessary data. Inquiry codes: minmax Reports min/max statistics of the data. timestat Reports time statistics of the data. File manipulation codes: cat_hydro Concatenates multiple MODS data files. select Extracts casts based on cast id. select_depth Extracts casts based on minimum allowed depth. selpoly Extracts casts based on a polygon. selpos Extracts casts based on position. seltime Extracts casts based on time. thinclimo Reduces gridded climatology resolution by factor of 2. Data manipulation codes: add_top Adds a surface value to data. shifttime Adds a constant time shift to the data. smooth_casts Smooths the data with linear or Gaussian filters. Data conversion codes: convcast Converts data file formats. saclant2mods Convert SACLANT formatted data to MODS. ============ Matlab Tools ============ The following is a list of the scripts, supplied in the data manipulation package, for accessing MODS formatted data from within MATLAB. User Level: rhydro Read a MODS formatted data file. whydro Writes a data file in MODS format. It requires the user to set-up all the header information. It's primary usefullness is to re-write data that was read in with rhydro. Low Level: read_cast Reads a single cast from a MODS formatted data file. read_header Reads the file header from a MODS formatted data file. write_cast Writes a single cast to a MODS formatted data file. =========== MODS Format =========== The basic data format supported by these codes is the MODS format. A description of this ASCII format is given. The MODS format is designed around "cast" data. That is, data with depth dependence taken at points in latitude, longitude and time. The data file is in two sections: the header and the data. The conventions are as follows lat/lon: Latitudes are positive north of the equator. -------- Longitudes are positive east of the prime meridian. All values are in decimal degrees. o o (39 30'N, 71 45' W) = (39.5, -71.75) depth: Depths are positive down (below sea-level). ------ time: Times are given in modified Julian days. ----- The modifications are: - The day starts at midnight, not noon. - An offset, given in the header, is subtracted from the actual Julian date. -------------------------------------------------------------------------------- Sample of MODS formatted data. -------------------------------------------------------------------------------- title = AIS95 Sample Data stations = 2 str_time = 10005.4072, Oct 14 1995 09:46:24 end_time = 10014.4775, Oct 23 1995 11:27:39 Jday_offset = 2440000 lng_min = 11.6195 lng_max = 15.4230 lat_min = 36.5342 lat_max = 37.8503 format = ascii, record interleaving type = XCTD, XBT fields = depth, temperature, salinity units = meter, Celsius, PSU creation_date = Tuesday - March 18, 1997 - 2:50:13 pm END 3 4 1001 11.6195 36.5342 285.0 10005.4072 1.00E-01 1.00E-03 1.00E-03 0 'XCTD: z t s' 0 18 26 34 23720 23720 23700 23700 36861 36861 36928 36985 2 15 2071 16.5057 37.3998 2614.0 10014.4775 1.00E-01 1.00E-03 0 'XBT: z t' 0 25 32 38 45 51 58 64 71 77 84 90 97 103 109 21860 21860 21860 21860 21850 21850 21850 21850 21850 21840 21840 21830 21840 21840 21830 -------------------------------------------------------------------------------- The file header in the above example is 15 lines long, ending with the terminator "END". The information contained in the header is: title A simple description of the data in the file stations The number of profiles in the data file str_time The earliest time in the data file. end_time The latest time in the data file. Jday_offset The offset subtracted from the Julian dates in the file. lng_min The westernmost cast longitude in the file. lng_max The easternmost cast longitude in the file. lat_min The southernmost cast latitude in the file. lat_max The northernmost cast latitude in the file. format A description of the data format. type The types of instruments represented in the file. fields The types of oceanographic fields represented in the file. units The respective units for the fields. creation_date The date the file was written. After the file header is the cast data. Each cast has its own two-line header. Cast header, line 1: -------------------- The first line of the cast header looks like: NHVAR NHPTS CASTID HLNG HLAT HDPTH HTIME HSCL(1) HSCL(2) ... HSCL(NHVAR) NHVAR - number of variables NHPTS - number of data points CASTID - cast identifier HLNG - longitude HLAT - latitude HDPTH - maximum depth HTIME - time HSCL(i) - scale to apply to variable i Cast header, line 2: -------------------- The second line of the cast header looks like: HFLAG HTYPE HFLAG - unused flag variable HTYPE - string indicated instrument type, and oceanographic fields represented in cast. The data is stored by field. For the first cast in the above sample, the XCTD, the data would be stored as depth then temperature and finally salinity. The data is read in and then scaled by the appropriate HSCL value.