Read .dss database file in Python

2.9k views Asked by At

I have a DSS database file and I want to extract the database schemas from the file using Python. I actually need to execute queries on this database but I couldn't find any good documentation to start with. So, I decided if I can extract the schemas, I can create an SQLite database and run my experiments there.

The file content is like:

5a44 5353 8854 0000 6e04 0000 0700 0000
362d 5146 14a2 2001 85a9 8c00 3037 4a55
4c31 3400 3330 4e4f 5631 3700 3132 3a35
393a 3134 0000 0000 0010 0000 0100 0000
2000 0000 0500 0000 7f00 0000 df10 0000
...  ...  ...  ...
...  ...  ...  ...

Note. I am not familiar with DSS database.

Any help would be appreciated.

2

There are 2 answers

3
gyanz On BEST ANSWER

DSS or HEC-DSS is a database system developed by U.S. Army Corps of Engineers' Hydrologic Engineering Center (HEC). It is not a relational database, but a database that is designed to retrieve and store large amounts of data more efficiently that are not necessarily interlinked to other sets of data. Data types such as time series data, paired data (like pandas DataFrame), spatial-oriented gridded data, and others are supported by HEC-DSS. HEC-DSS references data sets, or records, by their pathnames. A pathname is separated into six parts, and is labeled "A" through "F" as /A/B/C/D/E/F/. HEC-DSS is incorporated in HEC's major programs such as HEC-RAS, HEC-HMS, HEC-RTS, etc.

HEC has released Java library and Java-based visual utility program HEC-DSSVue to view, edit or manipulate HEC-DSS. The Java library can be used in Jython, however, Jython lacks numpy, pandas, matplotlib and other popular scientific libraries. So I began working on pydsstools, a python library for HEC-DSS, in late 2017. Currently, pydsstools supports major HEC-DSS data types and basic copy, delete operations, and works in Windows and Ubuntu like distributions. The following python code is an example for reading and plotting time-series data:

from pydsstools.heclib.dss import HecDss
import matplotlib.pyplot as plt
import numpy as np

dss_file = "example.dss"
pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR/Ex1/"
startDate = "15JUL2019 19:00:00"
endDate = "15JUL2019 21:00:00"

with HecDss.Open(dss_file) as fid:
    ts = fid.read_ts(pathname,window=(startDate,endDate),trim_missing=True)
    times = np.array(ts.pytimes)
    values = ts.values
    plt.plot(times[~ts.nodata],values[~ts.nodata],"o")
    plt.show()

Examples for manipulating other data types (paired data, gridded data) are provided in README section of pydsstools repository in GitHub.

0
Mattchoo On

If you are in a Windows environment, this library looks promising. I'm in a Linux environment, so I couldn't use it, but the examples showed what I wanted to do:

https://github.com/gyanz/pydsstools