PyHDF

1. Overview

PyHDF is a Python interface to the HDF4external library. It covers most functions of Scientific Data Sets (SD API), Vdatas (VS API), and Vgroups (V API). PyHDF is not merely a wrapper of HDF4 C API. PyHDF exploits Python features such as the OOP concept and exception handling to make it more convenient.

At the time this document was written, the latest version was 0.8.2 released in August 2008, and it was built with HDF4.2r1. However, it worked with HDF4.2r3.

2. Installation

Like other Python libraries, this library comes with the setup.py script. Users may need to set include_dirs and library_dirs for HDF4external. Although PyHDF was developed for HDF4.2r1, we could not find any problems with HDF4.2r3. If HDF4 was not built with SZIPexternal, the libraries option needs to be changed.

PyHDF 0.7-3 requires the Numeric package that the numpy[1] package replaced. PyHDF was successfully built with Numeric-24-2.

3. How to Use

PyHDF is similar to HDF4external C API in that most functions have similar names and functionality. Although most function names are the same as or similar to corresponding C APIs, they are categorized into a few classes. For example, the SD API is divided into five Python classes, including SD, SDS, SDim and SDAttr. Figure 1 shows part of a program that creates a Scientific Data Set.

Figure 1 Python code creating an HDF4 SDS with PyHDF interface
from pyhdf.SD import*
# import Numeric Python package -- Numpy
fromnumpy import*

data = array(((1, 2, 3),
(4, 5, 6)), int16)

# Create an HDF file
sd = SD("hello.hdf", SDC.WRITE | SDC.CREATE)

# Create a dataset
sds = sd.create("sds1", SDC.INT16, (2, 3))

# Fill the dataset with a fill value
sds.setfillvalue(0)

# Set dimension names
dim1 = sds.dim(0)
dim1.setname("row")
dim2 = sds.dim(1)
dim2.setname("col")

# Assign an attribute to the dataset
sds.units = "miles"

# Write data
sds[:] = data

# Close the dataset
sds.endaccess()

# Flush and close the HDF file
sd.end()

The code in Figure 1 creates an HDF4 file and an SDS object in it. This code is straightforward to those who are familiar with HDF4external. As Table 1 shows, many PyHDF interfaces are equivalent to HDF4 C interfaces.

PyHDF API Equivalent HDF4 C API
SD (constructor) SDstart
SD.create SDcreate
SDS.setfillvalue SDsetfillvalue
SDS.dim SDgetdimid
SDim.setname SDsetdimname
SDS.endaccess SDendaccess
SD.end SDend
Table 1 PyHDF API and equivalent HDF4 C API

The statement starting with sd = SD() creates an SD instance, and it is equivalent to the SDstart() function. The SD class implements functions applied to a file such as creating a file and a global attribute. The SD interface identifier that the SDstart() API returns does not exist because the SD class of PyHDF encapsulates the data and possible operations.

The statement starting with sds.units sets an attribute to the specific SDS object. This is equivalent to the SDsetattr() C function. The next statement, sds[:] = data, writes the actual values to the file as SDwritedata() does.

Both V API and VS API are divided into a few classes and are encapsulated like SD API. This eliminates the use of an identifier, and may improve the readability.

See also examples at the PyHDF SourceForge Examplesexternal PyHDF Sourceforge Project site.

4. References


Last modified: 12/15/2010
About Us | Contact Info | Archive Info | Disclaimer
Sponsored by Subcontract number 114820 under Raytheon Contract number NNG10HP02C, funded by NASA / Maintained by The HDF Group