How to Read and Visualize HDF-EOS2 data Using MATLAB

MATLABexternal provides APIs for accessing several HDF4external application programming interfaces (APIs) including HDF-EOS2 . In order to use these APIs user must be familiar with the HDF4 library. This page presents a few examples on how to read and visualize HDF-EOS2 data via MATLAB.

MATLAB provides functions to read HDF-EOS grid (GD), point (PT), and swath (SW) data. Since each of them is equivalent to an HDF-EOS2 C API, those who are familiar with the HDF-EOS2 library can easily learn how to handle HDF-EOS2 files in MATLAB.


HDF-EOS grid data is special in that it does not store longitude and latitude values. Instead of storing those values, HDF-EOS grid keeps only several parameters that can generate the entire longitude and latitude values. This can save much space, but this also makes it difficult to retrieve those values unless HDF-EOS API is used.

The degree of difficulty of obtaining longitude and latitude values depends on the projection method. The simplest projection is the geographic projection. We will cover this case using a real AMSR-E AE_RnGdexternal file from NSIDC. You can download the file here.

To access a data field in a grid data, the enclosing HDF-EOS2 file should be opened using hdfgd, first. Using the descriptor returned by that API, the grid should be opened. In this example, we will access a grid named MonthlyRainTotal_GeoGrid in an HDF-EOS2 file named AMSR_E_L3_RainGrid_B05_200707.hdf.

Figure 1 Accessing a grid

file_id = hdfgd('open', FILE_NAME, 'rdonly')
grid_id = hdfgd('attach', file_id, GRID_NAME)
hdfgd is the function which MATLAB provides to read data from an HDF-EOS2 grid file. The first argument that it takes always reflects the task it is supposed to do. In the figure above, open would open the file which is similar to GDopen API from the HDF-EOS2 C library. It will create a handle which is stored in file_id. Using this file_id descriptor, the grid is opened in the next step using attach.

Then, data fields, dimensions and attributes in this grid can be accessed.

Figure 2 Reading a data field in a grid
[rrland, fail] = hdfgd('readfield', grid_id, DATAFIELD_NAME, [], [], [])
The data in a grid can be read out by using readfield. The name of the data field being read is RrLandRain. Note that grid_id, is the descriptor returned by attach. rrland is the buffer where the data is stored. Unlike using C APIs, users do not need to allocate memory for this buffer; memory will be automatically allocated. Also, users do not need to specify the type of rrland because MATLAB is a dynamically typed language.

After finishing reading all fields, one needs to close the grid using detach. Similarly, if accessing the grid file is done, one needs to close the file using close.

Figure 3 Closing the grid and the file
hdfgd('detach', grid_id);
hdfgd('close', file_id);
grid_id is what attach returned, and file_id is what open returned.

A few more steps are required to visualize the data field we just read. To generate a more meaningful plot, the data field needs to be associated with coordinates. Users need to provide longitude and latitude values. Note that the following explanation only applies to the geographic projection. For more information on how to retrieve latitude and longitude from an HDF-EOS2 grid, please check here .

The geographic projection, which is used by MonthlyRainTotal_GeoGrid, maps meridians to equally spaced vertical straight lines, and circles of latitude to evenly spread horizontal straight lines[1]. This implies that we can precisely interpolate all longitude and latitude values if we know the followings:

From the range and the number of points, we can get the space between two adjacent points, which can be called the scale. Then, we can say i-th longitude value is (i + offsetX) * scaleX + leftX. offsetX is given by Pixel Registration. This is out of scope of this page, the value of offsetX and offsetY can be 0, 0.5 or 1. For more information, refer to the HDF-EOS2 Reference[2].

We need to retrieve those four types of information. We will use gridinfo API to get the values for all the required variables.

Figure 4 Retrieving information about a grid structure
[xdimsize, ydimsize, upleft, lowright, status] = hdfgd('gridinfo', grid_id);
The output of the above line of code can be seen in Figure 5.

xdimsize (72) and ydimsize (28) are the values for numX and numY, respectively. Also, upleft and lowright represents coordinates of upper left corner and lower right corner respectively in DMS format[3]. One needs to convert these coordinates in degree. After conversion, one can find these values as 0 degree, 70 degree, 360 degree and -70 degree. As the name of upleft and lowright imply, we can get 0 as leftX, 360 as rightX, 70 as upperY and -70 as lowerY.

Another factor that affects the longitude and latitude values is the Pixel Registration option. Since this option can not be obtained. HDFE_CENTER is assumed. HDFE_CENTER assigns 0.5 to both offsetX and offsetY.

Now, we have enough information from the file.

From the above information, one can calulate the scaleX and the value is 5 degree. Similarly, the value of scaleY is -5 degree. The longitude values are (i + 0.5) * 5 + 0 where i = 0, 1, 2, ..., 71. Similarly, the latitude values are (j + 0.5) * -5 + 70 where j = 0, 1, 2, ..., 27. One can find that the longitude range is from 2.5 to 357.5 and the latitude range is from 67.5 to -67.5.
Figure 6 Calculating latitude and longitude

for i = 0:(xdimsize-1)
lon_value(i+1) = (i+offsetX)*(scaleX) + upleft_d(1);

for j = 0:(ydimsize-1)
lat_value(j+1) = (j+offsetY)*(scaleY) + upleft_d(2);

Generating the correct plot with a geographic map still requires several adjustments for the arrangments of data and latitude and longitude.

We need to pass the latitude, longitude and rrland values to the contour function in order to plot the data. To ensure the plot to be aligned with the world map generated by the geoshow function, the first argument of contour must be longitude and the second argument must be latitude. According to Figure 5, the size of rrland is [72 28], which implies that the first dimension is xdim(lon) and the second dimension is ydim(lat) in MATLAB. This is also required by the contour fuction, which needs lon to be the first argument and lat to be the second argument to ensure the plot to be overlaid with the world map correctly. However, the fastest changing dimension of rrland is along the same longitude line. The size is 72. This can be observed by Figure 7.

From the above figure, we find that the rrland is really stored as [28 72] in the file. Therefore a transpose of the rrland is needed to generate the correct plot.

Displaying the world map with the geoshow function also requires the longitude range starting from -180 degree and ending at 180 degree. So the longitude needs to be translated from '2.5 degree to 357.5 degree' to '-177.5 degree to 177.5 degree'. The latitude also needs to be translated from "north to south"(decreasing of the latitude) to "south to north"(increasing of the latitude) with the geoshow function. Accordingly, the latitude needs to be translated from '67.5 degree to -67.5 degree' to '-67.5 degree to 67.5 degree'.

Also the lower left corner in MATLAB is treated as the origin of the coordinate. However, as we can see from the information obtained by gridinfo, the origin is defined as the upper left corner in the file. Hence, we need to flip rrland before passing it to the contour function. The code section is listed below.

Figure 8 Adjusting the data and geo-location information
ts = transpose(rrland);
halfx= floor(xdimsize/2);
ts_reverse = [ts(:, (halfx+1):xdimsize) ts(:, 1:halfx)];
data = flipud(ts_reverse);

lon_offset = -180;
lon_value = lon_offset + lon_value;
lat_value = fliplr(lat_value);

Finally, one can use the contour function to draw a plot using data, lon and lat calculated above. Since Many options are provided for using the contour function, users may need to refer to MATLAB's detailed document for more information about this function.

Figure 9 Visualizing a data field
contour(lon, lat, data)
geoshow('landareas.shp', 'FaceColor', [0.4 0.4 0.4])

You can see the complete code from here. Figure 10 shows the result.

Swath data

Unlike grid data, swath data contains geo-location fields such as Longitude and Latitude. This allows users to draw plots without the extra step to calculate longitude and latitude values; these values can be read from geo-location fields. However, swath data may contain dimension maps that allow the size of geo-location fields smaller or larger than that of the related data fields. If this is the case, retrieving longitude and latitude values becomes complicated. We will not cover the case when dimension maps are used to handle a swath.

We will use one swath file from AMSR-E/Aqua L2A Global Swath Spatially-Resampled Brightness Temperatures product. You can get one sample file here.

The step to open, read and close is similar.

Figure 11 Accessing a swath

file_id = hdfsw('open', FILE_NAME, 'rdonly')
swath_id = hdfsw('attach', file_id, SWATH_NAME)

[data, fail] = hdfsw('readfield', swath_id, DATAFIELD_NAME, [], [], [])
[lon, status] = hdfsw('readfield', swath_id, 'Longitude', [], [], [])
[lat, status] = hdfsw('readfield', swath_id, 'Latitude', [], [], [])

hdfsw('detach', swath_id);
hdfsw('close', file_id);
The name of APIs is slightly different; hdfsw is used instead of hdfgd. As the function name implies, functions that contain sw are for swath, and the functions that contain gd are for grid. The above code reads one data field 23.8H_Approx._Res.3_TB_(not-resampled) and two geo-location fields, Latitude and Longitude. Usually, a swath data contains both geo-location fields, and they provide longitude values and latitude values.

As the readfield function used to handle a grid file, the readfield function for the swath file in the above piece of code stores the data elements in the variable data. Now that lat and lon contain the latitude and longitude values respectively, one can pass those variables to contour function as the following arguments:

Figure 12 Visualizing a swath field
contourf(lon, lat, data)
geoshow('landareas.shp', 'FaceColor', 'white', 'FaceAlpha', 0)
Here the function geoshow creates the world map, with land area colored white. The fifth and the sixth parameters of the geoshow function are FaceAlpha and 0. These parameters imply that FaceAlpha value is set to zero. This will clear the land areas of the map where the data overlaps with the map so that the plot of the data will show up in those areas. In this example, the first and second arguments of contour are two-dimensional because both Longitude and Latitude are two-dimensional. Then, each element of data is mapped to each element of lat and lon.

You can see the complete code here. Figure 13 shows the result.


Last modified: 09/10/2015
About Us | Contact Info | Archive Info | Disclaimer
Sponsored by Subcontract under Raytheon Contract number NNG15HZ39C, funded by NASA / Maintained by The HDF Group