17.2.5.1.3. Low-level IO modules

Module containing NetCDF file reading functions

cis.data_io.netcdf.find_missing_value(var)

Get the missing / fill value of the variable

Parameters:var – NetCDF Variable instance
Returns:missing / fill value
cis.data_io.netcdf.get_data(var)

Reads raw data from a NetCDF.Variable instance.

Parameters:var – The specific Variable instance to read
Returns:A numpy maskedarray. Missing values are False in the mask.
cis.data_io.netcdf.get_metadata(var)

Retrieves all metadata

Parameters:var – the Variable to read metadata from
Returns:A metadata object
cis.data_io.netcdf.get_netcdf_file_attributes(filename)

Get all the global attributes from a NetCDF file

Parameters:filename – The filename of the file to get the variables from
Returns:a dictionary of attributes and their values
cis.data_io.netcdf.get_netcdf_file_variables(filename, exclude_coords=False)

Get all the variables contained in a NetCDF file. Variables in NetCDF4 Hierarchical groups are returned with their fully qualified variable name in the form <group1>.<group2....>.<variable_name>, e.g.``AVHRR.Ch4CentralWavenumber``.

Parameters:
  • filename – The filename of the file to get the variables from
  • exclude_coords – Exclude coordinate variables if True
Returns:

An OrderedDict containing {variable_name: NetCDF Variable instance}

cis.data_io.netcdf.read(filename, usr_variables)

Reads a Variable from a NetCDF file

Parameters:
  • filename – The name (with path) of the NetCDF file to read.
  • usr_variables – A variable (dataset) name to read from the files. The name must appear exactly as in in the NetCDF file. Variable names may be fully qualified NetCDF4 Hierarchical group variables in the form <group1>.<group2....>.<variable_name>, e.g. AVHRR.Ch4CentralWavenumber.
Returns:

A Variable instance constructed from the input file

cis.data_io.netcdf.read_many_files(filenames, usr_variables, dim=None)

Reads a single Variable from many NetCDF files. This method uses the netCDF4 MFDataset class and so is NOT suitable for NetCDF4 datasets (only ‘CLASSIC’ netcdf).

Parameters:
  • filenames – A list of NetCDF filenames to read, or a string with wildcards.
  • usr_variables – A list of variable (dataset) names to read from the files. The names must appear exactly as in in the NetCDF file.
  • dim – The name of the dimension on which to aggregate the data. None is the default which tries to aggregate over the unlimited dimension
Returns:

A list of variable instances constructed from all of the input files

cis.data_io.netcdf.read_many_files_individually(filenames, usr_variables)

Read multiple Variables from many NetCDF files manually - i.e. not with MFDataset as this doesn’t always work, in particular for NetCDF4 files.

Parameters:
  • filenames – A list of NetCDF filenames to read, or a string with wildcards.
  • usr_variables – A list of variable (dataset) names to read from the files. The names must appear exactly as in in the NetCDF file. Variable names may be fully qualified NetCDF4 Hierarchical group variables in the form <group1>.<group2....>.<variable_name>, e.g. AVHRR.Ch4CentralWavenumber.
Returns:

A dictionary of lists of variable instances constructed from all of the input files with the fully qualified variable name as the key

cis.data_io.netcdf.remove_variables_with_non_spatiotemporal_dimensions(variables, spatiotemporal_var_names)

Remove from a list of netCDF variables any which have dimensionality which is not in an approved list of valid spatial or temporal dimensions (e.g. sensor number, pseudo dimensions). CIS currently does not support variables with this dimensionality and will fail if they are used.

Parameters:
  • variables – Dictionary of netCDF variable names : Variable objects. Variable names may be fully qualified NetCDF4 Hierarchical group variables in the form <group1>.<group2....>.<variable_name>, e.g. AVHRR.Ch4CentralWavenumber.
  • spatiotemporal_var_names – List of valid spatiotemporal dimensions.
Returns:

None

Module for writing data to NetCDF files

cis.data_io.write_netcdf.add_data_to_file(data_object, filename)
Parameters:
  • data_object
  • filename
Returns:

cis.data_io.write_netcdf.write(data_object, filename)
Parameters:
  • data_object
  • filename
Returns:

cis.data_io.write_netcdf.write_coordinate_list(coord_list, filename)

Writes coordinates to a netCDF file.

Parameters:
  • coord_list – list of Coord objects
  • filename – file to which to write
cis.data_io.write_netcdf.write_coordinates(coords, filename)

Writes coordinates to a netCDF file.

Parameters:
  • coords – UngriddedData or UngriddedCoordinates object for which the coordinates are to be written
  • filename – file to which to write
cis.data_io.hdf.get_hdf4_file_metadata(filename)

This returns a dictionary of file attributes, which often contains metadata information about the whole file. The value of each attribute can simply be a big string which will often need to be parsed manually thereafter. :param filename :return: dictionary of string attributes

cis.data_io.hdf.get_hdf4_file_variables(filename, data_type=None)

Get all variables from a file containing ungridded data. Concatenate variable from both VD and SD data

Parameters:
  • filename – The filename of the file to get the variables from
  • data_type – String representing the HDF data type, i.e. ‘VD’ or ‘SD’. if None, both are computed.
cis.data_io.hdf.read(filenames, variables)
cis.data_io.hdf.read_data(data_dict, data_type, missing_values=None)
cis.data_io.hdf.read_metadata(data_dict, data_type)

Module containing hdf file utility functions for the SD object

class cis.data_io.hdf_sd.HDF_SDS(filename, variable)

Bases: object

This class is used in place of the pyhdf.SD.SDS class to allow the file contents to be loaded at a later time rather than in this module read method (so that we can close the SD instances and free up file handles)

attributes()

Call pyhdf.SD.SDS.attributes(), opening and closing the file

get()

Call pyhdf.SD.SDS.get(), opening and closing the file

info()

Call pyhdf.SD.SDS.info(), opening and closing the file

cis.data_io.hdf_sd.get_calipso_data(sds)

Reads raw data from an SD instance. Automatically applies the scaling factors and offsets to the data arrays found in Calipso data.

Parameters:sds – The specific sds instance to read
Returns:A numpy array containing the raw data with missing data is replaced by NaN.
cis.data_io.hdf_sd.get_data(sds, missing_values=None)

Reads raw data from an SD instance. Automatically applies the scaling factors and offsets to the data arrays often found in NASA HDF-EOS data (e.g. MODIS)

Parameters:sds – The specific sds instance to read
Returns:A numpy array containing the raw data with missing data is replaced by NaN.
cis.data_io.hdf_sd.get_hdf_SD_file_variables(filename)

Get all the variables from an HDF SD file

Parameters:filename (str) – The filename of the file to get the variables from
Returns:An OrderedDict containing the variables from the file
cis.data_io.hdf_sd.get_metadata(sds)
cis.data_io.hdf_sd.read(filename, variables=None, datadict=None)

Reads SD from a HDF4 file into a dictionary.

Parameters:
  • filename (str) – The name (with path) of the HDF file to read.
  • names (iterable) – A sequence of variable (dataset) names to read from the file (default None, causing all variables to be read). The names must appear exactly as in in the HDF file.
  • datadict (dict) – Optional dictionary to add data to, otherwise a new, empty dictionary is created
Returns:

A dictionary containing data for requested variables. Missing data is replaced by NaN.

Module containing hdf file utility functions for the VD object

class cis.data_io.hdf_vd.VDS

Bases: cis.data_io.hdf_vd.VDS

cis.data_io.hdf_vd.get_data(vds, first_record=False, missing_values=None)

Actually read the data from the VDS handle. We shouldn’t need to check for HDF being installed here because the VDS object which is being passed to us can only have come from pyhdf.

Parameters:
  • vds
  • first_record
  • missing_values
Returns:

cis.data_io.hdf_vd.get_hdf_VD_file_variables(filename)

Get all the variables from an HDF VD file

Parameters:filename – The filename of the file to get the variables from
Returns:An OrderedDict containing the variables from the file
cis.data_io.hdf_vd.get_metadata(vds)
cis.data_io.hdf_vd.read(filename, variables=None, datadict=None)

Given a filename and a list of file names return a dictionary of VD data handles

Parameters:
  • filename – full path to a single HDF4 file
  • variables – A list of variables to read, if no variables are given, no variables are read
  • datadict – A dictionary of variable name, data handle pairs to be appended to
Returns:

An updated datadict with any new variables appended.

cis.data_io.aeronet.get_aeronet_file_variables(filename)
cis.data_io.aeronet.get_file_metadata(filename, variable='', shape=None)
cis.data_io.aeronet.load_aeronet(fname, variables=None)

loads aeronet lev 2.0 csv file.

Originally from http://code.google.com/p/metamet/ License: GNU GPL v3
Parameters:
  • fname – data file name
  • keep_fields – A list of variables to return
Returns:

A

cis.data_io.aeronet.load_multiple_aeronet(fnames, variables=None)