18.2.5.1.3. Low-level IO modules¶
Module containing NetCDF file reading functions
-
cis.data_io.netcdf.
apply_offset_and_scaling
(data, add_offset=None, scale_factor=None)¶ Apply a standard offset and scaling to the data. This is deliberately very similar to the python-NetCDF4 implementation as it is anticipated that we can remove it if/when that library implements valid_range masking.
Parameters: - data (ndarray) – Data to scale
- add_offset (float) –
- scale_factor (float) –
Return ndarray: Scaled data
-
cis.data_io.netcdf.
find_missing_value
(var)¶ Get the missing / fill value of the variable
Parameters: var – NetCDF Variable instance Returns: missing / fill value
-
cis.data_io.netcdf.
get_data
(var)¶ Reads raw data from a NetCDF.Variable instance. Also applies CF-compliant valid max, min and ranges.
Parameters: var – The specific Variable instance to read Returns: A numpy maskedarray. Missing values are False in the mask.
-
cis.data_io.netcdf.
get_metadata
(var)¶ Retrieves all metadata
Parameters: var – the Variable to read metadata from Returns: A metadata object
-
cis.data_io.netcdf.
get_netcdf_file_attributes
(filename)¶ Get all the global attributes from a NetCDF file
Parameters: filename – The filename of the file to get the variables from Returns: a dictionary of attributes and their values
-
cis.data_io.netcdf.
get_netcdf_file_variables
(filename, exclude_coords=False)¶ Get all the variables contained in a NetCDF file. Variables in NetCDF4 Hierarchical groups are returned with their fully qualified variable name in the form
<group1>/<group2....>/<variable_name>
, e.g.``AVHRR/Ch4CentralWavenumber``.Parameters: - filename – The filename of the file to get the variables from
- exclude_coords – Exclude coordinate variables if True
Returns: An OrderedDict containing {variable_name: NetCDF Variable instance}
-
cis.data_io.netcdf.
read
(filename, usr_variables)¶ Reads a Variable from a NetCDF file
Parameters: - filename – The name (with path) of the NetCDF file to read.
- usr_variables – A variable (dataset) name to read from the files. The name must appear exactly as in in the
NetCDF file. Variable names may be fully qualified NetCDF4 Hierarchical group variables in the form
<group1>/<group2....>/<variable_name>
, e.g.AVHRR/Ch4CentralWavenumber
.
Returns: A Variable instance constructed from the input file
-
cis.data_io.netcdf.
read_many_files
(filenames, usr_variables, dim=None)¶ Reads a single Variable from many NetCDF files. This method uses the netCDF4 MFDataset class and so is NOT suitable for NetCDF4 datasets (only ‘CLASSIC’ netcdf).
Parameters: - filenames – A list of NetCDF filenames to read, or a string with wildcards.
- usr_variables – A list of variable (dataset) names to read from the files. The names must appear exactly as in in the NetCDF file.
- dim – The name of the dimension on which to aggregate the data. None is the default which tries to aggregate over the unlimited dimension
Returns: A list of variable instances constructed from all of the input files
-
cis.data_io.netcdf.
read_many_files_individually
(filenames, usr_variables)¶ Read multiple Variables from many NetCDF files manually - i.e. not with MFDataset as this doesn’t always work, in particular for NetCDF4 files.
Parameters: - filenames – A list of NetCDF filenames to read, or a string with wildcards.
- usr_variables – A list of variable (dataset) names to read from the files. The names must appear exactly as
in in the NetCDF file. Variable names may be fully qualified NetCDF4 Hierarchical group variables in the form
<group1>/<group2....>/<variable_name>
, e.g.AVHRR/Ch4CentralWavenumber
.
Returns: A dictionary of lists of variable instances constructed from all of the input files with the fully qualified variable name as the key
-
cis.data_io.netcdf.
remove_variables_with_non_spatiotemporal_dimensions
(variables, spatiotemporal_var_names)¶ Remove from a list of netCDF variables any which have dimensionality which is not in an approved list of valid spatial or temporal dimensions (e.g. sensor number, pseudo dimensions). CIS currently does not support variables with this dimensionality and will fail if they are used.
Parameters: - variables – Dictionary of netCDF variable names : Variable objects. Variable names may be fully qualified
NetCDF4 Hierarchical group variables in the form
<group1>/<group2....>/<variable_name>
, e.g.AVHRR/Ch4CentralWavenumber
. - spatiotemporal_var_names – List of valid spatiotemporal dimensions.
Returns: None
- variables – Dictionary of netCDF variable names : Variable objects. Variable names may be fully qualified
NetCDF4 Hierarchical group variables in the form
Module for writing data to NetCDF files
-
cis.data_io.write_netcdf.
add_data_to_file
(data_object, filename)¶ Parameters: - data_object –
- filename –
Returns:
-
cis.data_io.write_netcdf.
sizeof_fmt
(num, suffix='B')¶ Return a human readable size from an integer number of bytes
From http://stackoverflow.com/questions/1094841/reusable-library-to-get-human-readable-version-of-file-size :param int num: Number of bytes :param str suffix: Little or big B :return str: Formatted human readable size
-
cis.data_io.write_netcdf.
write
(data_object, filename)¶ Parameters: - data_object –
- filename –
Returns:
-
cis.data_io.write_netcdf.
write_coordinate_list
(coord_list, filename)¶ Writes coordinates to a netCDF file.
Parameters: - coord_list – list of Coord objects
- filename – file to which to write
-
cis.data_io.write_netcdf.
write_coordinates
(coords, filename)¶ Writes coordinates to a netCDF file.
Parameters: - coords – UngriddedData or UngriddedCoordinates object for which the coordinates are to be written
- filename – file to which to write
-
cis.data_io.hdf.
get_hdf4_file_metadata
(filename)¶ This returns a dictionary of file attributes, which often contains metadata information about the whole file. The value of each attribute can simply be a big string which will often need to be parsed manually thereafter. :param filename :return: dictionary of string attributes
-
cis.data_io.hdf.
get_hdf4_file_variables
(filename, data_type=None)¶ Get all variables from a file containing ungridded data. Concatenate variable from both VD and SD data
Parameters: - filename – The filename of the file to get the variables from
- data_type – String representing the HDF data type, i.e. ‘VD’ or ‘SD’. if None, both are computed.
-
cis.data_io.hdf.
read
(filenames, variables)¶
-
cis.data_io.hdf.
read_data
(data_list, read_function)¶ Wrapper for calling an HDF reading function for each dataset, and then concatenating the result.
Parameters: - data_list (list) – A list of data objects to read
- or str read_function (callable) – A function for reading the data, or ‘SD’ or ‘VD’ for default reading routines.
Returns: A single numpy array of concatenated data values.
-
cis.data_io.hdf.
read_metadata
(data_dict, data_type)¶
Module containing hdf file utility functions for the SD object
-
class
cis.data_io.hdf_sd.
HDF_SDS
(filename, variable)¶ Bases:
object
This class is used in place of the pyhdf.SD.SDS class to allow the file contents to be loaded at a later time rather than in this module read method (so that we can close the SD instances and free up file handles)
-
attributes
()¶ Call pyhdf.SD.SDS.attributes(), opening and closing the file
-
get
()¶ Call pyhdf.SD.SDS.get(), opening and closing the file
-
info
()¶ Call pyhdf.SD.SDS.info(), opening and closing the file
-
-
cis.data_io.hdf_sd.
get_data
(sds)¶ Reads raw data from an SD instance.
Parameters: sds – The specific sds instance to read Returns: A numpy array containing the raw data with missing data is replaced by NaN.
-
cis.data_io.hdf_sd.
get_hdf_SD_file_variables
(filename)¶ Get all the variables from an HDF SD file
Parameters: filename (str) – The filename of the file to get the variables from Returns: An OrderedDict containing the variables from the file
-
cis.data_io.hdf_sd.
get_metadata
(sds)¶
-
cis.data_io.hdf_sd.
read
(filename, variables=None, datadict=None)¶ Reads SD from a HDF4 file into a dictionary.
Parameters: - filename (str) – The name (with path) of the HDF file to read.
- names (iterable) – A sequence of variable (dataset) names to read from the file (default None, causing all variables to be read). The names must appear exactly as in in the HDF file.
- datadict (dict) – Optional dictionary to add data to, otherwise a new, empty dictionary is created
Returns: A dictionary containing data for requested variables. Missing data is replaced by NaN.
Module containing hdf file utility functions for the VD object
-
class
cis.data_io.hdf_vd.
VDS
¶ Bases:
cis.data_io.hdf_vd.VDS
-
cis.data_io.hdf_vd.
get_data
(vds, first_record=False, missing_values=None)¶ Actually read the data from the VDS handle. We shouldn’t need to check for HDF being installed here because the VDS object which is being passed to us can only have come from pyhdf.
Parameters: - vds –
- first_record –
- missing_values –
Returns:
-
cis.data_io.hdf_vd.
get_hdf_VD_file_variables
(filename)¶ Get all the variables from an HDF VD file
Parameters: filename – The filename of the file to get the variables from Returns: An OrderedDict containing the variables from the file
-
cis.data_io.hdf_vd.
get_metadata
(vds)¶
-
cis.data_io.hdf_vd.
read
(filename, variables=None, datadict=None)¶ Given a filename and a list of file names return a dictionary of VD data handles
Parameters: - filename – full path to a single HDF4 file
- variables – A list of variables to read, if no variables are given, no variables are read
- datadict – A dictionary of variable name, data handle pairs to be appended to
Returns: An updated datadict with any new variables appended.
-
cis.data_io.aeronet.
get_aeronet_file_variables
(filename)¶ Return a list of valid Aeronet file variables with invalid characters removed. We need to remove invalid characters primarily for writing back out to CF-compliant NetCDF. :param filename: Full path to the file to read :return: A list of Aeronet variable names in the order they appear in the file
-
cis.data_io.aeronet.
get_file_metadata
(filename, variable='', shape=None)¶
-
cis.data_io.aeronet.
load_aeronet
(fname, variables=None)¶ loads aeronet lev 2.0 csv file.
Originally from http://code.google.com/p/metamet/ License: GNU GPL v3Parameters: - fname – data file name
- variables – A list of variables to return
Returns: A dictionary of variables names and numpy arrays containing the data for that variable
-
cis.data_io.aeronet.
load_multiple_aeronet
(fnames, variables=None)¶