17.2.5.1.1. cis.data_io.products package

17.2.5.1.1.1. The abstract AProduct module

class cis.data_io.products.AProduct.AProduct

Bases: object

Abstract class for the various possible data products. This just defines the interface which the subclasses must implement.

create_coords(filenames)

Reads the coordinates from one or more files. Note that this method may have to make certain assumptions about the file in order to return a single coordinate set. The user should be warned through the logger if this is the case.

Parameters:filenames (list) – List of filenames to read coordinates from
Returns:CommonData object
create_data_object(filenames, variable)

Create and return an CommonData object for a given variable from one or more files.

Parameters:
  • filenames (list) – List of filenames of files to read
  • variable (str) – Variable to read from the files
Returns:

An CommonData object representing the specified variable

Raises:
  • FileIOError – Unable to read a file
  • InvalidVariableError – Variable not present in file
get_file_format(filename)

Returns a file format hierarchy separated by slashes, of the form TopLevelFormat/SubFormat/SubFormat/Version. E.g. NetCDF/GASSP/1.0, ASCII/ASCIIHyperpoint or HDF4/CloudSat. This is mainly used within the ceda_di indexing tool. If not set it will default to the products name.

A filename of an example file can be provided to enable the determination of, for example, a dataset version number.

Parameters:filename (str) – Filename of file to be inspected
Returns:File format, of the form [parent/]format/specific instance/version, or the class name
Return type:str
Raises:FileFormatError if there is an error
get_file_signature()

This method should return a list of regular expressions, which CIS uses to decide which data product to use for a given file. If more than one regular expression is provided in the list then the file can match any of the expressions. The first product with a signature that matches the filename will be used. The order in which the products are searched is determined by the priority property, highest value first; internal products generally have a priority of 10.

For example, this would match all files with a name containing the string ‘CODE’ and with the ‘nc’ extension.:

return [r'.*CODE*.nc']

Note

If the signature has matched the framework will call AProduct.get_file_type_error(), this gives the product a chance to open the file and check the contents.

Returns:A list of regex to match the product’s file naming convention.
Return type:list
get_file_type_error(filename)

Check a single file to see if it is of the correct type, and if not return a list of errors. If the return is None then there are no errors and this is the correct data product to use for this file.

This method gives a mechanism for a data product to identify itself as the correct product when a specific enough file signature cannot be provided. For example GASSP is a type of NetCDF file and so filenames end with .nc but so do other NetCDF files, so the data product opens the file and looks for the GASSP version attribute, and if it doesn’t find it returns an error.

Parameters:filename (str) – The filename for the file
Returns:List of errors, or None
Return type:list or None
get_variable_names(filenames, data_type=None)

Get a list of available variable names from the filenames list passed in. This general implementation can be overridden in specific products to include/exclude variables which may or may not be relevant. The data_type parameter can be used to specify extra information.

Parameters:
  • filenames (list) – List of string filenames of files to be read from
  • data_type (str) – ‘SD’ or ‘VD’ to specify only return SD or VD variables from HDF files. This may take on other values in specific product implementations.
Returns:

A set of variable names as strings

Return type:

str

priority = 10
valid_dimensions = None
exception cis.data_io.products.AProduct.ProductPluginException(message, original_exception)

Bases: exceptions.Exception

Represents an error which has occurred inside of a Product plugin

original_exception = None
cis.data_io.products.AProduct.get_coordinates(filenames, product=None)

Top level routine for calling the correct product’s create_coords() routine.

Parameters:
  • filenames (list) – A list of filenames to read data from
  • product (str) – The product to read data with - this should be a string which matches the name of one of the subclasses of AProduct
Returns:

A CoordList object

cis.data_io.products.AProduct.get_data(filenames, variable, product=None)

Top level routine for calling the correct product’s create_data_object() routine.

Parameters:
  • filenames (list) – A list of filenames to read data from
  • variable (str) – The variable to create the CommonData object from
  • product (str) – The product to read data with - this should be a string which matches the name of one of the subclasses of AProduct. If none is supplied it is guessed from the filename signature.
Returns:

A CommonData variable

cis.data_io.products.AProduct.get_file_format(filenames, product=None)

Returns the files format of throws FileFormatError if there is an error in the format

Parameters:
  • filenames (list) – the filenames to read
  • product (str) – the product to use, if not specified search
Returns:

File format

Raises ClassNotFoundError:
 

if there is no reader for this class

cis.data_io.products.AProduct.get_product_full_name(filenames, product=None)

Get the full name of the product which would read this file

Parameters:
  • filenames (list) – list of filenames to read
  • product (str) – specified product to use
cis.data_io.products.AProduct.get_variables(filenames, product=None, data_type=None)

Top level routine for calling the correct product’s get_variable_names() routine.

Parameters:
  • filenames (list) – A list of filenames to read the variables from
  • product (str) – The product to read data with - this should be a string which matches the name of one of the subclasses of AProduct
Returns:

A set of variable names as strings