access_nri_intake.source.builders#

Builders for generating Intake-ESM datastores

Module Contents#

Classes#

BaseBuilder

Base class for creating Intake-ESM datastore builders. Not intended for direct use.

AccessOm2Builder

Intake-ESM datastore builder for ACCESS-OM2 COSIMA datasets

AccessOm3Builder

Intake-ESM datastore builder for ACCESS-OM3 COSIMA datasets

AccessEsm15Builder

Intake-ESM datastore builder for ACCESS-ESM1.5 datasets

AccessCm2Builder

Intake-ESM datastore builder for ACCESS-CM2 datasets

exception access_nri_intake.source.builders.ParserError#

Bases: Exception

Common base class for all non-exit exceptions.

Initialize self. See help(type(self)) for accurate signature.

add_note()#

Exception.add_note(note) – add a note to the exception

with_traceback()#

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class access_nri_intake.source.builders.BaseBuilder(path, depth=0, exclude_patterns=None, include_patterns=None, data_format='netcdf', groupby_attrs=None, aggregations=None, storage_options=None, joblib_parallel_kwargs={'n_jobs': multiprocessing.cpu_count()})#

Bases: ecgtools.builder.Builder

Base class for creating Intake-ESM datastore builders. Not intended for direct use. This builds on the ecgtools.Builder class.

This method should be overwritten. The expection is that some of these arguments will be hardcoded in sub classes of this class.

Parameters:
path: str or list of str

Path or list of path to crawl for assets/files.

depth: int, optional

Maximum depth to crawl for assets. Default is 0.

exclude_patterns: list of str, optional

List of glob patterns to exclude from crawling.

include_patterns: list of str, optional

List of glob patterns to include from crawling.

data_format: str

The data format. Valid values are ‘netcdf’, ‘reference’ and ‘zarr’.

groupby_attrs: List[str]

Column names (attributes) that define data sets that can be aggegrated.

aggregations: List[dict]

List of aggregations to apply to query results, default None

storage_options: dict, optional

Parameters passed to the backend file-system such as Google Cloud Storage, Amazon Web Service S3

joblib_parallel_kwargs: dict, optional

Parameters passed to joblib.Parallel. Default is {}.

property columns_with_iterables#

Return a set of the columns that have iterables

parse()#

Parse metadata from assets.

save(name, description, directory=None)#

Save datastore contents to a file.

Parameters:
name: str

The name of the file to save the datastore to.

descriptionstr

Detailed multi-line description of the collection.

directory: str, optional

The directory to save the datastore to. If None, use the current directory.

validate_parser()#

Run the parser on a single file and check the schema of the info being parsed

build()#

Builds a datastore from a list of netCDF files or zarr stores.

abstract static parser(file)#

Parse info from a file asset

Parameters:
file: str

The path to the file

clean_dataframe()#

Clean the dataframe by excluding invalid assets and removing duplicate entries.

class access_nri_intake.source.builders.AccessOm2Builder(path)#

Bases: BaseBuilder

Intake-ESM datastore builder for ACCESS-OM2 COSIMA datasets

Initialise a AccessOm2Builder

Parameters:
pathstr or list of str

Path or list of paths to crawl for assets/files.

property columns_with_iterables#

Return a set of the columns that have iterables

static parser(file)#

Parse info from a file asset

Parameters:
file: str

The path to the file

parse()#

Parse metadata from assets.

save(name, description, directory=None)#

Save datastore contents to a file.

Parameters:
name: str

The name of the file to save the datastore to.

descriptionstr

Detailed multi-line description of the collection.

directory: str, optional

The directory to save the datastore to. If None, use the current directory.

validate_parser()#

Run the parser on a single file and check the schema of the info being parsed

build()#

Builds a datastore from a list of netCDF files or zarr stores.

clean_dataframe()#

Clean the dataframe by excluding invalid assets and removing duplicate entries.

class access_nri_intake.source.builders.AccessOm3Builder(path)#

Bases: BaseBuilder

Intake-ESM datastore builder for ACCESS-OM3 COSIMA datasets

Initialise a AccessOm3Builder

Parameters:
pathstr or list of str

Path or list of paths to crawl for assets/files.

property columns_with_iterables#

Return a set of the columns that have iterables

static parser(file)#

Parse info from a file asset

Parameters:
file: str

The path to the file

parse()#

Parse metadata from assets.

save(name, description, directory=None)#

Save datastore contents to a file.

Parameters:
name: str

The name of the file to save the datastore to.

descriptionstr

Detailed multi-line description of the collection.

directory: str, optional

The directory to save the datastore to. If None, use the current directory.

validate_parser()#

Run the parser on a single file and check the schema of the info being parsed

build()#

Builds a datastore from a list of netCDF files or zarr stores.

clean_dataframe()#

Clean the dataframe by excluding invalid assets and removing duplicate entries.

class access_nri_intake.source.builders.AccessEsm15Builder(path, ensemble)#

Bases: BaseBuilder

Intake-ESM datastore builder for ACCESS-ESM1.5 datasets

Initialise a AccessEsm15Builder

Parameters:
path: str or list of str

Path or list of paths to crawl for assets/files.

ensemble: boolean

Whether to treat each path as a separate member of an ensemble to join along a new member dimension

property columns_with_iterables#

Return a set of the columns that have iterables

static parser(file)#

Parse info from a file asset

Parameters:
file: str

The path to the file

parse()#

Parse metadata from assets.

save(name, description, directory=None)#

Save datastore contents to a file.

Parameters:
name: str

The name of the file to save the datastore to.

descriptionstr

Detailed multi-line description of the collection.

directory: str, optional

The directory to save the datastore to. If None, use the current directory.

validate_parser()#

Run the parser on a single file and check the schema of the info being parsed

build()#

Builds a datastore from a list of netCDF files or zarr stores.

clean_dataframe()#

Clean the dataframe by excluding invalid assets and removing duplicate entries.

class access_nri_intake.source.builders.AccessCm2Builder(path, ensemble)#

Bases: AccessEsm15Builder

Intake-ESM datastore builder for ACCESS-CM2 datasets

Initialise a AccessEsm15Builder

Parameters:
path: str or list of str

Path or list of paths to crawl for assets/files.

ensemble: boolean

Whether to treat each path as a separate member of an ensemble to join along a new member dimension

property columns_with_iterables#

Return a set of the columns that have iterables

static parser(file)#

Parse info from a file asset

Parameters:
file: str

The path to the file

parse()#

Parse metadata from assets.

save(name, description, directory=None)#

Save datastore contents to a file.

Parameters:
name: str

The name of the file to save the datastore to.

descriptionstr

Detailed multi-line description of the collection.

directory: str, optional

The directory to save the datastore to. If None, use the current directory.

validate_parser()#

Run the parser on a single file and check the schema of the info being parsed

build()#

Builds a datastore from a list of netCDF files or zarr stores.

clean_dataframe()#

Clean the dataframe by excluding invalid assets and removing duplicate entries.