access_nri_intake.source.builders
=================================

.. py:module:: access_nri_intake.source.builders

.. autoapi-nested-parse::

   Builders for generating Intake-ESM datastores

   Note: It looks like the {**default_kwargs, **kwargs} pattern is repeated a lot in
   the builders. The default kwargs all look very similar, but *are not the same*.
   Trying to unify them without a bunch of extra effort (probably making the
   deduplication effort wasted/more complex than it currently is) is not going to work.

   ..
       !! processed by numpydoc !!


Classes
-------

.. autoapisummary::

   access_nri_intake.source.builders.AccessOm2Builder
   access_nri_intake.source.builders.AccessOm3Builder
   access_nri_intake.source.builders.Mom6Builder
   access_nri_intake.source.builders.AccessEsm15Builder
   access_nri_intake.source.builders.AccessCm2Builder
   access_nri_intake.source.builders.AccessEsm16Builder
   access_nri_intake.source.builders.OnlineMltBuilder
   access_nri_intake.source.builders.AccessCm3Builder
   access_nri_intake.source.builders.ROMSBuilder
   access_nri_intake.source.builders.WoaBuilder
   access_nri_intake.source.builders.Cmip6Builder


Module Contents
---------------

.. py:class:: AccessOm2Builder(path, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for ACCESS-OM2 COSIMA datasets


   Initialise a AccessOm2Builder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: AccessOm3Builder(path, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for ACCESS-OM3 COSIMA datasets


   Initialise a AccessOm3Builder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: Mom6Builder(path, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for MOM6 COSIMA datasets


   Initialise a Mom6Builder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: AccessEsm15Builder(path, ensemble, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for ACCESS-ESM1.5 datasets


   Initialise a AccessEsm15Builder


   :Parameters:

       **path: str or list of str**
           Path or list of paths to crawl for assets/files.

       **ensemble: boolean**
           Whether to treat each path as a separate member of an ensemble to join
           along a new member dimension














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: AccessCm2Builder(path, ensemble, **kwargs)

   Bases: :py:obj:`AccessEsm15Builder`


   Intake-ESM datastore builder for ACCESS-CM2 datasets


   Initialise a AccessEsm15Builder


   :Parameters:

       **path: str or list of str**
           Path or list of paths to crawl for assets/files.

       **ensemble: boolean**
           Whether to treat each path as a separate member of an ensemble to join
           along a new member dimension














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: AccessEsm16Builder(path, ensemble, **kwargs)

   Bases: :py:obj:`AccessEsm15Builder`


   Intake-ESM datastore builder for ACCESS-ESM1.6 datasets


   Initialise a AccessEsm15Builder


   :Parameters:

       **path: str or list of str**
           Path or list of paths to crawl for assets/files.

       **ensemble: boolean**
           Whether to treat each path as a separate member of an ensemble to join
           along a new member dimension














   ..
       !! processed by numpydoc !!


   .. py:attribute:: PATH_REGEX
      :value: '.*/output\\d+/([^/]*)(?:/[^/]*)?/.*\\.nc'



   .. py:attribute:: REALM_MAPPING


   .. py:method:: parser(file)
      :classmethod:


      
      Get the realm and member/experiment id from the file name
















      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: OnlineMltBuilder(path, ensemble, **kwargs)

   Bases: :py:obj:`AccessEsm16Builder`


   Builder for the Mixed Layer Tracer Budget Diagnostics dataset located at
   /g/data/av17/access-nri/OM2/025deg_jra55_iaf_cycle6_online_mlt
   generated by Ryan Holmes (ryan.holmes@bom.gov.au)

   Dataset constists of a trimmed down repeat of an existing experiment with
   additional diagnostics added.

   These files are not added to the datastore:
   - output*/o2i.nc : these files have no calendar attribute on the 'time' axis


   Initialise a AccessEsm15Builder


   :Parameters:

       **path: str or list of str**
           Path or list of paths to crawl for assets/files.

       **ensemble: boolean**
           Whether to treat each path as a separate member of an ensemble to join
           along a new member dimension














   ..
       !! processed by numpydoc !!


   .. py:attribute:: PATH_REGEX
      :value: '.*/(?:output\\d+|post_processed_diags|.*)/([^/]*)(?:/[^/]*)?/.*\\.nc'



   .. py:attribute:: REALM_MAPPING


   .. py:method:: parser(file)
      :classmethod:


      
      Get the realm and member/experiment id from the file name
















      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: AccessCm3Builder(path, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for ACCESS-CM3 datasets


   Initialise a AccessCm3Builder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: ROMSBuilder(path, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for ROMS datasets

   See https://github.com/bkgf/ROMSIceShelf for details on the ROMSIceShelf model.


   Initialise a AccessOm2Builder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Parse info from a file asset


      :Parameters:

          **file: str**
              The path to the file














      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: WoaBuilder(path, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for WOA datasets


   Initialise a WoaBuilder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:method:: parser(file)
      :classmethod:


      
      Overwrite the parser method to add a grid id to the output dictionary.
















      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


.. py:class:: Cmip6Builder(path, ensemble, **kwargs)

   Bases: :py:obj:`BaseBuilder`


   Intake-ESM datastore builder for CMIP6 datasets


   Initialise a Cmip6Builder


   :Parameters:

       **path** : str or list of str
           Path or list of paths to crawl for assets/files.














   ..
       !! processed by numpydoc !!


   .. py:attribute:: ensemble
      :type:  bool
      :value: True



   .. py:method:: parser(file)
      :classmethod:


      
      No need to do much here - just parse the netCDF file and return the info
      as a dictionary. The realm is obtained from the file metadata following
      https://github.com/ACCESS-NRI/access-nri-intake-catalog/pull/478.
















      ..
          !! processed by numpydoc !!


   .. py:attribute:: PATTERNS
      :type:  list
      :value: ['*.nc']



   .. py:attribute:: paths


   .. py:attribute:: depth
      :value: 0



   .. py:attribute:: exclude_patterns
      :value: None



   .. py:attribute:: include_patterns
      :value: None



   .. py:attribute:: data_format
      :value: 'netcdf'



   .. py:attribute:: groupby_attrs
      :value: None



   .. py:attribute:: aggregations
      :value: None



   .. py:attribute:: storage_options
      :value: None



   .. py:attribute:: joblib_parallel_kwargs


   .. py:method:: parse()

      
      Parse metadata from assets.
















      ..
          !! processed by numpydoc !!


   .. py:method:: save(name, description, directory = None, use_parquet = False)

      
      Save datastore contents to a file.


      :Parameters:

          **name: str**
              The name of the file to save the datastore to.

          **description** : str
              Detailed multi-line description of the collection.

          **directory: str, optional**
              The directory to save the datastore to. If None, use the current directory.

          **use_parquet: bool, optional**
              Whether to save the datastore as a parquet file. Defaults to False,
              which saves as a CSV file. Parquet is both faster and saves space, but
              unlike CSV is not human-readable.














      ..
          !! processed by numpydoc !!


   .. py:method:: validate_parser()

      
      Run the parser on a single file and check the schema of the info being parsed
















      ..
          !! processed by numpydoc !!


   .. py:method:: build()

      
      Builds a datastore from a list of netCDF files or zarr stores.
















      ..
          !! processed by numpydoc !!


   .. py:property:: columns_with_iterables

      
      Return a set of the columns that have iterables
















      ..
          !! processed by numpydoc !!


   .. py:method:: parse_filename_freq(filename, frequencies = FREQUENCIES)
      :classmethod:


      
      Parse an ACCESS model filename and return a file id and any time information


      :Parameters:

          **filename: str**
              The filename to parse with the extension removed

          **frequencies: dict, optional**
              A dictionary of regex patterns to match against the filename to determine the frequency

          **redaction_fill: str, optional**
              The character to replace time information with. Defaults to "X"



      :Returns:

          frequency: str | None
              The frequency of the file if available in the filename, otherwise None











      ..
          !! processed by numpydoc !!


   .. py:method:: parse_ncfile(file, time_dim = 'time')
      :classmethod:


      
      Get Intake-ESM datastore entry info from a netcdf file


      :Parameters:

          **fname: str**
              The path to the netcdf file

          **time_dim: str**
              The name of the time dimension



      :Returns:

          output_nc_info: _NCFileInfo
              A dataclass containing the information parsed from the file




      :Raises:

          EmptyFileError: If the file contains no variables
              ..







      ..
          !! processed by numpydoc !!


   .. py:property:: valid_assets
      :type: list[str]


      
      Return the list of valid assets that have been parsed and validated
















      ..
          !! processed by numpydoc !!


   .. py:method:: get_assets()


   .. py:method:: clean_dataframe()

      
      Clean the dataframe by excluding invalid assets and removing duplicate entries.
















      ..
          !! processed by numpydoc !!


