Aliasing#
The ACCESS-NRI catalog supports aliasing — the ability to search using alternative, user-friendly names that are automatically mapped to the underlying canonical values stored in the catalog. This is particularly useful for researchers familiar with CMIP vocabularies who want to discover raw ACCESS model data without needing to learn ACCESS-specific variable codes or column names.
For example, a user familiar with CMIP conventions can search for variable="tas" and the
catalog will find files stored under the raw ACCESS variable name fld_s03i236 as well as
any files already labelled tas. Similarly, typing frequency="daily" will match
entries catalogued as 1day.
See also
What is it? gives a high-level overview of the catalog structure. how explains prerequisites and how to start a session on Gadi. The Aliasing demo notebook demonstrates all of the features described on this page.
Two-level architecture#
Aliasing is applied at two levels, corresponding to the two layers of the catalog:
intake.cat.access_nri ← AliasedDataframeCatalog
│ search(source_id=...) field aliasing + value aliasing
│ search(model=...)
│
└─ cat["access-esm1-6"] ← AliasedESMCatalog (per dataset)
search(variable=...) value aliasing only
(ESM datastores use native field names)
The top-level catalog (intake.cat.access_nri) supports both
field aliasing (accepting alternative column names) and value aliasing
(accepting alternative search terms). The per-dataset ESM datastores you get
back from it support value aliasing only, since each datastore already has its own
fixed set of field names.
Field aliasing#
Field aliasing applies at the top-level catalog. It lets you use CMIP-style column
names when calling search()
on intake.cat.access_nri.
Example
import intake
cat = intake.cat.access_nri
# Using the CMIP-style field name "source_id" instead of ACCESS-NRI's "model"
results = cat.search(source_id="ACCESS-ESM1-5")
# Using "variable_id" instead of "variable"
results = cat.search(variable_id="tas")
When a field alias fires, the library emits a UserWarning so you can see exactly
what mapping was applied. To suppress these warnings, see Controlling alias behaviour below.
Full field alias reference
The following aliases are accepted at the top-level catalog only. These do not apply when searching inside an individual ESM datastore.
Alias you can use |
Canonical field name |
Notes |
|---|---|---|
|
|
CMIP controlled vocabulary term |
|
|
CMIP controlled vocabulary term |
|
|
CMIP controlled vocabulary term |
|
|
CMIP controlled vocabulary term |
|
|
CMIP controlled vocabulary term |
|
|
Short alternative |
|
|
Short alternative |
Value aliasing#
Value aliasing applies at both the top-level catalog and inside individual ESM datastores. When a value alias fires, the library expands your search to include both your original term and the canonical value, so you never accidentally exclude data that is already stored under the canonical name.
For example:
results = cat.search(frequency="daily")
# Internally becomes: frequency=["1day", "daily"]
results = cat.search(realm="atmosphere")
# Internally becomes: realm=["atmos", "atmosphere"]
Value aliases are available for the fields documented below.
Frequency aliases#
These aliases apply to the frequency field.
Frequency alias reference
Alias |
Canonical value |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Realm aliases#
These aliases apply to the realm field.
Realm alias reference
Alias |
Canonical value |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Model aliases#
These aliases apply to the model field (top-level catalog) and source_id
(ESM datastores).
Model alias reference
Alias |
Canonical value |
|---|---|
|
|
Experiment aliases#
These aliases apply to the experiment_id field.
Experiment alias reference
Alias |
Canonical value |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Variable aliases — CMIP-to-ACCESS mappings#
The most powerful aliases are the CMIP-to-ACCESS variable mappings. These apply to
the variable field (at both catalog levels) and variable_id (inside ESM
datastores). They allow you to search using CMIP standard variable names and find raw
ACCESS model output stored under ACCESS field codes.
Mappings are loaded automatically from the bundled file
access_nri_intake/data/mappings/access-esm1-6-cmip-mappings.json, which covers
137 variables across the atmosphere, land, and ocean components of
ACCESS-ESM1.6.
# Retrieve an ESM datastore for ACCESS-ESM1-6 data
ds = cat["access-esm1-6"]
# Search using the CMIP name — returns files stored as "fld_s03i236"
ds.search(variable="tas")
# Pass a list of CMIP names
ds.search(variable=["tas", "pr", "ci"])
Representative CMIP variable mapping examples (ACCESS-ESM1.6)
The table below shows a selection of common CMIP variables. The full list of 137
mappings is in
src/access_nri_intake/data/mappings/access-esm1-6-cmip-mappings.json.
CMIP name |
ACCESS field |
CF standard name |
Component |
|---|---|---|---|
|
|
air_temperature |
atmosphere |
|
|
precipitation_flux |
atmosphere |
|
|
surface_air_pressure |
atmosphere |
|
|
eastward_wind |
atmosphere |
|
|
northward_wind |
atmosphere |
|
|
convection_time_fraction |
atmosphere |
|
|
cloud_area_fraction |
atmosphere |
|
|
cloud_area_fraction_in_atmosphere_layer |
atmosphere |
|
|
mass_fraction_of_cloud_ice_in_air |
atmosphere |
|
|
surface_upward_latent_heat_flux |
atmosphere |
|
|
surface_upward_sensible_heat_flux |
atmosphere |
|
|
atmosphere |
|
|
|
land_area_fraction |
land |
|
|
— |
ocean |
Controlling alias behaviour#
Alias warnings#
Every time an alias fires, the library emits a UserWarning describing the
mapping that was applied:
import warnings
import intake
cat = intake.cat.access_nri
cat.search(frequency="daily")
# UserWarning: Value aliasing: frequency='daily' → frequency=['1day','daily']
This is intentional — it keeps searches transparent and reproducible. You can suppress
these warnings by passing show_warnings=False when constructing the catalog wrapper,
or by using standard Python warning filters:
# Suppress using Python's warnings module
with warnings.catch_warnings():
warnings.simplefilter("ignore", UserWarning)
results = cat.search(frequency="daily")
Escaping aliasing with .unwrap()#
Both the top-level catalog wrapper and the per-dataset ESM datastore wrapper expose an
unwrap() method that returns the underlying, unaliased catalog object:
# Get the raw DfFileCatalog (no aliasing)
raw_cat = cat.unwrap()
# Get the raw esm_datastore (no aliasing)
ds = cat["access-esm1-6"]
raw_ds = ds.unwrap()
This is useful when you want to call catalog methods that are not supported by the alias wrapper, or when you need the exact type expected by another library.
Regex and other non-string values#
Value aliasing only applies to plain strings and lists/tuples/sets of strings. If you
pass a regex pattern (e.g. "ci|cl|tas"), an integer, or any other non-string
type, it is passed through to the underlying catalog unchanged:
# Regex: passed through unchanged — no aliasing applied
ds.search(variable="ci|cl|tas")
# Plain string: aliases applied
ds.search(variable="ci") # → ["fld_s05i269", "ci"]