Fields¶
A ZnTrack field is used to define inputs and outputs for a Node on the class level. The following fields are available:
- zntrack.params(default=<dataclasses._MISSING_TYPE object>, **kwargs)[source]¶
ZnTrack parameter field.
A field to define a parameter for a ZnTrack node.
- Parameters:
default (dict|int|str|float|list|None, optional) – Set a default parameter value.
default_factory (callable, optional) – A callable that returns the default value. Should be used instead of default if the default value is mutable.
Examples
>>> import zntrack >>> class MyNode(zntrack.Node): ... param: int = zntrack.params(default=42) ... ... def run(self) -> None: ... ... >>> a = MyNode() >>> a.param 42 >>> b = MyNode(param=43) >>> b.param 43
- zntrack.deps(default=<dataclasses._MISSING_TYPE object>, **kwargs)[source]¶
Define dependencies for a node.
A Node dependency field can be used to pass data from one node to another. It can not be used to pass anything that is not a
zntrack.Node
or adataclass
.- Parameters:
default (Any, optional) – Should not be set on the class level.
Examples
>>> import zntrack >>> class MyFirstNode(zntrack.Node): ... outs: int = zntrack.outs() ... ... def run(self) -> None: ... self.outs = 42 ... >>> class MySecondNode(zntrack.Node): ... deps: int = zntrack.deps() ... ... def run(self) -> None: ... ... >>> with zntrack.Project() as project: ... node1 = MyFirstNode() ... node2 = MySecondNode(deps=node1.outs) >>> project.build()
- zntrack.outs(*, cache: bool = True, independent: bool = False, **kwargs)[source]¶
Define output for a node.
An output can be anything that can be pickled.
- Parameters:
cache (bool, optional) – Set to true to use the DVC cache for the field. Default is
zntrack.config.ALWAYS_CACHE
.independent (bool, optional) – Whether the output is independent of the node’s inputs. Default is False.
Examples
>>> import zntrack >>> class MyNode(zntrack.Node): ... outs: int = zntrack.outs() ... ... def run(self) -> None: ... '''Save output to self.outs.'''
- zntrack.metrics(*, cache: bool | None = None, independent: bool = False, **kwargs)[source]¶
Define metrics for a node.
The metrics must be a dictionary that can be serialized to JSON.
- Parameters:
cache (bool, optional) – Set to true to use the DVC cache for the field. Default is
zntrack.config.ALWAYS_CACHE
.independent (bool, optional) – Whether the output is independent of the node’s inputs. Default is False.
Examples
>>> import zntrack >>> class MyNode(zntrack.Node): ... metrics: dict = zntrack.metrics() ... ... def run(self) -> None: ... '''Save metrics to self.metrics.'''
- zntrack.plots(*, y: str | list[str] | None = None, cache: bool = True, independent: bool = False, x: str = 'step', x_label: str | None = None, y_label: str | None = None, template: str | None = None, title: str | None = None, autosave: bool = False, **kwargs)[source]¶
Pandas plot options.
- Parameters:
y (str | list[str]) – Column name(s) to plot.
cache (bool, optional) – Use the DVC cache, by default True.
independent (bool, optional) – This fields output can be indepented of the input to the node. If set tue true, the entire Node output will be used for dependencies. Can be useful, if the output is e.g. a list of indices.
x (str, optional) – Column name to use for the x-axis, by default “step”.
x_label (str, optional) – Label for the x-axis, by default None.
y_label (str, optional) – Label for the y-axis, by default None.
template (str, optional) – Plotly template to use, by default None.
title (str, optional) – Title of the plot, by default None.
autosave (bool, optional) – Save the data of this field every time it is being updated. Disable for large dataframes.
Examples
>>> import zntrack >>> import pandas as pd >>> class MyNode(zntrack.Node): ... plots: pd.DataFrame = zntrack.plots(y="loss") ... ... def run(self): ... self.plots = pd.DataFrame({"loss": [1, 2, 3]})
- zntrack.params_path(default: str | ~pathlib.Path | ~typing.List[str | ~pathlib.Path] | ~dataclasses._MISSING_TYPE = <dataclasses._MISSING_TYPE object>, **kwargs)[source]¶
Define input parameter file path(s).
- Parameters:
default (str|Path|list[str|Path], optional) – Path to one or multiple parameter files.
Examples
>>> import zntrack >>> class MyNode(zntrack.Node): ... params_path: str = zntrack.params_path(default="params.yaml") ... ... def run(self) -> None: ... ... >>> a = MyNode() >>> a.params_path 'params.yaml' >>> b = MyNode(params_path="params2.yaml") >>> b.params_path 'params2.yaml'
- zntrack.deps_path(default: str | ~pathlib.Path | ~typing.List[str | ~pathlib.Path] | ~dataclasses._MISSING_TYPE = <dataclasses._MISSING_TYPE object>, *, cache: bool = True, **kwargs)[source]¶
Define dependency file path(s) for a node.
- Parameters:
default (str|Path|list[str|Path], optional) – Path to one or multiple dependency files.
cache (bool, optional) – Whether to use the DVC cache for the field. Default is True.
Examples
>>> import zntrack >>> class MyNode(zntrack.Node): ... dependencies: str = zntrack.deps_path() ... ... def run(self) -> None: ... ... ... a = MyNode(dependencies=["file1.txt", "file2.txt"])
- zntrack.outs_path(default: str | ~pathlib.Path | ~typing.List[str | ~pathlib.Path] | ~dataclasses._MISSING_TYPE = <dataclasses._MISSING_TYPE object>, *, cache: bool = True, independent: bool = False, **kwargs)[source]¶
Define output file path(s) for a node.
- Parameters:
default (str|Path|list[str|Path], optional) – Default path(s) to output files.
cache (bool, optional) – Whether to use the DVC cache for the field. Default is True.
independent (bool, optional) – Whether the output is independent of the node’s inputs. Default is False.
Examples
>>> import zntrack >>> from pathlib import Path >>> class MyNode(zntrack.Node): ... outs_path: Path = zntrack.outs_path(zntrack.nwd / "output.txt") ... ... def run(self) -> None: ... ... '''Save output to self.outs_path.'''
- zntrack.metrics_path(default: str | ~pathlib.Path | ~typing.List[str | ~pathlib.Path] | ~dataclasses._MISSING_TYPE = <dataclasses._MISSING_TYPE object>, *, cache: bool | None = None, independent: bool = False, **kwargs)[source]¶
Define metrics file path(s) for a node.
- Parameters:
default (str|Path|list[str|Path], optional) – Path to one or multiple metrics files.
cache (bool, optional) – Whether to use the DVC cache for the field. If None, defaults to zntrack.config.ALWAYS_CACHE.
independent (bool, optional) – Whether the output is independent of the node’s inputs. Default is False.
Examples
>>> import zntrack >>> from pathlib import Path >>> class MyNode(zntrack.Node): ... metrics_path: Path = zntrack.metrics_path(zntrack.nwd / "metrics.json") ... ... def run(self) -> None: ... ... '''Save metrics to self.metrics_path.'''
- zntrack.plots_path(default: str | ~pathlib.Path | ~typing.List[str | ~pathlib.Path] | ~dataclasses._MISSING_TYPE = <dataclasses._MISSING_TYPE object>, *, cache: bool = True, independent: bool = False, **kwargs)[source]¶
Create a field that handles plots and figure paths.
- Parameters:
default (str|Path|list[str|Path], optional) – Path to one or multiple plot files. See https://dvc.org/doc/user-guide/experiment-management/visualizing-plots for more information.
cache (bool, optional) – Whether to use the DVC cache for the field.
independent (bool, optional) – Set to true if the output of this field can be independent of the node’s inputs. E.g. if a csv file is produced that contains indices it might not change if the inputs to the node change. In such a case subsequent nodes might not rerun if independent is kept as False.
Examples
>>> import zntrack >>> from pathlib import Path >>> class MyNode(zntrack.Node): ... plots_path: Path = zntrack.plots_path(zntrack.nwd / "plots.png") ... ... def run(self) -> None: ... ... '''Save a figure to self.plots_path.'''
It is possible to define custom fields by using zntrack.field()
.
- zntrack.field(default=<dataclasses._MISSING_TYPE object>, *, cache: bool = True, independent: bool = False, field_type: ~zntrack.config.FieldTypes, dump_fn: ~typing.Callable[[~zntrack.node.Node, str, str], ~typing.Any] | ~typing.Callable[[~zntrack.node.Node, str], ~typing.Any] | None = None, suffix: str | None = None, load_fn: ~typing.Callable[[~zntrack.node.Node, str, str], ~typing.Any] | ~typing.Callable[[~zntrack.node.Node, str], ~typing.Any] | None = None, **kwargs)[source]¶
Create a custom field.
- Parameters:
default (t.Any) – Default value of the field. For an output field, this should be
zntrack.NOT_AVAILABLE
and should not be set during initialization.cache (bool) – Use the DVC cache for the field.
independent (bool) – If the output of this field can be independent of the input.
field_type (FieldTypes) – The type of the field.
dump_fn (FN_WITH_SUFFIX | FN_WITHOUT_SUFFIX) – Function to dump the field.
suffix (str) – Suffix to append to the field name. Can be None if the output is a directory.
load_fn (FN_WITHOUT_SUFFIX | FN_WITH_SUFFIX) – Function to load the field.
**kwargs – Additional arguments to pass to the field.
Examples
>>> import numpy as np >>> import zntrack ... >>> def _load_fn(self: zntrack.Node, name: str, suffix: str) -> np.ndarray: ... with self.state.fs.open( ... (self.nwd / name).with_suffix(suffix), mode="rb" ... ) as f: ... return np.load(f) ... >>> def _dump_fn(self: zntrack.Node, name: str, suffix: str) -> None: ... with open((self.nwd / name).with_suffix(suffix), mode="wb") as f: ... np.save(f, getattr(self, name)) ... >>> def numpy_field(*, cache: bool = True, independent: bool = False, **kwargs): ... return field( default=zntrack.NOT_AVAILABLE ... cache=cache, ... independent=independent, ... field_type=zntrack.FieldTypes.OUTS, ... dump_fn=_dump_fn, ... suffix=".npy", ... load_fn=_load_fn, ... **kwargs, ... ) ... >>> class MyNode(Node): ... data: np.ndarray = numpy_field() ... ... def run(self) -> None: ... self.data = np.arange(9).reshape(3, 3)