Utility Types in MatNWB

Note

API documentation for “untyped” types will be added in the future.

“Untyped” utility types are classes not defined by the NWB schema, but used alongside NWB data classes to provide additional functionality when reading or writing NWB files with MatNWB. These types are located in the +types/+untyped/ namespace within the MatNWB root directory. The following untyped types are described in this section:

Sets

The Set class (types.untyped.Set) is used to store a dynamic collection of NWB-typed objects. Some NWB data types may include other data types as property values. The Set class supports this by enforcing constraints on its members—for example, restricting the set to contain only specified data types. For this reason, it is also referred to as a constrained set.

Data objects are added to a Set as name-value pairs using the add method:

aTimeSeries = types.core.TimeSeries('data', rand(1,10));
someSet = types.untyped.Set();
someSet.add('my timeseries', aTimeSeries)

The example above creates a new Set with one entry:

>> someSet

someSet =

  Set with entries:

    myTimeseries: types.core.TimeSeries

The data object (TimeSeries) is added as a dynamic property on the Set object. Because MATLAB does not support whitespace or special characters in property names, the name is remapped to a valid MATLAB identifier.

Note

The name provided when adding a data object to a Set is preserved in the NWB file. Tools like PyNWB or other NWB/HDF5 readers will display this original name—for example, 'my timeseries'. In MatNWB, we recommend using a consistent naming style that is valid in MATLAB (e.g., PascalCase) to avoid naming ambiguities.

To retrieve the value, refer to the property directly:

timeSeriesCopy = someSet.myTimeseries

Supporting legacy syntax

MatNWB also supports legacy syntax for setting and retrieving items in a Set:

value = someSet.get('key name');
someSet.set('key name', value);

Note

The Set class also supports the keys and values methods, similar to containers.Map, for retrieving cell arrays of keys or values.

DataStubs and DataPipes

When working with NWB files, datasets can be very large (gigabytes or more). Loading all this data into memory at once would be impractical or impossible. MatNWB uses two types to handle on-disk data efficiently: DataStubs and DataPipes.

A DataStub (types.untyped.DataStub) represents a read-only reference to data stored in an NWB file. When you read an NWB file, non-scalar and multi-dimensional datasets are automatically represented as DataStubs rather than loaded into memory.

https://github.com/NeurodataWithoutBorders/nwb-overview/blob/main/docs/source/img/matnwb_datastub.png?raw=true

Key characteristics:

  • Lazy loading: Data remains on disk until you explicitly access it

  • Memory efficient: Only the portions you request are loaded

  • MATLAB-style indexing: Access data using familiar syntax like dataStub(1:100, :)

  • Read-only: Cannot be used to modify or write data

You’ll encounter DataStubs whenever you read existing NWB files containing non-scalar or multi-dimensional datasets.

A DataPipe (types.untyped.DataPipe) extends the concept of lazy data access to support writing as well as reading. While DataStubs are created automatically when reading files, you create DataPipes explicitly when writing data.

Key characteristics:

  • Bidirectional: Supports both reading and writing operations

  • Incremental writing: Stream data to disk in chunks rather than all at once

  • Compression support: Apply HDF5 compression and chunking strategies

  • Write optimization: Configure how data is stored on disk for better performance

DataPipes solve the problem of writing datasets that are too large to fit in memory, or when you want fine-grained control over how data is stored in the HDF5 file.

See also