Using the BigEarthNet v1.0 LMDB Reader#

In this section, an example of using the BigEarthNet v1.0 reader is shown. This converts the Lightning Memory-Mapped Database Manager (LMDB) used in the background as a database into an indexable Python object.

Note

Due to its use of LMDB, which is not pickle-able, it is not thread safe to use this object after first access. However, using it only after forking is supported (e.g. access in __getitem__ methods in a pytorch dataset).

To use the reader, we have to create a BENLMDBReader object. This object needs 4 arguments for creation, namely the directory where the LMBD file is located as string, a sequence of 3 ints for the desired image size (Channel, Height, Width), an indication of which bands are to be used and the label type to use.

from configilm.extra.BENv1_utils import BENv1LMDBReader

BEN_reader = BENv1LMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(3, 120, 120),
        bands="RGB",
        label_type="old",
    )
img, lbl = BEN_reader["S2B_MSIL2A_20180502T093039_82_40"]

We are expecting this object to contain images of size 3x120x120 in RGB, annotated with the “old” 43-label version. Images are delivered as torch tensors and labels as a list of strings.

Size: torch.Size([3, 120, 120])
Labels:
['Complex cultivation patterns',
 'Land principally occupied by agriculture, with significant areas of natural '
 'vegetation',
 'Broad-leaved forest',
 'Transitional woodland/shrub',
 'Water bodies']
../_images/144625339cc5fab7bc16256722ce59a0f1ecb1ab010b94f0b02ea9782a9acede.png

Selecting Bands#

If we now are interested in the vegetation index for example, we can specifically create a reader that only returns the Bands B8 and B4 as needed for the Index.

The vegetation index is defined as \( \begin{align*} VI = \frac{B08 - B04}{B08 + B04} \end{align*} \)

BEN_reader = BENv1LMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(2, 120, 120),
        bands=["B08", "B04"],
        label_type="old",
    )
img, lbl = BEN_reader["S2B_MSIL2A_20180502T093039_82_40"]
veg_idx = (img[0]-img[1])/(img[0]+img[1])

The images returned from this reader will have Band 08 in dimension 0 and Band 04 in dimension 1 like the order specified in the parameter. Note, that the image size also has to be set to (2, ...), as this is used to check the size after interpolation. Interpolation is already applied in the Loader using torch.nn.functional.interpolate() with aligned corners in bicubic mode.

../_images/97a8270d4b1add7b3781380a0b67a2886b7316398f5bfc08bfd1152f24675bae.png

For ease of use there are some predefined configurations available that can be used without having to list all containing bands. The available pre-definitions and their respective bands are

::{note} Not all configurations are necessarily supported by all implementations in ConfigILM


    'S1': ['VH', 'VV']
    'S2': ['B02', 'B03', 'B04', 'B08', 'B05', 'B06', 'B07', 'B11', 'B12', 'B8A']
'10m20m': ['B02', 'B03', 'B04', 'B08', 'B05', 'B06', 'B07', 'B11', 'B12', 'B8A', 'VH', 'VV']
   'RGB': ['B04', 'B03', 'B02']
'RGB-IR': ['B04', 'B03', 'B02', 'B08']
      2 : ['VH', 'VV']
     10 : ['B02', 'B03', 'B04', 'B08', 'B05', 'B06', 'B07', 'B11', 'B12', 'B8A']
     12 : ['B02', 'B03', 'B04', 'B08', 'B05', 'B06', 'B07', 'B11', 'B12', 'B8A', 'VH', 'VV']
      3 : ['B04', 'B03', 'B02']
      4 : ['B04', 'B03', 'B02', 'B08']

Label types#

We can also request the labels in the “new” 19-label version as introduced in Sumbul et al. [7]. Here we see that the Label ‘Water bodies’ gets converted into ‘Inland waters’ as expected.

BEN_reader = BENv1LMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(3, 120, 120),
        bands="RGB",
        label_type="new",
    )
img, lbl = BEN_reader["S2B_MSIL2A_20180502T093039_82_40"]
pprint(lbl)
['Complex cultivation patterns',
 'Land principally occupied by agriculture, with significant areas of natural '
 'vegetation',
 'Broad-leaved forest',
 'Transitional woodland, shrub',
 'Inland waters']

If wished, the 19-label lists can also be converted into a 19-dimensional one-hot tensor. This guarantees a uniform conversion, so that each label vector always has the same sequence, regardless of the user.

from configilm.extra.BENv1_utils import ben19_list_to_onehot
ben19_list_to_onehot(lbl)
tensor([0., 0., 0., 1., 0., 1., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 1.,
        0.])

Mean and Standard Deviation#

The reader objects also collect mean and standard deviation during initialization based on the chosen band configuration.

BEN_reader_1 = BENv1LMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(2, 120, 120),
        bands=["B08", "B04"],
        label_type="old",
    )
print(f"Mean 1: {BEN_reader_1.mean}")
print(f" Std 1: {BEN_reader_1.std}")

BEN_reader_2 = BENv1LMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(3, 120, 120),
        bands=["B04", "B01", "B8A"],
        label_type="old",
    )
print(f"Mean 2: {BEN_reader_2.mean}")
print(f" Std 2: {BEN_reader_2.std}")
Mean 1: [2218.94553375, 590.23569706]
 Std 1: [1365.45589904, 675.88746967]
Mean 2: [590.23569706, 340.76769064, 2266.46036911]
 Std 2: [675.88746967, 554.81258967, 1356.13789355]