torchjpeg.data¶
Dataset and Data Processing¶
-
class
torchjpeg.data.ImageList[source]¶ Bases:
objectStructure that holds a list of images (of possibly varying sizes) as a single tensor. This works by padding the images to the same size, and storing in a field the original sizes of each image
Note
This class was taken from detectron2 (https://github.com/facebookresearch/detectron2/blob/master/detectron2/structures/image_list.py) with a small modification to the padding function which preserves gradients and some small fixes for linting and type checking. Otherwise the class and its documentation of unchanged.
-
property
device¶ Implements the device API for ImageLists by returning the underlying device
- Returns:
The device that the underlying tensor storage resides on
- Return type:
-
static
from_tensors()[source]¶ - Parameters:
tensors – a tuple or list of torch.Tensors, each of shape (Hi, Wi) or (C_1, …, C_K, Hi, Wi) where K >= 1. The Tensors will be padded to the same shape with pad_value.
size_divisibility (int) – If size_divisibility > 0, add padding to ensure the common height and width is divisible by size_divisibility. This depends on the model and many models need a divisibility of 32.
pad_value (float) – value to pad
- Returns:
an ImageList.
-
to(*args: Any, **kwargs: Any) → torchjpeg.data.image_list.ImageList[source]¶ Implements the device API for ImageLists by copying the underyling storage to the target device
:param All arguments are forwarded to the underlying tensor storage
torch.Tensor.to():- Returns:
The imagelist object on the new device
- Return type:
-
torchjpeg.data.crop_batch()[source]¶ Crops a batch of images to their original size, removing any padding
- Parameters:
batch (Tensor) – A batch of shape \((N, C, H, W)\) of images which may have been padded either by JPEG or to make them the same size
sizes (Tensor) – A tensor of shape \((N, M)\) where the height and width of image i respecively are stored at position [i, -1] and [i, -2].
- Returns:
A list of the cropped images, potentially all with different sizes.
- Return type:
Sequence of Tensors
-
class
torchjpeg.data.JPEGQuantizedDataset[source]¶ Bases:
torch.utils.data.dataset.DatasetWraps an arbitrary image dataset to return JPEG quantized versions of the image. The amount quantization is defined using IJG quality settings. If the underlying dataset returns a sequence, the first element of the sequence is taken the be the image which is quantized and the remaining elements are returned as the last element of the batch. If the underlying dataset returns a mapping, set image_key to the key of the image to be quantized. The original dictionary, including the image before quantization, will be returned as the last element of the batch.
Since the primary return values are all DCT coefficients, padding will be added to the images to make them an even multiple of the MCU. Following JPEG conventions this is replicate padding added to the bottom and right edges. The original size of the image is returned so that the images can be correctly cropped after processing.
The format returned by this dataset is:
Y Channel Coefficients, CbCr Coefficients, Y Quantization Matrix, CbCr Quantization Matrix, Pre-quantization YCbCr Coefficients, Original Image Size, Optional rest of the batch from the underlying dataset
If the image is grayscale, the CbCr coefficients and quantization matrix will be an empty tensor, if the underlying dataset returns an image with no additional data, the final return value will be an empty tensor. This is to avoid issues with the default collate function, an empty tensor is one initialized with 0 size using
torch.empty(). It is detectable as tensor.numel() == 0.- Parameters:
data (
torch.utils.data.Dataset) – The dataset to wrapquality (int, tuple of two or three ints) – The quality range (min 0 max 100) to draw from, inclusive on both ends. If this is a single integer, only that quality is used, if it’s three integers, the last one defines a step size.
stats (:py:clas:`torchjpeg.dct.Stats`) – Statstics to use for per-frequency per-channel coefficient normalization
mcu (int) – The size of the minimum coded unit, use 16 for 4:2:0 chroma subsampling.
image_key (optional str) – The key to use to extract the image from a dataset which returns a mapping.
deterministic_quality (bool) – False by default, set to True to include the quality range in the dataset size. In other words, the length of this dataset will be len(quality_range) * len(dataset) and all the qualities in the range will be represented for every image by interating this dataset.
Warning
The images returned from this dataset may be of differing sizes, use the static
torchjpeg.data.JPEGQuantizedDataset.collate()to collate them into a batch with padding. Usetorchjpeg.data.crop_batch()to crop them back to the correct sizes (this will also remove JPEG padding).-
static
collate()[source]¶ Custom collate function which works for return values from this dataset. Adds padding to the images so that they can be stored in a single tensor
- Parameters:
batch_list – Output from this dataset
- Returns:
Batch with each input collated into single tensors
-
class
torchjpeg.data.FolderOfJpegDataset[source]¶ Bases:
torch.utils.data.dataset.DatasetLoads coefficents from a folder of JPEG without any labels. For each image, it returns the format of
torchjpeg.codec.read_coefficients(). The images must be actualy JPEG files (stored as JPEGs) for this to work. The relative path to the JPEG file will be returned along with the coefficients. The coefficients themselves are not guaranteed to be the same size, use the collate function to collate these into a batched Tensor by adding padding.- Parameters:
-
class
torchjpeg.data.UnlabeledImageFolder[source]¶ Bases:
torch.utils.data.dataset.DatasetDataset loading a folder of unlabeled images recursively. The images are loaded using PIL and otherwise unchanged, add a transform to turn them into Tensors
- Parameters:
path (Path) – The path to load recursively from
extensions (List[str]) – The image extensions to look for
transform – Any transform to apply to the images after loading them
Transforms¶
-
class
torchjpeg.data.transforms.RandomJPEG[source]¶ Bases:
objectApplies JPEG compression on a PIL at a random quality.
-
class
torchjpeg.data.transforms.YCbCr[source]¶ Bases:
objectConverts a PIL image to YCbCr color space
Note
PIL follows the JPEG YCbCr color conversion giving a result in [0, 255].
-
class
torchjpeg.data.transforms.YChannel[source]¶ Bases:
objectConverts a tensor with a color image in [0, 1] to the Y channel using ITU-R BT.601 conversion
Warning
This is not equivalent to the Y channel of a color image that would be used by JPEG, the result is in [16, 240] following the ITU-R BT.601 standard before normalization. This is useful for certian JPEG artifact correction algorithms due to some questionable evaluation choices by that community. The result is normalized to \(\left[\frac{16}{255},\frac{240}{255}\right]\) before being returned.
-
class
-
property