Annotation Formats

Remo can read and output annotations in a variety of formats. In this section, we present details about the specific formats and options available.

At high-level, we aim to offer native support for:

  • most used formats by the Computer Vision community
  • simple formats to be used as conversion points to process custom annotation formats

Supported formats

Here is a quick summary of the supported file formats across different tasks. For each format, you can refer to the relevant section further below for more information.

Tags

Regardless of the task selected, you can upload or export image-level tags using a CSV file

Supported Format Example Input / Output
Plain CSV tags.csv Both

Image Classification

Supported Format Example Input / Output
Plain CSV plain_img_class.csv Both
Pascal VOC XML voc_img_class.xml Input
Remo JSON remo_img_class.json Output

Object Detection

Supported Format Example Input / Output
Coco JSON coco_obj_det_io.json Both
Plain CSV obj_det.csv Both
Pascal VOC XML voc_obj_det.xml Input
Open Images CSV old_obj_det_i.csv Input
Remo JSON plain_obj_det_o.json Output

Instance Segmentation

Supported Format Example Input / Output
Coco JSON coco_inst_seg.json Both
Remo JSON plain_inst_seg.json Output
Plain CSV plain_inst_seg_o.csv Output

Additional info

Image files

Image files can have any of these extensions: .png, .jpg, .jpeg, .tif, .tiff.

Class encoding

With class encoding we indicate some set of rules that convert actual labels in some IDs. Some prominent Computer Vision datasets make use of this.

A custom class encoding csv file can be passed to convert between IDs and labels. This requires the following formatting:

Some pre-made class encodings are also available:

Objects coordinates

Across formats, objects coordinates follow the following logic:

  • can be expressed in either pixels or percentages wherever possible
  • (0,0) indicates the top left corner of an image
  • percentage example: (x, y) = (0.6, 0.9) means (x = 0.6*width , y=0.9*height)

Image Classification

Image classification refers to the task of assigning labels to an image as a whole. Remo supports the following formats:


Format File type Required fields Optional Fields Example
Pascal VOC XML filename, object voc_img_class.xml
Plain CSV file CSV file_name, class_name plain_img_class.csv

Pascal VOC

The Pascal VOC XML annotation format became popular thanks to the ImageNet challenge, which adopted the format of the PASCAL VOC challenge.

By default, on import classes follow the WordNet encoding, which you can download here

Plain CSV

This is a custom simple format we support to facilitate converting data between formats.

  • Columns in the file have to be separated by commas
  • you can use any of the following variations on column headers:
file_name class_name
image, image_id, image id, file_name, filename, file, file name, file_path, path, file path class, label, name, class_name, class name, label_name, label name, category

Object Detection

Object detection refers to the task of identify objects in an image using rectangular boxes. Remo supports the following formats:

Format File type Required fields Optional Fields Example
Coco Object Detection JSON file_name, segmentation, area, iscrowd, image_id, bbox, category_id, id info coco_obj_det_io.json
Pascal VOC XML filename, size, object voc_obj_det.xml
Open Images CSV ImageID, LabelName, Confidence,XMin, XMax, YMin, YMax Source, IsOccluded, IsGroupOf, IsDepiction, IsInside old_obj_det_i.csv
Plain CSV file CSV file_name, class_name, xmin, ymin, xmax, ymax - obj_det.csv

Coco

You can read more about the Common Objects in Context challenge here

Coordinates for bounding boxes are expressed in pixel values as [xmin, ymin, width, height].

Pascal VOC

You can read more about the Pascal Visual Objects Classes challenge here.

Classes are WordNet encoded which you can download here.

Open Images

You can read more about the Open Images challenge here.

Classes (LabelName) are Google Knowledge Graph encoded which you can access here.

Plain CSV

We support the following variations on column headers:

file_name class_name xmin xmax ymin ymax
image, image_id, image id, file_name, filename, file, file name, file_path, path, file path class, label, name, class_name, class name, label_name, label name, category xmin, x_min, x min, x1, x_1 xmax, x_max, x max, x2, x_2 ymin, y_min, y min, y1, y_1 ymax, y_max, y min, y2, y_2

Instance Segmentation

Instance segmentation refers to the task of detecting and delineating each distinct object of interest appearing in an image.

Remo supports the following formats:

Format File type Required fields Optional Fields Example
Coco Instance Segmentation JSON file_name, segmentation, area, iscrowd, image_id, category_id, id info coco_inst_seg.json

Coco

You can read more about the Common Objects in Context challenge here

Coordinates are expressed in pixel values as [x, y] pairs.