Uploading Annotations and Predictions

In this tutorial, we explore different options to upload annotations in Remo from code.

In particular, we will see how to:

  • add annotations from a file (in a format supported by remo)
  • add annotations from code (can be used for predictions, and from any input format)

We start off by creating a dataset and populating it with some images:

import remo
import os
import pandas as pd
urls = ['https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip']
my_dataset = remo.create_dataset(name = 'D1', urls = urls)

Acquiring data - completed
Processing data - completed
Data upload completed

Supported format

To add annotations from a supported file format, we can pass the file via dataset.add_data

Remo automatically parses annotation files in a variety of formats (such as Pascal XML, COCO JSON, Open Images CSV, etc). You can read more about supported file formats in our documentation.

Example: let's add some annotations for an Object Detection task from a CSV file with encoded classes

In this case, annotations are stored in a supported CSV file format. Class labels were encoded using GoogleKnowledgeGraph. Remo automatically detects the class encoding and translates it into the corresponding labels

annotation_files=[os.getcwd() + '/assets/open_sample.csv']

df = pd.read_csv(annotation_files[0])

Index(['ImageID', 'Source', 'LabelName', 'Confidence', 'XMin', 'XMax', 'YMin', 'YMax', 'IsOccluded', 'IsTruncated', 'IsGroupOf', 'IsDepiction', 'IsInside'], dtype='object')

my_dataset.add_data(local_files=annotation_files, annotation_task = 'Object detection')

Acquiring data - completed
Processing data - completed
Data upload completed

{'session_id': '810e23ad-ff63-42a9-89ed-400fbb7d4fc0', 'created_at': '2020-05-29T13:56:17.832956Z', 'dataset': {'id': 14, 'name': 'D1'}, 'status': 'done', 'substatus': '', 'images': {'pending': 0, 'total': 0, 'successful': 0, 'failed': 0, 'errors': []}, 'annotations': {'pending': 0, 'total': 1, 'successful': 1, 'failed': 0, 'errors': []}, 'errors': [], 'uploaded': {'total': {'items': 0, 'size': 0, 'human_size': '0 b'}, 'images': {'items': 0, 'size': 0, 'human_size': '0 b'}, 'annotations': {'items': 0, 'size': 0, 'human_size': '0 b'}, 'archives': {'items': 0, 'size': 0, 'human_size': '0 b'}}}

We can now see annotation statistics and visually explore the dataset


[{'AnnotationSet ID': 44, 'AnnotationSet name': 'Object detection', 'n_images': 9, 'n_classes': 15, 'n_objects': 84, 'top_3_classes': [{'name': 'Fruit', 'count': 27}, {'name': 'Sports equipment', 'count': 12}, {'name': 'Human arm', 'count': 7}], 'creation_date': None, 'last_modified_date': '2020-05-29T13:56:05.883892Z'}]



Add predictions or custom annotations

To add data programatically from any format, we can use the Annotation object

This can be useful to:

  • upload model predictions
  • upload annotations from any custom file format
  • create an active learning workflow

Example: let's add annotations to a specific image using add_annotations() method of the dataset class

image_name = '000a1249af2bc5f0.jpg'

annotations = []

annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes='Human hand'
annotation.bbox=[227, 284, 678, 674]

annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes='Fashion accessory'
annotation.bbox=[496, 322, 544,370]


Progress 100% - 1/1 - elapsed 0:00:00.001000 - speed: 1000.00 img / s, ETA: 0:00:00 Acquiring data - completed
Processing data - completed
Data upload completed

We can now retrieve the picture and visualise it:

my_image = my_dataset.image(image_name)


Annotation sets

Under the hood, Remo organises annotations in an annotation set, which is just a version of all the annotations of a dataset.

There are 2 great advantages of using annotation sets:

  1. high-level operations on all the annotations with just one command
  2. easily manage multiple versions of annotations

What you can do with annotation sets

When training a model, it's not always clear what's the right way to label data. Using annotation sets, this exploration becomes easier as you can for example:

  • see stats of your data
  • change labels for objects from one class to another
  • delete objects of specific classes
  • compare different annotation sets (such as ground truth vs prediction, or annotations coming from different annotators)

In the example we have seen above, Remo automatically creates an annotation set. For more control, it's possible to explicit manipulate Annotation sets objects thesmelves.

To read more about annotation sets, you can check the remo documentation: https://remo.ai/docs/.