Uploading Annotations and Predictions¶
In this tutorial, we explore different options to upload annotations in Remo from code.
In particular, we will see how to:
- add annotations from a file (in a format supported by remo)
- add annotations from code (can be used for predictions, and from any input format)
We start off by creating a dataset and populating it with some images:
import remo
import os
import pandas as pd
remo.set_viewer('jupyter')
urls = ['https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip']
my_dataset = remo.create_dataset(name = 'D1', urls = urls)
Acquiring data - completed
Processing data - completed
Data upload completed
Supported format¶
To add annotations from a supported file format, we can pass the file via dataset.add_data
Remo automatically parses annotation files in a variety of formats (such as Pascal XML, COCO JSON, Open Images CSV, etc). You can read more about supported file formats in our documentation.
Example: let's add some annotations for an Object Detection task from a CSV file with encoded classes
In this case, annotations are stored in a supported CSV file format. Class labels were encoded using GoogleKnowledgeGraph. Remo automatically detects the class encoding and translates it into the corresponding labels
annotation_files=[os.getcwd() + '/assets/open_sample.csv']
df = pd.read_csv(annotation_files[0])
df.columns
Index(['ImageID', 'Source', 'LabelName', 'Confidence', 'XMin', 'XMax', 'YMin', 'YMax', 'IsOccluded', 'IsTruncated', 'IsGroupOf', 'IsDepiction', 'IsInside'], dtype='object')
my_dataset.add_data(local_files=annotation_files, annotation_task = 'Object detection')
Acquiring data - completed
Processing data - completed
Data upload completed
{'session_id': '810e23ad-ff63-42a9-89ed-400fbb7d4fc0', 'created_at': '2020-05-29T13:56:17.832956Z', 'dataset': {'id': 14, 'name': 'D1'}, 'status': 'done', 'substatus': '', 'images': {'pending': 0, 'total': 0, 'successful': 0, 'failed': 0, 'errors': []}, 'annotations': {'pending': 0, 'total': 1, 'successful': 1, 'failed': 0, 'errors': []}, 'errors': [], 'uploaded': {'total': {'items': 0, 'size': 0, 'human_size': '0 b'}, 'images': {'items': 0, 'size': 0, 'human_size': '0 b'}, 'annotations': {'items': 0, 'size': 0, 'human_size': '0 b'}, 'archives': {'items': 0, 'size': 0, 'human_size': '0 b'}}}
We can now see annotation statistics and visually explore the dataset
my_dataset.get_annotation_statistics()
[{'AnnotationSet ID': 44, 'AnnotationSet name': 'Object detection', 'n_images': 9, 'n_classes': 15, 'n_objects': 84, 'top_3_classes': [{'name': 'Fruit', 'count': 27}, {'name': 'Sports equipment', 'count': 12}, {'name': 'Human arm', 'count': 7}], 'creation_date': None, 'last_modified_date': '2020-05-29T13:56:05.883892Z'}]
my_dataset.view()
Add predictions or custom annotations¶
To add data programatically from any format, we can use the Annotation
object
This can be useful to:
- upload model predictions
- upload annotations from any custom file format
- create an active learning workflow
Example: let's add annotations to a specific image using add_annotations()
method of the dataset class
image_name = '000a1249af2bc5f0.jpg'
annotations = []
annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes='Human hand'
annotation.bbox=[227, 284, 678, 674]
annotations.append(annotation)
annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes='Fashion accessory'
annotation.bbox=[496, 322, 544,370]
annotations.append(annotation)
my_dataset.add_annotations(annotations)
Progress 100% - 1/1 - elapsed 0:00:00.001000 - speed: 1000.00 img / s, ETA: 0:00:00
Acquiring data - completed
Processing data - completed
Data upload completed
We can now retrieve the picture and visualise it:
my_image = my_dataset.image(image_name)
my_image.view()
Annotation sets¶
Under the hood, Remo organises annotations in an annotation set, which is just a version of all the annotations of a dataset.
There are 2 great advantages of using annotation sets:
- high-level operations on all the annotations with just one command
- easily manage multiple versions of annotations
What you can do with annotation sets
When training a model, it's not always clear what's the right way to label data. Using annotation sets, this exploration becomes easier as you can for example:
- see stats of your data
- change labels for objects from one class to another
- delete objects of specific classes
- compare different annotation sets (such as ground truth vs prediction, or annotations coming from different annotators)
In the example we have seen above, Remo automatically creates an annotation set. For more control, it's possible to explicit manipulate Annotation sets objects thesmelves.
To read more about annotation sets, you can check the remo documentation: https://remo.ai/docs/.