Annotation
The AnnotationClient is used to interact with the AnnotationEngine
service to create tables from existing schema, upload new data, and
download existing annotations. Note that annotations in the
AnnotationEngine are not linked to any particular segmentation, and thus
do not include any root ids. An annotation client is accessed with
client.annotation
.
Getting existing tables¶
A list of the existing tables for the datastack can be found with get_tables().
all_tables = client.annotation.get_tables()
all_tables[0]
Each table has three main properties that can be useful to know:
table_name
: The table name, used to refer to it when uploading or downloading annotations. This is also passed through to the table in the Materialized database.schema_name
: The name of the table's schema from EMAnnotationSchemas (see below).max_annotation_id
: An upper limit on the number of annotations already contained in the table.
Downloading annotations¶
You can download the JSON representation of a data point through the get_annotation() method. This can be useful if you need to look up information on unmaterialized data, or to see what a properly templated annotation looks like.
table_name = all_tables[0]['table_name'] # 'ais_analysis_soma'
annotation_id = 100
client.annotation.get_annotation(annotation_ids=annotation_id, table_name=table_name)
Create a new table¶
One can create a new table with a specified schema with the create_table() method:
client.annotation.create_table(table_name='test_table',
schema_name='microns_func_coreg',
voxel_resolution = [1,1,1],
description="some text to describe your table")
The voxel resolution is the units your position columns will be uploaded in [1,1,1] would imply a nm location, where as [4,4,40] would correspond to voxels of that size. If you are uploading points from a neuroglancer session, you want this to match the units of that neuroglancer view.
Note there are some optional metadata parameters to create_table()
notice_text
: This is text that will show up to users who access this data as a warning. This could be used to warn users that the data is not complete or checked yet, or to advertise that a particular publication should be cited when using this table.read_permission
: one of "PRIVATE" which means only you can read data in this table. "PUBLIC" (default) which means anyone can read this table that has read permissions to this dataset. So if and only if you can read the segmentation results of this data, you can read this table. "GROUP" which means that you must share a common group with this user for them to be able to read. We need to make a way to discover what groups you are in and who you share groups with.write_permission
: one of "PRIVATE" (default), which means only you can write to this table. "PUBLIC" which means anyone can write to this table that has write permissions to this dataset. Note although this means anyone can add data, no annotations are ever truly overwritten. "GROUP" which means that you must share a common group with this user for them to be able to write. We need to make a way to discover what groups you are in and who you share groups with.
If you change your mind about what you want for metadata, some but not all fields can be updated with update_metadata(). This includes the description, the notice_text, and the permissions, but not the name, schema or voxel resolution.
# to update description
client.annotation.update_metadata(table_name='test_table',
description="a new description for my table")
# to make your table readable by anybody who can read this dataset
client.annotation.update_metadata(table_name='test_table',
notice_text="This table isn't done yet, don't trust it. Contact me")
# to make your table readable by anybody who can read this dataset
client.annotation.update_metadata(table_name='test_table',
read_permisison="PUBLIC")
New data can be generated as a dict or list of dicts following the
schema and uploaded with post_annotation
. For example, a
microns_func_coreg
point needs to have:
type
set tomicrons_func_coreg
pt
set to a dict withposition
as a key and the xyz location as a value.func_id
set to an integer.
The following could would create a new annotation and then upload it to the service. Note that you get back the annotation id(s) of what you uploaded.
new_data = {'type': 'microns_func_coreg',
'pt': {'position': [1,2,3]},
'func_id': 0}
client.annotation.post_annotation(table_name='test_table', data=[new_data])
There are methods to simplify annotation uploads if you have a pandas dataframe whose structure mirrors the struction of the annotation schema you want to upload
import pandas as pd
df = pd.DataFrame([{'id':0,
'type': 'microns_func_coreg',
'pt_position': [1,2,3]},
'func_id': 0},
{'id':1,
'type': 'microns_func_coreg',
'pt_position': [3,2,1]},
'func_id': 2}])
client.annotation.post_annotation_df('test_table', df)
Note that here I specified the IDs of my annotations, which you can do, but then its up to you to assure that the IDs don't collide with other IDs. If you leave them blank then the service will assign the IDs for you.
There is a similar method for updating update_annotation_df()
Staged Annotations¶
Staged annotations help ensure that the annotations you post follow the appropriate schema, both by providing guides to the field names and locally validating against a schema before uploading. The most common use case for staged annotations is to create a StagedAnnotation object for a given table, then add annotations to it individually or as a group, and finally upload to the annotation table.
To get a StagedAnnotation object, you can start with either a table name or a schema name. Here, we'll assume that there's already a table called "my_table" that is running a "cell_type_local" schema. If we want to add new annotations to the table, we simply use the table name with stage_annotations().
stage = client.annotation.stage_annotations("my_table")
This stage
object retrieves the schema for the table and hosts a local
collection of annotations. Every time you add an annotation, it is
immediately validated against the schema. To add an annotation, use the
add
method:
stage.add(
cell_type = "pyramidal_cell",
classification_system="excitatory",
pt_position=[100,100,10],
)
The argument names derive from fields in the schema and you must provide
all required fields. Any number of annotations can be added to the
stage. A dataframe of annotations can also be added with
stage.add_dataframe
, and requires an exact match between column names
and schema fields. The key difference between this and posting a
dataframe directly is that annotations added to a StagedAnnotations are
validated locally, allowing any issues to be caught before uploading.
You can see the annotations as a list of dictionary records with
stage.annotation_list
or as a Pandas dataframe with
stage.annotation_dataframe
. Finally, if you initialized the stage with
a table name, this information is stored in the stage
and you can
simply upload it from the client.
client.annotation.upload_staged_annotations(stage)
Updating annotations requires knowing the annotation id of the
annotation you are updating, which is not required in the schema
otherwise. In order to stage updated annotations, set the update
parameter to True
when creating the stage.
update_stage = client.annotation.stage_annotations("my_table", update=True)
update_stage.add(
id=1,
cell_type = "stellate_cell",
classification_system="excitatory",
pt_position=[100,100,10],
)
The update
also informs the framework client to treat the annotations
as an update and it will use the appropriate methods automatically when
uploading client.annotation.upload_staged_annotations
.
If you want to specify ids when posting new annotations, id_field
can
be set to True when creating the StagedAnnotation object. This will
enforce an id
column but still post the data as new annotations.
If you might be adding spatial data in coordinates that might be
different than the resolution for the table, you can also set the
annotation_resolution
when creating the stage. The stage will convert
between the resolution you specify for your own annotations and the
resolution that the table expects.
stage = client.annotation.stage_annotations("my_table", annotation_resolution=[8,8,40])
stage.add(
cell_type='pyramidal_cell',
classification_system="excitatory",
pt_position=[50,50,10],
)