Skip to content

materializationengine

MaterializationClientV2(server_address, auth_header, api_version, endpoints, server_name, datastack_name, cg_client=None, synapse_table=None, version=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, desired_resolution=None)

Bases: ClientBase

cg_client property

The chunked graph client.

datastack_name property

The name of the datastack.

homepage: HTML property

The homepage for the materialization engine.

server_version: Optional[Version] property

The version of the service running on the remote server. Note that this refers to the software running on the server and has nothing to do with the version of the datastack itself.

tables: TableManager property

The table manager for the materialization engine.

version: int property writable

The version of the materialization. Can be used to set up the client to default to a specific version when timestamps or versions are not specified in queries. If not set, defaults to the most recent version.

Note that if this materialization client is attached to a CAVEclient, the version must be set at the CAVEclient level.

views: ViewManager property

The view manager for the materialization engine.

get_table_metadata(table_name, datastack_name=None, version=None, log_warning=True)

Get metadata about a table

Parameters:

Name Type Description Default
table_name str

name of table to mark for deletion

required
datastack_name str or None

Name of the datastack_name. If None, uses the one specified in the client.

None
version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
log_warning bool

Whether to print out warnings to the logger. Defaults to True.

True

Returns:

Type Description
dict

Metadata dictionary for table

get_tables(datastack_name=None, version=None)

Gets a list of table names for a datastack

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client

None
version int or None

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None

Returns:

Type Description
list

List of table names

get_timestamp(version=None, datastack_name=None)

Get datetime.datetime timestamp for a materialization version.

Parameters:

Name Type Description Default
version int or None

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
datastack_name str or None

Datastack name, by default None. If None, defaults to the value set in the client.

None

Returns:

Type Description
datetime

Datetime when the materialization version was frozen.

get_version_metadata(version=None, datastack_name=None)

Get metadata about a version

Parameters:

Name Type Description Default
version int or None

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
datastack_name str or None

Datastack name, by default None. If None, defaults to the value set in the client.

None

Returns:

Type Description
dict

Dictionary of metadata about the version

get_versions(datastack_name=None, expired=False)

Get the versions available

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack, by default None. If None, uses the one specified in the client.

None
expired bool

Whether to include expired versions, by default False.

False

Returns:

Type Description
dict

Dictionary of versions available

get_versions_metadata(datastack_name=None, expired=False)

Get the metadata for all the versions that are presently available and valid

Parameters:

Name Type Description Default
datastack_name str or None

Datastack name, by default None. If None, defaults to the value set in the client.

None
expired bool

Whether to include expired versions, by default False.

False

Returns:

Type Description
list[dict]

List of metadata dictionaries

ingest_annotation_table(table_name, datastack_name=None)

Trigger supervoxel lookup and root ID lookup of new annotations in a table.

Parameters:

Name Type Description Default
table_name str

Table to trigger

required
datastack_name str

Datastack to trigger it. Defaults to what is set in client.

None

Returns:

Type Description
dict

Status code of response from server

join_query(tables, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, suffixes=None, datastack_name=None, return_df=True, split_positions=False, materialization_version=None, metadata=True, desired_resolution=None, random_sample=None, log_warning=True)

Generic query on materialization tables

Parameters:

Name Type Description Default
tables list of lists with length 2 or 'str'

list of two lists: first entries are table names, second entries are the columns used for the join.

required
filter_in_dict dict of dicts

outer layer: keys are table names inner layer: keys are column names, values are allowed entries, by default None

None
filter_out_dict dict of dicts

outer layer: keys are table names inner layer: keys are column names, values are not allowed entries, by default None

None
filter_equal_dict dict of dicts

outer layer: keys are table names inner layer: keys are column names, values are specified entry, by default None

None
filter_spatial_dict dict of dicts

outer layer: keys are table names, inner layer: keys are column names. Values are bounding boxes as [[min_x, min_y,min_z],[max_x, max_y, max_z]], expressed in units of the voxel_resolution of this dataset. Defaults to None.

None
filter_regex_dict dict of dicts

outer layer: keys are table names. inner layer: keys are column names, values are regex strings. Defaults to None

None
select_columns dict of lists of str

keys are table names,values are the list of columns from that table. Defaults to None, which will select all tables. Will be passed to server as select_column_maps. Passing a list will be passed as select_columns which is deprecated.

None
offset int

result offset to use. Defaults to None. Will only return top K results.

None
limit int

maximum results to return (server will set upper limit, see get_server_config)

None
suffixes dict

suffixes to use for duplicate columns, keys are table names, values are the suffix

None
datastack_name str

datastack to query. If None defaults to one specified in client.

None
return_df bool

whether to return as a dataframe default True, if False, data is returned as json (slower)

True
split_positions bool

whether to break position columns into x,y,z columns default False, if False data is returned as one column with [x,y,z] array (slower)

False
materialization_version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
metadata bool

toggle to return metadata If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
desired_resolution Iterable

What resolution to convert position columns to. Defaults to None will use defaults.

None
random_sample int

if given, will do a tablesample of the table to return that many annotations

None
log_warning bool

Whether to log warnings, by default True

True

Returns:

Type Description
DataFrame

a pandas dataframe of results of query

live_live_query(table, timestamp, joins=None, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, split_positions=False, metadata=True, suffixes=None, desired_resolution=None, allow_missing_lookups=False, random_sample=None, log_warning=True)

Beta method for querying cave annotation tables with rootIDs and annotations at a particular timestamp. Note: this method requires more explicit mapping of filters and selection to table as its designed to test a more general endpoint that should eventually support complex joins.

Parameters:

Name Type Description Default
table str

Principle table to query

required
timestamp datetime

Timestamp to query

required
joins

List of joins, where each join is a list of [table1,column1, table2, column2]

None
filter_in_dict

A dictionary with tables as keys, values are dicts with column keys and list values to accept.

None
filter_out_dict

A dictionary with tables as keys, values are dicts with column keys and list values to reject.

None
filter_equal_dict

A dictionary with tables as keys, values are dicts with column keys and values to equate.

None
filter_spatial_dict

A dictionary with tables as keys, values are dicts with column keys and values of 2x3 list of bounds.

None
select_columns

A dictionary with tables as keys, values are lists of columns to select.

None
offset int

Value to offset query by.

None
limit int

Limit of query.

None
datastack_name str

Datastack to query. Defaults to set by client.

None
split_positions bool

Whether to split positions into separate columns, True is faster.

False
metadata bool

Whether to attach metadata to dataframe.

True
suffixes dict

What suffixes to use on joins, keys are table_names, values are suffixes.

None
desired_resolution Iterable

What resolution to convert position columns to.

None
allow_missing_lookups bool

If there are annotations without supervoxels and root IDs yet, allow results.

False
random_sample int

If given, will do a table sample of the table to return that many annotations.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description

Results of query

Examples:

>>> from caveclient import CAVEclient
>>> client = CAVEclient('minnie65_public_v117')
>>> live_live_query("table_name", datetime.datetime.now(datetime.timezone.utc),
>>>    joins=[[table_name, table_column, joined_table, joined_column],
>>>             [joined_table, joincol2, third_table, joincol_third]]
>>>    suffixes={
>>>        "table_name":"suffix1",
>>>        "joined_table":"suffix2",
>>>        "third_table":"suffix3"
>>>    },
>>>    select_columns= {
>>>        "table_name":[ "column","names"],
>>>        "joined_table":["joined_colum"]
>>>    },
>>>    filter_in_dict= {
>>>        "table_name":{
>>>            "column_name":[included,values]
>>>        }
>>>    },
>>>    filter_out_dict= {
>>>        "table_name":{
>>>            "column_name":[excluded,values]
>>>        }
>>>    },
>>>    filter_equal_dict"={
>>>        "table_name":{
>>>            "column_name":value
>>>        },
>>>    filter_spatial_dict"= {
>>>        "table_name": {
>>>        "column_name": [[min_x, min_y, min_z], [max_x, max_y, max_z]]
>>>    }
>>>    filter_regex_dict"= {
>>>        "table_name": {
>>>        "column_name": "regex_string"
>>>     }

live_query(table, timestamp, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, split_positions=False, post_filter=True, metadata=True, merge_reference=True, desired_resolution=None, random_sample=None, log_warning=True)

Generic query on materialization tables

Parameters:

Name Type Description Default
table str

Table to query

required
timestamp datetime

Time to materialize (in utc). Pass datetime.datetime.now(datetime.timezone.utc) for present time.

required
filter_in_dict dict

Keys are column names, values are allowed entries.

None
filter_out_dict dict

Keys are column names, values are not allowed entries.

None
filter_equal_dict dict

Keys are column names, values are specified entry.

None
filter_spatial_dict dict

Keys are column names, values are bounding boxes expressed in units of the voxel_resolution of this dataset. Bounding box is [[min_x, min_y,min_z],[max_x, max_y, max_z]].

None
filter_regex_dict dict

Keys are column names, values are regex strings.

None
select_columns list of str

Columns to select.

None
offset int

Offset in query result.

None
limit int

Maximum results to return (server will set upper limit, see get_server_config).

None
datastack_name str

Datastack to query. If None, defaults to one specified in client.

None
split_positions bool

Whether to break position columns into x,y,z columns. If False data is returned as one column with [x,y,z] array (slower).

False
post_filter bool

Whether to filter down the result based upon the filters specified. If False, it will return the query with present root_ids in the root_id columns, but the rows will reflect the filters translated into their past IDs. So if, for example, a cell had a false merger split off since the last materialization, those annotations on that incorrect portion of the cell will be included if this is False, but will be filtered down if this is True.

True
metadata bool

Toggle to return metadata. If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
merge_reference bool

Toggle to automatically join reference table. If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that table.

True
desired_resolution Iterable

Desired resolution you want all spatial points returned in. If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata.

None
random_sample int

If given, will do a tablesample of the table to return that many annotations.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description
DataFrame

A pandas dataframe of results of query

lookup_supervoxel_ids(table_name, annotation_ids=None, datastack_name=None)

Trigger supervoxel lookups of new annotations in a table.

Parameters:

Name Type Description Default
table_name str

Table to trigger

required
annotation_ids list

List of annotation ids to lookup. Default is None, which will trigger lookup of entire table.

None
datastack_name str

Datastack to trigger it. Defaults to what is set in client.

None

Returns:

Type Description
dict

Status code of response from server

map_filters(filters, timestamp, timestamp_past)

Translate a list of filter dictionaries from a point in the future to a point in the past

Parameters:

Name Type Description Default
filters list[dict]

filter dictionaries with root_ids

required
timestamp datetime

timestamp to query

required
timestamp_past datetime

timestamp to query from

required

Returns:

Type Description
list[dict]

filter dictionaries with past root_ids

dict

mapping of future root_ids to past root_ids

most_recent_version(datastack_name=None)

Get the most recent version of materialization for this datastack name

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client

None

Returns:

Type Description
int

Most recent version of materialization for this datastack name

query_table(table, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, return_df=True, split_positions=False, materialization_version=None, timestamp=None, metadata=True, merge_reference=True, desired_resolution=None, get_counts=False, random_sample=None, log_warning=True)

Generic query on materialization tables

Parameters:

Name Type Description Default
table str

Table to query

required
filter_in_dict dict

Keys are column names, values are allowed entries, by default None

None
filter_out_dict dict

Keys are column names, values are not allowed entries, by default None

None
filter_equal_dict dict

Keys are column names, values are specified entry, by default None

None
filter_spatial_dict dict

Keys are column names, values are bounding boxes expressed in units of the voxel_resolution of this dataset. Bounding box is [[min_x, min_y,min_z],[max_x, max_y, max_z]], by default None

None
filter_regex_dict dict

Keys are column names, values are regex strings, by default None

None
select_columns list of str

Columns to select, by default None

None
offset int

Result offset to use, by default None. Will only return top K results.

None
limit int

Maximum results to return (server will set upper limit, see get_server_config), by default None

None
datastack_name str

Datastack to query, by default None. If None, defaults to one specified in client.

None
return_df bool

Whether to return as a dataframe, by default True. If False, data is returned as json (slower).

True
split_positions bool

Whether to break position columns into x,y,z columns, by default False. If False data is returned as one column with [x,y,z] array (slower)

False
materialization_version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
timestamp datetime

Timestamp to query, by default None. If passsed will do a live query. Error if also passing a materialization version

None
metadata bool

Toggle to return metadata (default True), by default True. If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
merge_reference bool

Toggle to automatically join reference table, by default True. If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that

True
desired_resolution Iterable[float]

Desired resolution you want all spatial points returned in, by default None. If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata

None
get_counts bool

Whether to get counts of the query, by default False

False
random_sample int

If given, will do a tablesample of the of the table to return that many annotations

None
log_warning bool

Whether to log warnings, by default True

True

Returns:

Type Description
DataFrame

A pandas dataframe of results of query

raise_for_status(r, log_warning=True) staticmethod

Raises requests.HTTPError, if one occurred.

synapse_query(pre_ids=None, post_ids=None, bounding_box=None, bounding_box_column='post_pt_position', timestamp=None, remove_autapses=True, include_zeros=True, limit=None, offset=None, split_positions=False, desired_resolution=None, materialization_version=None, synapse_table=None, datastack_name=None, metadata=True)

Convenience method for querying synapses.

Will use the synapse table specified in the info service by default. It will also remove autapses by default. NOTE: This is not designed to allow querying of the entire synapse table. A query with no filters will return only a limited number of rows (configured by the server) and will do so in a non-deterministic fashion. Please contact your dataset administrator if you want access to the entire table.

Parameters:

Name Type Description Default
pre_ids Union[int, Iterable, ndarray]

Pre-synaptic cell(s) to query.

None
post_ids Union[int, Iterable, ndarray]

Post-synaptic cell(s) to query.

None
bounding_box Optional[Union[list, ndarray]]

[[min_x, min_y, min_z],[max_x, max_y, max_z]] bounding box to filter synapse locations. Expressed in units of the voxel_resolution of this dataset.

None
bounding_box_column str

Which synapse location column to filter by.

'post_pt_position'
timestamp datetime

Timestamp to query. If passed recalculate query at timestamp, do not pass with materialization_version.

None
remove_autapses bool

Whether to remove autapses from query results.

True
include_zeros bool

Whether to include synapses to/from id=0 (out of segmentation).

True
limit int

Number of synapses to limit. Server-side limit still applies.

None
offset int

Number of synapses to offset query.

None
split_positions bool

Whether to split positions into separate columns, True is faster.

False
desired_resolution Iterable[float]

List or array of the desired resolution you want queries returned in useful for materialization queries.

None
materialization_version Optional[int]

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
metadata bool

Whether to attach metadata to dataframe in the df.attr dictionary.

True

Returns:

Type Description
DataFrame

Results of query.

MaterializationClientV3(*args, **kwargs)

Bases: MaterializationClientV2

cg_client property

The chunked graph client.

datastack_name property

The name of the datastack.

homepage: HTML property

The homepage for the materialization engine.

server_version: Optional[Version] property

The version of the service running on the remote server. Note that this refers to the software running on the server and has nothing to do with the version of the datastack itself.

tables: TableManager property

The table manager for the materialization engine.

version: int property writable

The version of the materialization. Can be used to set up the client to default to a specific version when timestamps or versions are not specified in queries. If not set, defaults to the most recent version.

Note that if this materialization client is attached to a CAVEclient, the version must be set at the CAVEclient level.

views: ViewManager property

The view manager for the materialization engine.

get_table_metadata(table_name, datastack_name=None, version=None, log_warning=True)

Get metadata about a table

Parameters:

Name Type Description Default
table_name str

name of table to mark for deletion

required
datastack_name str or None

Name of the datastack_name. If None, uses the one specified in the client.

None
version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
log_warning bool

Whether to print out warnings to the logger. Defaults to True.

True

Returns:

Type Description
dict

Metadata dictionary for table

get_tables(datastack_name=None, version=None)

Gets a list of table names for a datastack

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client

None
version int or None

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None

Returns:

Type Description
list

List of table names

get_tables_metadata(datastack_name=None, version=None, log_warning=True)

Get metadata about tables

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack_name. If None, uses the one specified in the client.

None
version Optional[int]

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
log_warning bool

Whether to print out warnings to the logger. Defaults to True.

True

Returns:

Type Description
dict

Metadata dictionary for table

get_timestamp(version=None, datastack_name=None)

Get datetime.datetime timestamp for a materialization version.

Parameters:

Name Type Description Default
version int or None

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
datastack_name str or None

Datastack name, by default None. If None, defaults to the value set in the client.

None

Returns:

Type Description
datetime

Datetime when the materialization version was frozen.

get_unique_string_values(table, datastack_name=None)

Get unique string values for a table

Parameters:

Name Type Description Default
table str

Table to query

required
datastack_name Optional[str]

Datastack to query. If None, uses the one specified in the client.

None

Returns:

Type Description
dict[str]

A dictionary of column names and their unique values

get_version_metadata(version=None, datastack_name=None)

Get metadata about a version

Parameters:

Name Type Description Default
version int or None

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
datastack_name str or None

Datastack name, by default None. If None, defaults to the value set in the client.

None

Returns:

Type Description
dict

Dictionary of metadata about the version

get_versions(datastack_name=None, expired=False)

Get the versions available

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack, by default None. If None, uses the one specified in the client.

None
expired bool

Whether to include expired versions, by default False.

False

Returns:

Type Description
dict

Dictionary of versions available

get_versions_metadata(datastack_name=None, expired=False)

Get the metadata for all the versions that are presently available and valid

Parameters:

Name Type Description Default
datastack_name str or None

Datastack name, by default None. If None, defaults to the value set in the client.

None
expired bool

Whether to include expired versions, by default False.

False

Returns:

Type Description
list[dict]

List of metadata dictionaries

get_view_metadata(view_name, materialization_version=None, datastack_name=None, log_warning=True)

Get metadata for a view

Parameters:

Name Type Description Default
view_name str

Name of view to query.

required
materialization_version Optional[int]

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description
dict

Metadata of view

get_view_schema(view_name, materialization_version=None, datastack_name=None, log_warning=True)

Get schema for a view

Parameters:

Name Type Description Default
view_name str

Name of view to query.

required
materialization_version Optional[int]

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description
dict

Schema of view.

get_view_schemas(materialization_version=None, datastack_name=None, log_warning=True)

Get schema for a view

Parameters:

Name Type Description Default
materialization_version Optional[int]

Version to query. If None, will use version set by client.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description
dict

Schema of view.

get_views(version=None, datastack_name=None)

Get all available views for a version

Parameters:

Name Type Description Default
version Optional[int]

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
datastack_name str

Datastack to query. If None, uses the one specified in the client.

None

Returns:

Type Description
list

List of views

ingest_annotation_table(table_name, datastack_name=None)

Trigger supervoxel lookup and root ID lookup of new annotations in a table.

Parameters:

Name Type Description Default
table_name str

Table to trigger

required
datastack_name str

Datastack to trigger it. Defaults to what is set in client.

None

Returns:

Type Description
dict

Status code of response from server

join_query(tables, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, suffixes=None, datastack_name=None, return_df=True, split_positions=False, materialization_version=None, metadata=True, desired_resolution=None, random_sample=None, log_warning=True)

Generic query on materialization tables

Parameters:

Name Type Description Default
tables list of lists with length 2 or 'str'

list of two lists: first entries are table names, second entries are the columns used for the join.

required
filter_in_dict dict of dicts

outer layer: keys are table names inner layer: keys are column names, values are allowed entries, by default None

None
filter_out_dict dict of dicts

outer layer: keys are table names inner layer: keys are column names, values are not allowed entries, by default None

None
filter_equal_dict dict of dicts

outer layer: keys are table names inner layer: keys are column names, values are specified entry, by default None

None
filter_spatial_dict dict of dicts

outer layer: keys are table names, inner layer: keys are column names. Values are bounding boxes as [[min_x, min_y,min_z],[max_x, max_y, max_z]], expressed in units of the voxel_resolution of this dataset. Defaults to None.

None
filter_regex_dict dict of dicts

outer layer: keys are table names. inner layer: keys are column names, values are regex strings. Defaults to None

None
select_columns dict of lists of str

keys are table names,values are the list of columns from that table. Defaults to None, which will select all tables. Will be passed to server as select_column_maps. Passing a list will be passed as select_columns which is deprecated.

None
offset int

result offset to use. Defaults to None. Will only return top K results.

None
limit int

maximum results to return (server will set upper limit, see get_server_config)

None
suffixes dict

suffixes to use for duplicate columns, keys are table names, values are the suffix

None
datastack_name str

datastack to query. If None defaults to one specified in client.

None
return_df bool

whether to return as a dataframe default True, if False, data is returned as json (slower)

True
split_positions bool

whether to break position columns into x,y,z columns default False, if False data is returned as one column with [x,y,z] array (slower)

False
materialization_version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
metadata bool

toggle to return metadata If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
desired_resolution Iterable

What resolution to convert position columns to. Defaults to None will use defaults.

None
random_sample int

if given, will do a tablesample of the table to return that many annotations

None
log_warning bool

Whether to log warnings, by default True

True

Returns:

Type Description
DataFrame

a pandas dataframe of results of query

live_live_query(table, timestamp, joins=None, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, split_positions=False, metadata=True, suffixes=None, desired_resolution=None, allow_missing_lookups=False, allow_invalid_root_ids=False, random_sample=None, log_warning=True)

Beta method for querying cave annotation tables with root IDs and annotations at a particular timestamp. Note: this method requires more explicit mapping of filters and selection to table as its designed to test a more general endpoint that should eventually support complex joins.

Parameters:

Name Type Description Default
table str

Principle table to query

required
timestamp datetime

Timestamp to query

required
joins

List of joins, where each join is a list of [table1,column1, table2, column2]

None
filter_in_dict

A dictionary with tables as keys, values are dicts with column keys and list values to accept.

None
filter_out_dict

A dictionary with tables as keys, values are dicts with column keys and list values to reject.

None
filter_equal_dict

A dictionary with tables as keys, values are dicts with column keys and values to equate.

None
filter_spatial_dict

A dictionary with tables as keys, values are dicts with column keys and values of 2x3 list of bounds.

None
filter_regex_dict

A dictionary with tables as keys, values are dicts with column keys and values of regex strings.

None
select_columns

A dictionary with tables as keys, values are lists of columns to select.

None
offset int

Value to offset query by.

None
limit int

Limit of query.

None
datastack_name str

Datastack to query. Defaults to set by client.

None
split_positions bool

Whether to split positions into separate columns, True is faster.

False
metadata bool

Whether to attach metadata to dataframe.

True
suffixes dict

What suffixes to use on joins, keys are table_names, values are suffixes.

None
desired_resolution Iterable

What resolution to convert position columns to.

None
allow_missing_lookups bool

If there are annotations without supervoxels and root IDs yet, allow results.

False
allow_invalid_root_ids bool

If True, ignore root ids not valid at the given timestamp, otherwise raise an error.

False
random_sample int

If given, will do a table sample of the table to return that many annotations.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description

Results of query

Examples:

>>> from caveclient import CAVEclient
>>> client = CAVEclient('minnie65_public_v117')
>>> live_live_query("table_name", datetime.datetime.now(datetime.timezone.utc),
>>>    joins=[[table_name, table_column, joined_table, joined_column],
>>>             [joined_table, joincol2, third_table, joincol_third]]
>>>    suffixes={
>>>        "table_name":"suffix1",
>>>        "joined_table":"suffix2",
>>>        "third_table":"suffix3"
>>>    },
>>>    select_columns= {
>>>        "table_name":[ "column","names"],
>>>        "joined_table":["joined_colum"]
>>>    },
>>>    filter_in_dict= {
>>>        "table_name":{
>>>            "column_name":[included,values]
>>>        }
>>>    },
>>>    filter_out_dict= {
>>>        "table_name":{
>>>            "column_name":[excluded,values]
>>>        }
>>>    },
>>>    filter_equal_dict"={
>>>        "table_name":{
>>>            "column_name":value
>>>        },
>>>    filter_spatial_dict"= {
>>>        "table_name": {
>>>        "column_name": [[min_x, min_y, min_z], [max_x, max_y, max_z]]
>>>    }
>>>    filter_regex_dict"= {
>>>        "table_name": {
>>>        "column_name": "regex_string"
>>>     }

live_query(table, timestamp, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, split_positions=False, post_filter=True, metadata=True, merge_reference=True, desired_resolution=None, random_sample=None, log_warning=True)

Generic query on materialization tables

Parameters:

Name Type Description Default
table str

Table to query

required
timestamp datetime

Time to materialize (in utc). Pass datetime.datetime.now(datetime.timezone.utc) for present time.

required
filter_in_dict dict

Keys are column names, values are allowed entries.

None
filter_out_dict dict

Keys are column names, values are not allowed entries.

None
filter_equal_dict dict

Keys are column names, values are specified entry.

None
filter_spatial_dict dict

Keys are column names, values are bounding boxes expressed in units of the voxel_resolution of this dataset. Bounding box is [[min_x, min_y,min_z],[max_x, max_y, max_z]].

None
filter_regex_dict dict

Keys are column names, values are regex strings.

None
select_columns list of str

Columns to select.

None
offset int

Offset in query result.

None
limit int

Maximum results to return (server will set upper limit, see get_server_config).

None
datastack_name str

Datastack to query. If None, defaults to one specified in client.

None
split_positions bool

Whether to break position columns into x,y,z columns. If False data is returned as one column with [x,y,z] array (slower).

False
post_filter bool

Whether to filter down the result based upon the filters specified. If False, it will return the query with present root_ids in the root_id columns, but the rows will reflect the filters translated into their past IDs. So if, for example, a cell had a false merger split off since the last materialization, those annotations on that incorrect portion of the cell will be included if this is False, but will be filtered down if this is True.

True
metadata bool

Toggle to return metadata. If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
merge_reference bool

Toggle to automatically join reference table. If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that table.

True
desired_resolution Iterable

Desired resolution you want all spatial points returned in. If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata.

None
random_sample int

If given, will do a tablesample of the table to return that many annotations.

None
log_warning bool

Whether to log warnings.

True

Returns:

Type Description
DataFrame

A pandas dataframe of results of query

lookup_supervoxel_ids(table_name, annotation_ids=None, datastack_name=None)

Trigger supervoxel lookups of new annotations in a table.

Parameters:

Name Type Description Default
table_name str

Table to trigger

required
annotation_ids list

List of annotation ids to lookup. Default is None, which will trigger lookup of entire table.

None
datastack_name str

Datastack to trigger it. Defaults to what is set in client.

None

Returns:

Type Description
dict

Status code of response from server

map_filters(filters, timestamp, timestamp_past)

Translate a list of filter dictionaries from a point in the future to a point in the past

Parameters:

Name Type Description Default
filters list[dict]

filter dictionaries with root_ids

required
timestamp datetime

timestamp to query

required
timestamp_past datetime

timestamp to query from

required

Returns:

Type Description
list[dict]

filter dictionaries with past root_ids

dict

mapping of future root_ids to past root_ids

most_recent_version(datastack_name=None)

Get the most recent version of materialization for this datastack name

Parameters:

Name Type Description Default
datastack_name str or None

Name of the datastack, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client

None

Returns:

Type Description
int

Most recent version of materialization for this datastack name

query_table(table, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, return_df=True, split_positions=False, materialization_version=None, timestamp=None, metadata=True, merge_reference=True, desired_resolution=None, get_counts=False, random_sample=None, log_warning=True)

Generic query on materialization tables

Parameters:

Name Type Description Default
table str

Table to query

required
filter_in_dict dict

Keys are column names, values are allowed entries, by default None

None
filter_out_dict dict

Keys are column names, values are not allowed entries, by default None

None
filter_equal_dict dict

Keys are column names, values are specified entry, by default None

None
filter_spatial_dict dict

Keys are column names, values are bounding boxes expressed in units of the voxel_resolution of this dataset. Bounding box is [[min_x, min_y,min_z],[max_x, max_y, max_z]], by default None

None
filter_regex_dict dict

Keys are column names, values are regex strings, by default None

None
select_columns list of str

Columns to select, by default None

None
offset int

Result offset to use, by default None. Will only return top K results.

None
limit int

Maximum results to return (server will set upper limit, see get_server_config), by default None

None
datastack_name str

Datastack to query, by default None. If None, defaults to one specified in client.

None
return_df bool

Whether to return as a dataframe, by default True. If False, data is returned as json (slower).

True
split_positions bool

Whether to break position columns into x,y,z columns, by default False. If False data is returned as one column with [x,y,z] array (slower)

False
materialization_version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
timestamp datetime

Timestamp to query, by default None. If passsed will do a live query. Error if also passing a materialization version

None
metadata bool

Toggle to return metadata (default True), by default True. If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
merge_reference bool

Toggle to automatically join reference table, by default True. If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that

True
desired_resolution Iterable[float]

Desired resolution you want all spatial points returned in, by default None. If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata

None
get_counts bool

Whether to get counts of the query, by default False

False
random_sample int

If given, will do a tablesample of the of the table to return that many annotations

None
log_warning bool

Whether to log warnings, by default True

True

Returns:

Type Description
DataFrame

A pandas dataframe of results of query

query_view(view_name, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset=None, limit=None, datastack_name=None, return_df=True, split_positions=False, materialization_version=None, metadata=True, merge_reference=True, desired_resolution=None, get_counts=False, random_sample=None)

Generic query on a view

Parameters:

Name Type Description Default
view_name str

View to query

required
filter_in_dict dict

Keys are column names, values are allowed entries, by default None

None
filter_out_dict dict

Keys are column names, values are not allowed entries, by default None

None
filter_equal_dict dict

Keys are column names, values are specified entry, by default None

None
filter_spatial_dict dict

Keys are column names, values are bounding boxes expressed in units of the voxel_resolution of this dataset. Bounding box is [[min_x, min_y,min_z],[max_x, max_y, max_z]], by default None

None
filter_regex_dict dict

Keys are column names, values are regex strings, by default None

None
select_columns list of str

Columns to select, by default None

None
offset int

Result offset to use, by default None. Will only return top K results.

None
limit int

Maximum results to return (server will set upper limit, see get_server_config), by default None

None
datastack_name str

Datastack to query, by default None. If None, defaults to one specified in client.

None
return_df bool

Whether to return as a dataframe, by default True. If False, data is returned as json (slower).

True
split_positions bool

Whether to break position columns into x,y,z columns, by default False. If False data is returned as one column with [x,y,z] array (slower)

False
materialization_version int

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
metadata bool

Toggle to return metadata (default True), by default True. If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

True
merge_reference bool

Toggle to automatically join reference table, by default True. If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that

True
desired_resolution Iterable[float]

Desired resolution you want all spatial points returned in, by default None. If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata

None
get_counts bool

Whether to get counts of the query, by default False

False
random_sample int

If given, will do a tablesample of the of the table to return that many annotations

None

Returns:

Type Description
DataFrame

A pandas dataframe of results of query

raise_for_status(r, log_warning=True) staticmethod

Raises requests.HTTPError, if one occurred.

synapse_query(pre_ids=None, post_ids=None, bounding_box=None, bounding_box_column='post_pt_position', timestamp=None, remove_autapses=True, include_zeros=True, limit=None, offset=None, split_positions=False, desired_resolution=None, materialization_version=None, synapse_table=None, datastack_name=None, metadata=True)

Convenience method for querying synapses.

Will use the synapse table specified in the info service by default. It will also remove autapses by default. NOTE: This is not designed to allow querying of the entire synapse table. A query with no filters will return only a limited number of rows (configured by the server) and will do so in a non-deterministic fashion. Please contact your dataset administrator if you want access to the entire table.

Parameters:

Name Type Description Default
pre_ids Union[int, Iterable, ndarray]

Pre-synaptic cell(s) to query.

None
post_ids Union[int, Iterable, ndarray]

Post-synaptic cell(s) to query.

None
bounding_box Optional[Union[list, ndarray]]

[[min_x, min_y, min_z],[max_x, max_y, max_z]] bounding box to filter synapse locations. Expressed in units of the voxel_resolution of this dataset.

None
bounding_box_column str

Which synapse location column to filter by.

'post_pt_position'
timestamp datetime

Timestamp to query. If passed recalculate query at timestamp, do not pass with materialization_version.

None
remove_autapses bool

Whether to remove autapses from query results.

True
include_zeros bool

Whether to include synapses to/from id=0 (out of segmentation).

True
limit int

Number of synapses to limit. Server-side limit still applies.

None
offset int

Number of synapses to offset query.

None
split_positions bool

Whether to split positions into separate columns, True is faster.

False
desired_resolution Iterable[float]

List or array of the desired resolution you want queries returned in useful for materialization queries.

None
materialization_version Optional[int]

The version of the datastack to query. If None, will query the client version, which defaults to the most recent version.

None
metadata bool

Whether to attach metadata to dataframe in the df.attr dictionary.

True

Returns:

Type Description
DataFrame

Results of query.

MaterializationClient(server_address, datastack_name=None, auth_client=None, cg_client=None, synapse_table=None, api_version='latest', version=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, desired_resolution=None, over_client=None)

Factory for returning AnnotationClient

Parameters:

Name Type Description Default
server_address str

server_address to use to connect to (i.e. https://minniev1.microns-daf.com)

required
datastack_name str

Name of the datastack.

None
auth_client AuthClient or None

Authentication client to use to connect to server. If None, do not use authentication.

None
api_version str or int (default: latest)

What version of the api to use, 0: Legacy client (i.e www.dynamicannotationframework.com) 2: new api version, (i.e. minniev1.microns-daf.com) 'latest': default to the most recent (current 2)

'latest'
cg_client

chunkedgraph client for live materializations

None
synapse_table

default synapse table for queries

None
version default version to query

if None will default to latest version

None
desired_resolution Iterable[float] or None

If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.

None

Returns:

Type Description
ClientBaseWithDatastack

List of datastack names for available datastacks on the annotation engine

concatenate_position_columns(df, inplace=False)

function to take a dataframe with x,y,z position columns and replace them with one column per position with an xyz numpy array. Edits occur

Args: df (pd.DataFrame): dataframe to alter inplace (bool): whether to perform edits in place

Returns: pd.DataFrame: [description]

convert_position_columns(df, given_resolution, desired_resolution)

function to take a dataframe with x,y,z position columns and convert them to the desired resolution from the given resolution

Args: df (pd.DataFrame): dataframe to alter given_resolution (Iterable[float]): what the given resolution is desired_resoultion (Iterable[float]): what the desired resolution is

Returns: pd.DataFrame: [description]

deserialize_query_response(response)

Deserialize pyarrow responses