Description: Olmsted dataset input file.
Required: ['dataset_id', 'clones']
Type: object
paper
ident
Description: UUID specific to the given object
Type: string
build
samples
Description: Information about each of the samples
Type: array
seeds
Description: Information about each of the seed sequences
Type: array
clones
Description: Information about each of the clonal families
Type: array
dataset_id
Description: Unique identifier for a collection of data
Type: string
subjects
Description: Information about each of the subjects
Type: array
Description: Information about a paper corresponding to this dataset
Required: ['authorstring']
Type: object
url
Description: Link to online version of the paper.
Type: string
authorstring
Description: String to be displayed citing authors, e.g. "Doe, et. al.".
Type: string
Description: Information about how a dataset was built.
Required: ['commit']
Type: object
commit
Description: Commit sha of whatever build system you used to process the data
Type: string
time
Description: Time at which build was initiated
Type: string
Description: A sample is a collection of sequences.
Required: ['locus']
Type: object
locus
Description: B-cell Locus.
Type: string
ident
Description: UUID specific to the given object
Type: string
timepoint_id
Description: Timepoint associated with this sample (may choose "merged" if data has been combined from multiple timepoints)
Type: string
sample_id
Description: Sample id
Type: string
Description: A sequence of interest among other clonal family members.
Required: ['seed_id']
Type: ['object', 'null']
ident
Description: UUID specific to the given object
Type: string
seed_id
Description: Seed id
Type: string
Description: Clonal family of sequences deriving from a particular reassortment event
Required: ['unique_seqs_count', 'mean_mut_freq', 'v_alignment_start', 'v_alignment_end', 'j_alignment_start', 'j_alignment_end']
Type: object
j_call
Description: AIRR: J gene with allele of the inferred ancestor of the clone. For example, IGHJ4*02.
Type: string
clone_id
Description: AIRR: Identifier for the clone.
Type: string
seed_id
Description: Seed sequence id if any.
Type: ['string', 'null']
sample_id
Description: sample id associated with this clonal family.
Type: string
v_call
Description: AIRR: V gene with allele of the inferred ancestral of the clone. For example, IGHV4-59*01.
Type: string
d_call
Description: AIRR: D gene with allele of the inferred ancestor of the clone. For example, IGHD3-10*01.
Type: string
subject_id
Description: Id of subject from which the clonal family was sampled.
Type: string
junction_length
Description: AIRR: Number of nucleotides in the junction. (see AIRR 'junction': Nucleotide sequence for the junction region of the inferred ancestor of the clone, where the junction is defined as the CDR3 plus the two flanking conserved codons.)
Type: integer
d_alignment_start
Description: AIRR: Start position of the D segment in both the sequence_alignment and germline_alignment fields (1-based closed interval).
Type: integer
unique_seqs_count
Description: Number of unique sequences in the clone
Type: integer
d_alignment_end
Description: AIRR: End position of the D segment in both the sequence_alignment and germline_alignment fields (1-based closed interval).
Type: integer
total_read_count
Description: Number of total reads represented by sequences in the clone.
Type: integer
v_alignment_start
Description: AIRR: Start position in the V segment in both the sequence_alignment and germline_alignment fields (1-based closed interval).
Type: integer
trees
Description: Phylogenetic trees, and possibly ancestral sequence reconstructions.
Type: array
j_alignment_end
Description: AIRR: End position of the J segment in both the sequence_alignment and germline_alignment fields (1-based closed interval).
Type: integer
j_alignment_start
Description: AIRR: Start position of the J segment in both the sequence_alignment and germline_alignment fields (1-based closed interval).
Type: integer
has_seed
Description: Does this clone have a seed sequence (see Seed schema) in it?
Type: boolean
ident
Description: UUID specific to the given object
Type: string
mean_mut_freq
Description: Mean mutation frequency across sequences in the clone.
Type: number
germline_alignment
Description: AIRR: Assembled, aligned, full-length inferred ancestor of the clone spanning the same region as the sequence_alignment field of nodes (typically the V(D)J region) and including the same set of corrections and spacers (if any).
Type: string
v_alignment_end
Description: AIRR: End position in the V segment in both the sequence_alignment and germline_alignment fields (1-based closed interval).
Type: integer
junction_start
Description: AIRR: Junction region start position in the alignment (1-based closed interval).
Type: integer
Description: Phylogenetic tree and possibly ancestral state reconstruction of sequences in a clonal family.
Required: ['newick', 'nodes']
Type: object
ident
Description: UUID specific to the given object
Type: string
newick
Description: AIRR: Newick string of the tree edges.
Type: string
clone_id
Description: AIRR: Identifier for the clone.
Type: string
downsampling_strategy
Description: If applicable, the downsampling method applied to the set of clonal sequences before passing them to a phylogenetic inference tool.
Type: string
tree_id
Description: AIRR: Identifier for the tree.
Type: string
nodes
Description: AIRR: Dictionary of nodes in the tree, keyed by sequence_id string.
Type: object
downsampled_count
Description: If applicable, the maximum number of sequences kept in the downsampling process.
Type: integer
Description: Information about the phylogenetic tree nodes and the sequences they represent
Required: ['sequence_id', 'sequence_alignment', 'sequence_alignment_aa']
Type: object
sequence_alignment_aa
Description: Amino acid sequence of the node, aligned to the germline_alignment for this clone, including any indel corrections or spacers.
Type: string
cluster_multiplicity
Description: If clonal family sequences were downsampled by clustering, the cummulative number of times sequences in cluster were observed.
Type: ['integer', 'null']
timepoint_id
Description: Timepoint associated with sequence, if any.
Type: ['string', 'null']
sequence_id
Description: AIRR: Identifier for this node that matches the id in the newick string and, where possible, the sequence_id in the source repertoire.
Type: string
lbr
Description: Local branching rate (derivative of lbi; see https://arxiv.org/abs/2004.11868).
Type: ['number', 'null']
lbi
Description: Local branching index (see https://arxiv.org/abs/2004.11868).
Type: ['number', 'null']
cluster_timepoint_multiplicities
Description: Sequence multiplicity, broken down by timepoint, including sequences falling in the same cluster if clustering-based downsampling was performed.
Type: array
timepoint_multiplicities
Description: Sequence multiplicity, broken down by timepoint.
Type: array
multiplicity
Description: Number of times sequence was observed in the sample. The presence of a given sequence in a clonal family may represent many identical such sequences in the original sample.
Type: ['integer', 'null']
sequence_alignment
Description: AIRR: Nucleotide sequence of the node, aligned to the germline_alignment for this clone, including any indel corrections or spacers.
Type: string
affinity
Description: Affinity of the antibody for some antigen. Typically inverse dissociation constant k_d in simulation, and inverse ic50 in data.
Type: ['number', 'null']
Description: Multiplicity at a specific time.
Type: object
multiplicity
Description: Number of times sequence was observed at the given timepoint
Type: ['integer', 'null']
timepoint_id
Description: Id associated with the timepoint in question
Type: string
Description: Subject from which the clonal family was sampled.
Required: ['subject_id']
Type: object
subject_id
Description: Subject id
Type: string
ident
Description: UUID specific to the given object
Type: string