cogent3.core.sequence.RnaSequence#

class RnaSequence(moltype: MolType[Any], seq: str | bytes | ndarray[tuple[Any, ...], dtype[integer]] | SeqViewABC, *, name: str | None = None, info: dict[str, Any] | Info | None = None, annotation_offset: int = 0, annotation_db: AnnotationDbABC | None = None)#

Holds the standard RNA sequence.

Attributes:
annotation_db

the annotation database for the collection

annotation_offset

The offset between annotation coordinates and sequence coordinates.

info
moltype
name

Methods

add_feature(*, biotype, name, spans[, ...])

add a feature to annotation_db

annotate_matches_to(pattern, biotype, name)

Adds an annotation at sequence positions matching pattern.

can_match(other)

Returns True if every pos in self could match same pos in other.

can_mispair(other)

Returns True if any position in self could mispair with other.

can_pair(other)

Returns True if self and other could pair.

complement()

Returns complement of self, using data from MolType.

copy([exclude_annotations, sliced])

returns a copy of self

copy_annotations(seq_db)

copy annotations into attached annotation db

count(item)

count() delegates to self._seq.

count_ambiguous()

Returns the number of ambiguous characters in the sequence.

count_degenerate()

Counts the degenerate bases in the specified sequence.

count_gaps()

Counts the gaps in the specified sequence.

count_kmers([k, use_hook])

return array of counts of all possible kmers of length k

count_variants()

Counts number of possible sequences matching the sequence, given any ambiguous characters in the sequence.

counts([motif_length, include_ambiguity, ...])

returns dict of counts of motifs

degap()

Deletes all gap characters from sequence.

diff(other)

Returns number of differences between self and other.

disambiguate([method])

Returns a non-degenerate sequence from a degenerate one.

distance(other[, function])

Returns distance between self and other using function(i,j).

frac_diff(other)

Returns fraction of positions where self and other differ.

frac_diff_gaps(other)

Returns frac.

frac_diff_non_gaps(other)

Returns fraction of non-gap positions where self differs from other.

frac_same(other)

Returns fraction of positions where self and other are the same.

frac_same_gaps(other)

Returns fraction of positions where self and other share gap states.

frac_same_non_gaps(other)

Returns fraction of non-gap positions where self matches other.

frac_similar(other, similar_pairs)

Returns fraction of positions where self[i] is similar to other[i].

from_rich_dict(data)

create a Sequence object from a rich dict

gap_indices()

Returns array of the indices of all gaps in the sequence

gap_vector()

Returns vector of True or False according to which pos are gaps or missing.

get_drawable(*[, biotype, width, vertical])

make a figure from sequence features

get_drawables(*[, biotype])

returns a dict of drawables, keyed by type

get_features(*[, biotype, name, start, ...])

yields Feature instances

get_in_motif_size([motif_length, warn])

returns sequence as list of non-overlapping motifs

get_kmers(k[, strict])

return all overlapping k-mers

get_name()

Return the sequence name -- should just use name instead.

get_translation([gc, incomplete_ok, ...])

translate to amino acid sequence

get_type()

Return the sequence type as moltype label.

has_annotation_db()

returns True if self has annotation db

has_terminal_stop([gc, strict])

Return True if the sequence has a terminal stop codon.

is_annotated([biotype])

returns True if sequence parent name has any annotations

is_degenerate()

Returns True if sequence contains degenerate characters.

is_gapped()

Returns True if sequence contains gaps.

is_strict()

Returns True if sequence contains only monomers.

is_valid()

Returns True if sequence contains no items absent from alphabet.

iter_kmers(k[, strict])

generates all overlapping k-mers.

make_feature(feature, *args)

return an Feature instance from feature data

matrix_distance(other, matrix)

Returns distance between self and other using a score matrix.

must_pair(other)

Returns True if all positions in self must pair with other.

mw([method, delta])

Returns the molecular weight of (one strand of) the sequence.

parent_coordinates([apply_offset])

returns seqid, start, stop, strand of this sequence on its parent

parse_out_gaps()

returns Map corresponding to gap locations and ungapped Sequence

rc()

Converts a nucleic acid sequence to its reverse complement.

replace_annotation_db(value[, check])

public interface to assigning the annotation_db

resolved_ambiguities()

Returns a list of sets of strings.

reverse_complement()

Converts a nucleic acid sequence to its reverse complement.

sample(*, n, with_replacement, motif_length, ...)

Returns random sample of positions from self, e.g. to bootstrap.

shuffle()

returns a randomized copy of the Sequence object

sliding_windows(window, step[, start, end])

Generator function that yield new sequence objects of a given length at a given interval.

strand_symmetry([motif_length])

returns G-test for strand symmetry

strip_bad()

Removes any symbols not in the alphabet.

strip_bad_and_gaps()

Removes any symbols not in the alphabet, and any gaps.

strip_degenerate()

Removes degenerate bases by stripping them out of the sequence.

to_array([apply_transforms])

returns the numpy array

to_dna()

Returns copy of self as DNA.

to_fasta([make_seqlabel, block_size])

Return string of self in FASTA format, no trailing newline

to_html([wrap, limit, colors, font_size, ...])

returns html with embedded styles for sequence colouring

to_json()

returns a json formatted string

to_moltype(moltype)

returns copy of self with moltype seq

to_phylip([name_len, label_len])

Return string of self in one line for PHYLIP, no newline.

to_rich_dict([exclude_annotations])

returns {'name': name, 'seq': sequence, 'moltype': moltype.label}

to_rna()

Returns copy of self as RNA.

trim_stop_codon([gc, strict])

Removes a terminal stop codon from the sequence

with_masked_annotations(biotypes[, ...])

returns a sequence with annot_types regions replaced by mask_char if shadow is False, otherwise all other regions are masked.

with_termini_unknown()

Returns copy of sequence with terminal gaps remapped as missing.

write(filename[, format_name])

Write the sequence to a file.

gapped_by_map

gapped_by_map_motif_iter

gapped_by_map_segment_iter

to_dict