Compute the Automorphisms of Mutational Events Between two Codon Sequences Represented in a Given Abelian group.
Source:R/automorphisms.R
automorphisms.Rd
Given two codon sequences represented in a given Abelian group, this function computes the automorphisms describing codon mutational events. Basically, this function is a wrapping to call the corresponding function for a specified Abelian group.
Usage
automorphisms(seqs = NULL, filepath = NULL, group = "Z4", ...)
# S4 method for DNAStringSet_OR_NULL
automorphisms(
seqs = NULL,
filepath = NULL,
group = c("Z5", "Z64", "Z125", "Z5^3"),
cube = c("ACGT", "TGCA"),
cube_alt = c("CATG", "GTAC"),
nms = NULL,
start = NA,
end = NA,
chr = 1L,
strand = "+",
num.cores = multicoreWorkers(),
tasks = 0L,
verbose = TRUE
)
Arguments
- seqs
An object from a
DNAStringSet
orDNAMultipleAlignment
class carrying the DNA pairwise alignment of two sequences. The pairwise alignment provided in argument seq or the 'fasta' file filepath must correspond to codon sequences.- filepath
A character vector containing the path to a file in fasta format to be read. This argument must be given if codon & base arguments are not provided.
- group
A character string denoting the group representation for the given base or codon as shown in reference (1).
- ...
Not in use.
- cube, cube_alt
A character string denoting pairs of the 24 Genetic-code cubes, as given in references (2-3). That is, the base pairs from the given cubes must be complementary each other. Such a cube pair are call \(dual cubes\) and, as shown in reference (3), each pair integrates group.
- nms
Optional. Only used if the DNA sequence alignment provided carries more than two sequences. A character string giving short names for the alignments to be compared. If not given then the automorphisms between pairwise alignment are named as: 'aln_1', 'aln_2', and so on.
- start, end, chr, strand
Optional parameters required to build a
GRanges-class
. If not provided the default values given for the function definition will be used.- num.cores, tasks
Parameters for parallel computation using package
BiocParallel-package
: the number of cores to use, i.e. at most how many child processes will be run simultaneously (seebplapply
and the number of tasks per job (only for Linux OS).- verbose
If TRUE, prints the progress bar.
Value
This function returns a Automorphism-class
object
with four columns on its metacolumn named: seq1, seq2,
autm, and cube.
Details
Herein, automorphisms are algebraic descriptions of mutational
event observed in codon sequences represented on different Abelian groups.
In particular, as described in references (3-4), for each representation of
the codon set on a defined Abelian group there are 24 possible isomorphic
Abelian groups. These Abelian groups can be labeled based on the DNA
base-order used to generate them. The set of 24 Abelian groups can be
described as a group isomorphic to the symmetric group of degree four
(\(S_4\), see reference (4)). Function automorphismByRanges
permits the classification of the pairwise alignment of protein-coding
sub-regions based on the mutational events observed on it and on the
genetic-code cubes that describe them.
Automorphisms in Z5, Z64 and Z125 are described as functions
\(f(x) = k x mod 64\) and \(f(x) = k x mod 125\), where k and x are
elements from the set of integers modulo 64 or modulo 125, respectively. If
an automorphisms cannot be found on any of the cubes provided in the
argument \(cube\), then function automorphisms
will search
for automorphisms in the cubes provided in the argument \(cube_alt\).
Automorphisms in Z5^3' are described as functions \(f(x) = Ax mod Z5\), where A is diagonal matrix.
Arguments cube and cube_alt must be pairs of' dual cubes (see section 2.4 from reference 4).
Methods
automorphismByRanges
:
This function returns a GRanges-class
object.
Consecutive mutational events (on the codon sequence) described by
automorphisms on a same cube are grouped in a range.
automorphism_bycoef
This function returns a GRanges-class
object.
Consecutive mutational events (on the codon sequence) described by
the same automorphisms coefficients are grouped in a range.
getAutomorphisms
This function returns an AutomorphismList-class object as a list of
Automorphism-class objects, which inherits from
GRanges-class
objects.
conserved_regions
Returns a AutomorphismByCoef
class object containing the
requested regions.
References
Sanchez R, Morgado E, Grau R. Gene algebra from a genetic code algebraic structure. J Math Biol. 2005 Oct;51(4):431-57. doi: 10.1007/s00285-005-0332-8. Epub 2005 Jul 13. PMID: 16012800. ( PDF).
Robersy Sanchez, Jesus Barreto (2021) Genomic Abelian Finite Groups. doi:10.1101/2021.06.01.446543
M. V Jose, E.R. Morgado, R. Sanchez, T. Govezensky, The 24 possible algebraic representations of the standard genetic code in six or in three dimensions, Adv. Stud. Biol. 4 (2012) 110-152.PDF.
R. Sanchez. Symmetric Group of the Genetic-Code Cubes. Effect of the Genetic-Code Architecture on the Evolutionary Process MATCH Commun. Math. Comput. Chem. 79 (2018) 527-560. PDF
Author
Robersy Sanchez (https://genomaths.com).
Examples
## Load a pairwise alignment
data("aln", package = "GenomAutomorphism")
aln
#> DNAStringSet object of length 2:
#> width seq
#> [1] 51 ACCTATGTTGGTATT---GCGCTCCAACTCCTTGGCTCTAGCTCACTACAT
#> [2] 51 ATCTATGTTGGTATTACGACGCTCCAATTCCTTGGGTCC------CTCCTT
## Automorphism on "Z5^3"
autms <- automorphisms(seqs = aln, group = "Z5^3", verbose = FALSE)
autms
#> Automorphism object with 17 ranges and 8 metadata columns:
#> seqnames ranges strand | seq1 seq2 aa1
#> <Rle> <IRanges> <Rle> | <character> <character> <character>
#> [1] 1 1 + | ACC ATC T
#> [2] 1 2 + | TAT TAT Y
#> [3] 1 3 + | GTT GTT V
#> [4] 1 4 + | GGT GGT G
#> [5] 1 5 + | ATT ATT I
#> ... ... ... ... . ... ... ...
#> [13] 1 13 + | TCT TCC S
#> [14] 1 14 + | AGC --- S
#> [15] 1 15 + | TCA --- S
#> [16] 1 16 + | CTA CTC L
#> [17] 1 17 + | CAT CTT H
#> aa2 coord1 coord2 autm cube
#> <character> <matrix> <matrix> <character> <character>
#> [1] I 1:2:2 1:4:2 1,2,1 ACGT
#> [2] Y 4:1:4 4:1:4 1,1,1 ACGT
#> [3] V 3:4:4 3:4:4 1,1,1 ACGT
#> [4] G 3:3:4 3:3:4 1,1,1 ACGT
#> [5] I 1:4:4 1:4:4 1,1,1 ACGT
#> ... ... ... ... ... ...
#> [13] S 4:2:4 4:2:2 1,1,3 ACGT
#> [14] - 1:3:2 0:0:0 0 Trnl
#> [15] - 4:2:1 0:0:0 0 Trnl
#> [16] L 2:4:1 2:4:2 1,1,2 ACGT
#> [17] L 2:1:4 2:4:4 1,4,1 ACGT
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
## Automorphism on "Z64"
autms <- automorphisms(seqs = aln, group = "Z64", verbose = FALSE)
autms
#> Automorphism object with 17 ranges and 8 metadata columns:
#> seqnames ranges strand | seq1 seq2 aa1
#> <Rle> <IRanges> <Rle> | <character> <character> <character>
#> [1] 1 1 + | ACC ATC T
#> [2] 1 2 + | TAT TAT Y
#> [3] 1 3 + | GTT GTT V
#> [4] 1 4 + | GGT GGT G
#> [5] 1 5 + | ATT ATT I
#> ... ... ... ... . ... ... ...
#> [13] 1 13 + | TCT TCC S
#> [14] 1 14 + | AGC --- S
#> [15] 1 15 + | TCA --- S
#> [16] 1 16 - | CTA CTC L
#> [17] 1 17 + | CAT CTT H
#> aa2 coord1 coord2 autm cube
#> <character> <numeric> <numeric> <numeric> <character>
#> [1] I 17 49 33 ACGT
#> [2] Y 15 15 1 ACGT
#> [3] V 59 59 1 ACGT
#> [4] G 43 43 1 ACGT
#> [5] I 51 51 1 ACGT
#> ... ... ... ... ... ...
#> [13] S 31 29 3 ACGT
#> [14] - 33 NA 0 Trnl
#> [15] - 28 NA 0 Trnl
#> [16] L 52 53 30 TGCA
#> [17] L 7 55 17 ACGT
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
## Automorphism on "Z64" from position 1 to 33
autms <- automorphisms(
seqs = aln,
group = "Z64",
start = 1,
end = 33,
verbose = FALSE
)
autms
#> Automorphism object with 11 ranges and 8 metadata columns:
#> seqnames ranges strand | seq1 seq2 aa1
#> <Rle> <IRanges> <Rle> | <character> <character> <character>
#> [1] 1 1 + | ACC ATC T
#> [2] 1 2 + | TAT TAT Y
#> [3] 1 3 + | GTT GTT V
#> [4] 1 4 + | GGT GGT G
#> [5] 1 5 + | ATT ATT I
#> [6] 1 6 + | --- ACG -
#> [7] 1 7 + | GCG ACG A
#> [8] 1 8 + | CTC CTC L
#> [9] 1 9 + | CAA CAA Q
#> [10] 1 10 + | CTC TTC L
#> [11] 1 11 + | CTT CTT L
#> aa2 coord1 coord2 autm cube
#> <character> <numeric> <numeric> <numeric> <character>
#> [1] I 17 49 33 ACGT
#> [2] Y 15 15 1 ACGT
#> [3] V 59 59 1 ACGT
#> [4] G 43 43 1 ACGT
#> [5] I 51 51 1 ACGT
#> [6] T NA 18 0 Trnl
#> [7] T 26 18 45 ACGT
#> [8] L 53 53 1 ACGT
#> [9] Q 4 4 1 ACGT
#> [10] F 53 61 41 ACGT
#> [11] L 55 55 1 ACGT
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths