Skip to contents

Given two codon sequences represented in the Z5 Abelian group, this function computes the automorphisms describing codon mutational events.

Usage

autZ5(
  seq = NULL,
  filepath = NULL,
  cube = c("ACGT", "TGCA"),
  cube_alt = c("CATG", "GTAC"),
  start = NA,
  end = NA,
  chr = 1L,
  strand = "+",
  num.cores = multicoreWorkers(),
  tasks = 0L,
  verbose = TRUE
)

Arguments

seq

An object from a DNAStringSet or DNAMultipleAlignment class carrying the DNA pairwise alignment of two sequences.

filepath

A character vector containing the path to a file in fasta format to be read. This argument must be given if codon & base arguments are not provided.

cube, cube_alt

A character string denoting pairs of the 24 Genetic-code cubes, as given in references (2-3). That is, the base pairs from the given cubes must be complementary each other. Such a cube pair are call dual cubes and, as shown in reference (3), each pair integrates group.

start, end, chr, strand

Optional parameters required to build a GRanges-class. If not provided the default values given for the function definition will be used.

num.cores, tasks

Parameters for parallel computation using package BiocParallel-package: the number of cores to use, i.e. at most how many child processes will be run simultaneously (see bplapply and the number of tasks per job (only for Linux OS).

verbose

If TRUE, prints the progress bar.

Value

An object Automorphism-class with four columns on its metacolumn named: seq1, seq2, autm, and cube.

Details

Automorphisms in Z5 are described as functions \(f(x) = k x mod 64\), where k and x are elements from the set of integers modulo 64. As noticed in reference (1). The pairwise alignment provided in argument seq or the 'fasta' file filepath must correspond to DNA base sequences.

References

  1. Sanchez R, Morgado E, Grau R. Gene algebra from a genetic code algebraic structure. J Math Biol. 2005 Oct;51(4):431-57. doi: 10.1007/s00285-005-0332-8. Epub 2005 Jul 13. PMID: 16012800. ( PDF).

  2. Robersy Sanchez, Jesus Barreto (2021) Genomic Abelian Finite Groups. doi:10.1101/2021.06.01.446543

  3. M. V Jose, E.R. Morgado, R. Sanchez, T. Govezensky, The 24 possible algebraic representations of the standard genetic code in six or in three dimensions, Adv. Stud. Biol. 4 (2012) 110-152.PDF.

  4. R. Sanchez. Symmetric Group of the Genetic-Code Cubes. Effect of the Genetic-Code Architecture on the Evolutionary Process MATCH Commun. Math. Comput. Chem. 79 (2018) 527-560. PDF

See also

Examples

## Load a pairwise alignment
data("aln", package = "GenomAutomorphism")
aln
#> DNAStringSet object of length 2:
#>     width seq
#> [1]    51 ACCTATGTTGGTATT---GCGCTCCAACTCCTTGGCTCTAGCTCACTACAT
#> [2]    51 ATCTATGTTGGTATTACGACGCTCCAATTCCTTGGGTCC------CTCCTT

## Automorphism on Z5
autms <- autZ5(seq = aln, verbose = FALSE)
autms
#> Automorphism object with 51 ranges and 6 metadata columns:
#>        seqnames    ranges strand |        seq1        seq2    coord1    coord2
#>           <Rle> <IRanges>  <Rle> | <character> <character> <numeric> <numeric>
#>    [1]        1         1      + |           A           A         1         1
#>    [2]        1         2      + |           C           T         2         4
#>    [3]        1         3      + |           C           C         2         2
#>    [4]        1         4      + |           T           T         4         4
#>    [5]        1         5      + |           A           A         1         1
#>    ...      ...       ...    ... .         ...         ...       ...       ...
#>   [47]        1        47      + |           T           T         4         4
#>   [48]        1        48      + |           A           C         1         2
#>   [49]        1        49      + |           C           C         2         2
#>   [50]        1        50      + |           A           T         1         4
#>   [51]        1        51      + |           T           T         4         4
#>             autm        cube
#>        <numeric> <character>
#>    [1]         1        ACGT
#>    [2]         2        ACGT
#>    [3]         1        ACGT
#>    [4]         1        ACGT
#>    [5]         1        ACGT
#>    ...       ...         ...
#>   [47]         1        ACGT
#>   [48]         2        ACGT
#>   [49]         1        ACGT
#>   [50]         4        ACGT
#>   [51]         1        ACGT
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths