Skip to contents

Given two codon sequences represented in the Z64 Abelian group, this function computes the automorphisms describing codon mutational events.

Usage

autZ64(
  seq = NULL,
  filepath = NULL,
  cube = c("ACGT", "TGCA"),
  cube_alt = c("CATG", "GTAC"),
  start = NA,
  end = NA,
  chr = 1L,
  strand = "+",
  genetic_code = getGeneticCode("1"),
  num.cores = multicoreWorkers(),
  tasks = 0L,
  verbose = TRUE
)

Arguments

seq

An object from a DNAStringSet or DNAMultipleAlignment class carrying the DNA pairwise alignment of two sequences. The pairwise alignment provided in argument seq or the 'fasta' file filepath must correspond to codon sequences.

filepath

A character vector containing the path to a file in fasta format to be read. This argument must be given if codon & base arguments are not provided.

cube, cube_alt

A character string denoting pairs of the 24 Genetic-code cubes, as given in references (2-3). That is, the base pairs from the given cubes must be complementary each other. Such a cube pair are call dual cubes and, as shown in reference (3), each pair integrates group.

start, end, chr, strand

Optional parameters required to build a GRanges-class. If not provided the default values given for the function definition will be used.

genetic_code

The named character vector returned by getGeneticCode or similar. The translation of codon into aminoacids is a valuable information useful for downstream statistical analysis. The standard genetic code is the default argument value applied in the translation of codons into aminoacids (see GENETIC_CODE_TABLE.

num.cores, tasks

Parameters for parallel computation using package BiocParallel-package: the number of cores to use, i.e. at most how many child processes will be run simultaneously (see bplapply and the number of tasks per job (only for Linux OS).

verbose

If TRUE, prints the progress bar.

Value

An object Automorphism-class with four columns on its metacolumn named: seq1, seq2, autm, and cube.

Details

Automorphisms in Z64 are described as functions \(f(x) = k * x\) mod 64, where \(k\) and \(x\) are elements from the set of integers modulo 64.

References

  1. Sanchez R, Morgado E, Grau R. Gene algebra from a genetic code algebraic structure. J Math Biol. 2005 Oct;51(4):431-57. doi: 10.1007/s00285-005-0332-8. Epub 2005 Jul 13. PMID: 16012800. ( PDF).

  2. Robersy Sanchez, Jesus Barreto (2021) Genomic Abelian Finite Groups. doi:10.1101/2021.06.01.446543

  3. M. V Jose, E.R. Morgado, R. Sanchez, T. Govezensky, The 24 possible algebraic representations of the standard genetic code in six or in three dimensions, Adv. Stud. Biol. 4 (2012) 110-152.PDF.

  4. R. Sanchez. Symmetric Group of the Genetic-Code Cubes. Effect of the Genetic-Code Architecture on the Evolutionary Process MATCH Commun. Math. Comput. Chem. 79 (2018) 527-560. PDF

Author

Robersy Sanchez (https://genomaths.com).

Examples

## Load a pairwise alignment
data("aln", package = "GenomAutomorphism")
aln
#> DNAStringSet object of length 2:
#>     width seq
#> [1]    51 ACCTATGTTGGTATT---GCGCTCCAACTCCTTGGCTCTAGCTCACTACAT
#> [2]    51 ATCTATGTTGGTATTACGACGCTCCAATTCCTTGGGTCC------CTCCTT

## Automorphism on Z64
autms <- autZ64(seq = aln, verbose = FALSE)
autms
#> Automorphism object with 17 ranges and 8 metadata columns:
#>        seqnames    ranges strand |        seq1        seq2         aa1
#>           <Rle> <IRanges>  <Rle> | <character> <character> <character>
#>    [1]        1         1      + |         ACC         ATC           T
#>    [2]        1         2      + |         TAT         TAT           Y
#>    [3]        1         3      + |         GTT         GTT           V
#>    [4]        1         4      + |         GGT         GGT           G
#>    [5]        1         5      + |         ATT         ATT           I
#>    ...      ...       ...    ... .         ...         ...         ...
#>   [13]        1        13      + |         TCT         TCC           S
#>   [14]        1        14      + |         AGC         ---           S
#>   [15]        1        15      + |         TCA         ---           S
#>   [16]        1        16      - |         CTA         CTC           L
#>   [17]        1        17      + |         CAT         CTT           H
#>                aa2    coord1    coord2      autm        cube
#>        <character> <numeric> <numeric> <numeric> <character>
#>    [1]           I        17        49        33        ACGT
#>    [2]           Y        15        15         1        ACGT
#>    [3]           V        59        59         1        ACGT
#>    [4]           G        43        43         1        ACGT
#>    [5]           I        51        51         1        ACGT
#>    ...         ...       ...       ...       ...         ...
#>   [13]           S        31        29         3        ACGT
#>   [14]           -        33        NA         0        Trnl
#>   [15]           -        28        NA         0        Trnl
#>   [16]           L        52        53        30        TGCA
#>   [17]           L         7        55        17        ACGT
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths