Skip to contents

Extract the gene ranges and coordinates from a pairwise alignment of codon/base sequences represented on a given Abelian group.

Usage

seqranges(x, ...)

# S4 method for CodonSeq
seqranges(x, granges = TRUE)

# S4 method for DNAStringSet_OR_NULL
seqranges(
  x,
  granges = TRUE,
  base_seq = TRUE,
  filepath = NULL,
  start = NA,
  end = NA,
  chr = 1L,
  strand = "+"
)

Arguments

x

An object from a DNAStringSet or DNAMultipleAlignment class carrying the DNA pairwise alignment of two sequences.

...

Not in use.

granges

Logical. Whether to return a GRanges-class object or a DataFrame.

base_seq

Logical. Whether to return the base or codon coordinates on the selected Abelian group. If codon coordinates are requested, then the number of the DNA bases in the given sequences must be multiple of 3.

filepath

A character vector containing the path to a file in fasta format to be read. This argument must be given if codon & base arguments are not provided.

start, end, chr, strand

Optional parameters required to build a GRanges-class. If not provided the default values given for the function definition will be used.

Value

A GRanges-class

Details

This function provide an alternative way to get the codon coordinate and the information on the codon sequence from a CodonSeq class objects. The function can either take the output from functions codon_coord or to operate directly on a DNAStringSet or to retrieve the a DNA sequence alignment from a file.

References

  1. Robersy Sanchez, Jesus Barreto (2021) Genomic Abelian Finite Groups. doi:10.1101/2021.06.01.446543

  2. M. V Jose, E.R. Morgado, R. Sanchez, T. Govezensky, The 24 possible algebraic representations of the standard genetic code in six or in three dimensions, Adv. Stud. Biol. 4 (2012) 119-152.PDF.

  3. R. Sanchez. Symmetric Group of the Genetic-Code Cubes. Effect of the Genetic-Code Architecture on the Evolutionary Process MATCH Commun. Math. Comput. Chem. 79 (2018) 527-560.

See also

Author

Robersy Sanchez https://genomaths.com

Examples

## Load a pairwise alignment
data("aln", package = "GenomAutomorphism")
aln
#> DNAStringSet object of length 2:
#>     width seq
#> [1]    51 ACCTATGTTGGTATT---GCGCTCCAACTCCTTGGCTCTAGCTCACTACAT
#> [2]    51 ATCTATGTTGGTATTACGACGCTCCAATTCCTTGGGTCC------CTCCTT

## A GRanges object carrying the aligned DNA sequence.
seqranges(
    x = aln,
    base_seq = TRUE,
    filepath = NULL,
)
#> GRanges object with 51 ranges and 2 metadata columns:
#>        seqnames    ranges strand |        seq1        seq2
#>           <Rle> <IRanges>  <Rle> | <character> <character>
#>    [1]        1         1      + |           A           A
#>    [2]        1         2      + |           C           T
#>    [3]        1         3      + |           C           C
#>    [4]        1         4      + |           T           T
#>    [5]        1         5      + |           A           A
#>    ...      ...       ...    ... .         ...         ...
#>   [47]        1        47      + |           T           T
#>   [48]        1        48      + |           A           C
#>   [49]        1        49      + |           C           C
#>   [50]        1        50      + |           A           T
#>   [51]        1        51      + |           T           T
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

## A GRanges object carrying the aligned codon sequence.
seqranges(
    x = aln,
    base_seq = FALSE,
    filepath = NULL,
)
#> GRanges object with 17 ranges and 2 metadata columns:
#>        seqnames    ranges strand |        seq1        seq2
#>           <Rle> <IRanges>  <Rle> | <character> <character>
#>    [1]        1         1      + |         ACC         ATC
#>    [2]        1         2      + |         TAT         TAT
#>    [3]        1         3      + |         GTT         GTT
#>    [4]        1         4      + |         GGT         GGT
#>    [5]        1         5      + |         ATT         ATT
#>    ...      ...       ...    ... .         ...         ...
#>   [13]        1        13      + |         TCT         TCC
#>   [14]        1        14      + |         AGC         ---
#>   [15]        1        15      + |         TCA         ---
#>   [16]        1        16      + |         CTA         CTC
#>   [17]        1        17      + |         CAT         CTT
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths