The aminoacid similarity matrices from Amino Acid Index Database https://www.genome.jp/aaindex/ are provided here. AAindex (ver.9.2) is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids.
The similarity of amino acids can be represented numerically, expressed in terms of observed mutation rate or physicochemical properties. A similarity matrix, also called a mutation matrix, is a set of 210 numerical values, 20 diagonal and 20x19/2 off-diagonal elements, used for sequence alignments and similarity searches.
Function aa_phychem_index is wrapper function to call two other functions: aa_mutmat and aa_index
Usage
aa_phychem_index(acc = NA, aaindex = NA, acc_list = FALSE, info = FALSE)
aa_mutmat(acc = NA, aaindex = c("aaindex2", "aaindex3"), acc_list = FALSE)
aa_index(acc = NA, acc_list = FALSE, info = FALSE)
Arguments
- acc
Accession id for a specified mutation or contact potential matrix.
- aaindex
Database where the requested accession id is locate. The possible values are: "aaindex2" or "aaindex3".
- acc_list
Logical. If TRUE, then the list of available matrices ids and index names is returned.
- info
Logical. if TRUE, then whole information for the physicochemical index will be returned.
Value
Depending on the user specifications, a mutation or contact potential matrix, a list of available matrices (indices) ids or index names can be returned. More specifically:
- aa_mutmat:
Returns an aminoacid mutation matrix or a statistical protein contact potentials matrix.
- aa_index:
Returns the specified aminoacid physicochemical indices.
See also
aaindex1, aaindex2, aaindex3, and get_mutscore.
Author
Robersy Sanchez https://genomaths.com
Examples
## Load the mutation matrices from database from the packages
data("aaindex1","aaindex2", package = "GenomAutomorphism" )
## Get the available mutation matrices
mat <- aa_mutmat(aaindex = "aaindex2", acc_list = TRUE)
mat[seq(10)]
#> [1] "List of 94 Amino Acid Matrices in AAindex ver.9.2"
#> [2] ""
#> [3] "The columns correspond to the AAindex accession number and the description of"
#> [4] "each matrix."
#> [5] ""
#> [6] "ALTS910101 The PAM-120 matrix (Altschul, 1991)"
#> [7] "BENS940101 Log-odds scoring matrix collected in 6.4-8.7 PAM (Benner et al., 1994)"
#> [8] "BENS940102 Log-odds scoring matrix collected in 22-29 PAM (Benner et al., 1994)"
#> [9] "BENS940103 Log-odds scoring matrix collected in 74-100 PAM (Benner et al., 1994)"
#> [10] "BENS940104 Genetic code matrix (Benner et al., 1994)"
## Return the 'Base-substitution-protein-stability matrix
## (Miyazawa-Jernigan, 1993)'
aa_mutmat(acc = "MIYS930101", aaindex = "aaindex2")
#> A R N D C Q E G H I L K M
#> A 0.34 -0.08 -0.16 0.04 -0.51 -0.16 0.03 0.19 -0.21 -0.45 -0.45 -0.20 -0.47
#> R -0.08 0.65 0.01 -0.06 -0.35 0.15 -0.02 0.20 0.11 -0.82 -0.74 0.04 -0.83
#> N -0.16 0.01 0.38 0.20 -0.46 0.20 0.19 -0.11 0.21 -0.83 -0.90 0.38 -0.97
#> D 0.04 -0.06 0.20 0.36 -0.41 0.16 0.32 0.15 0.18 -0.78 -0.75 0.16 -0.90
#> C -0.51 -0.35 -0.46 -0.41 1.02 -0.74 -0.64 -0.18 -0.29 -0.13 -0.02 -0.81 -0.29
#> Q -0.16 0.15 0.20 0.16 -0.74 0.48 0.25 -0.14 0.37 -0.95 -0.74 0.30 -0.97
#> E 0.03 -0.02 0.19 0.32 -0.64 0.25 0.36 0.13 0.13 -0.86 -0.80 0.25 -0.87
#> G 0.19 0.20 -0.11 0.15 -0.18 -0.14 0.13 0.43 -0.19 -0.58 -0.56 -0.16 -0.61
#> H -0.21 0.11 0.21 0.18 -0.29 0.37 0.13 -0.19 0.54 -0.69 -0.50 0.14 -0.86
#> I -0.45 -0.82 -0.83 -0.78 -0.13 -0.95 -0.86 -0.58 -0.69 0.75 0.48 -1.01 0.67
#> L -0.45 -0.74 -0.90 -0.75 -0.02 -0.74 -0.80 -0.56 -0.50 0.48 0.61 -1.06 0.50
#> K -0.20 0.04 0.38 0.16 -0.81 0.30 0.25 -0.16 0.14 -1.01 -1.06 0.49 -1.03
#> M -0.47 -0.83 -0.97 -0.90 -0.29 -0.97 -0.87 -0.61 -0.86 0.67 0.50 -1.03 0.97
#> F -0.47 -0.79 -0.71 -0.62 0.39 -0.85 -0.81 -0.52 -0.43 0.45 0.48 -1.03 0.35
#> P 0.17 0.14 -0.12 -0.15 -0.65 0.15 -0.13 -0.14 0.14 -0.78 -0.67 -0.14 -0.82
#> S 0.16 0.05 -0.02 -0.14 -0.20 -0.20 -0.19 0.00 -0.15 -0.68 -0.70 -0.14 -0.75
#> T 0.18 0.00 0.11 -0.10 -0.62 -0.07 -0.09 -0.09 -0.13 -0.59 -0.77 0.09 -0.61
#> W -0.51 -0.26 -0.79 -0.64 0.82 -0.83 -0.66 -0.15 -0.70 -0.23 0.13 -0.92 0.04
#> Y -0.34 -0.31 0.12 0.09 0.40 -0.02 -0.11 -0.35 0.35 -0.35 -0.27 -0.12 -0.56
#> V -0.06 -0.54 -0.69 -0.39 -0.19 -0.68 -0.42 -0.15 -0.59 0.41 0.41 -0.79 0.41
#> F P S T W Y V
#> A -0.47 0.17 0.16 0.18 -0.51 -0.34 -0.06
#> R -0.79 0.14 0.05 0.00 -0.26 -0.31 -0.54
#> N -0.71 -0.12 -0.02 0.11 -0.79 0.12 -0.69
#> D -0.62 -0.15 -0.14 -0.10 -0.64 0.09 -0.39
#> C 0.39 -0.65 -0.20 -0.62 0.82 0.40 -0.19
#> Q -0.85 0.15 -0.20 -0.07 -0.83 -0.02 -0.68
#> E -0.81 -0.13 -0.19 -0.09 -0.66 -0.11 -0.42
#> G -0.52 -0.14 0.00 -0.09 -0.15 -0.35 -0.15
#> H -0.43 0.14 -0.15 -0.13 -0.70 0.35 -0.59
#> I 0.45 -0.78 -0.68 -0.59 -0.23 -0.35 0.41
#> L 0.48 -0.67 -0.70 -0.77 0.13 -0.27 0.41
#> K -1.03 -0.14 -0.14 0.09 -0.92 -0.12 -0.79
#> M 0.35 -0.82 -0.75 -0.61 0.04 -0.56 0.41
#> F 0.61 -0.76 -0.53 -0.75 0.25 0.14 0.36
#> P -0.76 0.56 0.24 0.25 -0.71 -0.20 -0.47
#> S -0.53 0.24 0.48 0.28 -0.29 -0.04 -0.44
#> T -0.75 0.25 0.28 0.45 -0.70 -0.27 -0.44
#> W 0.25 -0.71 -0.29 -0.70 1.42 0.00 -0.17
#> Y 0.14 -0.20 -0.04 -0.27 0.00 0.84 -0.39
#> V 0.36 -0.47 -0.44 -0.44 -0.17 -0.39 0.54
## Return the 'BLOSUM80 substitution matrix (Henikoff-Henikoff, 1992)'
aa_mutmat(acc = "HENS920103", aaindex = "aaindex2")
#> A R N D C Q E G H I L K M F P S T W Y V
#> A 7 -3 -3 -3 -1 -2 -2 0 -3 -3 -3 -1 -2 -4 -1 2 0 -5 -4 -1
#> R -3 9 -1 -3 -6 1 -1 -4 0 -5 -4 3 -3 -5 -3 -2 -2 -5 -4 -4
#> N -3 -1 9 2 -5 0 -1 -1 1 -6 -6 0 -4 -6 -4 1 0 -7 -4 -5
#> D -3 -3 2 10 -7 -1 2 -3 -2 -7 -7 -2 -6 -6 -3 -1 -2 -8 -6 -6
#> C -1 -6 -5 -7 13 -5 -7 -6 -7 -2 -3 -6 -3 -4 -6 -2 -2 -5 -5 -2
#> Q -2 1 0 -1 -5 9 3 -4 1 -5 -4 2 -1 -5 -3 -1 -1 -4 -3 -4
#> E -2 -1 -1 2 -7 3 8 -4 0 -6 -6 1 -4 -6 -2 -1 -2 -6 -5 -4
#> G 0 -4 -1 -3 -6 -4 -4 9 -4 -7 -7 -3 -5 -6 -5 -1 -3 -6 -6 -6
#> H -3 0 1 -2 -7 1 0 -4 12 -6 -5 -1 -4 -2 -4 -2 -3 -4 3 -5
#> I -3 -5 -6 -7 -2 -5 -6 -7 -6 7 2 -5 2 -1 -5 -4 -2 -5 -3 4
#> L -3 -4 -6 -7 -3 -4 -6 -7 -5 2 6 -4 3 0 -5 -4 -3 -4 -2 1
#> K -1 3 0 -2 -6 2 1 -3 -1 -5 -4 8 -3 -5 -2 -1 -1 -6 -4 -4
#> M -2 -3 -4 -6 -3 -1 -4 -5 -4 2 3 -3 9 0 -4 -3 -1 -3 -3 1
#> F -4 -5 -6 -6 -4 -5 -6 -6 -2 -1 0 -5 0 10 -6 -4 -4 0 4 -2
#> P -1 -3 -4 -3 -6 -3 -2 -5 -4 -5 -5 -2 -4 -6 12 -2 -3 -7 -6 -4
#> S 2 -2 1 -1 -2 -1 -1 -1 -2 -4 -4 -1 -3 -4 -2 7 2 -6 -3 -3
#> T 0 -2 0 -2 -2 -1 -2 -3 -3 -2 -3 -1 -1 -4 -3 2 8 -5 -3 0
#> W -5 -5 -7 -8 -5 -4 -6 -6 -4 -5 -4 -6 -3 0 -7 -6 -5 16 3 -5
#> Y -4 -4 -4 -6 -5 -3 -5 -6 3 -3 -2 -4 -3 4 -6 -3 -3 3 11 -3
#> V -1 -4 -5 -6 -2 -4 -4 -6 -5 4 1 -4 1 -2 -4 -3 0 -5 -3 7
## Using wrapping function
aa_phychem_index(acc = "EISD840101", aaindex = "aaindex1")
#> A R N D C Q E G H I L K M
#> 0.25 -1.76 -0.64 -0.72 0.04 -0.69 -0.62 0.16 -0.40 0.73 0.53 -1.10 0.26
#> F P S T W Y V
#> 0.61 -0.07 -0.26 -0.18 0.37 0.02 0.54
## Just the info. The information provided after the reference
## corresponds to the correlaiton of 'EISD840101' with other indices.
aa_phychem_index(acc = "EISD840101", aaindex = "aaindex1", info = TRUE)
#> [1] "H EISD840101"
#> [2] "D Consensus normalized hydrophobicity scale (Eisenberg, 1984)"
#> [3] "R PMID:6383201"
#> [4] "A Eisenberg, D."
#> [5] "T Three-dimensional structure of membrane and surface proteins"
#> [6] "J Ann. Rev. Biochem. 53, 595-623 (1984) Original references: Eisenberg, D., "
#> [7] " Weiss, R.M., Terwilliger, T.C. and Wilcox, W. Faraday Symp. Chem. Soc. 17, "
#> [8] " 109-120 (1982) Eisenberg, D., Weiss, R.M. and Terwilliger, T.C. The "
#> [9] " hydrophobic moment detects periodicity in protein hydrophobicity Proc. Natl. "
#> [10] " Acad. Sci. USA 81, 140-144 (1984)"
#> [11] "C RADA880101 0.968 JACR890101 0.938 RADA880107 0.927"
#> [12] " ROSM880105 0.923 WOLR810101 0.914 WOLR790101 0.909"
#> [13] " RADA880104 0.908 JANJ790102 0.900 JURD980101 0.895"
#> [14] " NADH010102 0.887 CHOC760103 0.885 BLAS910101 0.884"
#> [15] " EISD860101 0.884 KYTJ820101 0.878 FAUJ830101 0.875"
#> [16] " JANJ780102 0.874 OLSK800101 0.869 COWR900101 0.863"
#> [17] " NADH010101 0.861 NADH010103 0.840 NAKH900110 0.838"
#> [18] " EISD860103 0.837 DESM900102 0.828 RADA880108 0.817"
#> [19] " BIOV880102 0.814 BIOV880101 0.811 YUTK870101 0.809"
#> [20] " NADH010104 0.809 ROSG850102 0.806 BASU050103 0.806"
#> [21] " WOLS870101 -0.820 GRAR740102 -0.823 MEIH800102 -0.829"
#> [22] " HOPT810101 -0.846 GUYH850101 -0.849 PUNT030102 -0.854"
#> [23] " LEVM760101 -0.859 OOBM770101 -0.878 JANJ780103 -0.881"
#> [24] " FAUJ880109 -0.890 GUYH850104 -0.892 CHOC760102 -0.892"
#> [25] " KIDA850101 -0.900 JANJ780101 -0.907 KUHL950101 -0.907"
#> [26] " PUNT030101 -0.914 VHEG790101 -0.924 ROSM880102 -0.925"
#> [27] " ENGD860101 -0.936 PRAM900101 -0.936 ROSM880101 -0.947"
#> [28] " GUYH850105 -0.951"
#> [29] "I A/L R/K N/M D/F C/P Q/S E/T G/W H/Y I/V"
#> [30] " 0.25 -1.76 -0.64 -0.72 0.04 -0.69 -0.62 0.16 -0.40 0.73"
#> [31] " 0.53 -1.10 0.26 0.61 -0.07 -0.26 -0.18 0.37 0.02 0.54"