This data set carries some relevant physicochemical properties of the DNA bases. Available properties are:
- "proton_affinity: "
It is an indicatio of the thermodynamic gradient between a molecule and the anionic form of that molecule upon removal of a proton from it (Wikipedia). The proton affinity values, given in kJ/mol, were taken from reference (1), also available in Wolfram Alpha at https://www.wolframalpha.com/ and in the cell phone App 'Wolfram Alpha'.. Reference (2) provides several measurements accomplished by several computational and experimental approaches.
- "partition_coef: "
1-octanol/water partition coefficients, logP. In the physical sciences, a partition coefficient (P) or distribution coefficient (D) is the ratio of concentrations of a compound in a mixture of two immiscible solvents at equilibrium (3). The partition coefficient measures how hydrophilic ("water-loving") or hydrophobic ("water-fearing") a chemical substance is. Partition coefficients are useful in estimating the distribution of drugs within the body. Hydrophobic drugs with high octanol-water partition coefficients are mainly distributed to hydrophobic areas such as lipid bilayers of cells. Conversely, hydrophilic drugs (low octanol/water partition coefficients) are found primarily in aqueous regions such as blood serum. The partition coefficient values included here were taken from reference (1), also available in Wolfram Alpha at https://www.wolframalpha.com/ and in the cell phone App 'Wolfram Alpha'.
- "dipole_moment: "
Dipole-dipole, dipole-induced-dipole and London force interactions among the bases in the helix are large, and make the free energy of the helix depend on the base composition and sequence. The dipole moment values were taken from reference (4). The dipole moment of DNA bases refers to the measure of polarity in the chemical bonds between atoms within the nucleobases. Dipole moments arise due to differences in electronegativity between the bonded atoms. In DNA bases, these dipole moments can influence the orientation of the bases when interacting with other molecules or surfaces, such as graphene/h-BN interfaces. The concept of dipole moments has been applied to analyze the electric moments of RNA-binding proteins, which can help identify DNA-binding proteins and provide insights into their mechanisms and prediction.
- "tautomerization_energy: "
The term “tautomerism” is usually defined as structural isomerism with a low barrier to interconversion between the isomers, for example, the enol/imino forms for cytosine and guanine. Tautomerization is a process where the chemical structure of a molecule, such as DNA bases, undergoes a rearrangement of its atoms. This rearrangement results in the formation of different isomers, called tautomers, which can exist in solution or in a cell. The DNA bases can undergo tautomeric shifts, which can potentially contribute to mutagenic mispairings during DNA replication. The energy required for tautomerization of DNA bases is known as tautomerization energy. These values were taken from reference (2) and the value for each base corresponds to the average of the values estimated from different measurement approaches.
Usage
data("dna_phyche", package = "GenomAutomorphism")
References
Wolfram Research (2007), ChemicalData, Wolfram Language function, https://reference.wolfram.com/language/ref/ChemicalData.html (updated 2016).
Moser A, Range K, York DM. Accurate proton affinity and gas-phase basicity values for molecules important in biocatalysis. J Phys Chem B. 2010;114: 13911–13921. doi:10.1021/jp107450n.
Leo A, Hansch C, Elkins D. Partition coefficients and their uses. Chem Rev. 1971;71: 525–616. doi:10.1021/cr60274a001.
Vovusha H, Amorim RG, Scheicher RH, Sanyal B. Controlling the orientation of nucleobases by dipole moment interaction with graphene/h-BN interfaces. RSC Adv. Royal Society of Chemistry; 2018;8: 6527–6531. doi:10.1039/c7ra11664k.
Examples
data("dna_phyche", package = "GenomAutomorphism")
dna_phyche
#> proton_affinity partition_coef dipole_moment tautomerization_energy
#> A 942.8 -0.3 2.51 12.68
#> C 949.9 -1.1 5.58 2.47
#> G 959.5 -0.9 6.65 0.76
#> T 880.9 -0.7 4.37 12.26
## Select DNA base tautomerization energy
te <- as.list(dna_phyche$tautomerization_energy)
names(te) <- rownames(dna_phyche)
## Let's create DNAStringSet-class object
base <- DNAStringSet(x = c( seq1 ='ACGTGATCAAGT',
seq2 = 'GTGTGATCCAGT'))
dna_phychem(seqs = base, phychem = te,
index_name = "Tautomerization-Energy")
#> MatrixSeq with 2 rows (sequences) and 12 columns (aminoacids/codons):
#> -------
#> B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12
#> S1 12.68 2.47 0.76 12.26 0.76 12.68 12.26 2.47 12.68 12.68 0.76 12.26
#> S2 0.76 12.26 0.76 12.26 0.76 12.68 12.26 2.47 2.47 12.68 0.76 12.26
#> -------
#> Slots: 'seqs', 'matrix', 'names', 'aaindex', 'phychem', 'accession