getGRegionsStatmethods {MethylIT.utils}  R Documentation 
A function to estimate summarized measures of a specified
variable given in a GRanges object (a column from the metacolums of the
GRanges object) after split the GRanges object into intervals. A faster
alternative would be getGRegionsStat2
.
getGRegionsStat(GR, win.size = 350, step.size = 350, grfeatures = NULL, stat = c("sum", "mean", "gmean", "median", "density", "count"), absolute = FALSE, select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE, maxgap = 1L, minoverlap = 0L, scaling = 1000L, logbase = 2, missings = 0, type = c("any", "start", "end", "within", "equal"), ignore.strand = FALSE, na.rm = TRUE, naming = FALSE, num.cores = 1L, tasks = 0, verbose = TRUE, ...) ## S4 method for signature 'GRanges' getGRegionsStat(GR, win.size = 350, step.size = 350, grfeatures = NULL, stat = c("sum", "mean", "gmean", "median", "density", "count"), absolute = FALSE, select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE, maxgap = 1L, minoverlap = 0L, scaling = 1000L, logbase = 2, missings = 0, type = c("any", "start", "end", "within", "equal"), ignore.strand = FALSE, na.rm = TRUE, naming = FALSE) ## S4 method for signature 'list' getGRegionsStat(GR, win.size = 350, step.size = 350, grfeatures = NULL, stat = c("sum", "mean", "gmean", "median", "density", "count"), absolute = FALSE, select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE, maxgap = 1L, minoverlap = 0L, scaling = 1000L, logbase = 2, missings = 0, type = c("any", "start", "end", "within", "equal"), ignore.strand = FALSE, na.rm = TRUE, naming = FALSE, num.cores = 1L, tasks = 0, verbose = TRUE, ...) ## S4 method for signature 'InfDiv' getGRegionsStat(GR, win.size = 350, step.size = 350, grfeatures = NULL, stat = c("sum", "mean", "gmean", "median", "density", "count"), absolute = FALSE, select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE, maxgap = 1L, minoverlap = 0L, scaling = 1000L, logbase = 2, missings = 0, type = c("any", "start", "end", "within", "equal"), ignore.strand = FALSE, na.rm = TRUE, naming = FALSE, num.cores = 1L, tasks = 0, verbose = TRUE, ...) ## S4 method for signature 'pDMP' getGRegionsStat(GR, win.size = 350, step.size = 350, grfeatures = NULL, stat = c("sum", "mean", "gmean", "median", "density", "count"), absolute = FALSE, select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE, maxgap = 1L, minoverlap = 0L, scaling = 1000L, logbase = 2, missings = 0, type = c("any", "start", "end", "within", "equal"), ignore.strand = FALSE, na.rm = TRUE, naming = FALSE, num.cores = 1L, tasks = 0, verbose = TRUE, ...) ## S4 method for signature 'GRangesList' getGRegionsStat(GR, win.size = 350, step.size = 350, grfeatures = NULL, stat = c("sum", "mean", "gmean", "median", "density", "count"), absolute = FALSE, select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE, maxgap = 1L, minoverlap = 0L, scaling = 1000L, logbase = 2, missings = 0, type = c("any", "start", "end", "within", "equal"), ignore.strand = FALSE, na.rm = TRUE, naming = FALSE, num.cores = 1L, tasks = 0, verbose = TRUE, ...)
GR 
A GRange object or a list of GRanges object with the variable of interest in the GRanges metacolumn. 
win.size 
An integer for the size of the windows/regions size of the intervals of genomics regions. 
step.size 
Interval at which the regions/windows must be defined 
grfeatures 
A GRanges object corresponding to an annotated genomic feature. For example, gene region, transposable elements, exons, intergenic region, etc. If provided, then parameters 'win.size' and step.size are ignored and the statistics are estimated for 'grfeatures'. 
stat 
Statistic used to estimate the summarized value of the variable of interest in each interval/window. Posible options are: "mean", geometric mean ("gmean"), "median", "density", "count" and "sum" (default). Here, we define "density" as the sum of values from the variable of interest in the given region devided by the length of the region. The option 'count' compute the number/count of positions in the specified regions with values greater than zero in the selected 'column'. 
absolute 
Optional. Logic (default: FALSE). Whether to use the absolute values of the variable provided. For example, the difference of methylation levels could take negative values (TV) and we would be interested on the sum of abs(TV), which is sum of the total variation distance. 
select.strand 
Optional. If provided,"+" or "", then the summarized statistic is computed only for the specified DNA chain. 
column 
Integer number denoting the column where the variable of interest is located in the metacolumn of the GRanges object or an integer vector of two elements (only if prob = TRUE). 
prob 
Logic. If TRUE and the variable of interest has values between zero and 1, then the summarized statistic is comuputed using Fisher's transformation. If length(column) == 2, say with colums x1 and x2, then the variable of interest will be p = x1/(x1 + x2). For example, if x1 and x2 are methylated and unmethylated read counts, respectively, then p is the methylation level. 
entropy 
Logic. Whether to compute the entropy when prob == TRUE. 
maxgap, minoverlap, type 
See ?findOverlaps in the IRanges package for a description of these arguments. 
scaling 
integer (default 1). Scaling factor to be used when stat = "density". For example, if scaling = 1000, then density * scaling denotes the sum of values in 1000 bp. 
logbase 
A positive number: the base with respect to which logarithms are computed when parameter 'entropy = TRUE' (default: logbase = 2). 
missings 
Whether to write '0' or 'NA' on regions where there is not data to compute the statistic. 
ignore.strand 
When set to TRUE, the strand information is ignored in the overlap calculations. 
na.rm 
Logical value. If TRUE, the NA values will be removed 
naming 
Logical value. If TRUE, the rows GRanges object will be given the names(GR). Default is FALSE. 
num.cores, tasks 
Paramaters for parallele computation using package

verbose 
Logical. Default is TRUE. If TRUE, then the progress of the computational tasks is given. 
maxgap, minoverlap, type, select, ignore.strand 
Used to find overlapped
regions. See ? 
This function split a Grange object into intervals genomic regions (GR) of fixed size (as given in function "tileMethylCounts2" R package methylKit, with small changes). A summarized statistic (mean, median, geometric mean or sum) is calculated for the specified variable values from each region. Notice that if win.size == step.size, then nonoverlapping windows are obtained.
An object of the same class of GR with the new genomic regions and their corresponding summarized statistic.
Robersy Sanchez
getGRegionsStat2
.
library(GenomicRanges) gr < GRanges(seqnames = Rle( c("chr1", "chr2", "chr3", "chr4"), c(5, 5, 5, 5)), ranges = IRanges(start = 1:20, end = 1:20), strand = rep(c("+", ""), 10), GC = seq(1, 0, length = 20)) grs < getGRegionsStat(gr, win.size = 4, step.size = 4) grs ## Selecting the positive strand grs < getGRegionsStat(gr, win.size = 4, step.size = 4, select.strand = "+") grs ## Selecting the negative strand grs < getGRegionsStat(gr, win.size = 4, step.size = 4, select.strand = "") grs ## Operating over a list of GRanges objects gr2 < GRanges(seqnames = Rle( c("chr1", "chr2", "chr3", "chr4"), c(5, 5, 5, 5)), ranges = IRanges(start = 1:20, end = 1:20), strand = rep(c("+", ""), 10), GC = runif(20)) grs < getGRegionsStat(list(gr1 = gr, gr2 = gr2), win.size = 4, step.size=4)