getGRegionsStat-methods {MethylIT.utils}R Documentation

Statistic of Genomic Regions

Description

A function to estimate the centrality measures of a specified variable given in GRanges object (a column from the metacolums of the GRanges object) after split the GRanges object into intervals.

Usage

getGRegionsStat(GR, win.size = 350, step.size = 350,
  grfeatures = NULL, stat = c("sum", "mean", "gmaean", "median",
  "density"), absolute = FALSE, select.strand = NULL, column = 1L,
  prob = FALSE, entropy = FALSE, maxgap = -1L, minoverlap = 0L,
  scaling = 1000L, logbase = 2, type = c("any", "start", "end",
  "within", "equal"), ignore.strand = FALSE, na.rm = TRUE,
  num.cores = 1L, tasks = 0, ...)

## S4 method for signature 'GRanges'
getGRegionsStat(GR, win.size, step.size, grfeatures,
  stat, absolute, select.strand, column, prob, entropy, maxgap, minoverlap,
  scaling, logbase, type, ignore.strand, na.rm)

## S4 method for signature 'list'
getGRegionsStat(GR, win.size = 350, step.size = 350,
  grfeatures = NULL, stat = c("sum", "mean", "gmaean", "median",
  "density"), absolute = FALSE, select.strand = NULL, column = 1L,
  prob = FALSE, entropy = FALSE, maxgap = -1L, minoverlap = 0L,
  scaling = 1000L, logbase = 2, type = c("any", "start", "end",
  "within", "equal"), ignore.strand = FALSE, na.rm = TRUE,
  num.cores = 1L, tasks = 0, ...)

## S4 method for signature 'InfDiv'
getGRegionsStat(GR, win.size = 350, step.size = 350,
  grfeatures = NULL, stat = c("sum", "mean", "gmaean", "median",
  "density"), absolute = FALSE, select.strand = NULL, column = 1L,
  prob = FALSE, entropy = FALSE, maxgap = -1L, minoverlap = 0L,
  scaling = 1000L, logbase = 2, type = c("any", "start", "end",
  "within", "equal"), ignore.strand = FALSE, na.rm = TRUE,
  num.cores = 1L, tasks = 0, ...)

## S4 method for signature 'pDMP'
getGRegionsStat(GR, win.size = 350, step.size = 350,
  grfeatures = NULL, stat = c("sum", "mean", "gmaean", "median",
  "density"), absolute = FALSE, select.strand = NULL, column = 1L,
  prob = FALSE, entropy = FALSE, maxgap = -1L, minoverlap = 0L,
  scaling = 1000L, logbase = 2, type = c("any", "start", "end",
  "within", "equal"), ignore.strand = FALSE, na.rm = TRUE,
  num.cores = 1L, tasks = 0, ...)

## S4 method for signature 'GRangesList'
getGRegionsStat(GR, win.size = 350,
  step.size = 350, grfeatures = NULL, stat = c("sum", "mean",
  "gmaean", "median", "density"), absolute = FALSE,
  select.strand = NULL, column = 1L, prob = FALSE, entropy = FALSE,
  maxgap = -1L, minoverlap = 0L, scaling = 1000L, logbase = 2,
  type = c("any", "start", "end", "within", "equal"),
  ignore.strand = FALSE, na.rm = TRUE, num.cores = 1L, tasks = 0,
  ...)

Arguments

GR

A Grange object with the variable of interest in its metacolumn.

win.size

An integer for the size of the windows/regions size of the intervals of genomics regions.

step.size

Interval at which the regions/windows must be defined

grfeatures

A GRanges object corresponding to an annotated genomic feature. For example, gene region, transposable elements, exons, intergenic region, etc. If provided, then parameters 'win.size' and step.size are ignored and the statistics are estimated for 'grfeatures'.

stat

Statistic used to estimate the summarized value of the variable of interest in each interval/window. Posible options are: "mean", geometric mean ("gmean"), "median", "density", and "sum" (default). Here, we define "density" as the sum of values from the variable of interest in the given region devided by the length of the region.

absolute

Optional. Logic (default: FALSE). Whether to use the absolute values of the variable provided

select.strand

Optional. If provided,"+" or "-", then the summarized statistic is computed only for the specified DNA chain.

column

Integer number denoting the column where the variable of interest is located in the metacolumn of the GRanges object or an integer vector of two elements (only if prob = TRUE).

prob

Logic. If TRUE and the variable of interest has values between zero and 1, then the summarized statistic is comuputed using Fisher's transformation. If length(column) == 2, say with colums x1 and x2, then the variable of interest will be p = x1/(x1 + x2). For example, if x1 and x2 are methylated and unmethylated read counts, respectively, then p is the methylation level.

entropy

Logic. Whether to compute the entropy when prob == TRUE.

maxgap, minoverlap, type

See ?findOverlaps in the IRanges package for a description of these arguments.

scaling

integer (default 1). Scaling factor to be used when stat = "density". For example, if scaling = 1000, then density * scaling denotes the sum of values in 1000 bp.

logbase

A positive number: the base with respect to which logarithms

ignore.strand

When set to TRUE, the strand information is ignored in the overlap calculations.

na.rm

Logical value. If TRUE, the NA values will be removed

num.cores, tasks

Paramaters for parallele computation using package BiocParallel-package: the number of cores to use, i.e. at most how many child processes will be run simultaneously (see bplapply and the number of tasks per job (only for Linux OS).

Details

This function split a Grange object into intervals genomic regions (GR) of fixed size (as given in function "tileMethylCounts2" R package methylKit, with small changes). A summarized statistic (mean, median, geometric mean or sum) is calculated for the specified variable values from each region. Notice that if win.size == step.size, then non-overlapping windows are obtained.

Value

A GRanges object with the new genomic regions and their corresponding summarized statistic.

Author(s)

Robersy Sanchez

Examples

gr <- GRanges(seqnames = Rle( c("chr1", "chr2", "chr3", "chr4"),
            c(5, 5, 5, 5)),
            ranges = IRanges(start = 1:20, end = 1:20),
            strand = rep(c("+", "-"), 10),
            GC = seq(1, 0, length = 20))
grs <- getGRegionsStat(gr, win.size = 4, step.size = 4)
grs

## Selecting the positive strand
grs <- getGRegionsStat(gr, win.size = 4, step.size = 4, select.strand = "+")
grs

## Selecting the negative strand
grs <- getGRegionsStat(gr, win.size = 4, step.size = 4, select.strand = "-")
grs

## Operating over a list of GRanges objects
gr2 <- GRanges(seqnames = Rle( c("chr1", "chr2", "chr3", "chr4"),
                            c(5, 5, 5, 5)),
                ranges = IRanges(start = 1:20, end = 1:20),
                strand = rep(c("+", "-"), 10),
                GC = runif(20))

grs <- getGRegionsStat(list(gr1 = gr, gr2 = gr2), win.size = 4, step.size=4)

[Package MethylIT.utils version 0.3.2 ]