The function is used to discard the cytosine positions with coverage values less than 'min.coverage' read counts or values greater than the specified 'percentile'.

filterByCoverage(
  x,
  min.coverage = 4,
  max.coverage = Inf,
  percentile = 0.999,
  col.names = c(coverage = NULL, mC = NULL, uC = NULL),
  verbose = TRUE
)

Arguments

x

GRanges object or list of GRanges

min.coverage

Cytosine sites with coverage less than min.coverage are discarded. Default: 0

max.coverage

Cytosine sites with coverage greater than max.coverage are discarded. Default: Inf

percentile

Threshold to remove the outliers from each file and all files stacked. If percentile is 1, all the outliers stay

col.names

The number of the 'coverage' column. Since no specific table format for the count data is specified, at least the number of the 'coverage' column must be given, or the number of the columns with methylated (mC) and unmethylated counts (uC). Then coverage = mC + uC.

verbose

If TRUE, prints the function log to stdout

Value

The input GRanges object or list of GRanges objects after filtering them.

Details

The input must be a GRanges object or list of GRanges objects with a coverage column in the meta-column table or the columns with methylated (mC) and unmethylated counts (uC).

Examples

gr1 <- makeGRangesFromDataFrame(data.frame(chr = 'chr1', start = 11:15,
end = 11:15, strand = c('+','-','+','*','.'), mC = 1, uC = 1:5),
keep.extra.columns = TRUE)

filterByCoverage(gr1, min.coverage = 1, max.coverage = 4,
col.names = c(mC = 1, uC = 2), verbose = FALSE)
#> GRanges object with 2 ranges and 2 metadata columns:
#>       seqnames    ranges strand |        mC        uC
#>          <Rle> <IRanges>  <Rle> | <numeric> <integer>
#>   [1]     chr1        11      + |         1         1
#>   [2]     chr1        12      - |         1         2
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths