Build an unique GRanges object from a list of Granges objects.

uniqueGRanges(
  ListOfGranges,
  ncolns = NULL,
  columns = NULL,
  chromosomes = NULL,
  maxgap = -1L,
  minoverlap = 1L,
  missing = 0,
  type = c("any", "start", "end", "within", "equal"),
  select = c("all", "first", "last", "arbitrary"),
  ignore.strand = FALSE,
  num.cores = 1L,
  tasks = 0L,
  verbose = TRUE
)

Arguments

ListOfGranges

Objects to combine. A list of GRanges-class object or a GRangesList-class object.

ncolns

integer. Number of columns to use from the meta-column of each GRanges object. Default value: NULL. If NULL, all the columns (from column 1 to ncolns) from each GRanges will be present in the \(uniqueGRanges\) output. If \(ncolns\) = NULL and \(columns != NULL\), then the function will set \(ncolns = length(columns)\).

columns

integer number(s) corresponding to the specific column(s) to use from the meta-column of each GRanges. Default value: NULL. if provided, the metacolumn from the uniqueGRanges output will contain the specified columns.

chromosomes

Chromosomes used Default value: NULL

maxgap, minoverlap, ignore.strand, select, type

The same as in findOverlaps-methods.

missing

A numerical value (default 0) or NA to write in ranges with missing values. For example, suppose that we want to build a uniqueGRanges object from the GRanges objects X and Y. If a given range k from the GRanges object X with metacolumn value x is missing in the GRanges object Y, then the metacolumn of range k from uniqueGRanges(list(X,Y)) object will be the row vector (x,0) or (x,NA) if missing = NA.

num.cores, tasks

Parameters for parallel computation using package BiocParallel-package: the number of cores to use, i.e. at most how many child processes will be run simultaneously (see bplapply and the number of tasks per job (only for Linux OS).

verbose

if TRUE, prints the function log to stdout

Value

a GRanges object

Details

The metadata of each one of these GRanges must have one or more columns to yield a unique GRanges object with metadata columns from the original GRanges objects. Otherwise, a unique GRanges object will be created without metadata columns. Additionally, all metadata must be the same class, e.g. all numeric or all characters, or all factor

Author

Robersy Sanchez (https://genomaths.com).

Examples

## 'GRanges' object construction
dfChr1 <- data.frame(chr = 'chr1', start = 11:15, end = 11:15,
                     strand = c('+','-','+','*','.'), score = 1:5)
dfChr2 <- data.frame(chr = 'chr1', start = 11:15, end = 11:15,
                     strand = c('+','-','+','*','.'), score = 1:5)
dfChr3 <- data.frame(chr = 'chr1', start = 11:15, end = 11:15,
                     strand = c('+','-','+','*','+'), score = 1:5)

gr1 <- makeGRangesFromDataFrame(dfChr1, keep.extra.columns = TRUE)
gr2 <- makeGRangesFromDataFrame(dfChr2, keep.extra.columns = TRUE)
gr3 <- makeGRangesFromDataFrame(dfChr3, keep.extra.columns = TRUE)

## A 'GRangesList' object:
grList <- GRangesList('gr1' = gr1, 'gr2' = gr2, 'gr3' = gr3)
grList
#> GRangesList object of length 3:
#> $gr1
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      + |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      * |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 
#> $gr2
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      + |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      * |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 
#> $gr3
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      + |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      + |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 

## The default setting does not ignore strands value (ignore.strand = FALSE).
## In object 'gr1' and 'gr2', the strand at the 5th range is unknown.
## However, in object 'gr3' the 5th range has "+" strand. The defaul setting
## preserve this information. Two ranges with start = 15 are yielded: 1) the
## first range the first range with strand = '*' carries the value 5 in the
## columns derived from 'gr1' and 'gr2', and 0 (missing = 0) in the column
## derived from 'gr3'.
uniqueGRanges(grList)
#>  *** Building coordinates for the new GRanges object ...
#>  *** Strand information is preserved ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |======================================================================| 100%
#> 
#>  *** Sorting by chromosomes and start positions...
#> GRanges object with 6 ranges and 3 metadata columns:
#>       seqnames    ranges strand |     score   score.1   score.2
#>          <Rle> <IRanges>  <Rle> | <numeric> <numeric> <numeric>
#>   [1]     chr1        11      + |         1         1         1
#>   [2]     chr1        12      - |         2         2         2
#>   [3]     chr1        13      + |         3         3         3
#>   [4]     chr1        14      * |         4         4         4
#>   [5]     chr1        15      * |         5         5         0
#>   [6]     chr1        15      + |         0         0         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

## A new GRanges object is included with values in different strands
dfChr4 <- data.frame(chr = 'chr1', start = 11:15, end = 11:15,
                     strand = c('-','-','+','*','-'), score = 1:5)
gr4 <- makeGRangesFromDataFrame(dfChr4, keep.extra.columns = TRUE)
grList <- GRangesList('gr1' = gr1, 'gr2' = gr2, 'gr3' = gr3, gr4 = gr4)
grList
#> GRangesList object of length 4:
#> $gr1
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      + |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      * |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 
#> $gr2
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      + |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      * |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 
#> $gr3
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      + |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      + |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 
#> $gr4
#> GRanges object with 5 ranges and 1 metadata column:
#>       seqnames    ranges strand |     score
#>          <Rle> <IRanges>  <Rle> | <integer>
#>   [1]     chr1        11      - |         1
#>   [2]     chr1        12      - |         2
#>   [3]     chr1        13      + |         3
#>   [4]     chr1        14      * |         4
#>   [5]     chr1        15      - |         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
#> 

## The default setting preserve the information adding new ranges.
uniqueGRanges(grList)
#>  *** Building coordinates for the new GRanges object ...
#>  *** Strand information is preserved ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |======================================================================| 100%
#> 
#>  *** Sorting by chromosomes and start positions...
#> GRanges object with 8 ranges and 4 metadata columns:
#>       seqnames    ranges strand |     score   score.1   score.2   score.3
#>          <Rle> <IRanges>  <Rle> | <numeric> <numeric> <numeric> <numeric>
#>   [1]     chr1        11      + |         1         1         1         0
#>   [2]     chr1        11      - |         0         0         0         1
#>   [3]     chr1        12      - |         2         2         2         2
#>   [4]     chr1        13      + |         3         3         3         3
#>   [5]     chr1        14      * |         4         4         4         4
#>   [6]     chr1        15      * |         5         5         0         0
#>   [7]     chr1        15      + |         0         0         5         0
#>   [8]     chr1        15      - |         0         0         0         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

## The setting: 'ignore.strand = TRUE' ignores the strand information
## and only the information on the base position is preserved.
uniqueGRanges(grList, ignore.strand = TRUE)
#>  *** Building coordinates for the new GRanges object ...
#>  *** Setting strand information to * ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |======================================================================| 100%
#> 
#>  *** Sorting by chromosomes and start positions...
#> GRanges object with 5 ranges and 4 metadata columns:
#>       seqnames    ranges strand |     score   score.1   score.2   score.3
#>          <Rle> <IRanges>  <Rle> | <numeric> <numeric> <numeric> <numeric>
#>   [1]     chr1        11      * |         1         1         1         1
#>   [2]     chr1        12      * |         2         2         2         2
#>   [3]     chr1        13      * |         3         3         3         3
#>   [4]     chr1        14      * |         4         4         4         4
#>   [5]     chr1        15      * |         5         5         5         5
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths