Извлек последовательность ДНК из хромосомы 1, благодаря функции subseq () и функции DNAString (из biomaRt) моя цель - найти все возможные транскрипты в этой последовательности.
Я думал найтиперекрывается между объектом GRange "Txdb" (все транскрипты из хромосомы 1 у человека) и полученной ранее последовательностью ДНК (мне пришлось построить объект Grange с начальной и конечной позициями в chr1) с функцией transcriptsByOverlaps () .
Вот код:
>library(GenomicFeatures)
#download the pre compiled txDb annotation for Homo Sapiens from Bioconductor
>library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>txdb <-TxDb.Hsapiens.UCSC.hg19.knownGene
#building of the grange object
>(gr <- GRanges(seqnames = Rle("chr1", 1), ranges = IRanges(start =119625311, end = 129625309)))
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 [119625311, 129625309] *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
>transcriptsByOverlaps(txdb, gr) #this function needs a TranscriptsDb object and a grange one
Error in .Call2("NCList_find_overlaps_in_groups", start(q), end(q), q_space, :
when 'type' is "any", at least one of 'maxgap' and 'minoverlap' must be set to its default value
Я даже запустил пример, который нашел в помощи функции transcriptsByOverlaps (), и он получает ту же ошибку:
> txdb <- loadDb(system.file("extdata", "hg19_knownGene_sample.sqlite",
+ package="GenomicFeatures"))
> gr <- GRanges(seqnames = rep("chr1",2),
+ ranges = IRanges(start=c(500,10500), end=c(10000,30000)),
+ strand = strand(rep("-",2)))
> transcriptsByOverlaps(txdb, gr)
Error in .Call2("NCList_find_overlaps_in_groups", start(q), end(q), q_space, :
when 'type' is "any", at least one of 'maxgap' and 'minoverlap' must be set to its default value
Может кто-нибудь помочь с этим? Я прочитал справку, но не могу понять, как решить эту проблему. Спасибо.
Вот SessionInfo ()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=it_IT.UTF-8 LC_NUMERIC=C LC_TIME=it_IT.UTF-8
[4] LC_COLLATE=it_IT.UTF-8 LC_MONETARY=it_IT.UTF-8 LC_MESSAGES=it_IT.UTF-8
[7] LC_PAPER=it_IT.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.46.0
[3] rtracklayer_1.36.6 Biostrings_2.46.0
[5] XVector_0.16.0 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[7] GenomicFeatures_1.28.5 AnnotationDbi_1.40.0
[9] Biobase_2.36.2 GenomicRanges_1.30.3
[11] GenomeInfoDb_1.14.0 IRanges_2.12.0
[13] S4Vectors_0.16.0 BiocGenerics_0.24.0
[15] biomaRt_2.34.2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 pillar_1.1.0 compiler_3.4.1
[4] zlibbioc_1.22.0 prettyunits_1.0.2 bitops_1.0-6
[7] tools_3.4.1 progress_1.2.2 zeallot_0.1.0
[10] digest_0.6.14 bit_1.1-12 lattice_0.20-35
[13] RSQLite_2.0 memoise_1.1.0 tibble_1.4.1
[16] pkgconfig_2.0.1 rlang_0.4.1 Matrix_1.2-12
[19] DelayedArray_0.2.7 DBI_0.7 rstudioapi_0.7
[22] GenomeInfoDbData_0.99.0 stringr_1.2.0 httr_1.3.1
[25] knitr_1.18 vctrs_0.2.0 hms_0.5.2
[28] grid_3.4.1 bit64_0.9-7 R6_2.2.2
[31] BiocParallel_1.10.1 XML_3.98-1.9 blob_1.1.0
[34] magrittr_1.5 matrixStats_0.52.2 GenomicAlignments_1.12.2
[37] Rsamtools_1.28.0 backports_1.1.2 SummarizedExperiment_1.6.5
[40] assertthat_0.2.0 stringi_1.1.6 RCurl_1.95-4.10
[43] crayon_1.3.4