У меня есть файл gb, и мне нужно извлечь из него некоторые специфические особенности: названия и размер генов, кодирующих белки.
LOCUS NC_008137 15318 bp DNA linear MAM 15-APR-2009
DEFINITION Phalanger interpositus mitochondrion, complete genome.
ACCESSION NC_008137
VERSION NC_008137.1 GI:108793518
DBLINK Project: 17043
KEYWORDS .
SOURCE mitochondrion Phalanger interpositus (Stein's cuscus)
ORGANISM Phalanger interpositus
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Metatheria; Diprotodontia; Phalangeridae; Phalanger.
REFERENCE 1 (bases 1 to 15318)
AUTHORS Munemasa,M., Nikaido,M., Donnellan,S., Austin,C.C., Okada,N. and
Hasegawa,M.
TITLE Phylogenetic analysis of diprotodontian marsupials based on
complete mitochondrial genomes
JOURNAL Genes Genet. Syst. 81 (3), 181-191 (2006)
PUBMED 16905872
REFERENCE 2 (bases 1 to 15318)
CONSRTM NCBI Genome Project
TITLE Direct Submission
JOURNAL Submitted (12-JUN-2006) National Center for Biotechnology
Information, NIH, Bethesda, MD 20894, USA
REFERENCE 3 (bases 1 to 15318)
AUTHORS Munemasa,M., Nikaido,M., Donnellan,S., Austin,C.C., Okada,N. and
Hasegawa,M.
TITLE Direct Submission
JOURNAL Submitted (08-NOV-2005) Tokyo Institute of Technology, Graduate
School of Bioscience and Biotechnology; Nagatsuta-cho 4259-B-21,
Midori-ku, Kanagawa 226-8501, Japan
COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staff. The
reference sequence was derived from AB241057.
Genome sequence lacks part of non-coding region.
COMPLETENESS: full length.
FEATURES Location/Qualifiers
source 1..15318
/organism="Phalanger interpositus"
/organelle="mitochondrion"
/mol_type="genomic DNA"
/db_xref="taxon:356347"
/tissue_type="liver"
/common="Stein's cuscus"
tRNA 1..69
/product="tRNA-Phe"
rRNA 72..1018
/product="s-rRNA"
/note="12S ribosomal RNA"
tRNA 1020..1088
/product="tRNA-Val"
rRNA 1089..2653
/product="l-rRNA"
/note="16S ribosomal RNA"
tRNA 2654..2727
/product="tRNA-Leu"
/codon_recognized="UUR"
gene 2729..3685
/gene="ND1"
/db_xref="GeneID:4117948"
CDS 2729..3685
/gene="ND1"
/codon_start=1
/transl_table=2
/product="NADH dehydrogenase subunit 1"
/protein_id="YP_637062.1"
/db_xref="GI:108793519"
/db_xref="GeneID:4117948"
/translation="MFIINLLMYIIPILLAIAFLTLVERKALGYMQFRKGPNVVGPYG
LLQPIADGMKLFSKEPLQPVTSSTTMFIIAPTLALTLSLTMWTPLPMPHSLIDLNLGL
LFILALSGLSVYSILWSGWASNSKYALMGALRAVAQTISYEVTLAIILLSIMLINGSF
TLKNLITTQENMWLIITTWPLVMMWYVSTLAETNRAPLDLTEGESELVSGFNVEYAAG
PFAMFFLAEYANIMLMNAMTTILFLGSSINHNFTHLNTLSFMTKTIALTFLFLWVRAS
YPRFRYDQLMHLLWKNFLPMTLAMCLWFISIPIALSCIPPQI"
misc_feature 2729..3682
/gene="ND1"
/note="NADH dehydrogenase; Region: NADHdh; cl00469"
/db_xref="CDD:186018"
tRNA 3686..3751
/product="tRNA-Ile"
tRNA complement(3750..3821)
/product="tRNA-Gln"
tRNA 3821..3878
/product="tRNA-Met"
gene 3889..4932
/gene="ND2"
/db_xref="GeneID:4117949"
CDS 3889..4932
/gene="ND2"
/codon_start=1
/transl_table=2
/product="NADH dehydrogenase subunit 2"
/protein_id="YP_637063.1"
/db_xref="GI:108793520"
/db_xref="GeneID:4117949"
/translation="MSPYILLIMLTSLLLGTSLTLFSNHWLTAWMGLEINTLAIIPMM
TYPNHPRATESAIKYFLTQSTASMMLMFAIINNAWMTNQWTLLQTSDQTSSTIMTLAL
AMKLGLAPFHFWVPEVTQGIPLTSGMILLTWQKIAPTSLMYQISPSLNMKILVMLALL
STILGGWGGLNQTHMRKILAYSSIAHMGWMTIIILINPTLTLLNLAIYITTTLTLFLA
LNHSSITKIKSLANLWNKSSSMTIVIALTLLSLGGLPPLTGFMPKWLILQELITYNNI
ATATMMAMSALLNLFFYMRIIYTTTLTMPPSINNSKLQWPHPQTKTTNIIPLLTIISS
FLLPLTPLSITLS"
Я использовал seqFeature и subfeatures, но это не сработало.
Из этого файла я должен получить (ND1 и 2729..3685, ND2 и 3889..4932, ... если было больше)
Я новичок в биопионе и хотел бы помочь с этим.