| Class | Bio::NBRF |
| In: |
lib/bio/db/nbrf.rb
(CVS)
|
| Parent: | DB |
| DELIMITER | = | RS = "\n>" | Delimiter of each entry. Bio::FlatFile uses it. | |
| DELIMITER_OVERRUN | = | 1 | (Integer) excess read size included in DELIMITER. |
| entry_id | -> | accession |
| data | [RW] | sequence data of the entry (???) |
| definition | [RW] | Returns the description line of the NBRF/PIR formatted data. |
| entry_id | [RW] | Returns ID described in the entry. |
| entry_overrun | [R] | piece of next entry. Bio::FlatFile uses it. |
| seq_type | [RW] |
Returns sequence type described in the entry.
P1 (protein), F1 (protein fragment) DL (DNA linear), DC (DNA circular) RL (DNA linear), RC (DNA circular) N3 (tRNA), N1 (other functional RNA) |
Creates a new NBRF object. It stores the comment and sequence information from one entry of the NBRF/PIR format string. If the argument contains more than one entry, only the first entry is used.
# File lib/bio/db/nbrf.rb, line 45 def initialize(str) str = str.sub(/\A[\r\n]+/, '') # remove first void lines line1, line2, rest = str.split(/^/, 3) rest = rest.to_s rest.sub!(/^>.*/m, '') # remove trailing entries for sure @entry_overrun = $& rest.sub!(/\*\s*\z/, '') # remove last '*' and "\n" @data = rest @definition = line2.to_s.chomp if /^>?([A-Za-z0-9]{2})\;(.*)/ =~ line1.to_s then @seq_type = $1 @entry_id = $2 end end
Creates a NBRF/PIR formatted text. Parameters can be omitted.
# File lib/bio/db/nbrf.rb, line 167 def self.to_nbrf(hash) seq_type = hash[:seq_type] seq = hash[:seq] unless seq_type if seq.is_a?(Bio::Sequence::AA) then seq_type = 'P1' elsif seq.is_a?(Bio::Sequence::NA) then seq_type = /u/i =~ seq ? 'RL' : 'DL' else seq_type = 'XX' end end width = hash.has_key?(:width) ? hash[:width] : 70 if width then seq = seq.to_s + "*" seq.gsub!(Regexp.new(".{1,#{width}}"), "\\0\n") else seq = seq.to_s + "*\n" end ">#{seq_type};#{hash[:entry_id]}\n#{hash[:definition]}\n#{seq}" end
Returens the protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.
# File lib/bio/db/nbrf.rb, line 143 def aaseq if seq.is_a?(Bio::Sequence::NA) then raise 'not nucleic but protein sequence' elsif seq.is_a?(Bio::Sequence::AA) then seq else Bio::Sequence::AA.new(seq) end end
Returens the nucleic acid sequence. If you call naseq for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.
# File lib/bio/db/nbrf.rb, line 122 def naseq if seq.is_a?(Bio::Sequence::AA) then raise 'not nucleic but protein sequence' elsif seq.is_a?(Bio::Sequence::NA) then seq else Bio::Sequence::NA.new(seq) end end
Returns sequence data. Returns Bio::Sequence::NA, Bio::Sequence::AA or Bio::Sequence, according to the sequence type.
# File lib/bio/db/nbrf.rb, line 107 def seq unless defined?(@seq) @seq = seq_class.new(@data.tr(" \t\r\n0-9", '')) # lazy clean up end @seq end
Returns Bio::Sequence::AA, Bio::Sequence::NA, or Bio::Sequence, depending on sequence type.
# File lib/bio/db/nbrf.rb, line 91 def seq_class case @seq_type when /[PF]1/ # protein Sequence::AA when /[DR][LC]/, /N[13]/ # nucleic Sequence::NA else Sequence end end