Class Bio::NBRF
In: lib/bio/db/nbrf.rb  (CVS)
Parent: DB

Sequence data class for NBRF/PIR flatfile format.

Methods

aalen   aaseq   entry   length   nalen   naseq   new   seq   seq_class   to_nbrf   to_s  

Constants

DELIMITER = RS = "\n>"   Delimiter of each entry. Bio::FlatFile uses it.
DELIMITER_OVERRUN = 1   (Integer) excess read size included in DELIMITER.

External Aliases

entry_id -> accession

Attributes

data  [RW]  sequence data of the entry (???)
definition  [RW]  Returns the description line of the NBRF/PIR formatted data.
entry_id  [RW]  Returns ID described in the entry.
entry_overrun  [R]  piece of next entry. Bio::FlatFile uses it.
seq_type  [RW]  Returns sequence type described in the entry.
 P1 (protein), F1 (protein fragment)
 DL (DNA linear), DC (DNA circular)
 RL (DNA linear), RC (DNA circular)
 N3 (tRNA), N1 (other functional RNA)

Public Class methods

Creates a new NBRF object. It stores the comment and sequence information from one entry of the NBRF/PIR format string. If the argument contains more than one entry, only the first entry is used.

[Source]

# File lib/bio/db/nbrf.rb, line 45
    def initialize(str)
      str = str.sub(/\A[\r\n]+/, '') # remove first void lines
      line1, line2, rest = str.split(/^/, 3)

      rest = rest.to_s
      rest.sub!(/^>.*/m, '') # remove trailing entries for sure
      @entry_overrun = $&
      rest.sub!(/\*\s*\z/, '') # remove last '*' and "\n"
      @data = rest

      @definition = line2.to_s.chomp
      if /^>?([A-Za-z0-9]{2})\;(.*)/ =~ line1.to_s then
        @seq_type = $1
        @entry_id = $2
      end
    end

Creates a NBRF/PIR formatted text. Parameters can be omitted.

[Source]

# File lib/bio/db/nbrf.rb, line 167
    def self.to_nbrf(hash)
      seq_type = hash[:seq_type]
      seq = hash[:seq]
      unless seq_type
        if seq.is_a?(Bio::Sequence::AA) then
          seq_type = 'P1'
        elsif seq.is_a?(Bio::Sequence::NA) then
          seq_type = /u/i =~ seq ? 'RL' : 'DL'
        else
          seq_type = 'XX'
        end
      end
      width = hash.has_key?(:width) ? hash[:width] : 70
      if width then
        seq = seq.to_s + "*"
        seq.gsub!(Regexp.new(".{1,#{width}}"), "\\0\n")
      else
        seq = seq.to_s + "*\n"
      end
      ">#{seq_type};#{hash[:entry_id]}\n#{hash[:definition]}\n#{seq}"
    end

Public Instance methods

Returens the length of protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

# File lib/bio/db/nbrf.rb, line 157
    def aalen
      aaseq.length
    end

Returens the protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

# File lib/bio/db/nbrf.rb, line 143
    def aaseq
      if seq.is_a?(Bio::Sequence::NA) then
        raise 'not nucleic but protein sequence'
      elsif seq.is_a?(Bio::Sequence::AA) then
        seq
      else
        Bio::Sequence::AA.new(seq)
      end
    end

Returns the stored one entry as a NBRF/PIR format. (same as to_s)

[Source]

# File lib/bio/db/nbrf.rb, line 84
    def entry
      @entry = ">#{@seq_type or 'XX'};#{@entry_id}\n#{definition}\n#{@data}*\n"
    end

Returns sequence length.

[Source]

# File lib/bio/db/nbrf.rb, line 115
    def length
      seq.length
    end

Returens the length of sequence. If you call nalen for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

# File lib/bio/db/nbrf.rb, line 135
    def nalen
      naseq.length
    end

Returens the nucleic acid sequence. If you call naseq for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

# File lib/bio/db/nbrf.rb, line 122
    def naseq
      if seq.is_a?(Bio::Sequence::AA) then
        raise 'not nucleic but protein sequence'
      elsif seq.is_a?(Bio::Sequence::NA) then
        seq
      else
        Bio::Sequence::NA.new(seq)
      end
    end

Returns sequence data. Returns Bio::Sequence::NA, Bio::Sequence::AA or Bio::Sequence, according to the sequence type.

[Source]

# File lib/bio/db/nbrf.rb, line 107
    def seq
      unless defined?(@seq)
        @seq = seq_class.new(@data.tr(" \t\r\n0-9", '')) # lazy clean up
      end
      @seq
    end

Returns Bio::Sequence::AA, Bio::Sequence::NA, or Bio::Sequence, depending on sequence type.

[Source]

# File lib/bio/db/nbrf.rb, line 91
    def seq_class
      case @seq_type
      when /[PF]1/
        # protein
        Sequence::AA
      when /[DR][LC]/, /N[13]/
        # nucleic
        Sequence::NA
      else
        Sequence
      end
    end
to_s()

Alias for entry

[Validate]