| Class | Bio::Locations |
| In: |
lib/bio/location.rb
(CVS)
|
| Parent: | Object |
The Bio::Locations class is a container for Bio::Location objects: creating a Bio::Locations object (based on a GenBank style position string) will spawn an array of Bio::Location objects.
locations = Bio::Locations.new('join(complement(500..550), 600..625)')
locations.each do |loc|
puts "class = " + loc.class.to_s
puts "range = #{loc.from}..#{loc.to} (strand = #{loc.strand})"
end
# Output would be:
# class = Bio::Location
# range = 500..550 (strand = -1)
# class = Bio::Location
# range = 600..625 (strand = 1)
# For the following three location strings, print the span and range
['one-of(898,900)..983',
'one-of(5971..6308,5971..6309)',
'8050..one-of(10731,10758,10905,11242)'].each do |loc|
location = Bio::Locations.new(loc)
puts location.span
puts location.range
end
According to the GenBank manual ‘gbrel.txt’, position notations were classified into 10 patterns - (A) to (J).
3.4.12.2 Feature Location
The second column of the feature descriptor line designates the
location of the feature in the sequence. The location descriptor
begins at position 22. Several conventions are used to indicate
sequence location.
Base numbers in location descriptors refer to numbering in the entry,
which is not necessarily the same as the numbering scheme used in the
published report. The first base in the presented sequence is numbered
base 1. Sequences are presented in the 5 to 3 direction.
Location descriptors can be one of the following:
(A) 1. A single base;
(B) 2. A contiguous span of bases;
(C) 3. A site between two bases;
(D) 4. A single base chosen from a range of bases;
(E) 5. A single base chosen from among two or more specified bases;
(F) 6. A joining of sequence spans;
(G) 7. A reference to an entry other than the one to which the feature
belongs (i.e., a remote entry), followed by a location descriptor
referring to the remote sequence;
(H) 8. A literal sequence (a string of bases enclosed in quotation marks).
(C) A site between two residues, such as an endonuclease cleavage site, is
indicated by listing the two bases separated by a carat (e.g., 23^24).
(D) A single residue chosen from a range of residues is indicated by the
number of the first and last bases in the range separated by a single
period (e.g., 23.79). The symbols < and > indicate that the end point
(I) of the range is beyond the specified base number.
(B) A contiguous span of bases is indicated by the number of the first and
last bases in the range separated by two periods (e.g., 23..79). The
(I) symbols < and > indicate that the end point of the range is beyond the
specified base number. Starting and ending positions can be indicated
by base number or by one of the operators described below.
Operators are prefixes that specify what must be done to the indicated
sequence to locate the feature. The following are the operators
available, along with their most common format and a description.
(J) complement (location): The feature is complementary to the location
indicated. Complementary strands are read 5 to 3.
(F) join (location, location, .. location): The indicated elements should
be placed end to end to form one contiguous sequence.
(F) order (location, location, .. location): The elements are found in the
specified order in the 5 to 3 direction, but nothing is implied about
the rationality of joining them.
(F) group (location, location, .. location): The elements are related and
should be grouped together, but no order is implied.
(E) one-of (location, location, .. location): The element can be any one,
but only one, of the items listed.
| locations | [RW] | An Array of Bio::Location objects |
Parses a GenBank style position string and returns a Bio::Locations object, which contains a list of Bio::Location objects.
locations = Bio::Locations.new('join(complement(500..550), 600..625)')
Arguments:
| Returns: | Bio::Locations object |
# File lib/bio/location.rb, line 295 def initialize(position) if position.is_a? Array @locations = position else position = gbl_cleanup(position) # preprocessing @locations = gbl_pos2loc(position) # create an Array of Bio::Location objects end end
Returns nth Bio::Location object.
# File lib/bio/location.rb, line 327 def [](n) @locations[n] end
Converts relative position in the locus to position in the whole of the DNA sequence.
This method can for example be used to relate positions in a DNA-sequence with those in RNA. In this use, the optional ’:aa’-flag returns the position of the associated amino-acid rather than the nucleotide.
loc = Bio::Locations.new('complement(12838..13533)')
puts loc.absolute(10) # => 13524
puts loc.absolute(10, :aa) # => 13506
Arguments:
| Returns: | position within the whole of the sequence |
# File lib/bio/location.rb, line 417 def absolute(n, type = nil) case type when :location ; when :aa n = (n - 1) * 3 + 1 rel2abs(n) else rel2abs(n) end end
Iterates on each Bio::Location object.
# File lib/bio/location.rb, line 320 def each @locations.each do |x| yield(x) end end
Evaluate equality of Bio::Locations object.
# File lib/bio/location.rb, line 308 def equals?(other) if ! other.kind_of?(Bio::Locations) return nil end if self.sort == other.sort return true else return false end end
Returns first Bio::Location object.
# File lib/bio/location.rb, line 332 def first @locations.first end
Returns last Bio::Location object.
# File lib/bio/location.rb, line 337 def last @locations.last end
Converts absolute position in the whole of the DNA sequence to relative position in the locus.
This method can for example be used to relate positions in a DNA-sequence with those in RNA. In this use, the optional ’:aa’-flag returns the position of the associated amino-acid rather than the nucleotide.
loc = Bio::Locations.new('complement(12838..13533)')
puts loc.relative(13524) # => 10
puts loc.relative(13506, :aa) # => 3
Arguments:
| Returns: | position within the location |
# File lib/bio/location.rb, line 385 def relative(n, type = nil) case type when :location ; when :aa if n = abs2rel(n) (n - 1) / 3 + 1 else nil end else abs2rel(n) end end
Returns an Array containing overall min and max position [min, max] of this Bio::Locations object.
# File lib/bio/location.rb, line 343 def span span_min = @locations.min { |a,b| a.from <=> b.from } span_max = @locations.max { |a,b| a.to <=> b.to } return span_min.from, span_max.to end