KEGG_API.rd

Path: doc/KEGG_API.rd  (CVS)
Last Update: Wed Dec 27 22:40:45 +0900 2006

begin

  $Id: KEGG_API.rd,v 1.5 2006/12/27 13:40:45 k Exp $

    Copyright (C) 2003-2006 Toshiaki Katayama <k@bioruby.org>

KEGG API

KEGG API is a web service to use the KEGG system from your program via SOAP/WSDL.

We have been making the ((<KEGG|URL:/kegg/>)) system available at ((<GenomeNet|URL:/>)). KEGG is a suite of databases including GENES, SSDB, PATHWAY, LIGAND, LinkDB, etc. for genome research and related research areas in molecular and cellular biology. These databases and associated computation services are available via WWW and the user interfaces are built on web browsers. Thus, the interfaces are designed to be accessed by humans, not by machines, which means that it is troublesome for the researchers who want to use KEGG in an automated manner. Besides, from the database developer‘s side, it is impossible to prepare all the CGI programs that satisfy a variety of users’ needs.

In recent years, the Internet technology for application-to-application communication referred to as the ((<web service|URL:www.oreillynet.com/lpt/a/webservices/2002/02/12/webservicefaqs.html>)) is improving at a rapid rate. For exmaple, Google, a popular Internet search engine, provides the web service called the ((<Google Web API|URL:www.google.com/apis/>)). The service enables users to develop software that accesses and manipulates a massive amount of web documents that are constantly refreshed. In the field of genome research, a similar kind of web service called ((<DAS|URL:www.biodas.org/>)) (distributed annotation system) has been used on several web sites, including ((<Ensembl|URL:www.ensembl.org/>)), ((<Wormbase|URL:www.wormbase.org/>)), ((<Flybase|URL:www.flybase.org/>)), ((<SGD|URL:www.yeastgenome.org/>)), ((<TIGR|URL:www.tigr.org/>)).

With the background and the trends noted above, we have started developing a new web service called KEGG API using ((<SOAP|URL:www.w3.org/TR/SOAP/>)) and ((<WSDL|URL:www.w3.org/TR/wsdl20/>)). The service has been tested with ((<Ruby|URL:www.ruby-lang.org/>)) (Ruby 1.8.2 or Ruby 1.6.8 with ((<SOAP4R|URL:raa.ruby-lang.org/project/soap4r/>)) version 1.4.8.1) and ((<Perl|URL:www.perl.org/>)) (((<SOAP::Lite|URL:www.soaplite.com/>)) version 0.55) languages. Although the service has not been tested with clients written in other languages, it should work if the language can treat SOAP/WSDL.

The ((<BioRuby|URL:bioruby.org/>)) project prepared a Ruby library to handle the KEGG API, so users of the Ruby language should check out the latest release of the BioRuby distribution.

For the general information on KEGG API, see the following page at GenomeNet:

  * ((<URL:http://www.genome.jp/kegg/soap/>))

Table of contents

  • ((<Introduction>))
  • ((<KEGG API Quick Start>))
    • ((<Quick Start with Perl>))
      • ((<Perl FAQ>))
    • ((<Quick Start with Ruby>))
    • ((<Quick Start with Python>))
    • ((<Quick Start with Java>))
  • ((<KEGG API Reference>))
    • ((<WSDL file>))
    • ((<Terminology>))
    • ((<Returned values>))
      • ((<SSDBRelation>)), ((<ArrayOfSSDBRelation>))
      • ((<MotifResult>)), ((<ArrayOfMotifResult>))
      • ((<Definition>)), ((<ArrayOfDefinition>))
      • ((<LinkDBRelation>)), ((<ArrayOfLinkDBRelation>))
      • ((<PathwayElement>)), ((<ArrayOfPathwayElement>))
      • ((<PathwayElementRelation>)), ((<ArrayOfPathwayElementRelation>))
        • ((<Subtype>)), ((<ArrayOfSubtype>))
      • ((<StructureAlignment>)), ((<ArrayOfStructureAlignment>))
    • ((<Methods>))
      • ((<Meta information>))
        • ((<list_databases>))
        • ((<list_organisms>))
        • ((<list_pathways>))
      • ((<DBGET>))
        • ((<binfo>))
        • ((<bfind>))
        • ((<bget>))
        • ((<btit>))
        • ((<bconv>))
      • ((<LinkDB>))
        • ((<Database cross references>))
          • ((<get_linkdb_by_entry>))
          • ((<get_linkdb_between_databases>))
        • ((<Relation among genes and enzymes>))
          • ((<get_genes_by_enzyme>))
          • ((<get_enzymes_by_gene>))
        • ((<Relation among enzymes, compounds and reactions>))
          • ((<get_enzymes_by_compound>))
          • ((<get_enzymes_by_glycan>))
          • ((<get_enzymes_by_reaction>))
          • ((<get_compounds_by_enzyme>))
          • ((<get_compounds_by_reaction>))
          • ((<get_glycans_by_enzyme>))
          • ((<get_glycans_by_reaction>))
          • ((<get_reactions_by_enzyme>))
          • ((<get_reactions_by_compound>))
          • ((<get_reactions_by_glycan>))
      • ((<SSDB>))
        • ((<get_best_best_neighbors_by_gene>))
        • ((<get_best_neighbors_by_gene>))
        • ((<get_reverse_best_neighbors_by_gene>))
        • ((<get_paralogs_by_gene>))
      • ((<Motif>))
        • ((<get_motifs_by_gene>))
        • ((<get_genes_by_motifs>))
      • ((<KO>))
        • ((<get_ko_by_gene>))
        • ((<get_ko_by_ko_class>))
        • ((<get_genes_by_ko_class>))
        • ((<get_genes_by_ko>))
      • ((<PATHWAY>))
        • ((<Coloring pathways>))
          • ((<mark_pathway_by_objects>))
          • ((<color_pathway_by_objects>))
          • ((<color_pathway_by_elements>))
          • ((<get_html_of_marked_pathway_by_objects>))
          • ((<get_html_of_colored_pathway_by_objects>))
          • ((<get_html_of_colored_pathway_by_elements>))
        • ((<Relations of objects on the pathway>))
          • ((<get_element_relations_by_pathway>))
        • ((<Objects on the pathway>))
          • ((<get_elements_by_pathway>))
          • ((<get_genes_by_pathway>))
          • ((<get_enzymes_by_pathway>))
          • ((<get_compounds_by_pathway>))
          • ((<get_glycans_by_pathway>))
          • ((<get_reactions_by_pathway>))
          • ((<get_kos_by_pathway>))
        • ((<Pathways by objects>))
          • ((<get_pathways_by_genes>))
          • ((<get_pathways_by_enzymes>))
          • ((<get_pathways_by_compounds>))
          • ((<get_pathways_by_glycans>))
          • ((<get_pathways_by_reactions>))
          • ((<get_pathways_by_kos>))
        • ((<Relation among pathways>))
          • ((<get_linked_pathways>))
      • ((<GENES>))
        • ((<get_genes_by_organism>))
      • ((<GENOME>))
        • ((<get_number_of_genes_by_organism>))
      • ((<LIGAND>))
        • ((<convert_mol_to_kcf>))
        • ((<search_compounds_by_name>))
        • ((<search_drugs_by_name>))
        • ((<search_glycans_by_name>))
        • ((<search_compounds_by_composition>))
        • ((<search_drugs_by_composition>))
        • ((<search_glycans_by_composition>))
        • ((<search_compounds_by_mass>))
        • ((<search_drugs_by_mass>))
        • ((<search_glycans_by_mass>))
        • ((<search_compounds_by_subcomp>))
        • ((<search_drugs_by_subcomp>))
        • ((<search_glycans_by_kcam>))

Introduction

This guide explains how to use the KEGG API in your programs for searching and retrieving data from the KEGG database.

KEGG API Quick Start

As always, the best way to become familar with it is by looking at an example. In this document, sample codes written in several languages are shown. After understanding the first exsample, try other APIs.

Firstly, you have to install the SOAP related libraries for the programming language of your choice.

Quick Start with Perl

In the case of Perl, you need to install the following packages:

  * ((<SOAP Lite|URL:http://www.soaplite.com/>)) (tested with 0.60)
    * Note: SOAP Lite > 0.60 is reported to have errors in some methods for now.
  * ((<MIME-Base64|URL:http://search.cpan.org/author/GAAS/MIME-Base64/>))
  * ((<LWP|URL:http://search.cpan.org/author/GAAS/libwww-perl/>))
  * ((<URI|URL:http://search.cpan.org/author/GAAS/URI/>))

Here‘s a first example in Perl language.

  #!/usr/bin/env perl

  use SOAP::Lite;

  $wsdl = 'http://soap.genome.jp/KEGG.wsdl';

  $serv = SOAP::Lite->service($wsdl);

  $offset = 1;
  $limit = 5;

  $top5 = $serv->get_best_neighbors_by_gene('eco:b0002', $offset, $limit);

  foreach $hit (@{$top5}) {
    print "$hit->{genes_id1}\t$hit->{genes_id2}\t$hit->{sw_score}\n";
  }

The output will be

  eco:b0002       eco:b0002       5283
  eco:b0002       ecj:JW0001      5283
  eco:b0002       sfx:S0002       5271
  eco:b0002       sfl:SF0002      5271
  eco:b0002       ecc:c0003       5269

showing that eco:b0002 has Smith-Waterman score 5271 with sfl:SF0002 as a 4th hit among the entire KEGG/GENES database (here, "eco" means

  1. coli K-12 MG1655 and "sfl" means Shigella flexneri 2457T in the

KEGG organism codes).

The method internally searches the KEGG/SSDB (Sequence Similarity Database) database which contains information about the amino acid sequence similarities among all protein coding genes in the complete genomes, together with information about best hits and bidirectional best hits (best-best hits). The relation of gene x in genome A and gene y in genome B is called bidirectional best hits, when x is the best hit of query y against all genes in A and vice versa, and it is often used as an operational definition of ortholog.

Next example simply lists PATHWAYs for E. coli ("eco") in KEGG database.

  #!/usr/bin/env perl

  use SOAP::Lite;

  $wsdl = 'http://soap.genome.jp/KEGG.wsdl';

  $results = SOAP::Lite
               -> service($wsdl)
               -> list_pathways("eco");

  foreach $path (@{$results}) {
    print "$path->{entry_id}\t$path->{definition}\n";
  }

This example colors the boxes corresponding to the E. coli genes b1002 and b2388 on a Glycolysis pathway of E. coli (path:eco00010).

  #!/usr/bin/env perl

  use SOAP::Lite;

  $wsdl = 'http://soap.genome.jp/KEGG.wsdl';

  $serv = SOAP::Lite -> service($wsdl);

  $genes = SOAP::Data->type(array => ["eco:b1002", "eco:b2388"]);

  $result = $serv -> mark_pathway_by_objects("path:eco00010", $genes);

  print $result;        # URL of the generated image

Perl FAQ

If you use the KEGG API methods which requires arguments in ArrayOfstring datatype, you must need following modifications depending on the version of SOAP::Lite.

SOAP::Lite version <= 0.60

As you see in the above example, you always need to convert a Perl‘s array into a SOAP object expicitly in SOAP::Lite by

  SOAP::Data->type(array => [value1, value2, .. ])

when you pass an array as the argument for any KEGG API method.

SOAP::Lite version > 0.60

You should use version >= 0.69 as the versions between 0.61-0.68 contain bugs.

You need to add following code to your program to pass the array of string and/or int data to the SOAP server.

  sub SOAP::Serializer::as_ArrayOfstring{
    my ($self, $value, $name, $type, $attr) = @_;
    return [$name, {'xsi:type' => 'array', %$attr}, $value];
  }

  sub SOAP::Serializer::as_ArrayOfint{
    my ($self, $value, $name, $type, $attr) = @_;
    return [$name, {'xsi:type' => 'array', %$attr}, $value];
  }

By adding the above, you can write

  $genes = ["eco:b1002", "eco:b2388"];

instead of the following (writing as follows is also permitted).

  $genes = SOAP::Data->type(array => ["eco:b1002", "eco:b2388"]);

Sample program

You can test with the following script for the SOAP::Lite v0.69. If it works, a URL of the generated image will be returned.

  #!/usr/bin/env perl

  use SOAP::Lite +trace => [qw(debug)];

  print "SOAP::Lite = ", $SOAP::Lite::VERSION, "\n";

  my $serv = SOAP::Lite -> service("http://soap.genome.jp/KEGG.wsdl");

  my $genes = ["eco:b1002", "eco:b2388"];

  my $result = $serv->mark_pathway_by_objects("path:eco00010", $genes);
  print $result, "\n";

  # sub routines implicitly used in the above code

  sub SOAP::Serializer::as_ArrayOfstring{
    my ($self, $value, $name, $type, $attr) = @_;
    return [$name, {'xsi:type' => 'array', %$attr}, $value];
  }

  sub SOAP::Serializer::as_ArrayOfint{
    my ($self, $value, $name, $type, $attr) = @_;
    return [$name, {'xsi:type' => 'array', %$attr}, $value];
  }

Quick Start with Ruby

If you are using Ruby 1.8.1 or later, you are ready to use KEGG API as Ruby already supports SOAP in its standard library.

If your Ruby is 1.6.8 or older, you need to install followings:

  * ((<SOAP4R|URL:http://raa.ruby-lang.org/list.rhtml?name=soap4r>)) 1.5.1 or later
  * One of the following XML processing library
    * ((<rexml|URL:http://raa.ruby-lang.org/list.rhtml?name=rexml>))
    * ((<xmlparser|URL:http://raa.ruby-lang.org/list.rhtml?name=xmlparser>))
    * ((<xmlscan|URL:http://raa.ruby-lang.org/list.rhtml?name=xmlscan>))
  * ((<date2|URL:http://raa.ruby-lang.org/list.rhtml?name=date2>))
  * ((<devel-logger|URL:http://raa.ruby-lang.org/list.rhtml?name=devel-logger>))
  * ((<uconv|URL:http://raa.ruby-lang.org/list.rhtml?name=uconv>))
  * ((<http-access2|URL:http://raa.ruby-lang.org/list.rhtml?name=http-access2>))

Here‘s a sample code for Ruby having the same functionality with Perl‘s first example shown above.

  #!/usr/bin/env ruby

  require 'soap/wsdlDriver'

  wsdl = "http://soap.genome.jp/KEGG.wsdl"
  serv = SOAP::WSDLDriverFactory.new(wsdl).create_rpc_driver
  serv.generate_explicit_type = true
  # if uncommented, you can see transactions for debug
  #serv.wiredump_dev = STDERR

  offset = 1
  limit = 5

  top5 = serv.get_best_neighbors_by_gene('eco:b0002', offset, limit)
  top5.each do |hit|
    print hit.genes_id1, "\t", hit.genes_id2, "\t", hit.sw_score, "\n"
  end

You may need to iterate to obtain all the results by increasing offset and/or limit.

  #!/usr/bin/env ruby

  require 'soap/wsdlDriver'

  wsdl = "http://soap.genome.jp/KEGG.wsdl"
  serv = SOAP::WSDLDriverFactory.new(wsdl).create_rpc_driver
  serv.generate_explicit_type = true

  offset = 1
  limit = 100

  loop do
    results = serv.get_best_neighbors_by_gene('eco:b0002', offset, limit)
    break unless results
    results.each do |hit|
      print hit.genes_id1, "\t", hit.genes_id2, "\t", hit.sw_score, "\n"
    end
    offset += limit
  end

It is automatically done by using ((<BioRuby|URL:bioruby.org/>)) library, which implements get_all_* methods for this. BioRuby also provides filtering functionality for selecting needed fields from the complex data type.

  #!/usr/bin/env ruby

  require 'bio'

  serv = Bio::KEGG::API.new

  results = serv.get_all_best_neighbors_by_gene('eco:b0002')

  results.each do |hit|
    print hit.genes_id1, "\t", hit.genes_id2, "\t", hit.sw_score, "\n"
  end

  # Same as above but using filter to select fields
  fields = [:genes_id1, :genes_id2, :sw_score]
  results.each do |hit|
    puts hit.filter(fields).join("\t")
  end

  # Different filters to pick additional fields for each amino acid sequence
  fields1 = [:genes_id1, :start_position1, :end_position1, :best_flag_1to2]
  fields2 = [:genes_id2, :start_position2, :end_position2, :best_flag_2to1]
  results.each do |hit|
    print "> score: ", hit.sw_score, ", identity: ", hit.identity, "\n"
    print "1:\t", hit.filter(fields1).join("\t"), "\n"
    print "2:\t", hit.filter(fields2).join("\t"), "\n"
  end

The equivalent for the Perl‘s second example described above will be

  #!/usr/bin/env ruby

  require 'bio'

  serv = Bio::KEGG::API.new

  list = serv.list_pathways("eco")
  list.each do |path|
    print path.entry_id, "\t", path.definition, "\n"
  end

and equivalent for the last example is as follows.

  #!/usr/bin/env ruby

  require 'bio'

  serv = Bio::KEGG::API.new

  genes = ["eco:b1002", "eco:b2388"]

  result = serv.mark_pathway_by_objects("path:eco00010", genes)

  print result          # URL of the generated image

Quick Start with Python

In the case of Python, you have to install

  * ((<SOAPpy|URL:http://pywebsvcs.sourceforge.net/>))

plus some extra packages required for SOAPpy ( ((<fpconst|URL:www.analytics.washington.edu/Zope/projects/fpconst>)), ((<PyXML|URL:pyxml.sourceforge.net/>)) etc.).

Here‘s a sample code using KEGG API with Python.

  #!/usr/bin/env python

  from SOAPpy import WSDL

  wsdl = 'http://soap.genome.jp/KEGG.wsdl'
  serv = WSDL.Proxy(wsdl)

  results = serv.get_genes_by_pathway('path:eco00020')
  print results

Quick Start with Java

In the case of Java, you need to obtain Apache Axis library version axis-1_2alpha or newer (axis-1_1 doesn‘t work properly for KEGG API)

  * ((<Apache Axis|URL:http://ws.apache.org/axis/>))

and put required jar files in an appropriate directory.

For the binary distribution of the Apache axis-1_2alpha release, copy the jar files stored under the axis-1_2alpha/lib/ to the directory of your choice.

  % cp axis-1_2alpha/lib/*.jar /path/to/lib/

You can use WSDL2Java coming with Apache Axis to generate classes needed for the KEGG API automatically.

To generate classes and documents for the KEGG API, download the script ((<axisfix.pl|URL:www.genome.jp/kegg/soap/support/axisfix.pl>)) and follow the steps below:

  % java -classpath /path/to/lib/axis.jar:/path/to/lib/jaxrpc.jar:/path/to/lib/commons-logging.jar:/path/to/lib/commons-discovery.jar:/path/to/lib/saaj.jar:/path/to/lib/wsdl4j.jar:. org.apache.axis.wsdl.WSDL2Java -p keggapi http://soap.genome.jp/KEGG.wsdl
  % perl -i axisfix.pl keggapi/KEGGBindingStub.java
  % javac -classpath /path/to/lib/axis.jar:/path/to/lib/jaxrpc.jar:/path/to/lib/wsdl4j.jar:. keggapi/KEGGLocator.java
  % jar cvf keggapi.jar keggapi/*
  % javadoc -classpath /path/to/lib/axis.jar:/path/to/lib/jaxrpc.jar -d keggapi_javadoc keggapi/*.java

This program will do the same job as the Python‘s example (extended to accept a pathway_id as the argument).

  import keggapi.*;

  class GetGenesByPathway {
          public static void main(String[] args) throws Exception {
                  KEGGLocator  locator = new KEGGLocator();
                  KEGGPortType serv    = locator.getKEGGPort();

                  String   query   = args[0];
                  String[] results = serv.get_genes_by_pathway(query);

                  for (int i = 0; i < results.length; i++) {
                          System.out.println(results[i]);
                  }
          }
  }

This is another example which uses ArrayOfSSDBRelation data type.

  import keggapi.*;

  class GetBestNeighborsByGene {
          public static void main(String[] args) throws Exception {
                  KEGGLocator    locator  = new KEGGLocator();
                  KEGGPortType   serv     = locator.getKEGGPort();

                  String         query    = args[0];
                  SSDBRelation[] results  = null;

                  results = serv.get_best_neighbors_by_gene(query, 1, 50);

                  for (int i = 0; i < results.length; i++) {
                          String gene1  = results[i].getGenes_id1();
                          String gene2  = results[i].getGenes_id2();
                          int    score  = results[i].getSw_score();
                          System.out.println(gene1 + "\t" + gene2 + "\t" + score);
                  }
          }
  }

Compile and execute this program (don‘t forget to include keggapi.jar file in your classpath) as follows:

  % javac -classpath /path/to/lib/axis.jar:/path/to/lib/jaxrpc.jar:/path/to/lib/wsdl4j.jar:/path/to/keggapi.jar GetBestNeighborsByGene.java

  % java -classpath /path/to/lib/axis.jar:/path/to/lib/jaxrpc.jar:/path/to/lib/commons-logging.jar:/path/to/lib/commons-discovery.jar:/path/to/lib/saaj.jar:/path/to/lib/wsdl4j.jar:/path/to/keggapi.jar:. GetBestNeighborsByGene eco:b0002

You may wish to set the CLASSPATH environmental variable.

bash/zsh:

  % for i in /path/to/lib/*.jar
  do
    CLASSPATH="${CLASSPATH}:${i}"
  done
  % export CLASSPATH

tcsh:

  % foreach i ( /path/to/lib/*.jar )
    setenv CLASSPATH ${CLASSPATH}:${i}
  end

For the other cases, consult the javadoc pages generated by WSDL2Java.

  * ((<URL:http://www.genome.jp/kegg/soap/doc/keggapi_javadoc/>))

KEGG API Reference

WSDL file

Users can use a WSDL file to create a SOAP client driver. The WSDL file for the KEGG API can be found at:

  * ((<URL:http://soap.genome.jp/KEGG.wsdl>))

Terminology

  * 'org' is a three-letter (or four-letter) organism code used in KEGG.
    The list can be found at (see the description of the list_organisms
    method below):

    * ((<URL:http://www.genome.jp/kegg/catalog/org_list.html>))

  * 'db' is a database name used in GenomeNet service. See the
    description of the list_databases method below.

  * 'entry_id' is a unique identifier of which format is the combination of
    the database name and the identifier of an entry joined by a colon sign
    as 'database:entry' (e.g. 'embl:J00231' means an EMBL entry 'J00231').
    'entry_id' includes 'genes_id', 'enzyme_id', 'compound_id', 'drug_id',
    'glycan_id', 'reaction_id', 'pathway_id' and 'motif_id' described in below.

  * 'genes_id' is a gene identifier used in KEGG/GENES which consists of
    'keggorg' and a gene name (e.g. 'eco:b0001' means an E. coli gene 'b0001').

  * 'enzyme_id' is an enzyme identifier consisting of database name 'ec'
    and an enzyme code used in KEGG/LIGAND ENZYME database.
    (e.g. 'ec:1.1.1.1' means an alcohol dehydrogenase enzyme)

  * 'compound_id' is a compound identifier consisting of database name
    'cpd' and a compound number used in KEGG COMPOUND / LIGAND database
    (e.g. 'cpd:C00158' means a citric acid).  Note that some compounds
    also have 'glycan_id' and both IDs are accepted and converted internally
    by the corresponding methods.

  * 'drug_id' is a drug identifier consisting of database name 'dr'
    and a compound number used in KEGG DRUG / LIGAND database
    (e.g. 'dr:D00201' means a tetracycline).

  * 'glycan_id' is a glycan identifier consisting of database name 'gl'
    and a glycan number used in KEGG GLYCAN database (e.g. 'gl:G00050'
    means a Paragloboside).  Note that some glycans also have 'compound_id'
    and both IDs are accepted and converted internally by the corresponding
    methods.

  * 'reaction_id' is a reaction identifier consisting of database name 'rn'
    and a reaction number used in KEGG/REACTION (e.g. 'rn:R00959' is a
    reaction which catalyze cpd:C00103 into cpd:C00668)

  * 'pathway_id' is a pathway identifier consisting of 'path' and a pathway
    number used in KEGG/PATHWAY. Pathway numbers prefixed by 'map' specify
    the reference pathway and pathways prefixed by the 'keggorg' specify
    pathways specific to the organism (e.g. 'path:map00020' means a reference
    pathway for the cytrate cycle and 'path:eco00020' means a same pathway of
    which E. coli genes are marked).

  * 'motif_id' is a motif identifier consisting of motif database names
    ('ps' for prosite, 'bl' for blocks, 'pr' for prints, 'pd' for prodom,
    and 'pf' for pfam) and a motif entry name. (e.g. 'pf:DnaJ' means a Pfam
    database entry 'DnaJ').

  * 'ko_id' is a KO identifier consisting of 'ko' and a ko number used in
    KEGG/KO. KO (KEGG Orthology) is an classification of orthologous genes
    defined by KEGG (e.g. 'ko:K02598' means a KO group for nitrite transporter
    NirC genes).

  * 'ko_class_id' is a KO class identifier which is used to classify
    'ko_id' hierarchically (e.g. '01110' means a 'Carbohydrate Metabolism'
    class).

    * ((<URL:http://www.genome.jp/dbget-bin/get_htext?KO>))

  * 'offset' and 'limit' are both an integer and used to control the
    number of the results returned at once.  Methods having these arguments
    will return first 'limit' results starting from 'offset'th.

  * 'fg_color_list' is a list of colors for the foreground (corresponding
    to the texts and borders of the objects on the KEGG pathway map).

  * 'bg_color_list' is a list of colors for the background (corresponding
    to the inside of the objects on the KEGG pathway map).

Related site:

  * ((<URL:http://www.genome.jp/kegg/kegg3.html>))

Returned values

Many of the KEGG API methods will return a set of values in a complex data structure as described below. This section summarizes all kind of these data types. Note that, the retuened values for the empty result will be

  * an empty array -- for the methods which return ArrayOf'OBJ'
  * an empty string -- for the methods which return String
  * -1 -- for the methods which return int
  * NULL -- for the methods which return any other 'OBJ'

+ SSDBRelation

SSDBRelation data type contains the following fields:

  genes_id1         genes_id of the query (string)
  genes_id2         genes_id of the target (string)
  sw_score          Smith-Waterman score between genes_id1 and genes_id2 (int)
  bit_score         bit score between genes_id1 and genes_id2 (float)
  identity          identity between genes_id1 and genes_id2 (float)
  overlap           overlap length between genes_id1 and genes_id2 (int)
  start_position1   start position of the alignment in genes_id1 (int)
  end_position1     end position of the alignment in genes_id1 (int)
  start_position2   start position of the alignment in genes_id2 (int)
  end_position2     end position of the alignment in genes_id2 (int)
  best_flag_1to2    best flag from genes_id1 to genes_id2 (boolean)
  best_flag_2to1    best flag from genes_id2 to genes_id1 (boolean)
  definition1       definition string of the genes_id1 (string)
  definition2       definition string of the genes_id2 (string)
  length1           amino acid length of the genes_id1 (int)
  length2           amino acid length of the genes_id2 (int)

+ ArrayOfSSDBRelation

ArrayOfSSDBRelation data type is a list of the SSDBRelation data type.

+ MotifResult

MotifResult data type contains the following fields:

  motif_id          motif_id of the motif (string)
  definition        definition of the motif (string)
  genes_id          genes_id of the gene containing the motif (string)
  start_position    start position of the motif match (int)
  end_position      end position of the motif match (int)
  score             score of the motif match for TIGRFAM and PROSITE (float)
  evalue            E-value of the motif match for Pfam (double)

Note: ‘score’ and/or ‘evalue’ is set to -1 if the corresponding value is not applicable.

+ ArrayOfMotifResult

ArrayOfMotifResult data type is a list of the MotifResult data type.

+ Definition

Definition data type contains the following fields:

  entry_id          database entry_id (string)
  definition        definition of the entry (string)

+ ArrayOfDefinition

ArrayOfDefinition data type is a list of the Definition data type.

+ LinkDBRelation

LinkDBRelation data type contains the following fields:

  entry_id1         entry_id of the starting entry (string)
  entry_id2         entry_id of the terminal entry (string)
  type              type of the link as "direct" or "indirect" (string)
  path              link path information across the databases (string)

+ ArrayOfLinkDBRelation

ArrayOfLinkDBRelation data type is a list of the LinkDBRelation data type.

+ PathwayElement

PathwayElement represents the object on the KEGG PATHWAY map. PathwayElement data type contains the following fields:

  element_id        unique identifier of the object on the pathway (int)
  type              type of the object ("gene", "enzyme" etc.) (string)
  names             array of names of the object (ArrayOfstring)
  components        array of element_ids of the group components (ArrayOfint)

+ ArrayOfPathwayElement

ArrayOfPathwayElement data type is a list of the PathwayElement data type.

+ PathwayElementRelation

PathwayElementRelation represents the relationship between PathwayElements. PathwayElementRelation data type contains the following fields:

  element_id1       unique identifier of the object on the pathway (int)
  element_id2       unique identifier of the object on the pathway (int)
  type              type of relation ("ECrel", "maplink" etc.) (string)
  subtypes          array of objects involved in the relation (ArrayOfSubtype)

+ ArrayOfPathwayElementRelation

ArrayOfPathwayElementRelation data type is a list of the PathwayElementRelation data type.

++ Subtype

Subtype is used in the PathwayElementRelation data type to represent the object involved in the relation. Subtype data type contains the following fields:

  element_id        unique identifier of the object on the pathway (int)
  relation          kind of relation ("compound", "inhibition" etc.) (string)
  type              type of relation ("+p", "--|" etc.) (string)

++ ArrayOfSubtype

ArrayOfSubtype data type is a list of the Subtype data type.

+ StructureAlignment

StructureAlignment represents structural alignment of nodes between two molecules with score. StructureAlignment data type contains the following fields:

  target_id         entry_id of the target (string)
  score             alignment score (float)
  query_nodes       indices of aligned nodes in the query molecule (ArrayOfint)
  target_nodes      indices of aligned nodes in the target molecule (ArrayOfint)

+ ArrayOfStructureAlignment

ArrayOfStructureAlignment data type is a list of the StructureAlignment data type.

Methods

Meta information

This section describes the APIs for retrieving the general information concerning latest version of the KEGG database.

[Validate]