WebScipio RESTful Web Service

You can access the functionality of WebScipio from within any software program by using the web service API.
To illustrate the process, we show a transcript of a session in Ruby, which can easily be adapted to your programming language:
Lines starting with a number in brackets are user input. Lines starting with "=>" are responses from the server.

(0) require 'net/http'
require 'yaml'
(1) url = URI.parse("http://www.webscipio.org/api_searches")
post_parameters = {'search_species' => 'true', 'query' => 'drosophila'}
response = Net::HTTP.post_form(url, post_parameters)
id = response.body
=> "109342452031"
(2) url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml")
response = Net::HTTP.get_response(url)
yaml_string = response.body
species = YAML::load(yaml_string)
=> ["Drosophila_ananassae_TSC_14024_0371_13", "Drosophila_elegans", "Drosophila_erecta_TSC_14021_0224_01", "Drosophila_ficusphila", "Drosophila_grimshawi_TSC_15287_2541_00", "Drosophila_kikkawai", "Drosophila_melanogaster", "Drosophila_mojavensis_TSC_15081_1352_22", "Drosophila_persimilis_MSH_3", "Drosophila_pseudoobscura_MV2_25", "Drosophila_sechellia_Rob3c", "Drosophila_simulans_str__Mosaic", "Drosophila_simulans_str__c1674", "Drosophila_simulans_str__md106", "Drosophila_simulans_str__md199", "Drosophila_simulans_str__nc48", "Drosophila_simulans_str__sim4", "Drosophila_simulans_str__sim6", "Drosophila_simulans_str__white501", "Drosophila_takahashii", "Drosophila_virilis_TSC_15010_1051_87", "Drosophila_willistoni_TSC_14030_0811_24", "Drosophila_yakuba_Tai18E2", "Kluyveromyces_lactis_NRRL_Y_1140"]
(3) url = URI.parse("http://www.webscipio.org/api_searches")
post_parameters = {'search_species' => 'true', 'query' => 'primate'}
response = Net::HTTP.post_form(url, post_parameters)
id = response.body
=> "791866516211"
(4) url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml")
response = Net::HTTP.get_response(url)
yaml_string = response.body
species = YAML::load(yaml_string)
=> ["Callithrix_jacchus", "Gorilla_gorilla", "Gorilla_gorilla_gorilla", "Homo_sapiens", "Homo_sapiens_African_Individual_NA18507", "Homo_sapiens_JCVenter", "Homo_sapiens_JDWatson", "Macaca_fascicularis", "Macaca_mulatta", "Microcebus_murinus", "Nomascus_leucogenys", "Otolemur_garnettii", "Pan_troglodytes", "Papio_hamadryas", "Pongo_abelii", "Tarsius_syrichta"]
(5) url = URI.parse("http://www.webscipio.org/api_searches")
post_parameters = {'search_genomes' => 'true', 'query' => 'Daphnia_pulex'}
response = Net::HTTP.post_form(url, post_parameters)
id = response.body
=> "218502844915"
(6) url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml")
response = Net::HTTP.get_response(url)
yaml_string = response.body
genomes = YAML::load(yaml_string)
=> [{:minor_version=>0, :type=>"supercontigs", :species=>"Daphnia_pulex", :mini_version=>0, :size=>"219.77MB", :reference=>"jgi", :major_version=>1, :path=>"genomes_jgi/Daphnia_pulex_v1_supercontigs.fasta"}, {:minor_version=>0, :type=>"supercontigs", :species=>"Daphnia_pulex", :mini_version=>0, :size=>"191.35MB", :reference=>"ncbi", :major_version=>1, :path=>"genomes_ncbi/Daphnia_pulex_v1_supercontigs.fasta"}, {:minor_version=>0, :type=>"contigs", :species=>"Daphnia_pulex", :mini_version=>0, :size=>"155.07MB", :reference=>"ncbi", :major_version=>1, :path=>"genomes_ncbi/Daphnia_pulex_v1_contigs.fasta"}]
(7) query_fasta = '>DapMhc
MPPKKDMGPDPDPAQYLFVSLEMKRADQTKPYDGKKATWVPCEKDSYQLGEITGTKGDLV
VVKVADGNEKMVKKDQCFPVNPPKFEKVEDMADLTYLNDAAVLHNLRQRYYHKLIYTYSG
LFCVAINPYKRFPIYTQRVIKMYIGKRRNEVPPHIFCISDGAYMDMLTNHENQSMLITGE
SGAGKTENTKKVIQYFAQIAKDTKGSKHTFSSGGNLEDQIVQTNPVLEAFGNAKTTRNDN
SSRFGKFIRIHFGNSGKLAGADIETYLLEKARVISQQALERSYHIFYQIMSGKLPTLKAD
CCLVDDIYQYNFVSQGKITIPSMDDSEEMALTDEAFEILGMGEQRPEIWKITAAVMHFGT
MKFKQRGREEQADPDGTQEGENVAKMMGVDGPQLYMNFLKPRIKVGNEFVTQGRNVNQVV
YSIGAMAKAIFDRLFKWLVKRVNETLETGQKRVTFIGVLDIAGFEIFDYNGFEQLCINFT
NEKLQQFFNHHMFVLEQEEYKKEGIDWVFMDFGMDLQACIELMEKPMGVLSILEEESMFP
KATDQTFAEKLNNNHLGKSASFVKPKPAKAGCKEAHFAIAHYAGTVPYNITGWLEKNKDP
LNDTVVDQFKKGSSKLVQEIFADHPGQSGGKEEAKGGKRTKGSGFQTVSALYREQLNGLM
KTLNATSPHFIRCIIPNETKSPGVIDSHLVMHQLTCNGVLEGIRICRKGFPNRMVYPDFK
HRYMILAPNEMKAEPDERKAAKICLEKIALDPEWYRIGHTKVFFKAGVLGQLEEMRDDKL
AKIITWMQSFIRGYHTRKQYKQLQDQRVALCVVQRNLRSYLQMRTWAWYRLWQKVKPLLN
VTRVEDEIKALEDKAAAAQANFEKEEKLRKELETNLAKLTKEKEDLLNRLQAESGTVADF
HDKQNKLMSQKADLESQLSDTQERLQQEEDARNQLFQNKKKLEQEASGLKKDIEDLELAL
QKTETDKATKDHQIRNLNDEIAHQDELINKLNKEKKHMQEVNQKTAEDLQASEDKVNHLN
KVKAKLEQTLDELEDSLEREKKLRADIEKNKRKTEGDLKLTQEAVADLERNKKELEQTIQ
RKDKEIASLNAKLEDEQSLVGKLQKQIKELQSRIEELEEEVEAERQARAKAEKQRADLAR
ELEELGERLEEAGGATAAQIELNKKREAELSKLRRDLEESNIQHESVLSNLRKKHNDAVS
EMSEQIDQLNKMKAKAEKDRSQFAGENNDLRAAMDHVSSDKAAAEKMTKMLQQQLNEIQS
KLDEANRSLNDFDVQKKKLTIENSDYLRQLEDAESQVSQLQKLKISLTTQLEDSKRMADE
EGRERATLLGKFRNLEHDIDNIREQLDEESEAKADLQRQLSKSNADCQMWRHKYESEGVA
KAEELEDAKRKLQARLGEAEEAIESLNQKNVALEKIKMRLSGELDDMHVEVERATVLANQ
MEKRGKNFDKVVSEWKAKVDDLAAELDASQKECRNYSTELFRLKAGYDESQEHLEAVRRE
NKNLADEIKDLMDQIGEGGRNVHEIDKQRKRLEVEKEELQAALEEAESALEQEENKVLRA
QLELSQVRQEIDRRIQEKEEEFENTRKNHQRAIDSMQASLEAEAKGKAEALRMKKKLESD
INELEIALDHANKANAEAQKSIKRYQQSIKETQSALEEEQRNRDDLREQYGIAERRANAL
QGELEESRTLLEQADRARRQAETELADAHEQLHDLTAQAASSSAAKRKMESELQTLHADL
DDMINETKNSEEKAKKAMVDAARLADELRAEQEHAQAQEKQRKALELQVKELQVRLDESE
NNALKGGKKAIQKLEERVRGLETELDGEQRRHADAQKNLRKSERRIKELTFQSDEDRKNH
ERMQDLVDKLQQKIKTYKRQIEEAEEIAALNLAKFRKAQQELEEADERAELADQAVSKLR
AKGRGGSASRLSPPPQMKPRSKRDFE'
url = URI.parse("http://www.webscipio.org/api_searches")
post_parameters = {'scipio_run' => 'true', 'target_file_path' => 'genomes_jgi/Daphnia_pulex_v1_supercontigs.fasta', 'query' => query_fasta}
#optional_scipio_parameters = {
# :min_score => 0.3, # best_size
# :minid => 90, # min_identity
# :maxmis => 7, # max_mismatches
# :min_coverage => 60,
# :reg_size => 2000,
# :multiple_results => false,
# :single_target_hits => false,
# :transtable => 1,
# :max_assemble_size => 75000,
# :max_move_exon => 6,
# :gap_to_close => 6,
# :min_intron_len => 22,
# :accepted_intron_penalty => "1.0",
# :blattile => 7, # tile_size
# :blatoneoff => false,
# :blatscore => 15,
# :blatidentity => 81,
# :exhaust_align_size => 15000,
# :exhaust_gap_size => 21
#}
response = Net::HTTP.post_form(url, post_parameters)
id = response.body
=> "276393024066"
(8) url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml")
run_result = ["running", ""]
while(run_result[0] == "running") do
response = Net::HTTP.get_response(url)
yaml_string = response.body
run_result = YAML::load(yaml_string)
sleep(10)
end
scipio_results = YAML::load(run_result[1])
=> {"DapMhc"=>[{"number"=>1, "matchings"=>[{"number"=>1, "overl
ap"=>nil, "mismatchlist"=>[], "prot_start"=>0, "nucl_end"=>2
01, "prot_end"=>67, "translation"=>"MPPKKDMGPDPDPAQYLFVSLEMK
RADQTKPYDGKKATWVPCEKDSYQLGEITGTKGDLVVVKVADG", "dna_start"=>2
048124, "type"=>"exon", "contig"=>1, "seqshifts"=>[], "undet
erminedlist"=>[], "nucl_start"=>0, "seq"=>"atgcctcccaagaagga
tatgggacccgatcccgacccagcccaatacctcttcgtttccctggaaatgaaacgtgc
...
QIEEAEEIAALNLAKFRKAQQELEEADERAELADQAVSKLRAKGRGGSASRLSPPPQMKP
RSKRDFE", "prot_len"=>1946, "status"=>"auto", "undetermined"
=>0, "mismatches"=>0, "matches"=>1946, "dna_end"=>2070830}]}
(9) scipio_results["DapMhc"].size
=> 1
(10) scipio_results["DapMhc"][0]["target"]
=> "scaffold_6"
(11) scipio_results["DapMhc"][0]["dna_start"]
=> 2048124
(12) scipio_results["DapMhc"][0]["matchings"].size
=> 57
(13) scipio_results["DapMhc"][0]["matchings"].select{|m| m["type"] == "exon"}.size
=> 29
(14) scipio_results["DapMhc"][0]["matchings"].select{|m| m["type"] == "exon"}.map{|e| [e["dna_start"], e["dna_end"]]}
=> [[2048124, 2048325], [2048507, 2048654], [2049778, 2049935], [2050465, 2050493], [2050633, 2050739], [2052035, 2052128], [2052199, 2052263], [2052336, 2052435], [2052631, 2052735], [2055462, 2055597], [2056634, 2056754], [2056821, 2056971], [2057249, 2057420], [2059163, 2059343], [2059422, 2059573], [2059936, 2059989], [2061071, 2061159], [2061759, 2061877], [2062068, 2062186], [2064404, 2064541], [2064625, 2064962], [2066715, 2066844], [2066930, 2067175], [2067247, 2067506], [2067578, 2067832], [2068003, 2068082], [2068560, 2069505], [2069584, 2070718], [2070794, 2070830]]
(15) yaml_string = YAML::dump(scipio_results)
url = URI.parse("http://fab8:3010/api_searches")
post_parameters = {'mutu_exon_run' => 'true', 'query' => yaml_string}
#optional_mutu_exon_parameters = {
# :length_difference => "20",
# :min_score => "15",
# :min_exon_length_aa => "15",
# :search_up_down_stream => true,
# :all => false,
# :use_start_codon => "auto",
# :use_stop_codon => "auto",
# :max_recursion_depth => "0"
}
response = Net::HTTP.post_form(url, post_parameters)
id = response.body
=> "401632858060"
(16) url = URI.parse("http://fab8:3010/api_searches/#{id}.yaml")
run_result = ["running", ""]
while(run_result[0] == "running") do
response = Net::HTTP.get_response(url)
yaml_string = response.body
run_result = YAML::load(yaml_string)
sleep(10)
end
scipio_results_with_mutu_exons = YAML::load(run_result[1])
=> {"DapMhc"=>[{"unmatched"=>0, "matchings"=>[{"overlap"=>nil,
"number"=>1, "prot_start"=>0, "mismatchlist"=>[], "nucl_end"
=>201, "first_exon"=>true, "prot_end"=>67, "type"=>"exon", "
dna_start"=>2048124, "translation"=>"MPPKKDMGPDPDPAQYLFVSLEM
...
TCCGAATCCCACACGAATTTCAACCGTCTCTCTCTCCTCTCTCTCATCAACTTGTTTGAT
TTTTTGTCGTCGTCGTCGTTGATCAACAACTCGAAACAACAAG", "dna_end"=>207
0830, "matches"=>1946, "mismatches"=>0, "undetermined"=>0}]}
(17) yaml_string = YAML::dump(scipio_results)
url = URI.parse("http://fab8:3010/api_searches")
post_parameters = {'tandem_genes_run' => 'true', 'query' => yaml_string}
optional_tandem_genes_parameters = {
# :length_difference => "10",
# :min_score => "15",
# :min_exon_length_aa => "10",
# :min_tandem_gene_score => "30",
# :search_for_concatenated_exons => false,
# :search_for_splitted_exons => false,
# :use_start_codon => "auto",
# :use_stop_codon => "auto",
# :generate_tandem_gene_results => true
}
response = Net::HTTP.post_form(url, post_parameters)
id = response.body
=> "219595899748"
(18) url = URI.parse("http://fab8:3010/api_searches/#{id}.yaml")
run_result = ["running", ""]
while(run_result[0] == "running") do
response = Net::HTTP.get_response(url)
yaml_string = response.body
run_result = YAML::load(yaml_string)
sleep(10)
end
scipio_results_with_tandem_genes = YAML::load(run_result[1])
=> {"DapMhc"=>[{"unmatched"=>0, "matchings"=>[{"overlap"=>nil,
"number"=>1, "prot_start"=>0, "mismatchlist"=>[], "nucl_end"
=>201, "first_exon"=>true, "prot_end"=>67, "type"=>"exon", "
dna_start"=>2048124, "translation"=>"MPPKKDMGPDPDPAQYLFVSLEM
...
TCCGAATCCCACACGAATTTCAACCGTCTCTCTCTCCTCTCTCTCATCAACTTGTTTGAT
TTTTTGTCGTCGTCGTCGTTGATCAACAACTCGAAACAACAAG", "dna_end"=>207
0830, "matches"=>1946, "mismatches"=>0, "undetermined"=>0}]}

Explanation

(0) Load the http and yaml libraries.
(1) Send a POST to url "http://www.webscipio.org/api_searches" with parameters "search_species" set to "true" and "query" set to "drosophila". This will generate a search for all species with "drosophila" in their names. The response will be an ID to get the result.
(2) Send a GET to url "http://www.webscipio.org/api_searches/109342452031.yaml" were "109342452031" is the ID of the response. You can use .xml, .json and .html to get responses in ohter formats than YAML. You will get an array of all species found.
(3) Send a POST to url "http://www.webscipio.org/api_searches" with parameters "search_species" set to "true" and "query" set to "primate". This will generate a search for all primates. The response will be an ID to get the result.
(4) Send a GET to url "http://www.webscipio.org/api_searches/791866516211.yaml" were "791866516211" is the ID of the response. You will get an array of all species found.
(5) Send a POST to url "http://www.webscipio.org/api_searches" with parameters "search_genomes" set to "true" and "query" set to "Daphnia_pulex". This will generate a search for all genome files of the specified organism. The response will be an ID to get the result.
(6) Send a GET to url "http://www.webscipio.org/api_searches/218502844915.yaml" were "218502844915" is the ID of the response. You will get an array of all genome files found. Each genome file is represented by a hash containing the version, type, size and path of the genome file.
(7) Send a POST to url "http://www.webscipio.org/api_searches" with parameters "scipio_run" set to "true", "target_file_path" set to the chosen Daphnia genome "genomes_jgi/Daphnia_pulex_v1_supercontigs.fasta" and "query" set to a myosin heavy chain protein sequence. This will start a Scipio run with default parameters. You could change all parameters by setting them in the POST call. The response will be an ID to get the result.
(8) Send a GET to url "http://www.webscipio.org/api_searches/276393024066.yaml" were "276393024066" is the ID of the response. Because the Scipio run can take some time, you have to recall the GET until the run is finished. As a result you get an array with to elements. The first element is the status ("running", "finished", "nothing_found" or "error") and the second element is the Scipio result in YAML format.
(9) The Scipio result is a hash with a key/value pair for each protein queried. To get the number of contigs on which the gene is distributed, ask for the size of the array (in this case only one contig).
(10) Chose the first contig, which contains a hash with key/value pairs discribing the gene. Use the "target" key to get the name of the contig.
(11) Chose the first contig and ask for the starting position of the gene in the DNA sequence.
(12) To get the number of introns, exons and gaps chose "matchings".
(13) To get the number of exons select the matchings of type "exon".
(14) To get the star and end positions of the exons of the gene, select the exons and map them onto their start and end positions in the DNA sequence.
(15) Send a POST to url "http://www.webscipio.org/api_searches" with parameters "mutu_exon_run" set to "true" and "query" set to the Scipio result in YAML format. This will start a search for mutually exclusive exons with default parameters. You can change all parameters by setting them in the POST call. The response will be an ID to get the result.
(16) Send a GET to url "http://www.webscipio.org/api_searches/401632858060.yaml" were "401632858060" is the ID of the response. Because the search can take some time, you have to recall the GET until the run is finished. As a result you get an array with to elements. The first element is the status ("running", "finished", "nothing_found" or "error") and the second element is the Scipio result including the mutually exclusive exons found.
(17) Send a POST to url "http://www.webscipio.org/api_searches" with parameters "tandem_genes_run" set to "true" and "query" set to the Scipio result in YAML format. This will start a search for tandem genes with default parameters. You can change all parameters by setting them in the POST call. The response will be an ID to get the result.
(18) Send a GET to url "http://www.webscipio.org/api_searches/219595899748.yaml" were "219595899748" is the ID of the response. Because the search can take some time, you have to recall the GET until the run is finished. As a result you get an array with to elements. The first element is the status ("running", "finished", "nothing_found" or "error") and the second element is the Scipio result including the tandem genes found.
link to kassiopeia
link to diark
link to cymobase
link to motorprotein.de
MPG
MPI for biophysical chemistry
Uni-Goettingen
Informatik Uni-Goettingen