Seq2Fun Databases

  • For most non-model organisms, biological understanding of study outcomes is limited to protein-coding genes with functional annotations such as KEGG pathways, Gene Ontology or PANTHER classification system. Therefore, developing Seq2Fun database to focus on functionally annotated genes such as, protein-coding genes, GOs and KOs largely meets the preferred needs of most scientists studying non-model organisms.

    We provide dozens (~30) of pre-built databases that can be downloaded here.

    Group Species Proteins Ortholog Filename Date
    Algae 14 155495 34628 algae.tar.gz 03-02-2022
    alveolates 21 207674 48656 alveolates.tar.gz 03-02-2022
    amoebozoa 7 81844 20217 amoebozoa.tar.gz 03-02-2022
    amphibians 3 75261 9838 amphibians.tar.gz 03-02-2022
    animals 370 7150735 236447 animals.tar.gz 03-02-2022
    apicomplexans 18 93576 13823 apicomplexans.tar.gz 03-02-2022
    arthropods 119 1727651 101867 arthropods.tar.gz 03-02-2022
    ascomycetes 100 904642 93433 ascomycetes.tar.gz 03-02-2022
    basidiomycetes 33 363997 52723 basidiomycetes.tar.gz 03-02-2022
    birds 31 482205 13868 birds.tar.gz 03-02-2022
    cnidarians 9 203000 18682 cnidarians.tar.gz 03-02-2022
    crustaceans 7 154960 32225 crustaceans.tar.gz 03-02-2022
    dothideomycetes 10 123200 26288 dothideomycetes.tar.gz 03-02-2022
    eudicots 93 3180221 72086 eudicots.tar.gz 03-02-2022
    euglenozoa 9 86483 11790 euglenozoa.tar.gz 03-02-2022
    eurotiomycetes 20 196228 23006 eurotiomycetes.tar.gz 03-02-2022
    fishes 64 1736572 27327 fishes.tar.gz 03-02-2022
    flatworms 4 58181 15156 flatworms.tar.gz 03-02-2022
    fungi 138 1278312 141223 fungi.tar.gz 03-02-2022
    insects 101 1376824 60375 insects.tar.gz 03-02-2022
    leotiomycetes 5 67865 19335 leotiomycetes.tar.gz 03-02-2022
    mammals 94 1910363 32471 mammals.tar.gz 03-02-2022
    mollusks 9 206905 29992 mollusks.tar.gz 03-02-2022
    monocots 17 560027 32745 monocots.tar.gz 03-02-2022
    nematodes 6 134093 32549 nematodes.tar.gz 03-02-2022
    plants 127 3968027 128122 plants.tar.gz 03-02-2022
    protists 52 660237 126969 protists.tar.gz 03-02-2022
    reptiles 20 384584 11946 reptiles.tar.gz 03-02-2022
    saccharomycetes 36 195913 13337 saccharomycetes.tar.gz 03-02-2022
    stramenopiles 8 119746 28902 stramenopiles.tar.gz 03-02-2022
    vertebrates 212 4588985 59392 vertebrates.tar.gz 03-02-2022

    If you want to download the databases of Seq2Fun version 1, please click here.
  • We fully support customer built database. See MANUAL 13. Custom built database.

  • The following 8 databases are used for the assessment of Seq2Fun version 1 with mouse, chicken, zebrafish and roundworm datasets.
    The RNA-seq data can be download from here.

    Group Proteins KOs Species Filename
    Mammals_no_mouse 356,672 5,622 64 mammals_no_mouse.tar.gz
    Mouse 8,438 5,437 1 mouse.tar.gz
    Birds_no_chicken 81,576 4,176 23 birds_no_chicken.tar.gz
    Chicken 4,930 3,921 1 chicken.tar.gz
    Fishes_no_zebrafish 267,954 4,235 38 fishes_no_zebrafish.tar.gz
    Zebrafish 6,047 3,963 1 zebrafish.tar.gz
    Nematodes_no_roundworm 13,939 2,950 5 nematodes_no_worm.tar.gz
    Roundworm 3,081 2,391 1 worm.tar.gz