For most non-model organisms, biological understanding of study outcomes is limited to protein-coding genes with functional
annotations such as KEGG pathways, Gene Ontology or PANTHER classification system. Therefore, developing Seq2Fun database to
focus on functionally annotated genes such as, protein-coding genes, GOs and KOs largely meets the preferred needs of most scientists studying non-model organisms.
We provide dozens (~30) of pre-built databases that can be downloaded here.
Group | Species | Proteins | Ortholog | Filename | Date |
---|---|---|---|---|---|
Algae | 14 | 155495 | 34628 | algae.tar.gz | 03-02-2022 |
alveolates | 21 | 207674 | 48656 | alveolates.tar.gz | 03-02-2022 |
amoebozoa | 7 | 81844 | 20217 | amoebozoa.tar.gz | 03-02-2022 |
amphibians | 3 | 75261 | 9838 | amphibians.tar.gz | 03-02-2022 |
animals | 370 | 7150735 | 236447 | animals.tar.gz | 03-02-2022 |
apicomplexans | 18 | 93576 | 13823 | apicomplexans.tar.gz | 03-02-2022 |
arthropods | 119 | 1727651 | 101867 | arthropods.tar.gz | 03-02-2022 |
ascomycetes | 100 | 904642 | 93433 | ascomycetes.tar.gz | 03-02-2022 |
basidiomycetes | 33 | 363997 | 52723 | basidiomycetes.tar.gz | 03-02-2022 |
birds | 31 | 482205 | 13868 | birds.tar.gz | 03-02-2022 |
cnidarians | 9 | 203000 | 18682 | cnidarians.tar.gz | 03-02-2022 |
crustaceans | 7 | 154960 | 32225 | crustaceans.tar.gz | 03-02-2022 |
dothideomycetes | 10 | 123200 | 26288 | dothideomycetes.tar.gz | 03-02-2022 |
eudicots | 93 | 3180221 | 72086 | eudicots.tar.gz | 03-02-2022 |
euglenozoa | 9 | 86483 | 11790 | euglenozoa.tar.gz | 03-02-2022 |
eurotiomycetes | 20 | 196228 | 23006 | eurotiomycetes.tar.gz | 03-02-2022 |
fishes | 64 | 1736572 | 27327 | fishes.tar.gz | 03-02-2022 |
flatworms | 4 | 58181 | 15156 | flatworms.tar.gz | 03-02-2022 |
fungi | 138 | 1278312 | 141223 | fungi.tar.gz | 03-02-2022 |
insects | 101 | 1376824 | 60375 | insects.tar.gz | 03-02-2022 |
leotiomycetes | 5 | 67865 | 19335 | leotiomycetes.tar.gz | 03-02-2022 |
mammals | 94 | 1910363 | 32471 | mammals.tar.gz | 03-02-2022 |
mollusks | 9 | 206905 | 29992 | mollusks.tar.gz | 03-02-2022 |
monocots | 17 | 560027 | 32745 | monocots.tar.gz | 03-02-2022 |
nematodes | 6 | 134093 | 32549 | nematodes.tar.gz | 03-02-2022 |
plants | 127 | 3968027 | 128122 | plants.tar.gz | 03-02-2022 |
protists | 52 | 660237 | 126969 | protists.tar.gz | 03-02-2022 |
reptiles | 20 | 384584 | 11946 | reptiles.tar.gz | 03-02-2022 |
saccharomycetes | 36 | 195913 | 13337 | saccharomycetes.tar.gz | 03-02-2022 |
stramenopiles | 8 | 119746 | 28902 | stramenopiles.tar.gz | 03-02-2022 |
vertebrates | 212 | 4588985 | 59392 | vertebrates.tar.gz | 03-02-2022 |
We fully support customer built database. See MANUAL 13. Custom built database.
The following 8 databases are used for the assessment of Seq2Fun version 1 with mouse, chicken, zebrafish and roundworm datasets.
The RNA-seq data can be download from here.
Group | Proteins | KOs | Species | Filename |
---|---|---|---|---|
Mammals_no_mouse | 356,672 | 5,622 | 64 | mammals_no_mouse.tar.gz |
Mouse | 8,438 | 5,437 | 1 | mouse.tar.gz |
Birds_no_chicken | 81,576 | 4,176 | 23 | birds_no_chicken.tar.gz |
Chicken | 4,930 | 3,921 | 1 | chicken.tar.gz |
Fishes_no_zebrafish | 267,954 | 4,235 | 38 | fishes_no_zebrafish.tar.gz |
Zebrafish | 6,047 | 3,963 | 1 | zebrafish.tar.gz |
Nematodes_no_roundworm | 13,939 | 2,950 | 5 | nematodes_no_worm.tar.gz |
Roundworm | 3,081 | 2,391 | 1 | worm.tar.gz |