Bioinformatics tools are used by scientists and researchers to aid discovery in online databases. There has been a rapid growth in published literature and the databases and biological data has multiplied due to the the e-science era and high throughput technologies (Hey,T. 2005). Computers can help researchers to manage this information explosion (Fedoroff N. 2005). Managing data needs knowledge representation. This requires ontologies to structure the complex domain data and to aid in querying, integrating and mining public data.
Ontologies are tools which represent domain specific knowledge and provide a content-oriented research environment (Mizoguchi R. 1997). An ontology is a data model that represents a domain and is used to reason about the objects in that domain and the relations between them. More generally, an ontology is a specification of entities (or concepts), relations, instances and axioms in an area of study. A reference ontology is similar to a scientific theory, it contains representations of biological reality which are correct according to our current understanding. An application ontology, however, is a software artefact for structuring data according to some hierarchy of classes, for the purpose of managing and manipulating that data, supporting interoperability of various resources ( Nigam Shah, 24th july 2008).
Ontology has its roots in western philosophy and has always been there since the time of Aristotle who used it to provide a definitive and step by step classification of the entities i.e the objects in nature and the reality of them in general. It was first used in scientific theories by Quine (1953) to understand the scientific theories in first order logic. Ontology provides a vocabulary to a specific domain to illustrate the concepts and the relations in that given domain yielding a framework for knowledge sharing, reusability and depiction for different communities (Gruber, T. R. 1991 & Smith, B. 2003). Unlike in philosophy where ontology captures truth about the world and its entities, in information technology ontologies have a smaller scope capturing the reality pertaining only to the domain. With the progress of semantics, ontologists have attempted to make data representation and integration more resourceful and increase interoperability.
Bioinformatics has become increasingly associated with computational methods to store, model and analyse biological data. Two of the major challenges facing bioinformatics are, (1) data management and (2) knowledge discovery (Sidhu,A.S. 2007). Both challenges raise issues of how to represent data, how to aide practicality and how to make data more computationally compliant. This desire to interpret and represent data in a form that can aid these challenges has led bioinformaticians to the field of ontologies.
Ontologies in biomedicine extend from a simple list of terms, e.g. SOFG Anatomy Entry List (SAEL) ( Parkinson,H. 2004) to expressive sources of knowledge, e.g. Gene Ontology (GO) (Ashburner,M. 2000). For the simple list of terms, the class names within bio ontologies can be used to annotate data and aid in interpreting the experimental results. The highly expressive ontologies are built on a massive scale based on powerful computer reasoning and applications which contains a detailed biomedical knowledge ( Nigam Shah 24th july 2008).
Ontologies in bioinformatics have greatly improved the way biological data is handled and represented. The success of the Gene Ontology ( Bada, M. et al., 2004 & Gene Ontology Consortium 2006) and the establishment of the Open Biomedical Ontologies (OBO) (Smith,B. 2007) consortium have resulted in a large number of open source biomedical ontologies.
The OBO Foundry
The success of GO gave rise to the establishment of Open Biomedical Ontologies (OBO) consortium in 2001 (Smith,B. 2007) which aims to coordinate in the development of ontologies. Ontologies are intended to aid data interoperability and capturing of domain knowledge, however, increasing the ease with which an ontology can be developed has also led to a propagation in the number of bioontologies being developed which sometimes overlap in terms of scope. This can lead to duplication of effort, inconsistencies and introduces complexities to integration.
To tackle this issue, the OBO Foundry was established as a comprehensive body for the developers of life-science ontologies (Smith,B. 2007). The OBO Foundry promotes the development of ontologies using the key principles that have supported the success of the GO, namely, that ontologies be open access, orthogonal (that is non-overlapping in terms of scope), instantiated in a well-specified syntax (such as OBO or OWL) and designed to share a common space of identifiers. In essence, OBO offers a more simplistic approach than is adopted in, for instance, OWL, removing the need to understand certain elements of description logic axioms for example. However, as a consequence OBO does lack the expressivity offered by OWL and therefore does not take advantage of the axiom based representation that a description logic type language offers, such as the computational reasoning and inference.
There are already a number of bioontologies hosted on the OBO Foundry repository which include domain areas such as biochemistry (ChEBI)( Berkeley Bioinformatics Open Source Project September 20, 2009) phenotype (PATO) (Berkeley Bioinformatics Open Source Project September 20, 2009) and a large array of anatomies including cells (Cell)(Berkeley Bioinformatics Open Source Project September 20, 2009), fly (Drosophila Gross Anatomy) (Berkeley Bioinformatics Open Source Project September 20, 2009)and mouse (Mouse Adult Gross Anatomy) (Berkeley Bioinformatics Open Source Project September 20, 2009). Developers of these ontologies commit themselves to following the OBO Foundry principles and to providing procedures for user feedback and for identifying successive versions. However, the continuous evolution of these ontologies and providing information about changes remains an un-addressed issue for most of these ontologies. A method to tracing these changes would appear to provide a useful service to the bioontology consuming community, and to the ontology community more generally.
Features of Ontology
A major feature of ontology is the capability to describe conceptual object versus tangible objects. A conceptual object is an ideal object which does not exist in the real world, whereas, a tangible object is one which is present and occupies some time or space. This feature is debatable with respect to ontologies in biological domain, in the sense, ontologies should not model objects that do not exist in the real world and should represent entities that exist only (Nigam Shah, 24th july 2008).
Ontologies typically use subsumption (also known as an "is-a") relation which allows a parent-child relation to be represented. The parent-child relation is of particular important in fields such as biology where taxonomic knowledge involving the likes of superclass/subclass relations has been in widespread usage for many years. This is demonstrable in ontologies such the Gene Ontology(Ashburner,M. 2000) and Cell Ontology ( Bard,J. 2005).
Specifically, ontologies are used in the following ways:
Reference for naming things
By naming objects, we establish a set of controlled terms or restrictions for labelling entities in databases and datasets. This helps the researchers to interpret and analyse large online datasets to carry on their analyses. It can contain synonymous terms, abbreviations which could all refer to the same object. To merge all the datasets in a uniform way where they describe similar entities that are labelled different in different resources is quite a challenge. An ontology provides a single name (the class name) for each entity it contains (though it can represent alternative names for that entity through the appropriate relations). By providing references for naming things, it can be applied to index the literature for improving search as well as text mining applications. (Nigam Shah, 24th july 2008)
Large databases use a classical set of terms from an ontology for indexing data and linking them to other sources. For example, NCI Thesaurus, being developed by the National Cancer Institute, is designed to integrate molecular and clinical cancer-related information and is used by many databases(Hartel,F.W. 2005).
Representation of encyclopaedic knowledge
Complex knowledge of biological data in texts makes it accessible to human but not to computers, therefore, ontologies are developed in such a way where it makes the explicit vast knowledge accessible to scientists and machines. The representation of encyclopaedic knowledge in ontologies has many applications to many biological domains, ranging from help in basic research to assist in any decision support. For example, Dameron and colleagues developed a numeric and symbolic representation of brain cortex anatomy as a reference ontology which could be reused in applications such as decision support in neurosurgery ,teaching and sharing of neuroimaging data for research purposes ( Dameron,O. 2002).
Specification of information models
Information models such as database schema diagrams, UML methods or entity-relation diagrams are used which help to reuse existing ontologies. The use of ontology for building models for biological information and databases have several advantages- Ontologies endorse automated reasoning among classes and also connects different data types in databases explicit. It also provides definite specification of terms used to convey information in the biological domain. Complex database and knowledge based schemas are visualized in a spontaneous manner since tools such as Protg ( Stanford Center for Biomedical Informatics Research 2009) contain visualization tools that enable developers to create graphical visualizations of schemas (Noy,N.F. 2003). If the ontology is in the format of Web Ontology Language (OWL), the representation is Semantic Web compatible ( Horrocks, I. 2009). Information models also provide a standard term for annotating the micro array experiments when they are submitted to public repositories and give a clear description of how the experiments were carried out.
Representation of semantics of data for information integration
As the amount of biological data present online is massive, ontologies can narrow the process of incorporating and retrieve the data across various resources. Researchers can combine assorted data across different databases by specifying the semantics of data in a variety of databases. This can be done by linking the shared characteristics of entities in databases. Computer reasoning programs can be applied to ontologies to determine if two objects in different databases refer to the same biomedical entity. Thus, an ontology-based framework can facilitate the exchange, integration and validation of information. TAMBIS is a project that aimed to provide transparent access to disparate biological databases and analysis tools, enabling users to access and virtually integrate a wide range of biomedical resources (Stevens,R. 2000). The creators of TAMBIS developed an application that uses the TAMBIS ontology to enable users to formulate a query across a set of diverse biomedical sources, providing a means to virtually integrate these resources.
Computer reasoning with data
Computer reasoning comprehends methods that use ontologies to make conclusions based on the knowledge it contains along with any additional information or facts available.
Presently, it is not easy to combing the current knowledge about the biological systems and hypothesize Models extending a large number of proteins and genes ( Kuchinsky A. 2009). It is hard to ascertain whether the theories are supported by the data, to improve conflicting theories and to translate the implications of the modified theories (Karp,P.D. 2001). This situation makes it apparent that new altered tools need to be developed which makes use of the formal methods to query and analyse the information at hand (Gifford,D.K. 2001). Fedoroff et al (Fedoroff N. 2005) suggested a formal representational systems appropriate for representing models of biological systems, and computational tools that can manipulate, check, and use these models to make predictions and form explanations. HyBrow (Hypothesis Browser) is a system for the representation, manipulation and integration of diverse biological data - such as gene expression, protein interactions & annotations - with prior biological knowledge for the purpose of evaluating alternative hypotheses. Hybrow's purpose is to evaluate and rank hypotheses based on user-defined 'rules', and consistency with all information available to it (Racunas,S.A. 2004). Another way computer reasoning with ontologies has been generalized to other domains is encoding classification criteria in explicit ontology-based form. Ontologies can be used to represent the classification criteria explicit using logic formalisms that some ontology languages provide, such as OWL which offers expressivity.
Languages in Bio Ontologies
Like any ontology language, OWL is designed to be used to explicitly represent the meaning of terms from a domain and the relationships that hold between those terms, as any ontological representation should.
An OWL ontology consists of a set of axioms (statements of truth) which place restrictions on sets of classes and the types of relationships that hold between them. These axioms are made up of two types of properties, properties over objects and properties over data types. Object properties form relations between one object and another, while data types properties relate objects to data type values. For these properties, multiple domains and ranges may be declared, as well as permitting inverse properties and equivalences. OWL also permits property restrictions such that class X satisfies certain conditions, that is, all instances of X satisfy the conditions, resulting in a rich array of property relations. These features remove ambiguity in ontological representation and requires languages that are amenable to reasoning by computer algorithms. Reasoning with ontological axioms is important in this context for applications attempting to perform such an automated process, hence a language based on syntax which supports reasoning offers advantages.
Descriptive logics are characterised by constructing complex concepts from simpler ones, which can be achieved in an incremental fashion by adding more axioms to a model. This neatly fits in with the process of building an ontology which is essentially concerned with capturing statements of truth about a domain. Editing tools such as Protg has empowered the community to be able to readily use languages such as OWL without the requirement of a deep understanding of description logics.
The GO stands as probably the first and, to date, most successful of the bioontologies currently used in bioinformatics. The success of the GO was dependant upon several important factors such as community involvement, limited scope, the continuous evolution and active curation and the early use of the ontology (Bada, M. et al, 2004;}}.
The continuous and active development has meant the ontology has evolved over time, with a version control used to monitor releases as classes are added or modified. This is an important consideration and relevant to the work of this thesis. As a very actively curated and ever-changing ontology, the GO is one example where the ability to keep abreast of class changes would be a useful asset when employing it for an application.
Presently, GO has reached a considerable size of more than 16,500 terms linked to databases holding over 120,000 gene products (Harris,M.A. 2008). This helps users to find more details regarding a particular gene enriching the knowledge associated to the gene. It is therefore essential to provide a mechanism for protecting GO-based annotations from inconsistencies, errors or error propagations.
A tool called GOChase was developed for use with databases using GO concepts to flag terms that may have been incorrectly used for annotations as they have been changed in more recent GO versions (Park,Y.R. 2005). Over 200 inconsistencies due to use of obsolete or altered terms, highlighting the potential of such a tool was found. The GOChase tool, however, has only a limited scope when it is likely that databases using other bioontologies may also suffer from potential inconsistencies by using old versions of ontology classes, not to mention applications that are not specific to using bioontologies for database annotations. Whatizit (Rebholz-Schuhmann,D. 2008) is a web based application which brings out the interesting information from a document by textmining. It identifies biological terms and connects them to public databases. Gene2diseases ( Perez-Iratxeta,C. 2002) is a publicly available database meant for candidate genes for mapped inherited human diseases. Go term linked with a particular inherited diseases can be viewed here.
The Ontology for Biomedical Investigations (OBI) is an integrated ontology for the description of biological and clinical investigations (OBI Consortium 2009). The ontology includes a set of 'universal' terms, that are applicable across various biological and technological domains, and domain-specific terms relevant only to a given domain. The efforts taken up by OBO made bio-ontologies reliable in representing data and rich in annotations enhancing queries to aid the end users (researchers) in their elucidation of the available data ( Smith,B. 2007; Bodenreider,O. 2006 & Stevens,R. 2000).
Although expression of biological knowledge has become simpler with the development of ontologies , with the explosion of data the accuracy has become a problem. There is an urgent need to develop approaches to increase accuracy and make it less labour intensive because the development of ontologies is time consuming and labour intensive. Text mining is an extension of data mining and is being used widely in life sciences. Text mining is an option for automated ontology or ontology -like knowledge base creation. A variety of tools are being developed to support and sustain ontology using text as its corpus. (Brewster,C. 2009) have developed a text mining approach to build ontology from texts for Animal behaviour domain. They used OWL as the semantic language .It was simple and the time taken to construct this was very less. This approach attempts to build ontology from journals in a reduced time span which was Elsevier database. However, the downside of this approach was with the amount of noise in the results (i.e. the accuracy was taking a fall back) and efforts involved in excluding terms that belong to other domains. Another application that deals with the acquisition of knowledge from text is the Text2Onto tool (Cimiano,P et al 2005). This application is a redesigned version of its predecessor, TextToOnto ( Maedche, A. 2004). Earlier, attempts have been made to automate the ontology learning process from a set of textual data such as TextToOnto, OntoLearn ( Navigli R., et al 2003), Mo'k Workbench (Bisson G. 2000) and the Asium system . The tools suffered shortcomings as their designs were dependent on very specific ontology models which could not always be translated to semantic formalisms (Cimiano,P et al 2005). These drawbacks have been overcome in Text2Onto by introducing Ontology model independent platform and the Probabilistic Ontology Model (POM) - this represents the results based on probability and finally Data-driven change discovery algorithms - which detects changes in the corpus and accordingly modifies the POM. The application has a user centric architecture as it is equipped with a graphic user interface featuring corpus management, a work flow editor, dialogue configurations for the algorithms and graph-based POM visualisations (presenting the results to the user according to the varying confidence threshold).
Applications in Bioinformatics
ArrayExpress is a standards compliant public database based at the European Bioinformatics Institute (EBI) which contains data from micro array gene expression experiments (Parkinson,H. 2007) Upon submission to the database, data is manually curated by a team of database curators before it is then made public in the database. This curation step relates to study the data, for example, for the presence of raw and processed data, accuracy and completeness of biological information provided, including the presence of experimental factors and their values in the sample annotation. Gene expression profiles can be queried by gene names and properties,such as by using Gene Ontology (GO) terms and these gene expression profiles can then be visualized.
In recent times, EBI has developed the Experiment Factor Ontology (EFO) (Malone J. 2008). EFO is an application focused ontology developed using the experimental variables from the Array Express database (Parkinson,H. 2007), focused on elevating the annotations in the ArrayExpress repository. The semi-automated ontology involves the mappings of a range of other ontologies such as the NCI thesaurus (Fragoso,G. 2004), Disease Ontology (Dyck, P. 2003), Cell Type Ontology (Bard,J. 2005) and Zebrafish Anatomy and Development (ZFA) ontology (Sprague,J. 2003) and this is used to annotate data in ArrayExpress. ArrayExpress data was used on the software to process gene expression data - which provides the use case for this thesis. This project is a formal representation of the software needed to represent biomedical investigation and we aim to produce an ontology which is compatible with OBI which could be merged with OBI in the future.
Gene Expression Atlas
The Gene Expression Atlas (ArrayExpress Atlas) is a semantically enriched database which has evolved from ArrayExpress gene expression data archive. It enables the scientists to compare and search gene expression data according to disease, cell type and tissue conditions (EMBL-EBI). It aims at integrating ontologies for high quality annotation of genes and provides summarized reports to analyse the data.
Aims of this Project:
A software ontology is need to describe the data types, algorithms and implementations of these as software currently used in experimental biology. The only existing software ontology has several problems and is not currently orthogonal to other OBO foundry efforts such as the Ontology for Biomedical Investigation OBI. Therefore, this thesis will deal with
- evaluation of the existing software ontology
- application of text mining methodology to extend the software ontology
- restructuring of the software ontology using an upper level ontology to enhance interoperability
- validation of the ontology using data and use cases from ArrayExpress.
- Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M. and Sherlock, G., 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat.Genet., 25, 25-29.
- Bada, M., Stevens, R., Goble, C., Gil, Y., Ashburner, M., Blake, J., Cherry, M., Harris, M., Lewis, S., 2004. A short study on the success of the Gene Ontology. 1(2), 235-240.
- Bard, J., Rhee, S.Y. and Ashburner, M., 2005. An ontology for cell types. Genome Biol., 6, R21.
- Berkeley Bioinformatics Open Source Project, September 20, 2009. The Open Biomedical Ontologies. [online]. Available at: http://www.obofoundry.org/ [accessedSeptember 20/Sunday 2009].
- Bisson G., Nedellec C and Canamero D., 2000. Designing clustering methods for ontology building. The mo'k workbench, Staab S., et al, eds. In: Proceedings of the Workshop on Ontology Learning, 2000, .
- Bodenreider, O. & Stevens, R., 2006. Bio-ontologies: current trends and future directions. Brief Bioinform, 7, 256-274.
- Brewster, C., Jupp, S., Luciano, J., Shotton, D., Stevens, R.D. and Zhang, Z., 2009. Issues in learning an ontology from text. BMC Bioinformatics, 10 Suppl 5, S1.
- Cimiano, P. and Volker, J., 2005. Text2Onto - A Framework for Ontology Learning and Datadriven Change Discovery.
- Cimino, J.J. & Zhu, X., 2006. The practical impact of ontologies on biomedical informatics. Yearb.Med.Inform., 124-135.
- Dameron, O., Gibaud, B., Burgun, A. and Morandi, X., 2002. Towards a sharable numeric and symbolic knowledge base on cerebral cortex anatomy: lessons learned from a prototype. Proc.AMIA.Symp., 185-189.
- Degtyarenko, K., de Matos, P., Ennis, M., Hastings, J., Zbinden, M., McNaught, A., Alcantara, R., Darsow, M., Guedj, M. and Ashburner, M., 2008. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res., 36, D344-50.
- Dyck, P. and Chisolm, R., 2003. Disease Ontology: Sructuring Medical Billing Codes for Medical Record Mining and Disease Gene Association,Proceeding of the Sixth Annual Bio-ontologies Meeting, Brisbane, 2003 .
- European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), September 2009. Gene Atlas Expression [online]. Available at: http://www.ebi.ac.uk/gxa/help/AboutAtlas
- Fedoroff N., Racunas S.A. and Shrager J., 2005. Making Biological Computing Smarter , The Scientist. 19(11), pp.20-21.
- Fragoso, G., de Coronado, S., Haber, M., Hartel, F. and Wright, L., 2004. Overview and Utilization of the NCI Thesaurus. Comp.Funct.Genomics, 5, 648-654.
- Gene Ontology Consortium, 2006. The Gene Ontology (GO) project in 2006. Nucleic Acids Res., 34, D322-6.
- Gifford, D.K., 2001. Blazing pathways through genetic mountains. Science, 293, 2049-2051.
- Gruber, T.R., 1991. The role of common ontology in archieving sharable, reusable knowledge bases, J. In Allen, R. Fikes and E. Sandewall, eds. In: International Conference on Principles of Knowledge Representation and Reasoning, 1991, San Mateo, California: Morgan Kaufmann pp601-602.
- Harris, M.A., 2008. Developing an ontology. Methods Mol.Biol., 452, 111-124.
- Hartel, F.W., de Coronado, S., Dionne, R., Fragoso, G. and Golbeck, J., 2005. Modeling a description logic vocabulary for cancer research. J.Biomed.Inform., 38, 114-129.
- Hey, T. & Trefethen, A.E., 2005. Cyberinfrastructure for e-Science. Science, 308, 817-821.
- Horridge M., et al, 2009. The Manchester OWL Syntax. [online]. Available at: http://www.webont.org/owled/2006/acceptedLong/submission_9.pdf [accessed September 2009].
- Horrocks, I., 2009. An ontology language for the semantic Web. [online] 17(2), pp.18th August 2009-74-75. Available from : http://www2.computer.org/portal/web/guest/home.
- Karp, P.D., 2001. Pathway databases: a case study in computational symbolic theories. Science, 293, 2040-2044.
- Kuchinsky A., et al, 2009. Biological storytelling: a software tool for biological information organization based upon narrative structure. [online] 23(2), pp.1st September 2009-4-5. Available from : http://portal.acm.org/citation.cfm?id=962185.962186.
- Maedche, A. & Staab S, 2004. Ontology Learning. In Staab s. & Studer R., eds, Handbook on Ontologies. Germany: Springer. 173-189.
- Malone J., et al, 2008. Developing an application focused experimental factor ontology: embracing the OBO Community, In Proc. of ISMB 2008 SIG meeting on Bio-ontologies, 2008 .
- Mizoguchi R. and Ikeda M., 1997. Towards Ontology Engineering, Proc. of PACES/SPICIS, 1997, pp259-266.
- Navigli R., Velardi P., Gangemi A., 2003. Ontology Learning and its application to automated terminology translation. 18:1, 22-31.
- Nigam Shah, B.S., 24th july 2008. Tutorial on How to Make Useful Ontologies for Biomedicine. [online]. Available at: http://www.bioontology.org/wiki/index.php/How_to_Make_Useful_Ontologies_for_Biomedicine [accessedSeptember 8 2009].
- Noy, N.F., Crubezy, M., Fergerson, R.W., Knublauch, H., Tu, S.W., Vendetti, J. and Musen, M.A., 2003. Protege-2000: an open-source ontology-development and knowledge-acquisition environment. AMIA.Annu.Symp.Proc., 953.
- OBI Consortium, 2009. The Ontology for Biomedical Investigations. [online]. Available at: http://obi-ontology.org/page/Main_Page [accessedAugust 2009].
- Park, Y.R., Park, C.H. and Kim, J.H., 2005. GOChase: correcting errors from Gene Ontology-based annotations for gene products. Bioinformatics, 21, 829-831.
- Parkinson, H., Aitken, S., Baldock, R.A., Bard, J.B., Burger, A., Hayamizu, T.F., Rector, A., Ringwald, M., Rogers, J., Rosse, C., Stoeckert, C.J. and Davidson, D., 2004. The SOFG Anatomy Entry List (SAEL): An Annotation Tool for Functional Genomics Data. Comp.Funct.Genomics, 5, 521-527.
- Parkinson, H., Kapushesky, M., Shojatalab, M., Abeygunawardena, N., Coulson, R., Farne, A., Holloway, E., Kolesnykov, N., Lilja, P., Lukk, M., Mani, R., Rayner, T., Sharma, A., William, E., Sarkans, U. and Brazma, A., 2007. ArrayExpress--a public database of microarray experiments and gene expression profiles. Nucleic Acids Res., 35, D747-50.
- Perez-Iratxeta, C., Bork, P. and Andrade, M.A., 2002. Association of genes to genetically inherited diseases using data mining. Nat.Genet., 31, 316-319.
- Racunas, S.A., Shah, N.H., Albert, I. and Fedoroff, N.V., 2004. HyBrow: a prototype system for computer-aided hypothesis evaluation. Bioinformatics, 20 Suppl 1, i257-64.
- Rebholz-Schuhmann, D., Arregui, M., Gaudan, S., Kirsch, H. and Jimeno, A., 2008. Text processing through Web services: calling Whatizit. Bioinformatics, 24, 296-298.
- Rubin, D.L., Lewis, S.E., Mungall, C.J., Misra, S., Westerfield, M., Ashburner, M., Sim, I., Chute, C.G., Solbrig, H., Storey, M.A., Smith, B., Day-Richter, J., Noy, N.F. and Musen, M.A., 2006. National Center for Biomedical Ontology: advancing biomedicine through structured organization of scientific knowledge. OMICS, 10, 185-198.
- Sidhu, A.S., Dillon, T.S., Chang, E. and Chen, J.Y., 2007. Ontologies for bioinformatics. Int.J.Bioinform Res.Appl., 3, 261-267.
- Smith, B., 2003. Ontology. In Luciano Floridi, ed, Black guide to the Philosophy of Computing and Information. Oxford:Blackwell. 155-166.
- Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L.J., Eilbeck, K., Ireland, A., Mungall, C.J., OBI Consortium, Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S.A., Scheuermann, R.H., Shah, N., Whetzel, P.L. and Lewis, S., 2007. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat.Biotechnol., 25, 1251-1255.
- Sprague, J., Clements, D., Conlin, T., Edwards, P., Frazer, K., Schaper, K., Segerdell, E., Song, P., Sprunger, B. and Westerfield, M., 2003. The Zebrafish Information Network (ZFIN): the zebrafish model organism database. Nucleic Acids Res., 31, 241-243.
- Stanford Center for Biomedical Informatics Research, 2009. Protg. [online]. Available at: http://protege.stanford.edu/ [accessedAugust 2009].
- Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A. and Brass, A., 2000. TAMBIS: transparent access to multiple bioinformatics information sources. Bioinformatics, 16, 184-185.