Consensus lists of such terms from different knowledge domains are collectively known www.selleckchem.com/products/MLN-2238.html as Knowledge Organization Systems (KOS) and can range from simple glossaries and dictionaries (or controlled vocabularies) through to more complex classification schemes, taxonomies, thesauri, gazetteers and ontologies. Here we present the Ontogrator web application, where we have used a set of KOS to demonstrate how data can be marked-up to create informative facets for search and discovery. We are in particular interested in the use of ontologies as facets. Ontologies can be loosely defined as sets of concepts or terms that also contain explicit relationships between them. Perhaps the best-known example of an applied ontology in the field of Molecular Biology is the Gene Ontology (GO) [7], but there are now a wide range of available ontologies [8] opening up a range of options for future aggregation and Ontogration of data.
Material and methods The Ontogrator Web application provides a JavaScript GUI (graphical user interface) running within a web browser. This Web application fetches data on demand from a back-end comprising a set of REST (representational state transfer) web services supported by a LAMP (Linux, Apache, MySQL, PHP) software stack. A MySQL database is constructed and indexed specifically to support the functions of the browser GUI. The back-end service (see Figure 1) performs the following key functions: A – Data acquisition: ingestion of raw data from primary sources; B – Semantic indexing: detecting concepts in the data using text mining; C – Browsing services: providing the client with an efficient concept-based retrieval service; D – Data and facet updates: periodic refreshing of the underlying resources.
Figure 1 The Ontogrator platform. Ontologies, or other KOS, and selected content are processed for use in Ontogrator. After data acquisition and annotation (semantic indexing), browsing services enable exploration and discovery through the web application. A – Data acquisition Data to be imported is converted to tabular format and pre-processed using a PHP script which is customized for each data source. This identifies which columns should be scanned for terms as well as constructing a unique identifier for each record. For example, a data resource with a habitat column would be marked for matching against the Environment Ontology.
Once the input processors have been constructed, the remainder of the processing is fully automated. The import scripts create appropriate tables in the back-end database to hold both the data and any hits found during semantic indexing. B – Semantic indexing Concept annotation is performed by Terminizer [9], an external Web service that detects mentions of ontological concepts Cilengitide from a given ontology in a given textual passage.