Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
The relational database models used in the BioQ database documentation tools.

h1. Contents

{toc:exclude=Contents}

h1. The documentation schema

The file {bioq_svn_server}perl/dbdoc/db_doc_model.mwb{bioq_svn_server} is a MySQL Workbench document containing the schema and ER diagrams. The diagrams below were taken directly from this file.

h2. The core model

This entity-relationship (ER) diagram shows the core relational model for our database documentation (from subversion revision 70).

!core_model.png|thumbnail,border=1!

h2. The Biologic-Experiment-Result (BERT) relational model

Our BERT relational model is designed to document a genomic database by tracing the experimental source of the data. The key entities in the models are the biologics, the experiments and the experimental results as illustrated in the following diagram.

!trace_table_legend.gif|thumbnail,border=1,width=200!

Here is an example of the BERT model applied to linkage disequilibrium data from the HapMap database.

!trace_db_bioq_hapmap_pr28_dbsnp132_table_asw_plink_ld.gif|thumbnail,border=1,width=200!

Our Biologic-Experiment-Results (BERT) relational model (from subversion revision 70).

!bert_model.png|thumbnail,border=1!

h2. The queries model

Our model for storing queries in the documentation database.  This information is used to populate the BioQ queries for each database.

 !queries_model.png|thumbnail,border=1,width=200!

h1. The BioQ documentation database

The BioQ::Documentation (dbDoc) features, which allow investigators to browse genomic database documentation, are based on a single MySQL database ({color:#ff0000}{*}TODO: specify PHP variable name{*}{color}).  The current documentation database is _bioq_dbdoc_1_. Some ER diagrams for specific models in the database are shown above.

h2. dbDoc Tables

h3. db, tbl and col

These provide descriptions of database, tables and columns, respectively, in the BioQ genomic databases.

h3. db_refs, tbl_refs and col_refs

Bibliographic references.

h3. database_links, column_links, tbl_links

Web links.

{info}To do: we should rename this to db_links, etc. They are probably not reference in code very much, if at all. See [http://deku.psych.wucon.wustl.edu:8081/browse/BIOQ-21]. {info}

h3. flow_group

In the BERT model, a flow group is a group of tables that can be input or output for a specific process.

!flow_group.gif|border=1!

h3. flow_group_tables

The tables in a flow group.

!flow_group_tables.gif|border=1!

h3. flow_group_tags

Tags that describe a flow group. These are not the same as filters.

h3. process

A process in the BERT model.

!process.gif|border=1!

h3. process_flow_group

A group of tables, columns and/or databases which are input/output for a process or experiment in the BERT model. It's possible to have multiple groups with the same name for different processes. Each process must have a group and groups may be singletons.

 !process_flow_group.gif|border=1!

h3. process_tags

Tags that describe a process. These can also be used as filters. No entries as of version 70.

h3. query

A query used in the BioQ web application.

 !query.gif|border=1!

h3. query_column

This is used in [Process.pm|https://deku.psych.wucon.wustl.edu/svn/bioq/trunk/web/perl/dbdoc/Process.pm] to look up information about columns being queried.
{info}It's actually not clear what the purpose of query_column is. Need to make sure this is actually being used, and for what.{info}

h3. reference

Reference table showing things like Subversion revision number.

h3. relationships

For recording and describing relationships between tables in genomic databases. No entries as of version 70.

h3. results_tables

For each table with table_type='result', determine if it can be traced to subjects and/or biologics.

h3. search_terms

Items used in the the dbDoc autocomplete search tool.

{info}Need to improve how this table is populated. See [http://deku.psych.wucon.wustl.edu:8081/browse/BIOQ-18]. {info}

h3. tags

Tags are used to group tables into categories. The 'tags' table contains names and descriptions of the tags.

h3. tbl_tags

This records which tags go with which tables.

h1. Miscellaneous Tasks

h2. Updating the dbDoc schema

# Dump the existing database
## Use mysql \--complete-insert \--no-create-db \--no-create-info \--verbose
## See /projects/bioinf/ssaccone/dbdoc_util/dump_dbdoc_data.sh
# If necessary, create the schema file (current using [MySQL Workbench|http://wb.mysql.com/] with the file _db_doc_model.mwb_)
## The schema file is currently named db_doc_model.sql
## Note that the schema file should not have a 'USE' statement - instead use the dbdoc_util.pl _dbdoc-db_ option.
# Run _dbdoc_util.pl initdb_
## You may also use the predefined script _template_init.sh_ and the correpsonding options file _template_init.opt_
# Load the original dumped database
# See also the script /projects/bioinf/ssaccone/dbdoc_util/update_schema.sh

h2. Initializing documentation for a single database

This is used to take an existing genomic database, such as HapMap, and set up entries in the dbDoc database for all the tables and columns in the database. This will ensure that entries for the different tables and columns exist so that more detailed documentation can be added.

*Example:* see /projects/bioinf/ssaccone/hapmap/dbdoc/hapmap_p3r3_dbsnp132.sh
# Run dbdoc_util.pl initdoc (with options) to set up dbDoc for a specified existing genomic MySQL database
# Run dbdoc_util.pl updatedoc \--dbdoc-xml-file=<file> to read XML documentation
## Suggestion: break up XML documentation into multiple files, such as a "core" file that does not change often, and a another file that might describe what the recent changes are. This is currently done for the HapMap databases.

h3h2. Modifying documentation

* *Deletions*: at the moment (subversion 29) a row can be deleted from the _db_ table and this will propagate throughout the database.
* *Updates*: at the moment (subversion 29) changes do not  cascade well due to a circular foreign key in the process/flow_group  tables. To change the name of a database the docoumentation should be  re-loaded from XML.