User Manual DB manager

From GBWiki

Jump to: navigation, search


Contents

Phenyx DB Manager

This page describes the use of the Phenyx Databanks Manager.


What is the Phenyx DB Manager?

The Phenyx DB Manager is a functionality that allows the installation, update and maintenance of sequence Databanks to be queried by Phenyx. The main features of the DB Manager are

  • Intuitive graphical interface in the Management Console
  • Databanks can be installed from various source formats (Uniprot dat, fasta, Phenyx precompiled, etc.)
  • Each databank can be installed in multiple versions, one of them is defined as default
  • Source databanks can be amino acids or nucleotides
  • DB installer can create decoy versions using different approaches (reverse, shuffle, HMM, etc.)
  • No need to stop using Phenyx while installing a new version of a databank
  • Databanks can be attributed to all users or restricted to one user (see details on Private Databanks here)
  • Each user can create its own set of private databases


How to use the DB manager?

Log in to Phenyx with the user default to install or manage a databank made available to all users, or log in as another specific user to install or manage databank private to this user. From the Desktop, click on the Management Console link image:Management-console-link.gif

Select the DB manager link in the management console

image:DB-manager-link.gif

Note: you have to receive the right from the Phenyx sysadmin to install and update databanks. Otherwise, the link is abscent from your console


Install a new Databank

As databanks can be installed with different versions, we have created a hierarchy. A databank instance can contain one or more databank version(s). To add a new databank means therefore: to create a new instance of a databank and to add a new version of this databank.

Here is how the DB manager looks like when empty: image:DB-manager-01.gif


Instanciate a new DB

To create a new databank, you need to instantiate it first, and add a new version.

  • click on the new bank item
  • in the appearing window, provide a name (as you want to see it in the job submission window; caution: use '_' to separate words, no accent allowed), specify what type of sequence it is (amino acid or nucleotide), if taxonomy is available, if it is considered as a decoy version, and if yes, to what other databank it is related with (in case the refering DB exists...)

image:DB-manager-02.gif

The DB appears in the Tree-representation. image:DB-manager-03.gif

Note: if the databank name is already used by another user, its name will not appear in the tree. Try with another name.


Add a new version to a DB

You can now add a new version of this DB: select new version, select the installer of your choice, and follow the corresponding instructions:

  • If you select the uniprot(.dat) installer, you can either upload a local file in Uniprot format (browse), or enter/select an url for the source file (for instance, you can select ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.dat.gz for UniprotKB/Swiss-Prot from the -pre defined urls- menu). Provide also a release name/number. For Uniprot, to know the current release number, click on the "current" link. It will download the release description file. Copy the release description in the release field (for instance: 56.6 of 16-Dec-2008). The default value is the current date in format YYYYMMDD. The selected value in the form will appear as it is in the submission page. Click on Submit. The new version appear in the tree with a rotating red status spot. It does turn green once the DB preparation is finished. In case it doesn't, you can also manually refresh the tree by clicking on the image:DB-manager-09.gif icon.


image:DB_manager_5modif.gif


  • If you select the fasta installer, you can either upload a local file or enter an url for the source file. The fasta file can be already in the Phenyx-specific format, or can by parsed using a regular expression. Phenyx fasta format is a Phenyx-specific format ressembling a regular fasta, but where the various information fields in the header are formatted. Example of a header line in this format: >SWN:PWP1_HUMAN \ID=PWP1_HUMAN \DE=PERIODIC TRYPTOPHAN PROTEIN 1 HOMOLOG (KERATINOCYTE PROTEIN IEF).

Specify a release name/number as it will appear in the submission page and click on "submit".

image:DB_manager_6bis.gif


Here are examples of regular expressions:

example 1:

>sp|P68250|1433B_BOVIN 14-3-3 protein beta/alpha OS=Bos taurus GN=YWHAB PE=1 SV=2
AAAAAAAA
regular expression:
s/^sp\|(\w+?)\|(.*)/$1 \\DE=$2/

ex 2:

>PA0001 gi|15595199|ref|NP_064721.1| chromosomal replication initiator protein DnaA [Pseudomonas_aeruginosa_PAO1]
AAAAAA
regular expression:
s/(\w+)\s*(.*)/$1 \\DE=$2/

Note: it is recommended to test a regular expression with a small databank (3 entries for instance) and check the result before going to a big one. You can check the result by double-clicking on a version, and download the headers.zip file


  • If you select the precompiled installer, it will upload and install a databank available as precompiled, phenyx-archived format. You can either browse for the local archive, or use the -pre defined urls- menu to access our precompiled databanks repository (ftp), then click on "submit"

image:DB_manager_7modif.gif


  • If you select the fasta content installer, you can copy and paste a fasta-formatted list of amino acid sequences, and provide a release/version value. This feature is useful to search a limited number of entries.

image:DB_manager_8bis.gif

Note: The format can be either a simple fasta file such as:

  >sequence1
  DFRTACNMKRFSTDDPHGGST
  >sequence2
  DFRTACNMKRFSTDDPHGGST

the parsing rule is basic: the AC include all characters next to ">" until it reaches a _SPACE_. All subsequent text is ignored. No ID nor Description will be included.

The format can also be a phenyx formated format.


  • If you select the NCBI(nr)formated installer, it will upload and install both the nr.gz file and the corresponding file with protein's gi and taxid information (gi_taxid_prot.dmp file). Are uploading a NCBI fasta file with sorted AC? Then check the Sorted data box to further improve the calculation time on the databank.

image:DB_manager_09bis.gif

Note: if you want to install a database in NCBI format, but available on another ftp adresss, change the target address. If the file is local to the phenyx server (not on the client computer), you can also use file:///path/database_filename

Install a decoy Databank

A decoy databank is created from an existing "forward" databank and is referred to it. Therefore, you need first to create the "forward" version. Then, instantiate a new bank an provide a name, select the "is decoy?" option and choose the "forward" databank it is related with.

image:DB-manager-11.gif

Then create a new version. A new installer named "decoy" appears in the Installer menu. Provide a release name/number ans select the decoying method of choice.

image:DB-manager-12.gif

Change default version

The active version of the DB is checked. If you want to change it, click on another version checkbox. The new version becomes active and can be used as default in new identification jobs.

image:DB-manager-13.gif

Archive a Databank version, visualize its content

If you want to export and archive a DB, double-click on a DB version. You can decide to either download the full archive in zip format, or to download only the entry headers in a phenyx fasta format. The installer log will be helpful in case of errors.

image:DB_manager_10bis.gif

Delete DBs

If you want to delete a DB version or a DB instance (and also all associated DB versions), drag and drop the corresponding DB to the trash icon. Be careful, it will erase the files on the disk!

Personal tools
Create a book