User Manual DB manager
From GBWiki
Contents |
Phenyx DB Manager
This page describes the use of the Phenyx Databanks Manager.
What is the Phenyx DB Manager?
The Phenyx DB Manager is a functionality that allows the installation, update and maintenance of sequence Databanks to be queried by Phenyx. The main features of the DB Manager are
- Intuitive graphical interface in the Management Console
- Databanks can be installed from various source formats (Uniprot dat, fasta, Phenyx precompiled, etc.)
- Each databank can be installed in multiple versions, one of them is defined as default
- Source databanks can be amino acids or nucleotides
- DB installer can create decoy versions using different approaches (reverse, shuffle, HMM, etc.)
- No need to stop using Phenyx while installing a new version of a databank
- Databanks can be attributed to all users or restricted to one user (see details on Private Databanks here)
- Each user can create its own set of private databases
How to use the DB manager?
Log in to Phenyx with the user default to install or manage a databank made available to all users, or log in as another specific user to install or manage databank private to this user.
From the Desktop, click on the Management Console link
Select the DB manager link in the management console
Note: you have to receive the right from the Phenyx sysadmin to install and update databanks. Otherwise, the link is abscent from your console
Install a new Databank
As databanks can be installed with different versions, we have created a hierarchy. A databank instance can contain one or more databank version(s). To add a new databank means therefore: to create a new instance of a databank and to add a new version of this databank.
Here is how the DB manager looks like when empty:
Instanciate a new DB
To create a new databank, you need to instantiate it first, and add a new version.
- click on the new bank item
- in the appearing window, provide a name (as you want to see it in the job submission window; caution: use '_' to separate words, no accent allowed), specify what type of sequence it is (amino acid or nucleotide), if taxonomy is available, if it is considered as a decoy version, and if yes, to what other databank it is related with (in case the refering DB exists...)
The DB appears in the Tree-representation.
Note: if the databank name is already used by another user, its name will not appear in the tree. Try with another name.
Add a new version to a DB
You can now add a new version of this DB: select new version, select the installer of your choice, and follow the corresponding instructions:
- If you select the uniprot(.dat) installer, you can either upload a local file in Uniprot format (browse), or enter/select an url for the source file (for instance, you can select ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.dat.gz for UniprotKB/Swiss-Prot from the -pre defined urls- menu). Provide also a release name/number. For Uniprot, to know the current release number, click on the "current" link. It will download the release description file. Copy the release description in the release field (for instance: 56.6 of 16-Dec-2008). The default value is the current date in format YYYYMMDD. The selected value in the form will appear as it is in the submission page. Click on Submit. The new version appear in the tree with a rotating red status spot. It does turn green once the DB preparation is finished. In case it doesn't, you can also manually refresh the tree by clicking on the
icon.
- If you select the fasta installer, you can either upload a local file or enter an url for the source file. The fasta file can be already in the Phenyx-specific format, or can by parsed using a regular expression. Phenyx fasta format is a Phenyx-specific format ressembling a regular fasta, but where the various information fields in the header are formatted. Example of a header line in this format: >SWN:PWP1_HUMAN \ID=PWP1_HUMAN \DE=PERIODIC TRYPTOPHAN PROTEIN 1 HOMOLOG (KERATINOCYTE PROTEIN IEF).
Specify a release name/number as it will appear in the submission page and click on "submit".
Here are examples of regular expressions:
example 1:
>sp|P68250|1433B_BOVIN 14-3-3 protein beta/alpha OS=Bos taurus GN=YWHAB PE=1 SV=2 AAAAAAAA
regular expression: s/^sp\|(\w+?)\|(.*)/$1 \\DE=$2/
ex 2:
>PA0001 gi|15595199|ref|NP_064721.1| chromosomal replication initiator protein DnaA [Pseudomonas_aeruginosa_PAO1] AAAAAA
regular expression: s/(\w+)\s*(.*)/$1 \\DE=$2/
Note: it is recommended to test a regular expression with a small databank (3 entries for instance) and check the result before going to a big one. You can check the result by double-clicking on a version, and download the headers.zip file
- If you select the precompiled installer, it will upload and install a databank available as precompiled, phenyx-archived format. You can either browse for the local archive, or use the -pre defined urls- menu to access our precompiled databanks repository (ftp), then click on "submit"
- If you select the fasta content installer, you can copy and paste a fasta-formatted list of amino acid sequences, and provide a release/version value. This feature is useful to search a limited number of entries.
Note: The format can be either a simple fasta file such as:
>sequence1 DFRTACNMKRFSTDDPHGGST >sequence2 DFRTACNMKRFSTDDPHGGST
the parsing rule is basic: the AC include all characters next to ">" until it reaches a _SPACE_. All subsequent text is ignored. No ID nor Description will be included.
The format can also be a phenyx formated format.
- If you select the NCBI(nr)formated installer, it will upload and install both the nr.gz file and the corresponding file with protein's gi and taxid information (gi_taxid_prot.dmp file). Are uploading a NCBI fasta file with sorted AC? Then check the Sorted data box to further improve the calculation time on the databank.
Note: if you want to install a database in NCBI format, but available on another ftp adresss, change the target address. If the file is local to the phenyx server (not on the client computer), you can also use file:///path/database_filename
Install a decoy Databank
A decoy databank is created from an existing "forward" databank and is referred to it. Therefore, you need first to create the "forward" version. Then, instantiate a new bank an provide a name, select the "is decoy?" option and choose the "forward" databank it is related with.
Then create a new version. A new installer named "decoy" appears in the Installer menu. Provide a release name/number ans select the decoying method of choice.
Change default version
The active version of the DB is checked. If you want to change it, click on another version checkbox. The new version becomes active and can be used as default in new identification jobs.
Archive a Databank version, visualize its content
If you want to export and archive a DB, double-click on a DB version. You can decide to either download the full archive in zip format, or to download only the entry headers in a phenyx fasta format. The installer log will be helpful in case of errors.
Delete DBs
If you want to delete a DB version or a DB instance (and also all associated DB versions), drag and drop the corresponding DB to the trash icon. Be careful, it will erase the files on the disk!











