OMP ID format

From OMPwiki

modified from original googledoc:

https://docs.google.com/document/d/1N3PQM4Prar9aZn5NlJCZJ0rCx7w_G_s2IbIBYXl8CO0/edit#heading=h.whbibz64jbk9

Prefixes

Pan-genome

OMP_PG: vs OMP_PGM vs ?

Jim and Michelle suggested OMP_TX (for taxon) or OMP_SP (for species)

Pan-gene

OMP_GN: vs OMP_PGN vs ?

Strain/Substrain

OMP_ST: vs OMP_STR vs ?

Allele

OMP_AL: vs OMP_ALL vs ?

Phenotype Annotation

OMP_AN: vs. OMP_ANN vs ?


Term ID numbers

Decided to not use leading zeros

Pangenome

  • Includes only the genomes of strains.
  • Includes 3 different categories:
    • 1. core genome: genes present in all strains
    • 2. Dispensable genome: genes present in two or more strains
    • 3. Unique genes: specific to single strains


Strains and sub-strains

  • In ecoli wiki the prefix for all strains and their derivatives is “strain:” Should we just omit this prefix in OMP?
    • Examples:

OMP_ST:00004 ! K-12 vs. OMP_ST:00004 ! Strain:K-12

OMP_ST:00062 ! MG1655 vs. OMP_ST:00062 ! Strain:MG1655


  • Should we make the distinction between a strain and its derivatives?

OMP_ST:00004 ! K-12

OMP_ST:00062 ! K-12_MG1655 vs. OMP_ST:00062 ! MG1655


  • E.coli_K-12_MG1655
  • We will have our own unique ID’s and cross ref with NCBI