Initial steps before setting up a gene/disease specific database

Over 2000 gene/disease specific databases (LSDBs) are already in existence and accessible on the internet, therefore it is highly recommended that a new database is created only when a database does not already exist for a particular gene or disease, or if the existing database is no longer functional [1].

Duplicate databases are not only a potential waste of resources and curators time and effort, but can have broader impacts creating confusion for users, data scattering and data maintenance issues [2].

Gene/disease specific databases are only as complete, current and reliable as the curators maintaining them. Initiating a database involves a long-term commitment, with an ongoing requirement for maintenance work as new research and clinical data becomes available [1]. Long term sustainability needs to be considered prior to undertaking the establishment of a gene/disease specific database [1].

Background research

To prevent the creation of duplicate databases, a search of existing resources for is essential:


A simple way to search for existing database is to enter the following URL into your browser:, where GENESYMBOL corresponds to an HGNC approved gene symbol.


If a gene/disease specific database exists for the gene/disease of interest, it is strongly recommended to collaborate with the current curators and offer assistance rather than duplication [2].

If the database for the gene/disease of interest is found and is inactive or abandoned, attempts should be made to contact the listed curators to discuss the reactivation of the existing database before establishing a new database [2].


The Leiden Open Variation Databases (LOVD) have operational databases for all genes. Curator positions at many of these database are vacant and requests for curator access rights can be requested from the LOVD team.

Deciding to establish a database

It is advisable to contact the Human Variome Project International Coordinating Office and notify them of the generating of a new database. They may be aware of others working towards the same goal, in which case collaboration with these colleagues would be advised [2].

The Human Variome Project International Coordinating Office can be contacted via


When initiating a database, it is recommended to first define a governance structure to address the overall operation and management of the database [3].

Governance structures should at least:

  • Identify the custodian: The custodian (ownership) is the person/entity who is accountable and responsible for the database operation [3].
  • Identify the curator (may also be the custodian): A curators specific role is to oversee the management of the database [3]. Ideally the curator is an expert in the scientific/ gene of interest field [1].
  • Identify the database administrator: The database administrator is responsible for managing the infrastructure behind the data, such as installing and updating software, server maintenance, regular back-ups and data security [2].
More information on curation and the role of the curator can be found in the section of this guide on curation.

Considerations for ongoing management

When initiating a new gene/disease specific database it is important to consider future management. At initiation, time investment may not be a concern, but as situations change (finance, funding, research staff, etc.) it may become problematic to maintain the database, thus collaborative efforts are advisable [2].

Consider the options of: - Additional curators to assist with workload and commitment, and provide added recognition in the field. - A larger-scale collaboration within the research community.


