1. How should I cite mBiobank in my own publications?
  2. To date we have published a research paper base on the data in mBiobank. If you use the data in mBiobank, you should cite the paper in your own publications:

    Cao, Y., Li, L., Xu, M. et al. The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals. Cell Res (2020). https://doi.org/10.1038/s41422-020-0322-9

  3. What are the details of data include in mBiobank?
  4. Currently, mBiobank contains 136,745,826 SNPs and 10,703,115 INDELs information on autosomes from phase 1 study of China Metabolic Analytics Project. The phase 1 study proposes to analyze 10,588 whole genome sequencing data (~40X) in Chinese natural population.

  5. Could you please show an example of search result?
  6. The authorized users were able to query SNP and INDEL data via gene symbol, rs ID, genomic region, or genomic position. Search result table includes chromosome name, position, rs ID in dbSNP v149, reference genotype, alteration genotype, variant quality, allele frequency in the 10,588 sample sets, gene symbol, affected transcript, and allele frequency in 1000 Genomes database.

    • Chr, Chromosome
    • Position, The reference position
    • dbSNP, The rs ID in dbSNP
    • Ref, Reference base
    • Alt, Alternate base
    • Qual, Phred-scaled quality score for the assertion made in ALT
    • Alt freq, Allele Frequency
    • Count, Allele count in genotypes / Total number of alleles in called genotypes
    • Gene, Gene symbol
    • Transcript, Affected transcript
    • 1KGP_AF, Allele Frequency in 1KGP (1000 Genomes Project)
    • 1KGP_EAS_AF, Allele frequency of EAS populations in the 1KGP
    • 1KGP_AMR_AF, Allele frequency of AMR populations in the 1KGP
    • 1KGP_AFR_AF, Allele frequency of AFR populations in the 1KGP
    • 1KGP_EUR_AF, Allele frequency of EUR populations in the 1KGP
    • 1KGP_SAS_AF, Allele frequency of SAS populations in the 1KGP

  7. What is the workflow of the NGS data analysis in mBiobank?
  8. The flowchart of data processing:

  9. What genome build is the ChinaMAP data based on?
  10. All data are based on GRCh38.