project_name project_id v_region target_size raw_data_source local_processed_fasta processed_fasta num_samples num_subjects study_design_notes area attribute_data_type attributes_values short_desc literature_source sequencing_technology study_design original_mapping_file Cho 2012 cho V3 177 https://www.ncbi.nlm.nih.gov/bioproject/168618 MSI http://metagenome.cs.umn.edu/public/MLRepo/fasta/cho2012.fasta.gz 95 47 Two sample types available Antibiotics Categorical "Abx: Control, Penicillin, Chlortetracycline, Vancomycin, VancomycinPenicillin; Source: cecal, fecal" "Mouse fecal and cecal samples, Control vs. 4 kinds of antibiotics" http://www.ncbi.nlm.nih.gov/pubmed/22914093 454 Cross-Sectional ./datasets/cho/mapping-orig.txt Claesson 2012 claesson V4 221 https://qiita.ucsd.edu/download/13268 https://qiita.ucsd.edu/download/13265 https://qiita.ucsd.edu/download/13265 168 168 Age Categorical "AGE: elderly, young" Elderly and young adults https://www.ncbi.nlm.nih.gov/pubmed/22797518 454 Cross-Sectional ./datasets/claesson/mapping-orig.txt David 2014 david V4 282 https://www.mg-rast.org/linkin.cgi?project=mgp6248 MSI http://metagenome.cs.umn.edu/public/MLRepo/fasta/david2014.fasta.gz 235 11 "Longitudinal data available, but subset by last day of intervention to compare groups" Diet "Categorical, Integer" "Diet: Plant, Animal; Day: -4 to -1 (baseline), 0 to 4 (diet), 5 to 10 (washout)" "Plant-based vs. Animal-based diet, Cross-over study" https://www.ncbi.nlm.nih.gov/pubmed/24336217 Illumina MiSeq Longitudinal ./datasets/david/mapping-orig.txt Gevers 2014 gevers V4 173 https://qiita.ucsd.edu/download/33520 https://qiita.ucsd.edu/download/33519 https://qiita.ucsd.edu/download/33519 1321 668 IBD Categorical "DIAGNOSIS: no, IC, CD, UC; BODY_SITE: UBERON:feces, UBERON:rectum, UBERON:colon, UBERON:ileum;" Biopsies from IBD patients prior to treatment https://www.ncbi.nlm.nih.gov/pubmed/24629344 Illumina MiSeq Cross-Sectional ./datasets/gevers/mapping-orig.txt HMP 2012 hmp V35 527 https://qiita.ucsd.edu/download/64700 https://qiita.ucsd.edu/download/64699 https://qiita.ucsd.edu/download/64699 6407 242 Multiple body site samples per person "Body Habitat, Gender" Categorical "SEX: female, male; HMPBODYSUBSITE: Tongue_dorsum, Left_Antecubital_fossa, Left_Retroauricular_crease, Anterior_nares, Subgingival_plaque, Hard_palate, Posterior_fornix, Stool, Throat, Right_Retroauricular_crease, Mid_vagina, Buccal_mucosa, Vaginal_introitus, Saliva, Right_Antecubital_fossa, Supragingival_plaque, Palatine_Tonsils, Attached_Keratinized_gingiva; HMPBODYSUPERSITE: Oral, Skin, Airways, Urogenital_tract, Gastrointestinal_tract" Up to 18 body sites across 242 healthy subjects at 1-2 time points https://www.ncbi.nlm.nih.gov/pubmed/22699609 454 Cross-Sectional ./datasets/hmp/mapping-orig.txt Kostic 2012 kostic V35 569 https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?study=SRP000383 MSI http://metagenome.cs.umn.edu/public/MLRepo/fasta/montassier2016.fasta.gz 190 95 Samples are paired per person Colorectal Cancer Categorical "DIAGNOSIS: Healthy, Tumor" Adjacent Healthy vs. Tumor Colon Biopsy Tissues https://www.ncbi.nlm.nih.gov/pubmed/22009990 454 Paired ./datasets/kostic/mapping-orig.txt Montassier 2016 bacteremia V56 280 https://www.ncbi.nlm.nih.gov/sra/SRX733464 MSI http://metagenome.cs.umn.edu/public/MLRepo/fasta/montassier2016.fasta.gz 28 28 Bacteremia Categorical "Treatment: NObact, bact" Patients prior to chemotherapy who did or did not develop bacteremia https://www.ncbi.nlm.nih.gov/pubmed/27121964 454 Cross-Sectional ./datasets/bacteremia/mapping-orig.txt Morgan 2012 sokol V35 569 https://www.ncbi.nlm.nih.gov/bioproject/82111 MSI http://metagenome.cs.umn.edu/public/MLRepo/fasta/morgan2012.fasta.gz 231 231 Cross-sectional study with different subjects for all sample types IBD Categorical "ULCERATIVE_COLIT_OR_CROHNS_DIS: Crohn's disease, Healthy, Ulcerative Colitis; BODY_SITE: UBERON:ileal mucosa, UBERON:feces, UBERON:mucosa of descending colon" "Healthy, Crohn's Disease, or Ulcerative Colitis patients" https://www.ncbi.nlm.nih.gov/pubmed/23013615 454 Cross-Sectional ./datasets/sokol/mapping-orig.txt Turnbaugh 2009 turnbaugh_twins V2 230 https://qiita.ucsd.edu/download/6982 https://qiita.ucsd.edu/download/6979 https://qiita.ucsd.edu/download/6979 281 154 Obesity Categorical "OBESITYCAT: Lean, Overweight, Obese; TWIN_MOTHER: Twin, Mother; ZYGOSITY: DZ, MZ, NA" "Monozygotic or dizygotic twin pairs concordant for BMI class, and their mothers" http://www.ncbi.nlm.nih.gov/pubmed/19043404 454 Cross-Sectional ./datasets/turnbaugh_twins/mapping-orig.txt Wu 2011 bushman_cafe V12 244 https://qiita.ucsd.edu/download/2032 https://qiita.ucsd.edu/download/2029 https://qiita.ucsd.edu/download/2029 95 10 "Longitudinal data available, but subset by last day of intervention to compare groups" Diet Categorical "DIET: HighFat, LowFat; DAY: 01 to 10" Controlled HighFat or LowFat feeding on 10 subjects over 10 days https://www.ncbi.nlm.nih.gov/pubmed/21885731 454 Longitudinal ./datasets/bushman_cafe/mapping-orig.txt Yatsunenko 2012 yatsunenko V4 282 https://www.mg-rast.org/linkin.cgi?project=mgp401 MSI http://metagenome.cs.umn.edu/public/MLRepo/fasta/yatsunenko2012.fasta.gz 531 531 Infants should be analyzed separately from non-infants "Geography, Age, Gender" "Categorical, Continuous" "COUNTRY: GAZ:Venezuela, GAZ:United States of America, GAZ:Malawi; AGE: continuous" "Humans of varying ages from the USA, Malawi, and Venezuela" http://www.ncbi.nlm.nih.gov/pubmed/22699611 Illumina MiSeq Cross-Sectional ./datasets/yatsunenko/mapping-orig.txt Ravel 2011 ravel V12 240 https://www.ncbi.nlm.nih.gov/sra/SRA022855 /project/flatiron2/tonya/vaginal_otus/combined_seqs.fna http://metagenome.cs.umn.edu/public/MLRepo/fasta/ravel2011.fasta.gz 396 396 Bacterial Vaginosis "Categorical, Integer" "Ethnic_Group: White, Black, Asian, Hispanic; Nugent_score_category: Low, Intermediate, High; Nugent_score; pH" Vaginal samples from four ethnic groups nugent scores for bacterial vaginosis https://www.ncbi.nlm.nih.gov/pubmed/20534435 454 Cross-Sectional ./datasets/ravel/mapping-orig.txt Karlsson 2013 karlsson NA NA http://www.ncbi.nlm.nih.gov/sra?term=ERP002469 /project/flatiron2/data/public_shotgun/karlsson2013/shizen_20161130/fasta/combined_seqs.fna http://metagenome.cs.umn.edu/public/MLRepo/fasta/karlsson2013.fasta.gz 144 144 Diabetes Categorical "Classification: NGT, IGT, T2D" "Patients with normal, impaired, or type 2 diabetes glucose tolerance categories" https://www.ncbi.nlm.nih.gov/pubmed/23719380 Illumina HiSeq (shotgun) Cross-Sectional ./datasets/karlsson/mapping-orig.txt Qin 2012 qin2012 NA NA http://www.ncbi.nlm.nih.gov/sra?term=SRA045646; https://www.ncbi.nlm.nih.gov/sra?term=SRA050230 /home/knightsd/cutle051/data/public_shotgun/qin2012/shi7_170630b/combined_seqs.fna http://metagenome.cs.umn.edu/public/MLRepo/fasta/qin2012.fasta.gz 134 134 Diabetes Categorical "Diabetic: Y, N" Healthy vs type 2 diabetes Chinese patients https://www.ncbi.nlm.nih.gov/pubmed/23023125 Illumina HiSeq (shotgun) Cross-Sectional ./datasets/qin2012/mapping-orig.txt Qin 2014 qin2014 NA NA https://www.ebi.ac.uk/ena/data/view/PRJEB6337 /project/flatiron2/data/public_shotgun/qin2014/shi7_20170417/qin2014_combined_seqs.kfn6 http://metagenome.cs.umn.edu/public/MLRepo/fasta/qin2014_combined_seqs.fna.gz 130 130 Cirrhosis Categorical "Cirrhotic: Cirrhosis, Healthy" Cirrhosis versus healthy https://www.ncbi.nlm.nih.gov/pubmed/25079328 Illumina HiSeq (shotgun) Cross-Sectional ./datasets/qin2014/mapping-orig.txt