prep_data sheets

Contextual data about how the samples were prepared for sequencing. Includes how they were extracted, what molecular protocols were used, how they were sequenced. Two prep_data sheets are provided: one for amplicon preps (amplicon_prep_data) and one for metagenomic preps (metag_prep_data). All of terms in the metag_prep_data sheet are also in the amplicon_prep_data sheet. The 1st section of these sheets is in the format for an NCBI SRA upload and should NOT be rearranged or renamed. Each row is a separate sequencing library preparation, distinguished by a unique library_id. One sample from sample_prep could be represented multiple times on these sheets if multiple marker genes were amplified or multiple replicate sequencing libraries were prepared.

Terms

Term	Definition	Required By
sample_name	Sample Name is a name that you choose for the sample. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. Every Sample Name from a single Submitter must be unique. Suggested format: PROJECT_REGION_STATION_DEPTH_REPLICATE	NCBI+OBIS
title	Short description that will identify the dataset on public pages. A clear and concise formula for the title would be like: {methodology} of {organism}: {sample info}	NCBI+OBIS
biosample_accession	BioSample accession from NCBI, provided after creating a biosample on NCBI, such as during the SRA submission process	Recommended
samp_vol_we_dna_ext	Volume (ml) or mass (g) of total collected sample processed for DNA extraction	Recommended
samp_mat_process	Any processing applied to the sample during or after retrieving the sample from environment.	Recommended
size_frac	Filtering pore size used in sample preparation	Optional
library_id	Short unique identifier for the sequencing library. Each library_ID must be unique!	NCBI
library_strategy		NCBI
library_source		NCBI
library_selection		NCBI
lib_layout	Specify whether to expect single, paired, or other configuration of reads	NCBI
platform		NCBI
instrument_model		NCBI
design_description	Free-form description of the methods used to create the sequencing library; a brief 'materials and methods' section.	NCBI
drive_location	Internal storage location of sequencing files	Recommended
sra_accession	Provide the NCBI SRA accession once generated by NCBI.	Recommended
date_dna_extracted	Add date that DNA was extracted. Used for internal data management.	Internal
extraction_personnel	Add names of personnel who did extraction, separated by a space \|	Internal
date_pcr	Add date that PCR was run. Used for internal data management.	Internal
pcr_personnel	Add names of personnel who did extraction, separated by a space \|	Internal
seq_facility	Name of facility that did the sequencing	Recommended
seq_meth	Sequencer and read length	Recommended
nucl_acid_ext	A link to a literature reference, electronic resource or a standard operating procedure (SOP), that describes the material separation to recover the nucleic acid fraction from a sample	Recommended
target_gene	Targeted gene or marker name for marker-based studies	Recommended
target_subfragment	Name of subfragment of a gene or markerImportant to e.g. identify special regions on marker genes like the hypervariable V6 region of the 16S rRNA gene	Recommended
pcr_primer_forward	Forward PCR primer that was used to amplify the sequence of the targeted gene, locus or subfragment.	Recommended
pcr_primer_reverse	Reverse PCR primer that was used to amplify the sequence of the targeted gene, locus or subfragment.	Recommended
pcr_primer_name_forward	Name of the forward PCR primer	Recommended
pcr_primer_name_reverse	Name of the reverse PCR primer	Recommended
pcr_primer_reference	Reference for the primers	Recommended
pcr_cond	Description of reaction conditions and components of PCR in the form of ´initial denaturation:94degC_1.5min; annealing=...´ Examples: initial denaturation:94_3;annealing:50_1;elongation:72_1.5;final elongation:72_10;35	Recommended
nucl_acid_amp	A link to a literature reference, electronic resource or a standard operating procedure (SOP), that describes the enzymatic amplification (PCR, TMA, NASBA) of specific nucleic acids	Optional
adapters	Adapters provide priming sequences for both amplification and sequencing of the sample-library fragments. Both adapters should be reported; in uppercase letters	Recommended
mid_barcode	Molecular barcodes, called Multiplex Identifiers (MIDs), that are used to specifically tag unique samples in a sequencing run. Sequence should be reported in uppercase letters. MIXS term: mid	Optional