WGS
NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. SenNet is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.
Version 1 (no longer accepting data)
Version 1 (no longer accepting data)
Attribute | Type | Description | Allowable Values | Required |
---|---|---|---|---|
version | Allowable Value | Version of the schema to use when validating this metadata. | [‘1’] | True |
description | Textfield | Free-text description of this assay. | True | |
source_id | Textfield | SenNet Display ID of the source of the assayed tissue. | True | |
tissue_id | Textfield | SenNet Display ID of the assayed tissue. | True | |
execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | True | |
protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | True | |
operator | Textfield | Name of the person responsible for executing the assay. | True | |
operator_email | Textfield | Email address for the operator. | True | |
pi | Textfield | Name of the principal investigator responsible for the data. | True | |
pi_email | Textfield | Email address for the principal investigator. | True | |
assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | [‘sequence’] | True |
assay_type | Allowable Value | The specific type of assay being executed. | [‘WGS’] | True |
analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | [‘DNA’] | True |
is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | [‘Yes’,’No’] | True |
acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | True | |
acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | True | |
gdna_fragmentation_quality_assurance | Allowable Value | Is the gDNA integrity good enough for WGS? This is usually checked through running a gel. | [‘Pass’, ‘Fail’] | True |
dna_assay_input_value | Numeric | Amount of DNA input into library preparation | True | |
dna_assay_input_unit | Allowable Value | Units of DNA input into library preparation | [‘ug’] | False |
library_construction_method | Textfield | Describes DNA library preparation kit. Modality of isolating gDNA, Fragmentation and generating sequencing libraries. | True | |
library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used. | True | |
library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | [‘single-end’, ‘paired-end’] | True |
library_adapter_sequence | Textfield | The adapter sequence to be used for adapter trimming starting with the 5’ end. (eg. 5-ATCCTGAGAA) | True | |
library_final_yield | Numeric | Total amount of library after final pcr amplification step | True | |
library_final_yield_unit | Allowable Value | Total units of library after final pcr amplification step | [‘ng’] | False |
library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | True | |
sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | True | |
sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | True | |
sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + …) | True | |
sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | True | |
contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | True | |
data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | True |
Version 0
Version 0
Attribute | Type | Description | Allowable Values | Required |
---|---|---|---|---|
source_id | Textfield | SenNet Display ID of the source of the assayed tissue. | True | |
tissue_id | Textfield | SenNet Display ID of the assayed tissue. | True | |
execution_datetime | Datetime | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | True | |
protocols_io_doi | Textfield | DOI for protocols.io referring to the protocol for this assay. | True | |
operator | Textfield | Name of the person responsible for executing the assay. | True | |
operator_email | Textfield | Email address for the operator. | True | |
pi | Textfield | Name of the principal investigator responsible for the data. | True | |
pi_email | Textfield | Email address for the principal investigator. | True | |
assay_category | Allowable Value | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | [‘sequence’] | True |
assay_type | Allowable Value | The specific type of assay being executed. | [‘WGS’] | True |
analyte_class | Allowable Value | Analytes are the target molecules being measured with the assay. | [‘DNA’] | True |
is_targeted | Allowable Value | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | [‘Yes’,’No’] | True |
acquisition_instrument_vendor | Textfield | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | True | |
acquisition_instrument_model | Textfield | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | True | |
gdna_fragmentation_quality_assurance | Allowable Value | Is the gDNA integrity good enough for WGS? This is usually checked through running a gel. | [‘Pass’, ‘Fail’] | True |
dna_assay_input_value | Numeric | Amount of DNA input into library preparation | True | |
dna_assay_input_unit | Allowable Value | Units of DNA input into library preparation | [‘ug’] | False |
library_construction_method | Textfield | Describes DNA library preparation kit. Modality of isolating gDNA, Fragmentation and generating sequencing libraries. | True | |
library_construction_protocols_io_doi | Textfield | A link to the protocol document containing the library construction method (including version) that was used. | True | |
library_layout | Allowable Value | State whether the library was generated for single-end or paired end sequencing. | [‘single-end’, ‘paired-end’] | True |
library_adapter_sequence | Textfield | The adapter sequence to be used for adapter trimming starting with the 5’ end. (eg. 5-ATCCTGAGAA) | True | |
library_final_yield | Numeric | Total amount of library after final pcr amplification step | True | |
library_final_yield_unit | Allowable Value | Total units of library after final pcr amplification step | [‘ng’] | False |
library_average_fragment_size | Numeric | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | True | |
sequencing_reagent_kit | Textfield | Reagent kit used for sequencing | True | |
sequencing_read_format | Textfield | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | True | |
sequencing_read_percent_q30 | Numeric | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + …) | True | |
sequencing_phix_percent | Numeric | Percent PhiX loaded to the run | True | |
contributors_path | Textfield | Relative path to file with ORCID IDs for contributors for this dataset. | True | |
data_path | Textfield | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | True |