The experiments will be carried out by the platform members. The team brings its expertise and know-how on transcriptome analysis and its workforce to this collaboration.

A telephone or on-site meeting to define the biological objective(s) and question(s) of the project in order to establish, among other things, the experimental design will be systematically carried out. Participants in this meeting will be the collaborator(s), the concerned members of the Transcriptome platform and the concerned members of the "Genomic Networks" team.

It’s important to note that many factors influence the level of gene expression in a plant. The control of experimental conditions is therefore crucial if we want to link a difference in expression to the function studied. Thus a control plant compared to a plant that has undergone a specific treatment should be grown in the same nutritional and luminous environment as the latter. For example, a harvest delay during the day will reveal differences due to the circadian expression of many genes. A lack of homogeneity in watering or phytosanitary treatment can be a source of variability unrelated to the process under study. These considerations must be taken into account to ensure the reproducibility of the samples.

3.a Replicates

It’s essential to distinguish between technical and biological replicates.

Technical replicates:

The replicates of a sample are prepared at the same time (sowing, sampling, extraction, etc.).

They allow the observation and quantification of technical biases (technical variability) and control the reproducibility of studies.

Quality control of the data obtained.

The conclusions are only valid for the individual.

Biological replicates:

The replicates of a sample are prepared (sowing, sampling, extraction...) with at least 24 hours of delay (beware of the circadian cycle).

They allow the observation of inter-individual variability and the conclusions can be generalized to the populations studied.

In any case, it is necessary to plan at least two biological replicates, i.e. 3 times the whole experiment. The objective is to characterize the biological variability between replicates, and to “remove” it in order to identify genes whose difference in expression is related only to the factor studied.

3.b Quantity and quality of equipment required for experiments

A quantity of 4µg total RNA (minimum concentration of 200ng/µl) per sample is required. Contact the platform if you were unable to obtain this quantity. Since the purity of the RNAs is one of the most important factors for the success of the experiment, it is preferable to use a column extraction protocol (RNeasy type) including the DNase I step.

For "difficult" samples such as seeds and roots, the addition of PVP is very useful. Contact us if necessary.

The total RNAs are to be sent in the elution solution with dry ice. Their quality will be estimated on an Agilent Bioanalyzer chip and they will be measured with "Ribogreen" after their arrival on the platform.

The sending of the RNAs by the collaborators will be accompanied by the duly completed information table (to be printed on the last page).

RNA samples can be returned to the collaborator upon request and at his expense.

4. Characteristics of sequencing runs and delays

Sequencing runs are performed on the NextSeq500 (Illumina) of the platform, on HiSeq2000 or HiSeq4000 (Illumina) sequencers via the CNS Genomics Institute in Evry.

The number of reads per sample should be adjusted according to your initial biological question.

A deadline for the construction of the libraries and sequencing will be given from the receipt of the RNAs of satisfactory quality and quantity. Depending on the options chosen for bioinformatics and statistical analyses, an additional time will be given. This period will take into account in particular:

After statistical analysis of the raw results, a list of genes by comparison is produced as an Excel file. It includes the average count in condition 1, the average count in condition 2, the log2-ratio and a raw and adjusted p_value to allow false positives to be controlled.

6. Data exchange format

All the results (counts, Excel file, ACP...) will be sent via Renater.

The raw data (fastq), contigs (if realized), will be available for loading via a cloud or a secure site. The partner laboratory is responsible for downloading the raw data on their own server as soon as possible and within a maximum of 1 month after the sequences are made available.

Data storage, bioinformatics and statistical analyses are carried out at IPS2.

IPS2 commits to keep the raw data (fastq archiving, not images) 1 year after the data has been made available. After this period, the data will be destroyed.

7. Databases

It is expected that the results of the experiments will be integrated into the CATdb database, Gagnot et al. Nucleic Acids Res. 2008 and Zaag et al.NAR 2015 (compatible with the MIAME standard: Brazma et al, 2001. Nat Genet. 29(4):365-71) and transmitted to the NCBI Geomnibus (GEO) database. GEO will issue an accession number recommended for any publication of transcriptome results.

To do this, the platform will send you a submission file to collect the information necessary for these submissions (cultivation conditions, treatment, etc.).

Attention, if you do not intend to publish all the data at the same time, fill in 2 different files to have 2 accession numbers (if necessary contact us for more information).

Only the projects for which we carry out the analyses (Option B1 or B2) will be submitted in the 2 databases mentioned above.

8. Data release

The data will be made public 2 years after the end of the project. There are, however, exceptions that will be discussed on a case-by-case basis:

1) if the project is in partnership with an industrial company

2) if the project is part of an ANR/KBBE project; the results are made available to the public only 1 year after the end of the project itself.

3) if the transcriptome results are being published or valued for patent filing.

9. Publication of results

These are scientific collaborations between IPS2 and the partner, in which the platform provides its expertise. Only the cost of consumables is covered by the partner laboratory. As such, a member of the Transcriptome platform and a member of the IPS2 "Genomic Networks" team will be co-authors of the first publication in which transcriptome data will be presented/used. The same agreement will be applied for the filing of Patents at the initiative of the collaborator and in which the transcriptome results will be used.

You will also be asked to cite in the data description text, the CATdb database (for example: "Microarray data from this article were deposited at Gene Expression Omnibus (Edgard 2002): http://www.ncbi.nlm.nih.gov/geo/; accession no. GSEXXXXX and at CATdb (Gagnot 2007): http://urgv.evry.inra.fr/CATdb/; Project: XXXX according to the "Minimum Information About a Microarray Experiment" standards.

Is there a reference genome or UniGene set (transcriptome) available: yes/no:

If so, which one?

6. Number of libraries - description of samples per run (organ, stage of sampling according to Boyes et al. Plant Cell 2001, treatment ...)

The nomenclature to be followed for the names of the samples is as follows: conditionX_Y

X: There is at least one condition (example 1: genotype and example 2: treatment) or more conditions (example 3: genotype_treatment_treatment2 and example 4: genotype_treatment 1), in this case they will be separated by a_.