Unfiltered sequencing data for plant metabarcode (fasta format)

This fasta file contains unfiltered sequencing data (i.e. merged reads assigned to their original sample) for plant metabarcode. Amplicons were amplified using the universal primers g (5'-GGGCAATCCTGAGCCAA-3') and h (5'-CCATTGAGTCTCTGCACCTATC-3') primers (Taberlet et al. 2007). Sequences were produced by a 2 x 100 bp paired-end sequencing on Illumina HiSeq 2500 platform. First processing steps were performed using the OBITOOLS software (http://metabarcoding.org/obitools) as follows: (i) Direct and reverse reads corresponding to the same sequence were aligned and merged thanks to the IlluminaPairEnd program. Only merged sequences with a high alignment quality score were retained (>=40) (ii) Each merged sequence was assigned to its original sample using the tags information previously added to primers thanks to the ngsfilter program. For this step, only sequences containing both primers (with a maximum of 3 mismatches per primer) and exact tag sequences were selected. (iii) To reduce the file size, strictly identical sequences were merged together while keeping information about the origin of sequences.