In silico analysis enabling informed design for genome editing in medicinal cannabis; gene families and variant characterisation
journal contributionposted on 2021-09-28, 22:07 authored by Lennon Matchett-OatesLennon Matchett-Oates, Shivraj BraichShivraj Braich, German SpangenbergGerman Spangenberg, Simone RochfortSimone Rochfort, Noel CoganNoel Cogan
Cannabis has been used worldwide for centuries for industrial, recreational and medicinal use, however, to date no successful attempts at editing genes involved in cannabinoid biosynthesis have been reported. This study proposes and develops an in silico best practices approach for the design and implementation of genome editing technologies in cannabis to target all genes involved in cannabinoid biosynthesis. A large dataset of reference genomes was accessed and mined to determine copy number variation and associated SNP variants for optimum target edit sites for genotype independent editing. Copy number variance and highly polymorphic gene sequences exist in the genome making genome editing using CRISPR, Zinc Fingers and TALENs technically difficult. Evaluation of allele or additional gene copies was determined through nucleotide and amino acid alignments with comparative sequence analysis performed. From determined gene copy number and presence of SNPs, multiple online CRISPR design tools were used to design sgRNA targeting every gene, accompanying allele and homologs throughout all involved pathways to create knockouts for further investigation. Universal sgRNA were designed for highly homologous sequences using MultiTargeter and visualised using Sequencher, creating unique sgRNA avoiding SNP and shared nucleotide locations targeting optimal edit sites. Using this framework, the approach has wider applications to all plant species regardless of ploidy number or highly homologous gene sequences. Using this framework, a best-practice approach to genome editing is possible in all plant species, including cannabis, delivering a comprehensive in silico evaluation of the cannabinoid pathway diversity from a large set of whole genome sequences. Identification of SNP variants across all genes could improve genome editing potentially leading to novel applications across multiple disciplines, including agriculture and medicine.