Til innholdet

Prosjektnummer

900400

Prosjektinformasjon

Prosjektnummer: 900400
Status: Avsluttet
Startdato: 01.01.2009
Sluttdato: 31.12.2012

The Salmon Louse Genome Sequencing Project: Part 1

Results achieved
Results part 1

The salmon louse genome was sequenced to a final coverage of approximately 180X. The sequencing was completed in 2011 and formally completed part 1 of the salmon louse genome project.

Part 1 of the Salmon Louse Sequencing Project was financed by IMR, FHF and MH.

Results Part 2
Initial assemblies were generated using a number of assemblers and various combinations of assemblers (since many assemblers will not accept all data) before scaffolding1. The utilization of the resulting genome assemblies and annotations were secured by making the results publicly available in accordance with an access agreement from March 2012. This availability has resulted in use by 46 scientists from 12 institutions. A final assembly was made in January 2013 based on comparisons of assemblers and supporting information on linkage groups supplied by the PrevenT project (Salmon louse – prevention and treatment (PrevenT) (FHF-900416). The final assembly (generated by Newbler) appears to be of very good quality and has a N50 of 570K and a total size of 695MB.

In Norwegian
Lakselusa sitt arvestoff (genom) er nå kartlagt. Dette utnyttes aktivt av forskere over hele verden for å lære mer om lusa, og kanskje finne ut hvordan vaksine eller andre midler mot lus kan lages. FHF deltok i den første fasen av prosjektet, og også i fase 2 der genomet ble satt sammen og kvalitetssikret via prosjektet Salmon louse – prevention and treatment (PrevenT) (FHF-900416) som FHF finansierer sammen med Forskningsrådet. Genomet som nå er sekvensert anses å være kartlagt med meget god kvalitet og detaljeringsgrad, med en lengde på 695 000 basepar.
Background
FHF, Marine Harvest (MH) and Institute of Marine Research (IMR) have decided to finance sequencing of the salmon louse (Lepeophtheirus salmonis) genome.

The sequencing project will consist of two parts as follows:
Part 1: Genome sequencing
Part 2:  Bioinformatics processing.

The present project will fulfill part 1.
Objectives
• To sequence and assemble of a draft salmon louse genome.
• To identify the majority of L. salmonis genes, make the genome sequence publicly available and thereafter perform annotation.
The genome will be of paramount importance for molecular approaches to developing new salmon louse treatments.
Project design and implementation
Genome sequencing requires that every base is sequenced several times and a six time genome coverage (6X) is regarded as a draft sequence (more than 95% of the genome will be covered). A 6X coverage normally provides sufficient data to assemble most of the DNA into long stretches of sequence. Finished sequence requires more than 9X coverage and 99% of the DNA will be covered with more than 99.99% accuracy. These considerations are true for “Sanger sequencing” (read length of >750 bp).

New sequencing technology (UTS) has emerged recently which increase the throughput but the read length is shorter (currently up to ~500 bp for the 454 Titanium platform). It is regarded that ~15X coverage with new technology should provide a good draft sequence quality compared to Sanger (~6X).

In addition, modeling the effect of coverage with respect to assembled contig size shows that a 20X coverage provide equally good assembly as a 50X coverage. A factor that significantly improves assembly is the utilisation of paired end reads. Paired end libraries can be made with different jump size where jump size is the length between the ends in a pair. By using 2–3 paired-end libraries, with different jump size (e.g. 3 kb, 8 kb and 12 kb) contig size will increase and facilitate the scaffold building.

Sequencing the ends of long insert clones (e.g. fosmid or BAC clones) and a genetic map are resources that further facilitate genomic assembly. However, these resources will not be a part of the initial genomic approach of the salmon louse, but rather established and utilized at a later stage. 

The research group has established an inbred line of the salmon louse. This genetic line originated from a single female louse and full sibling crosses. This line has been kept for 22 generations and genotyping reveals only homozygote loci. This inbred line will serve as the template for all the sequencing as this will facilitate subsequent assembly.

Sequencing strategy
All sequencing will be performed on the Ls1a inbred strain.
1. 454 Titanium shot gun sequencing to a coverage of ~20 x.
2. Paired end reads to >10X using Sanger sequencing
3. Ad hoc Sanger sequencing as required to bridge gaps
Dissemination of project results
The results will be made publicly available when the sequence quality have been approved by the academic parties (IMR, CBU-UniReseach( Bergen, Norway) , University of Bergen, Max Planck Institute (Berlin, Germany)).
keyboard_arrow_up