We’re seeking a maternity cover bioinformatician in my group at the Earlham Institute. I can guarantee a diverse set of projects to work on across a number of genomics technologies and an inviting working environment at the Earlham!

The Platforms and Pipelines (P&P) Group at Earlham is responsible for delivering the BBSRC National Capability in Genomics (NCG), providing access to state of the art genomics technologies for research groups throughout the UK as well as delivering sequencing and genomics services to the associated Science Faculty groups and commercial customers.

The Bioinformatics team within P&P perform primary data analysis for research projects from a wide variety of NGS platforms, including Illumina, PacBio, Oxford Nanopore and associated technologies from BioNano Genomics and 10X Genomics. These projects cover practically every NGS application from genome assembly, amplicon re-sequencing, exome capture, CHIP-Seq, RAD-Seq, RNA-Seq, microRNA-Seq and metagenomics.

The post will join 2 other bioinformaticians and a bioinformatics team manager and will be responsible for data analysis, producing bioinformatics pipelines for use with new sequencing applications delivered on HPC platforms, maintaining and improving current pipelines, increasing automation of tasks and supporting the project management team with bioinformatics and NGS customer enquiries.
We are encouraging the development of an Agile culture and make extensive use of Atlassian products such as JIRA, Stash and Confluence.

]]>http://metagenom.es/?feed=rss2&p=109100We’re looking for a new member for our Project Management Team! (now filled)http://metagenom.es/?p=10879
http://metagenom.es/?p=10879#respondWed, 28 Oct 2015 09:13:24 +0000http://metagenom.es/?p=10879Let’s cut the preamble and let you just hit the link: http://jobs.tgac.ac.uk/Details.asp?vacancyID=11099

For those people seeking an alternate career path in science this is a fantastic opportunity to engage with external customers and our Science Faculty as well as working closely with the lab staff and bioinformaticians in my group at TGAC.

Advert blurb:

The Genome Analysis Centre (TGAC) has an exciting opportunity within the Platforms and Pipelines Group for a Customer Liaison Officer. Working in a dynamic, high-throughput, genomics facility with a focus on next-generation sequencing (NGS), the post holder will ensure that high customer service levels are achieved as TGACs customer base grows and expands by responding to customers in a timely manner and ensuring that the outputs from the projects are produced on time and to specification. This is an opportunity to work closely with bioinformatics and laboratory based teams in an Agile project management environment.

]]>http://metagenom.es/?feed=rss2&p=108790We’re (not) recruiting a bioinformatician!http://metagenom.es/?p=10871
http://metagenom.es/?p=10871#respondTue, 19 May 2015 10:57:41 +0000http://metagenom.es/?p=10871Note: Actually we’re not, we never recruited to this position, but keep your eyes peeled – we will be readvertising this post in the next couple of months! (Dec 15/Jan16)

The third of our three currently open posts is for a bioinformatician within the Platforms and Pipelines bioinformatics team. The post will join three other bioinformaticians and the Bioinformatics Manager to provide bioinformatics consultation and data analysis across a number of diverse sequencing/genomics platforms and projects.

This is a great opportunity to work in a cutting edge genomics environment and experience the incredible breadth of TGAC’s work. The bioinformatics team is closely integrated with our project management and lab teams, and as the final step in project delivery are an important part of making sure that we deliver excellent data and analysis to our customers.

TGAC has a particular remit to work with bleeding-edge technologies, and whilst next-generation sequencing is at the core of what we do the successful candidate will be able to experience first hand data from new platforms as they come online.

]]>http://metagenom.es/?feed=rss2&p=108710We’re recruiting a Bioinformatics Manager! (now filled)http://metagenom.es/?p=10867
http://metagenom.es/?p=10867#commentsTue, 19 May 2015 08:44:37 +0000http://metagenom.es/?p=10867The Platforms and Pipelines Group is looking for a leader for our bioinformatics team, this is an exciting opportunity to work in an environment with a very strong focus on genomics and computational biology. The post-holder will be critical to the development of our National Capability in Genomics, driving the transfer of cutting edge analysis techniques from Science Faculty into production for the wider research community.

The bioinformatics team is strongly aligned with our high-througput laboratory and as such are expected to deal with data from all commercially available sequencing platforms as well as optical mapping platforms, serving applications from exome sequencing to RNA-Seq, metagenomics, genome assembly, epigenetics and more. The post holder will also have a rare opportunity to work across a huge breadth of non-model and model organism genomes.

An eye for detail, a strong focus on QC and the ability to direct the future of a team of people to support customer requirements in a rapidly changing scientific environment are a must.

]]>http://metagenom.es/?feed=rss2&p=108671We’re recruiting an Automation Specialist! (now filled)http://metagenom.es/?p=10860
http://metagenom.es/?p=10860#commentsTue, 19 May 2015 08:31:58 +0000http://metagenom.es/?p=10860The Platforms and Pipelines Group at TGAC is recruiting an Automation Specialist for our high-throughput next-generation sequencing and genomics laboratory. The role will involve leading development of new automation protocols on a number of liquid handling systems from Perkin-Elmer, Beckman Coulter and Labcyte.

Our liquid handling systems support critical parts of our DNA extraction and NGS library preparation pipelines, and the right candidate will be someone who enjoys working with cutting edge robotics platforms and transitioning complex lab protocols onto liquid handlings systems. This is a critical post for the smooth running of the group and will liaise closely with vendors to train on-site and prospective users of the system. TGAC has an exciting remit to work with cutting edge technologies and tools and the post holder will be expected to advise on future strategies for lab automation.

]]>http://metagenom.es/?feed=rss2&p=108601Snuck into the lab to take a few (blurry) pictures!http://metagenom.es/?p=10851
http://metagenom.es/?p=10851#commentsTue, 31 Mar 2015 19:06:32 +0000http://metagenom.es/?p=10851So I nipped into the lab at the end of the day to drop some samples off with Chris, and ended up getting a lesson in setting up and programming the Sciclone G3 Perkin Elmer automated liquid handlers from Gawain and Fiona. On the way out I thought I’d just grab a few pictures of the toys..

I left out the Argus (Opgen) which sits next to the BioNano, and the 454 FLX machines which are also no longer in use, but flank the aisle next to the bank of Illumina’s.

]]>http://metagenom.es/?feed=rss2&p=108512Positions open in the Platforms & Pipelines Group at TGAC (now filled)http://metagenom.es/?p=10835
http://metagenom.es/?p=10835#commentsMon, 02 Mar 2015 10:33:42 +0000http://metagenom.es/?p=10835So my new role, which I will elaborate on later, is as the Head of the Platforms and Pipelines Group at The Genome Analysis Centre in Norwich. We have a couple of positions vacant currently in the group. For those of you interested in Bioinformatics posts, stay tuned for later advertisements.

For those of you unfamiliar with TGAC, it is a UK hub for innovative Bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It also has a state of the art DNA sequencing facility operating multiple complementary technologies for data generation that provide the foundation for analyses furthering our fundamental understanding of genomes and how they function.

The positions open are in Chris Watkins group, the Platforms & Pipelines Project Management Team. One is a Customer Liaison Officer (Pipelines) and the other is a Customer Liaison Officer (Sales).

The Pipelines role is a new post, and will ensure that there are timely responses to customer enquiries and ensuring that a projects deliverables are provided on time, to a high quality and communicated clearly to the customer. An undergraduate degree in a relevant area is essential, and candidates with some knowledge of cutting edge bioinformatics and genomics would be welcome.

The Sales role is also a new post, and where the Pipelines role deals with customer engagement after a project is secured, the Sales role will deal with the securing those projects. This means dealing with sales enquiries, issuing quotes and invoices, and work on some marketing aspects for the team. Again an undergraduate degree in a relevant area is essential, and given the complexity of the projects, those with post-graduate qualifications would be welcome.

Please circulate this to people you think might be interested!

]]>http://metagenom.es/?feed=rss2&p=10835115th International Conference on Human Genome Variation – Meeting reporthttp://metagenom.es/?p=10827
http://metagenom.es/?p=10827#respondFri, 26 Sep 2014 14:43:22 +0000http://danielswan.net/?p=10827Last week I was lucky enough to attend the HGV2014 meeting at the Culloden Hotel in Belfast. It was my first trip to Northern Ireland and my first attendance at an HGV meeting. The meeting is small and intimate, but had a great wide-ranging programme, and I would heartily recommend attending if you get the chance and have an interest in clincal or human genomics.

]]>http://metagenom.es/?feed=rss2&p=108270HGV2014 Meeting Report, Session 7 “NEXT-GEN ‘OMICS AND THE ACTIONABLE GENOME”http://metagenom.es/?p=10825
http://metagenom.es/?p=10825#respondFri, 26 Sep 2014 14:25:38 +0000http://danielswan.net/?p=10825Caveats: I have not taken notes in every talk of every session, a lack of notes for a particular speaker does not constitute disinterest on my part, I simply took notes for the talks that were directly related to my current work. If I have misquoted, misrepresented or misunderstood anything, and you are the speaker concerned, or a member of the team involved in the work, please leave a comment on the post, and I will rectify the situation accordingly.

7.1 Christine Eng, Baylor College of Medicine: “Clinical Exome Sequencing for the Diagnosis of Mendelian Disorders”

Christine spoke about the pipeline for clinical WES at Baylor. Samples are sequenced to 140x to achieve 85%>40x coverage for the exome. A SNP array is run in conjunction with each sample. Concordance with the SNP array is tested for each sample and this must exceed 99%.

BWA is the primary mapper, but variants are called with ATLAS and annotated with Cassandra (Annovar is a dependency of Cassandra)

Variants are filtered against HGMD. Filtered for variants which are <5% MAF. 4000 clinical internal exomes have been run so there is a further requirement for variants to have a <2% MAF in this dataset.

New gene list is updated for the system weekly and VOUS are reported in genes related to the disorder to all patients – this is much more extensive reporting than for those groups who feel VOUS muddy the waters.

An expanded report can be requested in addition which also reports deleterious mutations in genes for which there is no disease/phenotype linkage. The hit rate for molecular diagnostics via clinical exome is 25% and 75% are not clinically solved. These are then asked if they would like to opt in to a research programme so that the data can be shared and aggregated for greater diagnostic power.

11/504 cases had two distinct disorders presenting at the same time. 280 cases were autosomal dominant and 86% of the dominant cases are de novo mutations. 187 cases were autosomal recessive and this was 57% compound heterozygous, 3% UPD and 37% had homozygosity due to shared ancestry.

Many initially unsolved diagnoses can be revisited and successfully resolved 6-12 months later on revisiting the data such is the base of new data deposition.

They use guidelines from CPIC (from PharmGKB) and data on drug/gene interactions and there is linking to a prescription database, so the pipeline is ‘end to end’.

]]>http://metagenom.es/?feed=rss2&p=108250HGV2014 Meeting Report, Session 6: “UNDERSTANDING THE EVOLVING GENOME”http://metagenom.es/?p=10822
http://metagenom.es/?p=10822#respondFri, 26 Sep 2014 12:33:58 +0000http://danielswan.net/?p=10822Caveats: I have not taken notes in every talk of every session, a lack of notes for a particular speaker does not constitute disinterest on my part, I simply took notes for the talks that were directly related to my current work. If I have misquoted, misrepresented or misunderstood anything, and you are the speaker concerned, or a member of the team involved in the work, please leave a comment on the post, and I will rectify the situation accordingly.

Yves introduced “Endeavour” which takes a gene list and matches it to the disease of interest and ranks them, but this requires phenotypic information to be ‘rich’. Two main questions need to be addressed 1) What genes are related to a phenotype? And 2) Which variants in a gene are pathogenic? Candidate gene prioritization is not a new thing, and has a long history in microarray analysis. Whilst it’s easy to interrogate things like pathway information, GO terms and literature it is much harder to find relevant expression profile information or functional annotation and existing machine learning tools do not really support these data types.

eXtasy allows variants to be ranked by effects on structural change in the protein, association in a case/control or GWAS study, evolutionary conservation.

The problem though is one of multiscale data integration – we might know that a megabase region is interesting through one technique, a gene is interesting by another technique, and then we need to find the variant of interest from a list of variants in that gene.

They have performed HGMD to HPO mappings (1142 HPO terms cover HGMD mutations). It was noted that Polyphen and SIFT are useless for distinguishing between disease causing and rare, benign variants.

eXtasy produces rankings for a VCF file by taking the trained classifier data and using a random forest approach to rank. One of the underlying assumptions of this approach is that any rare variant found in the 1kG dataset is benign as they are meant to be nominally asymptomatic individuals.

These approaches are integrated into NGS-Logistics a federated analysis of variants over multiple sites which has some similarities to the Beacon approaches discussed previously. NGS-Logistics is a project looking for test and partner sites

However it’s clear what is required as much as a perfect database of pathogenic mutations is also a database of benign ones – both local population controls for ethnicity matching, but also high MAF variants, rare variants in asymptomatic datasets.

Aiofe started by saying that most CNVs in the human genome are benign. The quality that makes a CNV pathogenic is that of gene dosage. Haploinsufficiency (where half the product != half the activity) affects about 3% of genes in a systematic study in yeast. This is going to affect certain classes of genes, for instance those where concentration dependent effects are very important (morphogens in developmental biology for example).

This can occur through mechanisms like a propensity towards low affinity promiscuous aggregation of protein product. Consequently the relative balance of genes can be the problem where it affects the stoichiometry of the system.

This is against the background of clear genome duplication over the course of vertebrate evolution. This would suggest that dosage sensitive genes should be retained after subsequent genome chromosomal rearrangement and loss. About 20-30% of the genes can be traced back to these duplication events and they are enriched for developmental genes and members of protein complexes. These are called “ohnologs”

What is interesting is that 60% of these are never associated with CNV events or deletions and duplications in healthy people and they are highly enriched for disease genes.

Under discussion in this talk was the characterization of Loss of Function (LoF) mutations. There’s a lot of people who prefer not to use this term and would rather describe them as broken down into various classes which can include

Truncating nonsense SNVs

Splice disrupting mutations

Frameshift indels

Large structural variations

The average person carries around a hundred LoF mutations of which around 1/5th are in a homozygous state.

It was commented that people trying to divine information from e.g. 1kG datasets had to content with lots of sequencing artefacts or annotation artefacts when assessing this.

In particular the introduction of stop codons in a transcript are hard to predict. Some of the time this will be masked by splicing events or controlled by nonsense-mediated decay which means they may not be pathogenic at all.

Also stops codons in the last exon of a gene may not be of great interest as they are unlikely to have large effects on protein conformation.

The ALOFT pipeline was developed to annotate loss of function mutations. This uses a number of resources to make predictions including information about NMD, protein domains, gene networks (shortest path to known disease genes) as well as evolutionary conservation scores (GERP), dn/ds information from mouse and macaque and a random forest approach to classification. A list of benign variants is used in the training set including things like homozygous stop mutations in the 1kG dataset which are assumed to be non-pathogenic. Dominant effects are likely to occur in haploinsufficient genes with an HGMD entry.