For editing genome annotations, many of my colleagues use Artemis while others use Apollo. For my own use, I’ve usually just made scripts that generate GFF and visualized that in Gbrowse, Jbrowse, or IGV. For the genomics class I co-teach, we’ve had students edit GFF in a text editor (emacs!) and display it in IGV. But this year we shifted to doing more stuff that we used to do on the command line to our local teaching Galaxy, so after many years of avoiding them, I need to quickly get up to speed with Artemis and/or Apollo (in the long run, we’re going to use WebApollo, but that isn’t happening before the next homework assignment). Desktop Apollo stopped development and it’s not clear what the status of Artemis is, so this learning exercise may not be that useful.
To teach the kinds of things that MAKER does as a complete workflow, we are showing students how to take pieces of ab initio and data-driven evidence and assemble by hand the kind of evidence stack that MAKER automates. This means that we want to start with an undecorated fasta file of our artificial genome and load a bunch of gff, gtf, and bam files.
Everything below was done on a MacBook Air running OSX 10.9 (Mavericks).
Loading a fasta file
It seems like there are a couple of ways to do this. I was able to load my fasta file using either File > Read an entry or by invoking a project manager (which only seems to be available from the File menu if nothing else is opened). I initially opened a copy of my fasta file from a directory I had used with IGV, but found that this caused saves to fail because there was also a fasta index file present. Copying the file into my artemis working directory, I was able to open and save. This is what the viewer looks like.
The top line of the viewer shows a selector for feature sets, aka “Entries” in Artemis’ jargon. Below the entry bar (which can be hidden), the viewer shows an overview and a detailed view. Scroll bars on the right allow you to adjust the zoom of each; you can make the lower panel more of an overview than the top if you want. Double clicking on either panel jumps the other to the area you are viewing. A variety of graph options for things like GC content are available and open as additional panels. As you zoom out, Artemis shows stop codons in all 6 reading frames. As you zoom in, you get amino acid and DNA sequences.
Layers of annotations are “Entries”, so I can load additional files in different formats or create them using Artemis’ built-in tools. For example, Create > Mark Open Reading Frames gives this:Several things have changed.
- We have a new entry “ORFS_100+” (I used the default lower limit of 100 aa for ORF calling) on the entries bar.
- The panels are now decorated with aqua blocks showing CDS features
- The bottom panel shows a textual list of CDS features
I loaded a couple more entries as gff files:
- Augustus gene prediction
- Blastx parsed with a bioperl script I wrote
To get this view I tried some additional options from the Display menu. I tried Display > Show One Line Per Entry View. This is Display > Feature Stack View. These two create another panel above the overview genome panel.
There are some nice things about the display, but other parts are kind of a mess:
- I like how the coding exons are linked across different reading frames
- The parent-child feature relationships seem to be incomplete. CDS features are linked within a transcript, but parts of the same gene feature are displayed separately, and are stacked onto each other in a way that is hard to see.
Create a new set of annotations
Create > New Entry adds an entry to the entry bar called “no_name”. Yes, really. There’s no field to name the entry when you create it. You have to use Entries > Set Name of Entry and pick the no_name entry before you can rename it.
Features can be copied from the evidence entry sets to your custom entry and then edited. But I think I haven’t found the right way to copy a feature set (e.g. gene, transcript, introns, cds etc.) together.
That’s where I am so far… more later, perhaps.
Artemis manual (ftp/pdf)
- Berriman and Rutherford (2003) Brief Bioinform 4:124-32
- NOVA course for Pseudomonas-Plant Interaction