Tutorial: Running MAFFT in Galileo

Written and developed by Agustin Pardo

Tutorial: Running MAFFT in Galileo

Written and developed by

Matthew Gasperetti

matthew@hypernetlabs.io

Alexander Berry

alexander@hypernetlabs.io

Getting started with MAFFT in Galileo

Getting started with MAFFT in Galileo

To get started with Galileo log into your account using Firefox or Chrome, and download our MAFFT example file from GitHub. The downloaded folder contains an example .fasta file.

MAFFT

This tutorial demonstrates how to use MAFFT. MAFFT creates multiple sequence alignments (MSA) of nucleotides or protein sequences (genomes) and aligns all the sequences together and visualizes how they differ:

Multiple sequence alignment of 18 protein sequences. The color represents each amino acid. You can visualize which sequences have the same and different amino acids for each sequence. Note: MAFFT cannot visualize the alignment.

In our example folder, we have to have the file, “input.fasta”. The MAFFT program takes as input a file of genome or protein sequences in a format called “FASTA”, hence the “input.fasta”. This file format is commonly used in bioinformatics.

Understanding the user interface

When you log into Galileo, the first thing you’ll see is your Dashboard:

View of the Galileo Dashboard

To run the MAFFT example, start by navigating to the Missions tab using the side menu.

Using the Configuration Wizard to create your own project

Drag and drop the entire MAFFT example folder you downloaded from our GitHub to the “Add a mission” staging area. The Configuration Wizard will appear to help you create the appropriate computing environment to run MAFFT. Selected the mission type (MAFFT), give the job a name, and provide the wizard with the names of the input .fasta file and output .fasta file. Make sure the name of the input file matches that of the data file you uploaded.

Click the “Submit” button. Once the folder has been uploaded, click on the “Run mission” button in the newly-created “Galileo-examples-MAFFT” mission below the staging area. You will be asked to select a station on which to run the mission.

Choose the “Linux” station to begin and click “Run mission”. In addition to the MAFFT example job, you can also upload any .fasta file and follow the same process. After the mission has been launched, you’ll be able to see the job running in the Jobs tab:

The job runs quickly in Galileo – try running it locally and comparing

When the example job completes, hit the Download button under Action to download the results:

The results folder will be downloaded as a .zip that contains an output.log file returning the results of the analysis and a folder called results where plots and other files that were created by the analysis are stored.

The Downloaded .zip file contains a folder called results and a file called output.log

The output is contained in the results.fasta file:

We hope this tutorial was helpful. Please let us know if you have any questions or any problems using Galileo. Your feedback is extremely important to us. Contact us anytime at matthew@hypernetlabs.io or alexander@hypernetlabs.io.