Getting started with MAFFT in Galileo
This tutorial demonstrates how to use MAFFT. MAFFT creates multiple sequence alignments (MSA) of nucleotides or protein sequences (genomes) and aligns all the sequences together and visualizes how they differ:
In our example folder, we have to have the file, “input.fasta”. The MAFFT program takes as input a file of genome or protein sequences in a format called “FASTA”, hence the “input.fasta”. This file format is commonly used in bioinformatics.
Understanding the user interface
When you log into Galileo, the first thing you’ll see is your Dashboard:
To run the MAFFT example, start by navigating to the Missions tab using the side menu.
Using the Configuration Wizard to create your own project
Drag and drop the entire MAFFT example folder you downloaded from our GitHub to the “Add a mission” staging area. The Configuration Wizard will appear to help you create the appropriate computing environment to run MAFFT. Selected the mission type (MAFFT), give the job a name, and provide the wizard with the names of the input .fasta file and output .fasta file. Make sure the name of the input file matches that of the data file you uploaded.
Click the “Submit” button. Once the folder has been uploaded, click on the “Run mission” button in the newly-created “Galileo-examples-MAFFT” mission below the staging area. You will be asked to select a station on which to run the mission.
Choose the “Linux” station to begin and click “Run mission”. In addition to the MAFFT example job, you can also upload any .fasta file and follow the same process. After the mission has been launched, you’ll be able to see the job running in the Jobs tab:
The results folder will be downloaded as a .zip that contains an output.log file returning the results of the analysis and a folder called results where plots and other files that were created by the analysis are stored.
The output is contained in the results.fasta file: