Galaxy installation under production mode
David Roquis - 24aug2013

Walkthrough to install Galaxy under a production mode for more efficiency and speed.
It uses NGINX as a proxy server for uploads and downloads, PostgreSQL as a database manager and virtualenv as a virtual Python environment (for increased stability).
You need to have a Python version between 2.6 and 2.73. Python 3 is not supported. Working with virtualenv allow you to solve this possible issue.

I multithreaded galaxy processes into 2 web servers, 1 job manager and 2 job handlers.
This configuration should work on any computer with 6 core or more. On a 4 core system, use only 1 web server and 1 job manager.
For more detailed informations, please refer to the Galaxy wiki.

I strongly suggest that you do this install under a new user named "galaxy2production" with limited administrative privileges.

This install method was tested under Fedora 13, 16, 17 and 18.
Under another linux system, replace "yum" par the command corresponding to your local package installer.

Four configuration files need to be created/edited to complete this installation. They can be downloaded here:
Modifications to add to the galaxy configuration file
NGINX proxy server configuration file
Galaxy init file
NGINX init file

Command lines that must be typed in a terminal are displayed in italic.

Error logs (very useful for troubleshooting) are stored in the galaxy main directory, under paster.log (no multithreading) or web#.log, handler#.log and manager#.log
if you multithread.


Table of contents
    1. Install Galaxy dependencies
    2. Install PostgreSQL: as a database manager
    3. Install a Python virtual environment
    4. Create an init script
    5. Install NGINX with upload module
    6. Scaling and load balancing mode
    7. Set Galaxy on your network
    8. Troubleshooting
        8.1. Disabling AJAX upload module
        8.2. Removing a stucked job from the job manager


1. Install Galaxy dependencies

#install nedit as a text editor (optional)
yum install nedit

#install mercurial (as root), allow auto update of Galaxy
yum install '*mercurial*'

#Install git (as root) --> already in Fedora 16 and above
yum install git

#Create new unix user (as root). Here we used galaxy2production but name it the way you want.
useradd galaxy2production
passwd galaxy2production
exit
su - galaxy2production

#Install galaxy via mercurial
hg clone https://bitbucket.org/galaxy/galaxy-dist/

#Create the galaxy config file
cd ~/galaxy-dist
cp universe_wsgi.ini.sample universe_wsgi.ini
#Note: This file has to be edited. All the information can be downloaded at the top of this page
#in the file
galaxy universe modifications.txt


2. Install PostgreSQL as a database manager

#install postgresql (as root), high performance database --> already in Fedora 16
yum install postgresql

#Install phpPgAdmin (as root) and configure it --> allow to easily edit Galaxy database
yum install phpPgAdmin.noarch
#access phpPgAdmin at http://localhost/phpPgAdmin

#start postgresql (as root)
service postgresql initdb
chkconfig postgresql on
#--if doesn't work (Unknown operation initdb)--
su - postgres -c "PGDATA=/var/lib/pgsql/data initdb"
chkconfig postgresql on

#create new user and db in postgresql (do as root)
su - postgres
psql template1
CREATE USER galaxy2production WITH PASSWORD 'galaxy2production';
CREATE DATABASE galaxyprodSQL;
GRANT ALL PRIVILEGES ON DATABASE galaxyprodSQL to galaxy2production;
\q

#test galaxy2production user login
su - galaxy2production
psql -d galaxyprodSQL -U galaxy2production


3. Install a Python virtual environment

#install virtualenv (Python virtual environment)
wget https://raw.github.com/pypa/virtualenv/master/virtualenv.py

#create a sandbox for Python using virtualenv
python ./virtualenv.py --no-site-packages galaxy_env
. ./galaxy_env/bin/activate
which python

#Edit ~/.bashrc to define TEMP and to add virtualenv source
source ~/galaxy_env/bin/activate

TEMP=$HOME/galaxy-dist/database/tmp
export TEMP

#Ensure that ~/.bash_profile sources ~/.bashrc
# this should be in ~/.bash_profile
if [ -f ~/.bashrc ]; then
    . ~/.bashrc
fi


4. Create an init script

#Create a script at /etc/init.d/galaxy (you can download it at the top of this page)
#Make the script executable and add it to system service
sudo chmod 755 /etc/init.d/galaxy
sudo /sbin/chkconfig --add galaxy
#The last step is sometimes not working, depending on your system you might have to add it a different way.


5. Install NGINX with upload module

#Install nginx (as root for yum), high performance web server and reverse proxy. Will save a lot of troubles when downloading large files from Galaxy
#Upload module will allow NGINX to handle uploads from user to Galaxy database
#This module is a HUGE time saver and will sometimes allow to overcome the 2 Go upload cap.
yum install pcre-devel zlib-devel openssl-devel
wget http://nginx.org/download/nginx-1.2.2.tar.gz
tar xvf nginx-1.2.2.tar.gz
wget http://www.grid.net.ru/nginx/download/nginx_upload_module-2.2.0.tar.gz
tar xvf nginx_upload_module-2.2.0.tar.gz
cd nginx-1.2.2
./configure --sbin-path=/usr/local/sbin --with-http_ssl_module --add-module=../nginx_upload_module-2.2.0
make
#if make doesn't work, try to download directly the upload module from branch 2.2 at
#https://github.com/podados/nginx-upload-module/tree/2.2
sudo make install

#Create a script at /etc/init.d/nginx (you can download it at the top of this page)
#Make the script executable and add it to system service
sudo chmod 755 /etc/init.d/nginx
sudo /sbin/chkconfig nginx on
sudo /sbin/chkconfig --list nginx

#Modify /usr/local/nginx/conf/nginx.conf (you can download it at the top of this page)

#Start galaxy and nginx
sudo /etc/init.d/galaxy start
sudo /etc/init.d/nginx start
#First galaxy startup will most probably fail as it has to fetch some python eggs and update the database schemas

#Galaxy accessible at localhost/galaxy

#To look for galaxy updates, go in the galaxy-dev folder
hg incoming
#If there is a new version, update with
hg pull -u


6. Scaling and load balancing mode

#Allows to run multiple galaxy instance to do multithreading
#Follow the changes in universe.wgsi.ini (you can download it at the top of this page).
#comment in lines 14 and 34 and comment out lines 16 and 36 in /etc/init.d/galaxy
#Follow instructions on line 34 in /usr/local/nginx/conf/nginx.conf (add one extra line for each web server used).
 

7. Set Galaxy on your network

#edit /etc/hosts using the IP and the hostname of the machine (ifconfig)
#to get the IP, hostnames to get the hostnames (duh!)
#if the hostname is someserver1.host-name.net and the ip is 192.1.0.123
::1 localhost.someserver1.host-name.net localhost
127.0.0.1 localhost.someserver1.host-name.net localhost
192.1.0.123 someserver1.host-name.net someserver1
192.1.0.123 someserver1.host-name.net

#in firewall, choose persistent configuration and open TCP ports used by NGINX
#See NGINX configuration file (you can download it at the top of this page) to see
#which ports are used (usually 80 or 8080).


8. Troubleshooting

#Most troubleshooting can be done by looking in the Galaxy Wiki or Mailing List. However, here is a list of specific problems we faced and how to fix them.

    8.1.
Disabling AJAX upload module

#In some browser, uploading data keep loading but never ends. Here is how to fix this:
#In tools/data_source/upload.xml change

<param name="file_data" type="file" size="30" label="File" ajax-upload="true" help="TIP: Due to browser limitations, uploading files larger than 2GB is guaranteed to fail.  To upload large files, use the URL method (below) or FTP (if enabled by the site administrator).">

#to

<param name="file_data" type="file" size="30" label="File" ajax-upload="False" help="TIP: Due to browser limitations, uploading files larger than 2GB is guaranteed to fail.  To upload large files, use the URL method (below) or FTP (if enabled by the site administrator).">


     8.2. Removing a stucked job from the job manager

#It sometimes happens that a job is stucked in the job manager with no possibility to remove it (not checkbox for deletion).
#This is caused by a bug in the database, and can easily be edited to be removed.

#Install and start pgadmin3 to edit the database

yum install pgadmin3
pgadmin3

#Log in in the galaxyprodSQL database. You will need to enter the following info (may be different depending on how you installed
#the database).
Name: galaxyprodSQL
Port: 5432
MaintenanceDB: postgres
Username: galaxy2production
Password: galaxy2production

#Under /Tables/Jobs, find the row with the job ID corresponding to the one stucked in you job manager, and type "error" in
#column "state caracter" (instead of "upload" or whatever else is written there).