Thursday 12 February 2015

FTP upload to Galaxy using ProFTPd and PBKDF2

Recently I've been looking at enabling FTP upload for a local instance of Galaxy (based on the latest_2014.08.11 version of galaxy-dist.)

The Galaxy documentation for integrating an FTP upload server can be found at https://wiki.galaxyproject.org/Admin/Config/UploadviaFTP, and I've been working with their recommended choice of ProFTPd. Overall the instructions are pretty straightforward, but I've encountered a few issues, mostly to do with Galaxy's change from SHA1 to PBKDF2 as its default choice of password authentication. This post details how I handled these to get the upload working.

Note that I'm assuming that Galaxy is using Postgres as its database engine.

1. Get ProFTPd

ProFTP is simple to install on Scientific Linux 6 via yum:
  • yum install proftpd proftpd-postgresql
with proftpd-postgresql providing the extensions required by ProFTPd to access PostgreSQL databases.

If you need to build ProFTPd manually from source (for example because the default version doesn't have features that you need such as handling PBKDF2 password encryption - see below) then download the code from the ProFTP website and do e.g.:

# yum install postgresql-devel openssl-devel
# tar zvxf proftpd-1.3.5.tar.gz
# cd proftpd-1.3.5
# ./configure --prefix=/opt/apps/proftpd/1.3.5 --disable-auth-file --disable-ncurses --disable-ident --disable-shadow --enable-openssl --with-modules=mod_sql:mod_sql_postgres:mod_sql_passwd
# make ; make install

Note that the final step must be performed with superuser privileges.

2. Check how your Galaxy installation handles password encryption

Galaxy appears to support two types of password encryption: older versions of Galaxy use SHA1 to encrypt its passwords, whereas newer versions use a more sophisticated protocol called PBKDF2.

If you're using SHA1 then configuring ProFTPd is pretty straightforward, and the instructions on the Galaxy wiki should work out of the box. If you're using PBKDF2 then the configuration is a little more involved.

You can configure Galaxy to explicitly revert to SHA1 by setting use_pbkdf2 = False in the configuration files, in the [app:main] section; however by default PBKDF2 is used, and this blog post assumes that this is the case.

3. (Optionally) Set up a database user specifically for FTP authentication

This is not critical but is recommended. When users try to upload files to the FTP server they will log in using their Galaxy username and password. In order to enable this ProFTPd needs to be able to query Galaxy's database to check these credentials, and doing it via a database user with limited privileges (essentially only SELECT on the galaxy_user table) is more secure than via the one that Galaxy itself uses.

For Postgresql the instructions given on the Galaxy wiki are fine.

4. Create an area where ProFTPd will put uploaded files (and point Galaxy to it)

This should be a directory on the system which is readable by the Galaxy user. The ftp_upload_dir parameter in the Galaxy config file should be set to point to this location.

(It appears that you also need to set a value for ftp_upload_site in order for the uploaded files to be presented to the user when they got to "Upload Files".)

5. Configure ProFTPd

ProFTPd's default configuration file is located in /etc/proftpd.conf (if using the default system installation), or otherwise in the etc subdirectory where you installed ProFTPd if you built your own.

5.1 Configuring ProFTPd to use SHA1 password authentication

The Galaxy documentation gives an example ProFTPd config file that should work for the old SHA1 password encryption. I don't cover using SHA1 any further in this post.

5.2 Configuring ProFTPd to use PBKDF2 password authentication

As this is not documented on the Galaxy wiki, I used a sample ProFTPd configuration posted by Ricardo Perez in this thread from the Galaxy Developers mailing list as a starting point: http://dev.list.galaxyproject.org/ProFTPD-integration-with-Galaxy-td4660295.html - his example was invaluable to me for getting this working.

Here's a version of the ProFTPd conf file that I created to enable PBKDF2 authentication:


Note that the SQLPasswordPBKDF2 directive is not available in ProFTPd before version 1.3.5rc3, so check which version you're using.

(It should also be possible to configure ProFTPd to use both SHA1 and PBKDF2 authentication, and there are hints on how to do this in Ricardo's message linked above. However I haven't tried implementing it yet.)

6. Test your ProFTPd settings

ProFTPd can be run as a system service but during initial setup and debugging I found it useful to run directly from a console. In particular:
  • proftpd --config /path/to/conf_file -t performs basic checks on the conf file and warns if there are any syntax errors or other problems
  • proftpd --config /path/to/conf_file -n starts the server in "no daemon" mode
  • proftpd --config /path/to/conf_file -n -d 10 runs in debugging mode with maximal output, which is useful for diagnosing problems with the authentication.
On our system I also needed to update the Shorewall firewall settings to allow external access to port 21 (the default port used by FTP services), by editing the /etc/shorewall/rules file.

You can then test by ftp'ing to the server and checking that you can log in using your Galaxy credentials, upload a file, see that it appears in the correct place on the file with the correct file ownership and permissions (it should be read/writeable by the user running the Galaxy process), and check that Galaxy's upload tool presents it as an option.

If any of these steps fail then running ProFTPd with the debugging option can be really helpful in understanding what's happening behind the scenes.

One other gotcha is that if the Galaxy user UID or GID is less than 999, then you will need to set SQLMinID (or similar) in the ProFTPd conf file to a suitable value, otherwise the uploaded files will not be assigned to the correct user (you can get the UID/GID using the "id" command).

7. Make ProFTPd run as a service

If everything appears to be working then you can set up ProFTP to run as a system service - if you're using the system installed version then there should already be be an /etc/init.d/proftpd file to allow you to do

service proftpd start

Otherwise you will need to make your own init.d script for ProFTPd - I used the one in the documentation at http://www.proftpd.org/docs/howto/Stopping.html as a starting point, put it into /etc/init.d/ and edited the FTPD_BIN and FTPD_CONF variables to point to the appropriate files for my installation.

Once this is done you should have FTP uploads working with Galaxy using PBKDF2 password authentication.

Updates: fixed typos in name of "PBKDF2" and clarify that SHA1 is not used (27/02/2015).

Welcome

I'm starting this blog principally as an on-going way to document what I've learned about working with Penn State's Galaxy platform for data intensive biomedical research. Part of my current role involves supporting a number of Galaxy instances at the University of Manchester, which includes the set up, maintenance and administration of Galaxy for teaching and research. I also do some development of new tools within the Galaxy framework, in collaboration with local bioinformaticians and researchers. For want of a better term I am a "galactic engineer"; and given this focus I expect that my posts will be more technical than scientific.

I love Galaxy: it's a great platform supported by talented developers and a friendly and active community, and we have benefited from deploying it locally both for research and training. However I find it can sometimes be tricky to keep up with "best practice" as Galaxy continues to evolve- especially given the limited time I have to spend on it alongside other things. So while this blog is primarily for my own benefit (and the information is likely to go out of date over time) I hope that it might still be of use to others out there like me in their own efforts to set up and use Galaxy.

Finally, a disclaimer: all opinions, mistakes and misapprehensions are my own!