Deploying Production Galaxy Instances on OpenStack with CloudBioLinux and CloudMan

Comments (0)

Transcript of Deploying Production Galaxy Instances on OpenStack with CloudBioLinux and CloudMan

John ChiltonMinnesota Supercomputing InstituteDeploying Production GalaxyInstances with CloudBioLinux and CloudManPrivate Cloud Computing is ComingBusinesses large and small are flocking to Amazon et. al. because they are cheap.Storage CostsHigh UtilizationData Access PolicesReasons research institutions might not immediately switch to AmazonEnter OpenStackDeploy open source cloud infrastructure on your own hardware.GalaxyOpenStackPythonOpen SourceVibrant CommunityAdminDeveloperDB ServerFile ServerApp ServerRepositoryGalaxy is not the code in the repository, it is the whole stack on the application server.Cloud Infrastructure(OpenStack)DB ServerFile ServerApplication VMs (Web and Compute)RepositoryCommon ScenarioTwo people or teams needto be intimately familiar withGalaxy and must frequently communicate.Opportunity to reduce workload by building Galaxy using common community template.CloudBioLinuxCloudMan"A fully automated infrastructure installs software and data, with packages specified in simple configuration files."https://github.com/chapmanb/cloudbiolinuxDo not wasting effort manually installing software, automate it."CloudMan is a cloud manager that orchestrates all of the steps required to provision a complete compute cluster environment on a cloud infrastructure; subsequently, it allows one to manage the cluster, all through a web browser. "However...Saving money however is not the only reason to employ cloud computing, as I will argue for the specific case of Galaxy - cloud computing can also help manage complexity.Why?What?How?What is OpenStack and private cloud computing?Why deploy Galaxy in a (private) cloud?How to build production Galaxy instances for the cloud.User-DataA block of YAML text used to configure VM at launch time.http://wiki.galaxyproject.org/CloudMan/UserDataSplitting Galaxy into Multiple ProcessesCloudMan uses to configure virtual machine - Galaxy, nginx, NFS, arbitrary other files.configure_multiple_galaxy_processes: Trueweb_thread_count: 2handler_thread_count: 2galaxy_conf_dir: /mnt/galaxyTools/galaxy-central/conf.duser-datagalaxy_conf_dirhttps://bitbucket.org/galaxy/galaxy-central/pull-request/44/Very useful in non-cloud contexts as well. Allows universe_wsgi.ini to be split into a directory offiles (ala /etc/sudoers.d or /etc/apache/conf.d).BenefitsAllow some properties set in repository others in runtime environment.Easier for configuration management tools such as Puppet or Chef to work with.Separate development/production properties and/or developer/admin properties.External Authenticationgalaxy_conf_dir: /mnt/galaxyTools/galaxy-central/conf.dgalaxy_universe_use_remote_user: Truegalaxy_universe_remote_user_maildomain:<domain_name>galaxy_universe_remote_user_logout_href: \ https://logout@<galaxy_url>/galaxy_universe_require_login: TrueUser-DataGalaxy Reports ApplicationPowerful tool provides a wealth of valuable data on every job that Galaxy has run as well as disk usage accounting, etc....Implemented CloudMan "service" for this...user-dataservices: - name: Galaxy - name: GalaxyReports - name: PostgresSSLconf_files: - path: /usr/nginx/conf/key content: <base64 encoding of key> - path: /usr/nginx/conf/cert content: <base64 encoding of cert>user-dataConfigure arbitrary config files on VM server { listen 80; server_name galaxyp.msi.umn.edu; rewrite ^ https://$server_name$request_uri? permanent; }