XSEDE16 has ended
Back To Schedule
Monday, July 18 • 1:00pm - 5:00pm
Tutorial: Introduction to Scientific Workflow Technologies on XSEDE

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

This is a proposal for a joint tutorial between the ECSS workflows team, and the teams from the following workflow technologies: Swift, Makeflow/Work Queue, RADICAL-Pilot, and Pegasus. The goal is that attendees will leave the tutorial with an understanding of the workflow related services and tools available, that they will understand how to use them on XSEDE through hands-on exercises, and that they will be able to apply this knowledge to their own workloads when using XSEDE and other computing resources. The tutorial will be based on the successful XSEDE15 tutorial (http://sched.co/3YdC). All tutorial material is available from https://sites.google.com/site/xsedeworkflows/. The tutorial format will be primarily hands on and interactive.

One major obstacle when running workflows on XSEDE is where to run the workflow engine. Larger project and groups might have their own submit hosts, but it is common that users struggle finding a home for their workflow runs. For this reason, one effort that the ECSS workflows team have set up, based on feedback from the XSEDE14 workflow birds-of-a-feather session, is an IU Quarry hosted submit host. The host is based on as a clone of the login.xsede.org single sign on host. Thus, just like login.xsede.org, any XSEDE user with an active allocation will automatically have access. For the tutorial, we will also provide Jetstream-packaged VMs of the workflow software, as appropriate. With the host, we are also assembling content for a website highlighting tested workflow systems with XSEDE specific examples that users could use for trying out the different tools. These examples will be used as a basis for the examples in the proposed tutorial’s hands-on exercises.

Swift is a simple language for writing parallel scripts that run many copies of ordinary programs concurrently as soon as their inputs are available, reducing the need for complex parallel programming. The same script runs on multi-core computers, clusters, clouds, grids and supercomputers, and is thus a useful tool for moving your computations from laptop or workstation to any XSEDE resource. Swift can run a million programs, thousands at a time, launching hundreds per second. This hands-on tutorial will give participants a taste of running simple parallel scripts on XSEDE systems and provide pointers for applying it to your own scientific work.

Makeflow is a workflow engine for executing large complex workflows, with workflows up to thousands of tasks and hundreds of gigabytes. In this section of the tutorial, users will learn the basics of writing a Makeflow, which is based on the traditional Make construct. In the hands-on example, the users will learn to write Makeflow rules, run a makeflow locally, as well as running the tasks on XSEDE resources. The users will be introduced to Work Queue, a scalable master/worker framework, and create workers on XSEDE resources and connect them to the makeflow. The users will learn to use Work Queue to monitor workflows and the basics of debugging makeflows.

The Pegasus Workflow Management System sits on top of HTCondor DAGMan. In this section of the tutorial, users will learn how to create abstract workflows, and plan, execute, and monitor the resulting executable workflow. The first workflow will be run locally on the submit host, while the two other hands-on examples will be about running workflows on XSEDE resources. One workflow will include running jobs across resources, and highlights the workflow system’s data management capability in such setups. Another workflow will be about about using the pegasus-mpi-cluster tool to execute a high-throughput workload in an efficient and well-behaved manner on one of the XSEDE high performance computing resources.

RADICAL-Pilot allows a user to run large numbers of tasks concurrently on a multitude of distributed computing resources. A task can be a large-parallel simulation or a single-core analysis routine. RADICAL-Pilot is a “programmable” Pilot-Job system developed in Python. After discussing the concept of “Pilot Jobs”, we introduce how to use RADICAL-Pilot to support task-level parallelism. We then demonstrate how to write simple Python applications that use RADICAL-Pilot to execute coupled tasks on distributed computing resources. Additionally, the user can specify input and output data on the tasks that will be handled transparently by the system.

Attendee prerequisites: The participants will be expected to bring in their own laptops with the following software installed: SSH client, Web Browser, PDF reader. We assume basic familiarity with working in a Linux environment.

Special Needs: Even though the submit host enables users with existing allocations to use their own accounts, we would like to have access to a set of XSEDE training accounts for users who currently do not have active allocations.

Monday July 18, 2016 1:00pm - 5:00pm EDT