XSEDE16: Full Schedule

8:00am EDT

Tutorial: The many faces of data management, interaction, and analysis using Wrangler.

Link to slides
The goal of this tutorial is to provide guidance to participants on large-scale data services and analysis support with the newest XSEDE data research system, Wrangler. Being a mostly first of its kind XSEDE resource, both user and XSEDE staff training is needed to enable the novel research opportunities Wrangler presents. The tutorial consists of two major components. The morning sessions focus on helping user to familiar with the unique architecture and characteristics of Wrangler System and a set of data services the wrangler supports, including large scale file based data managements, database services, and data sharing services. The morning presentation includes introduction on the Wrangler system and its user environment, use of reservations for computing, data systems for structured and unstructured data, and data access layers using both Wranglers replicated long term storage system and high speed flash storage system. We will also introduce the Wrangler graphical interfaces, including the Wrangler Portal, Web based tools served by Wrangler including Jupyter notebooks and RStudio, and the iDrop web interface for iRODS. The afternoon session will focus on data driven analysis support on wrangler. The presentations are center around use of the dynamic provisioned of Hadoop ecosystem on Wrangler. The presentations include introduction on the core Hadoop cluster for big data analysis, using existing analysis routines through Hadoop Streaming, interactive analysis with Spark, using Hadoop/Spark with the often more familiar to researchers Python and R interfaces.

Speakers

XSEDE16

8:00am EDT

Niall Gaffney

Amit Gupta

Ruizhu Huang

Christopher Jordan

David Walling

Weijia Xu

Recently Active Attendees