XSEDE16 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Workforce Development and Diversity [clear filter]
Tuesday, July 19

10:30am EDT

WDD: Challenges and Accomplishments of the Computational Science Undergraduate Research Experiences (CSURE) REU Program
The Computational Science for Undergraduate Research Experiences (CSURE) is an NSF funded Research Experiences for Undergraduates (REU) program organized by the Joint Institute for Computational Sciences (JICS), www.jics.utk.edu/csure-reu. The main goal of the CSURE project is to direct a group of ten undergraduate students to explore the emergent computational science models and techniques proven to work on the supercomputers at the National Institute for Computational Sciences (NICS). In addition, a number of summer interns from Hong Kong also participated in the program. The CSURE program focuses on five different scientific domains: chemistry and material sciences, systems biology, engineering mechanics, atmospheric sciences, and parallel solvers on emergent platforms. The program also enjoys the joint relationship with researchers at the Oak Ridge National Laboratory. Because of these diverse topics of research area and backgrounds of participant, we will in this paper entail the challenges and resolutions in managing and coordinating the program, delivering cohesive tutorial materials, directing mentorship of individual projects, and a few good lessons learned in the duration of the program since it started in 2013.

Tuesday July 19, 2016 10:30am - 11:00am EDT

11:00am EDT

WDD: Assisting Bioinformatics Programs at Minority Institutions: Needs Assessment, and Lessons Learned -A Look at an Internship Program
PURPOSE: We present work in assisting Bioinformatics efforts at minority institutions in the USA funded through an NIH grant over the last 15 years. The primary aim was to create a program for assisting minority institutions in building multidisciplinary bioinformatics training programs. DESIGN: The program involves four components for immediate and long-term increases in research opportunities at minority institutions. Component 1: A two-week Summer Institute in Bioinformatics introducing the breadth of bioinformatics while discussing open research problems. Component 2: Strengthening or establishing bioinformatics programs at minority serving campuses by teaching bioinformatics in collaboration with local faculty. Component 3: A five to eight-week research internship at the PSC for students that completed bioinformatics courses on their campuses. Component 4: Development of a model curriculum for a concentration in bioinformatics in biology, computer science, or mathematics. In this paper we will report on the results of the internship program. 
METHODS: In compliance with federal regulations (45 CFR 46) concerning Human Subjects Research, the survey materials and procedures used and discussed in this paper were approved by Carnegie Mellon University’s Institutional Review Board (IRB No. HS-13-099 on 3/15/13, IRB No HS-14-141 on 3/18/14, and IRB No HS-15-178 on 3/12/15). Under these approved procedures, we began to conduct voluntary pre and post surveys of interns and summer institute participants. These surveys were completed at the very beginning of their summer experience at the PSC (pre-survey) and again at the end of their summer experience at the PSC (post-survey). This paper reports on a subset of the questions asked in the pre and post surveys, mainly on the demographic and skill sets of the participants that are most relevant to computation and high performance computing. 
DEMOGRAPHICS: 21 minority institutions have benefited the grant. Of the thirty-six student surveys completed, 36% were completed by undergraduate students, 36% by master’s students and 25% by doctoral students. 96% of total participants identified that they were attending a minority serving institution (MSI) with 47% indicating that they were attending a Hispanic serving institution, 44% indicating that they were attending a historically black college/university while 5% indicated that they were attending an “other” minority serving institution. 82% of participants self-identified as belonging to racial and ethnic groups that have been shown by the National Science Foundation to be underrepresented in health-related sciences on a national basis, which includes African Americans, Hispanic Americans, Native Americans, Alaskan Natives, Hawaiian Natives, and Natives of the U.S. Pacific Islands. 
NEEDS ASSESSMENT: The MARC pre-survey also included questions asking the participant to identify their prior bioinformatics knowledge. Few student participants self-identified as having intermediate bioinformatics knowledge. The majority of participants in the pre-surveys identified themselves as being able to run bioinformatics programs, but being uncomfortable changing the program parameters. One-third have not done basic bioinformatics analysis (such as database search and multiple alignment). About one-half had not done more advanced bioinformatics analysis and greater than one-half had not worked with structural data. About three-quarters or greater of the participants had not been exposed to common NGS analyses. The number of participants that reported Basic or Advanced skills with programming, databases, or the UNIX operating system was 30% or less. When asked through an open text unstructured question to list the basic steps and tools needed for these analyses, before the workshop the answers to this type of questions was typically “I do not know”. In the post-survey, the majority of the participants expressed that they could run basic bioinformatics analyses at an intermediate or advanced level. For NGS tasks done during the training less than one-third expressed that they could perform these analyses at an advanced level. The participants reported improvements in their programming, databases, or UNIX skills but 30-50% indicated low-level skills in these areas. 
LESSONS LEARNED: A strong team effort at the teaching level is needed to help improve the skill set of interns. Follow-up is key in order to help student maintain or improve the skills gained and carry out research successfully. It is very difficult to add courses or degrees in many state Minority Serving Institutions. Variability in traditional Biology Curricula make adapting the courses and modules required for broader improvement of computational skills a challenge. Math requirements at the bachelor’s degree level and “introductory computing” courses can be substantial barriers to success for biology students. Finally, better bioinformatics and computational textbooks for biologists at the undergraduate level are needed. 
CONCLUSIONS: This program has been a highly successful outreach effort and a very sound and cost-effective use of the MARC funding program from NIH. Important lessons have been learned about bioinformatics education that should be implemented at the policy level in order to ensure that educators, students and researchers at minority serving institutions can address science problems using state-of-the-art computational methods, computational genomics and Big Data. 

GRANT SUPPORT: This work was supported by National Institutes of Health Minority Access to Research Careers (MARC) grant T36-GM-095335 to the Pittsburgh Supercomputing Center. It also used the BioU computing cluster, which was made available by National Institutes of Health grant T36-GM-008789 to the Pittsburgh Supercomputing Center. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the National Science Foundation grant OCI-1053575. Specifically, it used the Blacklight supercomputer system at the Pittsburgh Supercomputing Center (PSC).

Tuesday July 19, 2016 11:00am - 11:30am EDT

11:30am EDT

WDD: An Agile Approach for Engaging Students in Research and Development
Opportunities to solve real-world problems - collaboratively or individually – can create significant impact on the education and career goals of students. By providing such opportunities to students, research groups and businesses can also benefit significantly. However, several factors and strategies play a role in developing a mutually beneficial synergy between the students and research groups (or businesses). Adopting a disciplined agile approach for student engagement and retention is one such strategy. In this paper we discuss our experiences in engaging students in research and development through the aforementioned approach. Some lessons learnt and recommendations are also included in the paper.

Tuesday July 19, 2016 11:30am - 12:00pm EDT

3:30pm EDT

WDD: XSEDE Scholars Tutorial: The Things I Wish I Had Known When I Started in HPC
High Performance Computing (HPC) clusters have been essential in the simulation of physical systems due to their large computing power and storage capabilities. However, new users, the majority who are in non-computing fields, face several challenges when working with these supercomputers because the information found online can be overwhelming and no practical guides are readily available. The panel will cover challenges and answers to several topics ranging from: getting access to a supercomputer, logging in to the system, submitting jobs, and future needs of non-traditional users.

Tuesday July 19, 2016 3:30pm - 4:15pm EDT

4:15pm EDT

WDD: Orion: Discovery Environment for HPC Research and Bridging XSEDE Resources
We present a case study on how Georgia State University (GSU) has grown its active High Performance Computing (HPC) research community by 80% in 2015 over previous year, and how GSU is projected to double its active HPC research community for 2016 over 2015. In October 2015, GSU launched an institutional HPC resource, Orion, which provides batch and interactive compute environment. Currently, Orion supports both the traditional and non-traditional research communities on our campus as well as our affiliates from Qatar University, University of Toronto, and Georgia Tech. At GSU, Research Solutions’ HPC facilitators are responsible for facilitating the HPC research, which is done in a form of providing technical support in developing pipelines and automating job submission process for various applications researchers need to use for their research. This approach has resulted in nearly 80% growth in our active HPC users from 2014 to 2015, and currently we are tracking at doubling our active HPC user community in 2016. XSEDE remains a backbone of our ambitious goals, as we rely on XSEDE for providing us the necessary resources for select users whose research quickly exceeds our local infrastructure.

Tuesday July 19, 2016 4:15pm - 5:00pm EDT
Wednesday, July 20

8:30am EDT

WDD: The Advanced Cyberinfrastructure Research and Education Facilitators Virtual Residency: Toward a National Cyberinfrastructure Workforce
An Advanced Cyberinfrastructure Research and Education Facilitator (ACI-REF) works directly with researchers to advance the computing- and data-intensive aspects of their research, helping them to make effective use of Cyberinfrastructure (CI). The University of Oklahoma (OU) is leading a national "virtual residency" program to prepare ACI-REFs to provide CI facilitation to the diverse populations of Science, Technology, Engineering and Mathematics (STEM) researchers that they serve. Until recently, CI facilitators have had no education or training program; the Virtual Residency program addresses this national need by providing: (1) training, specifically (a) summer workshops and (b) third party training opportunity alerts; (2) a community of facilitators, enabled by (c) a biweekly conference call and (d) a mailing list.

Wednesday July 20, 2016 8:30am - 9:00am EDT

9:00am EDT

WDD: Access and Inclusion in XSEDE Training
Computing in science and engineering is now ubiquitous: digital technologies underpin, accelerate, and enable new, even transformational, research in all domains. Access to an array of integrated and well-supported high-end digital services is critical for the advancement of knowledge. Driven by community needs, XSEDE (the Extreme Science and Engineering Discovery Environment) substantially enhances the productivity of a growing community of scholars, researchers, and engineers through access to advanced digital services that support open research. 
An XSEDE strategic goal is to extend use of high-end digital services to new communities by preparing current and next generation of scholars, researchers, and engineers in the use of advanced digital technologies via training, education, and outreach. The mission of XSEDE’s Under-Represented Community Engagement (URCE) program is to raise awareness of the value of advanced digital research services and recruit users from new communities. In collaboration with XSEDE training and education programs, the URCE program works with the faculty and students that are non-traditional users of XSEDE resources and helps them in utilizing XSEDE's advanced digital research services and ecosystem. 
The focus of this work are individual researchers, research teams, faculty, staff, and students who have limited or no exposure. These are first time users. The institutions that the URCE program works with are small, minority, and resource limited; and the individuals are under-represented minorities and women. In order for first time users to be successful, they need training, practice, user support, extended collaborative support, and software tools and environments including gateways to allow them to rapidly join the community and become productive. 
Over the past four years, the URCE program has organized and facilitated training across the country at a variety of institutions ranging from small private Historically Black Colleges and Universities (HBCUs) such as Philander Smith to the University of Texas at El Paso which is a large public Hispanic Serving Institution and in collaboration with research intuitions that have significant diversity initiatives on their campuses. Every URCE training workshop has included extensive post workshop evaluation and the participants progress in engaging with XSEDE services is tracked so we can identify deepening engagement and persistence. 
This type of success has been achieved because training has evolved due to our reflection on the post workshop feedback and data. The practices that have been incorporated include providing the motivation for using these types of services, promoting the simplest access through gateways, careful tailoring of the content to the audience, and developing persistence after the event.

avatar for Lorna Rivera

Lorna Rivera

Research Faculty, Georgia Institute of Technology

Wednesday July 20, 2016 9:00am - 9:20am EDT

9:20am EDT

WDD: Rescuing Lost History: Using Big Data to Recover Black Women’s Lived Experiences
This study employs latent Dirichlet allocation (LDA) algorithms and comparative text mining to search 800,000 periodicals in JSTOR (Journal Storage) and HathiTrust from 1746 to 2014 identify the types of conversations that emerge about Black women's shared experience over time and the resulting knowledge that developed. We used MALLET to interrogate various genres of text (poetry, science, psychology, sociology, African American Studies, policy, etc.). We also used comparative text mining (CTM) to explore latent themes across collections written in different time periods by analyzing the common and expert models. We used data visualization techniques, such as tree maps, to identify spikes in certain topics during various historical contexts such as slavery, reconstruction, Jim Crow, etc. We identified a subset of our corpus (20,000) comprised of known Black or Black women authors and compared patterns of words in the subset against the larger 8000,000 corpus. Preliminary findings indicate that when we pulled 300,000 volumes, about 80,000 (~25%) do not have subject metadata. This appears to suggest that if a researcher searched for volumes about Black women, they may not have access to a significant amount of data on the topic. When volumes are not tagged properly, researchers would have to know that it exists when they do their searches. The recovery nature of this project involves identifying these untagged volumes and making the corpus publicly available to librarians and others with copyright considerations.

Wednesday July 20, 2016 9:20am - 9:40am EDT

9:40am EDT

WDD: Advanced Research Network Infrastructure Enables Research Opportunities in the DC Area

Abstract: This paper reports on the development of the Capital Area Advanced Research and Education Network (CAAREN), which provides an advanced research and education network infrastructure for the D.C. metro region. Working in partnership with the D.C. Office of the Chief Technology Officer’s program, DC-Net, the George Washington University Division of Information Technology (IT) developed CAAREN out of a need for advanced research infrastructure for the D.C. area, and a need for collaboration among D.C. area universities. CAAREN’s objective is to provide an advanced research network infrastructure, as well as outreach and services for K-12 schools, museums, libraries and similar organizations within the D.C. metro region. A number of initiatives have been completed or are underway to help advance research and education for current and future generations to advance scientific discovery through the use of this cyberinfrastructure.

Wednesday July 20, 2016 9:40am - 10:00am EDT

10:30am EDT

WDD: Preparing Tomorrow¹s Cyberinfrastructure Leaders Today
This panel session will gather community recommendations on strategies for preparing the workforce to advance the capabilities, capacities and utilization of cyberinfrastructure with an emphasis on Computational and Data-enabled Science and Engineering (CDS&E), Data Science and related areas. The panel members will pose challenges and controversial ideas to stimulate audience participation to identify methods and approaches the community should pursue in support of advancing research, discovery, and scholarly studies through the usage of computational and data-enabled tools, resources, and methods. The XSEDE16 audience will be challenged to contribute their suggestions both during and after the session via social media and discussion forums. The audience suggestions will be incorporated into a report that will be publicly disseminated by the middle of 2017.


There are many reports that have documented the need to prepare a diverse workforce able to advance research, discovery, scholarly studies and economic competitiveness via CI. The White House issued an Executive Order establishing the National Strategic Computing Initiative (NSCI) to coordinate research, development, and deployment strategies, to draw on the strengths of departments and agencies to move the US federal government into a position that sharpens, develops, and streamlines a wide range of new 21st century applications. The NSCI goals include (a) advancing core technologies to solve difficult computational problems and (b) fostering increased use of the new capabilities in the public and private sectors. Workforce development is a key element of this initiative.


XSEDE and the NSF funded Blue Waters project, along with other projects and organizations are working to advance workforce development. They place an emphasis on:

* increasing formal and informal education and training opportunities in Cyberinfrastructure (CI), Computational and Data-enabled Science and Engineering (CDS&E), Data Science and related capabilities;

* expanding the workforce in these areas by fostering the identification, recruitment and cultivation of CI, CDS&E and Data Science practitioners, and of faculty who educate these practitioners;

* developing approaches to increase the diversity of the CI, CDS&E and Data Science workforce.


The goal of the panel discussions is to identify recommendations that will help advance discovery and scholarly studies via the applications of computational and data-enabled tools, resources, and methods.


This panel is intended to spark dialogue among the XSEDE16 participants on strategies that have potential to enhance the preparation of the workforce. The panel members will have 10 minutes each to present diverse viewpoints from a variety of community perspectives.

Following the panel presentations, a series of challenging questions will be posed to the audience to promote discussions to identify possible solutions. The moderator will foster discussions among the audience and will be proactive in ensuring that the audience is as engaged as the panel in the discussions. The moderator will guide the group towards making recommendations for enhancing workforce development.


The audience will be encouraged to provide oral comments and to provide written comments using polling software that everyone will be able to observe in real time, thus giving all attendees a voice in the discussion and a chance to explore topics not previously raised by the panel. The audience members will also be encouraged to submit written suggestions after the conference.


The panel organizers will provide a forum for open discussion and contributions of ideas during and after the Conference. The organizers will post drafts of recommendations throughout the development process, and the final set of recommendations will be posted by the middle of 2017.

Emcee - Henry Neeman, University of Oklahoma

Panel Members:

* Linda Akli, Southeastern Universities Research Association (SURA) will represent the needs of the Diversity Forum

* John Towns, NCSA, University of Illinois will provide a CIO’s perspective

* John Mosher, Oklahoma Innovation Institute will provide a Champion’s perspective

* Thomas Hauser, University of Colorado, Boulder

* Nathan Weeks, USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University

Wednesday July 20, 2016 10:30am - 12:00pm EDT