From HPCWire and Intel:

“Learning how to write parallel and vector HPC programs will take a lot longer than five minutes. But with a series of five-minute videos introduced this week, Intel’s director and parallel programming evangelist James Reinders gives prospective parallel and vector programmers an introduction to the tools and techniques they’ll use to write code for the chip giant’s latest processors and coprocessors.

The series will cover a different aspect of HPC programming every week for 12 weeks with the new series called The Five Minute Guide to Parallel and Vector Software Programming. New episodes will come out every Wednesday through the middle of August.”

The current line-up:

  • “Vectorization using Intel Cilk Plus Array Notation in C++/C” is already out;
  • “Data alignment for effective vectorization in Fortran and C++/C” on June 26;
  • “Faster math performance with Intel Math Kernel Library” on July 3;
  • “Automatic offload with Intel Math Kernel Library” on July 10;
  • “Threading with OpenMP” on July 17;
  • “Simplified threading with Intel Cilk Plus” on July 24;
  • “Threading with Intel Threading Building Blocks (when Intel Cilk Plus isn’t enough)” on July 31;
  • “Performance analysis with Intel VTune Amplifier XE” on August 7;
  • “Distributed Computing with Intel MPI Library” on August 14;
  • and “Balancing MPI Applications” on August 21.

The videos are being hosted at the HPCWire website — click here.

Original announcement found at HPCWire.

Starting May 1st, Coursera will be offering a High-Performance Scientific Computing class from Randall J. LeVeque of the University of Washington:

From the course description:

Programming-oriented course on effectively using modern computers to solve scientific computing problems arising in the physical/engineering sciences and other fields. Provides an introduction to efficient serial and parallel computing using Fortran 90, OpenMP, MPI, and Python, and software development tools such as version control, Makefiles, and debugging.

About the Course:  Computation and simulation are increasingly important in all aspects of science and engineering. At the same time writing efficient computer programs to take full advantage of current computers is becoming increasingly difficult. Even laptops now have 4 or more processors, but using them all to solve a single problem faster often requires rethinking the algorithm to introduce parallelism, and then programming in a language that can express this parallelism.  Writing efficient programs also requires some knowledge of machine arithmetic, computer architecture, and memory hierarchies.


Filed Under (Grants, Programming, Training) by John Pormann, Ph.D. on 07-03-2013

Sorry for the late notice … just got this from our friends at Shodor Foundation:

Training available for students in U.S., Europe, and Japan at International Summer School on HPC Challenges in Computational Sciences


Graduate students and postdoctoral scholars in the UnitedStates, Europe, and Japan are invited to apply for the fourth InternationalSummer School on HPC Challenges in Computational Sciences, to be held June 23-28, 2013, at New York University inNew York CityThe summer school is sponsored by the U.S. National Science Foundation’s Extreme Science and Engineering Discovery Environment (XSEDE) project, the European Union Seventh Framework Program’s Partnership for Advanced Computing in Europe (PRACE), and RIKEN Advanced Insti–tute for Computational Science (RIKEN AICS).


Leading American, European and Japanese computational scientists and high-performance computing technologists will offer instruction on a variety of topics, including:

·       Access to EU, U.S., and Japanese cyberinfrastructures

·       HPC challenges by discipline (e.g., bioinformatics, computer science, chemistry, and physics)

·       HPC programming proficiencies

·       Performance analysis & profiling

·       Algorithmic approaches & numerical libraries

·       Data-intensive computing

·       Scientific visualization


The expense-paid summer school will benefit advanced scholars from European, U.S., and Japanese institutions who use HPC to conduct research.


Further information and to apply for the 2013 summer school, visit are due by March 18.





Hermann Lederer

RZG, Max Planck Society, Germany


Simon Wong

ICHEC, Ireland



Mitsuhisa Sato




Scott Lathrop

NCSA, University of Illinois at Urbana-Champaign, United States




About PRACE: The Partnership for Advanced Computing in Europe (PRACE) is an international non-profit association with its seat in Brussels. The PRACE Research Infrastructure provides a persistent world-class high performance computing service for scientists and researchers from academia and industry in Europe. The Implementation Phase of PRACE receivesfunding from the EU’s Seventh Framework Programme (FP7/2007-2013) under grant agreements RI-261557, RI-283493 and RI-312763. For more information, see


About RIKEN AICS: RIKEN is one of Japan’s largest research organizations with institutes and centers in locations throughout Japan. The Advanced Institute for Computational Science (AICS) strives to create an international center of excellence dedicated to generating world-leading results through the use of its world-class supercomputer ”K computer.” It serves as the core of the “innovative high-performance computer infrastructure” project promoted by the Ministry of Education, Culture, Sports, Science andTechnology.

About XSEDE: The Extreme Science and Engineering Discovery Environment (XSEDE) is the most advanced, powerful, and robust collection of integrated digital resources and services in the world. It is a single virtual system that scientists can use to interactively share computing resources, data, and expertise. The five-yearproject is supported by the U.S. National Science Foundation. For more information, see

Filed Under (Optimization, Programming, Websites) by John Pormann, Ph.D. on 28-01-2013

There’s a great article on profiling techniques for HPC over at Admin Magazine.  The author, Jeff Layton, lays out the differences between profiling and tracing, and then provides links to a wealth of tools:  processor tracing tools, system profiling tools, system tracing tools, and even MPI profiling/tracing tools.

“Knowing your application is one of the keys to being able to improve it and, perhaps most importantly, being able to judge which architecture (or architectures) you should think about using. In essence, “knowing yourself” from an application perspective. This ability is very important in the current climate, where non-x86 processors are on the rise and where accelerators are also becoming more commonplace and diverse.”

Filed Under (Grants, Programming) by John Pormann, Ph.D. on 09-01-2013

Intel Corporation has announced that they are giving away an Ultrabook, two solid state drives and ten $50 gift certificates to the winners of their Intel® MKL success story contest.  You can submit your stories of how MKL has contributed to the success of your research, lab, or organization.  All you have to do is send an email to with your 1500-3000 word success story.  The contest is open through 11:59pm Pacific time February 28, 2013.

More information including contest rules, judging criteria and prize details are available at

Filed Under (Computational Science, Programming, Training) by John Pormann, Ph.D. on 10-10-2012

The Scalable Computing Support Center is looking for a motivated undergraduate student to assist with programming and support for our 550-machine Linux cluster. This is an excellent opportunity to develop your programming skills in a user-focused environment. Initial project focus is on cluster monitoring systems, but can be expanded or adapted to fit your skills.

  • Must be experienced with one or more programming languages; e.g. Java, PHP, Perl, or Python
  • Knowledge of web and database interactions is preferred
  • GUI programming experience is a plus
  • Flexible hours — up to 10 hours per week
  • PAID opportunity!!

For more information about our center, see

If interested, please contact Dr. John Pormann, john.pormann at

Filed Under (Programming, Training) by John Pormann, Ph.D. on 05-04-2012

Join MathWorks for a free MATLAB seminar on Thursday, April 26, 2012, in the Fitzpatrick Center Schiciano Auditorium Side A.

Data Analysis, Parallel & GPU Computing with MATLAB at Duke University

–Register now–

Register at 


Presenter: Bonita Vormawor, Application Engineer

9:45 – 10:00 a.m.

Registration and sign-in. Walk-ins are welcome.

10:00 a.m. – 1:00 p.m.

Data Analysis, Parallel & GPU Computing with MATLAB

Part 1: Data Analysis with MATLAB

Attend this free seminar to find out how you can use MATLAB and its add-on products to develop algorithms, visualize and analyze data, and perform numeric computation.

MathWorks engineers will provide an overview of MATLAB through live demonstrations, examples, and user testimonials, showing how you can use MATLAB and related toolboxes to:

•                Access data from many sources (files, other software, hardware, etc.)

•                Use interactive tools for iterative exploration, design, and problem solving

•                Automate and capture your work in easy-to-write scripts and programs

•                Share your results with others by automatically creating reports

•                Build and deploy GUI-based applications


MATLAB provides a flexible environment for teaching and research in a wide range of applications, including signal processing and communications, image processing, math and optimization, statistics and data analysis, control systems, hardware data acquisition, computational finance, and computational biology.


Part Two: Parallel Computing with MATLAB 

In this session, you will learn how to solve computationally and data-intensive problems using multicore processors, GPUs, and computer clusters. We will introduce you to high-level programming constructs that allow you to parallelize MATLAB applications without CUDA or MPI programming and run them on multiple processors. We will also show you how to overcome the memory limits of your desktop computer and solve problems that require manipulating very large matrices by distributing your data.


Highlights include:

•                Toolboxes with built-in support for parallel computing

•                Creating parallel applications to speed up independent tasks

•                Programming with distributed arrays to work with large data sets

•                Scaling up to computer clusters, grid environments, or clouds

•                Tips on developing parallel algorithms

Register at


Filed Under (Programming, Training, Websites) by John Pormann, Ph.D. on 29-03-2012

The NSF funded XSEDE Training, Education, and Outreach Services (TEOS) are nearing the end of the first year of activities. On behalf of the TEOS Managers, we are asking individuals to participate in a  brief survey to ensure that our programs and offerings are responsive to the needs of the community.

(Duke/SCSC will be filling out this survey as well, but if you feel that your needs are unique or different from the base campus needs, please contact XSEDE directly)

We are attaching a brief overview of the XSEDE project and the TEOS goals and strategies for your information. See also 

The online survey is located at:

There are 30 questions split among the following sections:

  • Background on yourself
  • Training
  • Education
  • Outreach
  • Campus Bridging
  • General Comments

You may skip any sections or questions that you feel are outside your range of knowledge or interest.   We would like all surveys to be completed no later than April 15, 2012. This will give us time to review your comments and if needed make adjustments to our future plans.

Please feel free to share this survey and the attached document with others you know, who would like to share their opinions and advice.

Thank you in advance for sharing your comments and suggestions.

Filed Under (Programming, Training) by John Pormann, Ph.D. on 06-01-2012

We have posted our Spring 2012 training sessions to the website:

This semester we will be offering three seminars:

We’ve posted another web-based video on the wiki/Training page:

“Intro to OpenMP” covers the basic use of OpenMP — a programming environment for parallel application development.  OpenMP is a set of compiler “directives” for C/C++ and Fortran, so rather than having to learn a whole new programming language, you can continue to code in a familiar language and just add directives (instructions/hints to the compiler) on where you think parallelism could be extracted.  A simple example:

#pragma omp parallel do
for(i=0;i<N;i++) {
    y[i] = alpha * x[i] + y[i];

The addition of that one “pragma” statement converts this single-CPU loop into a multi-CPU/parallel loop.  Take a look at the video and see how to add multi-CPU/multi-core parallelism to your existing applications.