Research Briefs

Statistics software for the geosciences

A graph showing temperature data.

Using R, scientists can build complicated graphical images from statistical analyses. An example is this graph, which shows an ensemble reconstruction of Northern Hemisphere annual temperatures and the maximum decadal average based on annual climate proxies and an energy balance model. More details are found here.

NCAR researchers are increasingly adopting an innovative tool for statistical computing and graphics. Called R, this community software project is the statistics equivalent of the LINUX movement.

R is a dialect of the S statistical language created at Bell Laboratories in the 1970s. Developed at the University of Auckland, New Zealand, in the early 1990s, it has since become widely used by statisticians for statistical software development and data analysis. R's source code is freely available and can be run on a wide variety of UNIX platforms and similar systems, Windows, and MacOS.

R's sophisticated suite of tools for data manipulation, calculation, and graphical display was developed by the statistics community. "Currently hundreds of statistical researchers contribute to the R archive of supplemental packages, giving geoscientists access to high-powered, high-level statistical methods," says NCAR scientist Steve Sain.

IMAGe scientists have been actively involved with R since its early development and have created packages for spatial statistics, extreme value analysis, and radiosonde data, as well as for sparse matrix algebra. More recently, researchers in RAL and other divisions have begun leveraging the software. "One of our goals is for scientists and visitors coming to NCAR to be aware of R and be comfortable with this statistics package as another one of their computation tools," says Doug Nychka, director of IMAGe.

One of R's strengths is that it is free and runs on nearly any platform, Nychka points out. "In the world of reproducible science, we'd like to suggest that statistical analysis be done in R," he says. "If someone wants to reproduce the results, they can simply install R and use the source code to go from raw data to finished, published tables and figures."