For High Energy Physics, the go-to framework for big data analysis has been CERN's ROOT framework. ROOT is a massive C++ library that even predates the STL in some areas. It is1 also a JIT C++ interpreter called Cling, probably the best in the business. If you have heard of the Xeus C++ Kernel for Jupyter, that is built on top of Cling. ROOT has everything a HEP physicist could want: math, plotting, histograms, tuple and tree structures, a very powerful file format for IO, machine learning, Python bindings, and more. It also does things like dictionary generation and arbitrary class serialization (other large frameworks like Qt have similar generation tools).
You may already be guessing one of the most common problems for ROOT. It is huge and difficult to install – if you build from source, that's a several hour task on a single core. It has gotten much better in the last 5 years, and there are several places you can find ROOT, but there are still areas where it is challenging. This is especially true for Python; ROOT is linked to just one version of Python, and the one you get with pre-built ROOT can often be the wrong one. And, if you use the Anaconda Python distribution, which is the most popular scientific distribution of Python and massively successful for ML frameworks, the general rule even for people who build ROOT themselves has been: don't. But now, you can get a fully featured ROOT binary package for macOS or Linux, Python 2.7, 3.6, or 3.7, from Conda-Forge, the most popular Anaconda community channel!
Intro to Conda
If you don't already have Anaconda or Conda, you can go to anaconda.com and download Anaconda, or you can install miniconda, which is just the Conda package manager without
anaconda installed in the base environment.
If you manage your system, there are also yum and apt packages. There are also Docker images.
If you want to do this in an entirely automated way, for example on a new system or on a continuous integration (CI) system, the following commands will set up miniconda:
# Download the Linux installer wget -nv http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh # Or download the macOS installer wget -nv https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O miniconda.sh # Install Conda (same for macOS and Linux) bash miniconda.sh -b -p $HOME/miniconda source $HOME/miniconda/etc/profile.d/conda.sh # Add to bashrc - similar files available for fish and csh
If you use binder, that already uses Conda if you have an
environment.yml file. See an example that uses ROOT here, or launch it by clicking this button:
For Conda, you really should be working in an environment. Here is how you prepare a new environment with ROOT preinstalled:
conda create -n my_root_env root -c conda-forge
This will make a new environment called
my_root_env2, and install ROOT into it. You can specify other packages too, like a version of Python (2.7, 3.6, or 3.7),
anaconda (which will install 100 or so scientific Python packages), or individual packages. ROOT will automatically add all of its dependencies, like Pythia8 and Numpy.
Advanced: How to use an environment file instead (click to expand)
It is even better to use an
environment.yml file. This is a list of channels and dependencies that you can distribute with your project.
This is an example of a simple
name: my_root_env channels: - conda-forge dependencies: - root
Then, you run Conda like this:
conda env create -f environment.yml
If you want to capture your exact environment in a reproducible manor, with all package versions, run this:
conda env export > environment.yml
To enter the environment:
conda activate my_root_env
The first time you enter the environment, you should add the conda-forge channel to the search list (otherwise, you will have to add
-c conda-forge every time you install or update something):
conda config --env --add channels conda-forge
To leave the environment:
Installing into the current environment
If you are already in an environment (even the base environment, that's not a good idea generally, but supported by Conda-ROOT), then you will want to do something like this:
conda install root -c conda-forge
If you want to enable conda-forge as a searched channel globally so that you don't have to add this flag every time you do anything, run:
conda config --add --env channels conda-forge
This really just adds a line to the current environment's condarc, or
~/.condarc if you do not include the
Things to try
Almost everything in ROOT should be supported; this was built with lots of options turned on. Here are a few things to try:
root: you can start up a session and see the splash screen; Control-D to exit.
import ROOTwill load PyROOT.
root --notebookwill start a notebook server with a ROOT kernel choice.
rootbrowsewill open a TBrowser session so you can look through files.
root -l -q $ROOTSYS/tutorials/dataframe/df013_InspectAnalysis.Cwill run a DataFrame example with an animated plot.
root -b -q -l -n -e "std::cout << TROOT::GetTutorialDir() << std::endl;"will print the tutorial dir.
root -b -l -q -e 'std::cout << (float) TPython::Eval("1+1") << endl;'will run Python from C++ ROOT.
The ROOT package will prepare the required compilers (see below). Everything in Conda is symlinked into
$CONDA_PREFIX if you build things by hand; tools like CMake should find it automatically. While
thisroot.* scripts exist, they should not be used. Graphics,
rootbrowse, etc. all should work. Any Conda-ROOT issues can be reported to the root-feedstock.
ROOT was built with and will report
On Linux, there really aren't any special caveats, just a few general to Conda itself, and the compilers package. When ROOT is in the active environment,
$CXX are the Conda compilers, GCC 7.3.
The caveats on macOS were removed on 9-25-2019; you no longer need a special 10.9 SDK. You should simply have any SDK (so install Xcode), and you should be good to go. Again, like linux, new compilers are added (Clang 8).
Feel free to refer to the conda build documentation if you want to build anything.
ROOT does not link to Python directly in order to properly support PyROOT from Python, but has been patched to provide the correct behavior to allow PyROOT to also be used from ROOT's C++ command line. Please report any bugs for this to the root-feedstock.
Building a library that uses ROOT
If you want to provide a package that uses ROOT, you probably do not want to replace the system compilers on the command line. To support this, ROOT was broken into several packages. You can install the
root_base package to just get ROOT. The
root-dependencies package stores all the dependencies (note: the full list includes things like Qt and is larger than ROOT itself). The
root-binaries package stores the ROOT executables. And finally, the full
root package includes compilers, jupyter, and a few other things. See the recipe for definitions.
How it was made possible
This was a monumental feat, but it was enabled by the new technologies from Conda and Conda-Forge. The Conda 4.6 release provides much better support for environments, and the unified activation allows packages to rely on environment changes. While ROOT is very careful to respect your environment (the only variable it directly sets is
ROOTSYS to be nice), it helps with compiler packages and more work together. Anaconda in version 5.0 changed to a unified and modern compiler stack, and Conda-Forge spent months converting all of the packages from the old, diverse compilers to the single compiler stack. ROOT's Linux packages were available on the day this project was completed.
Future, history, and thanks
This was done in collaboration with the ROOT team. Many fixes were pushed to ROOT to make this possible, and are in the 6.16.00 release and upcoming 6.16.02 release. There is an ongoing effort to integrate the Conda machinery into the ROOT nightly testing, so that we won't get caught by surprise in an update and so that nightly builds of master will be available (probably in the
This project was made possible by Chris Burr, with the help of Henry Schreiner, Enrico Guiraud, and Patrick Bos. Other members of the ROOT team that helped contribute are Guilherme Amadio, Axel Naumann, and Danilo Piparo.
Special thanks to the previous way to get ROOT on Conda, the NLeSC formula by Daniela Remenska. That was a great effort, but since it did not run with CI and it required a massive number of custom dependencies instead of relying on Conda-Forge to package those dependencies, it was impossible to maintain, and remained stuck on older versions of Python and ROOT. It predated Conda-Forge and conda-build 3, which made the current project possible. It was one of the inspirations of this project, though, and deserves a special place of honor.