Given the complexity of numerical combustion models, the computational combustion community has always aimed to benefit from leading-edge supercomputers. This goal is at the core of the Center of Excellence in Combustion, which aims to ready the European combustion reference codes for efficient exploration of upcoming Exascale supercomputers.
A relevant milestone accomplished in the first half of the CoEC implementation has been to carry out a performance audit of the reference codes constituting the CoEC’s software hub. This activity has been performed in collaboration with the Performance Optimization and Productivity (POP) Center of Excellence , which provides performance optimization and productivity services for academic and industrial codes in all domains.
The analyses provided by POP team have established a baseline of the codes’ performance and also included recommendations for meaningful optimizations. Moreover, since all the analyses were performed following the same methodology, the results obtained benefited the cross-comparisons between codes.
The test cases chosen for the performance audit fall into three categories that represent fundamental challenges of computational combustion: i) gas-phase simulations, ii) spray combustion simulations, and iii) soot formation simulations. The first category deals with the efficient simulation of gaseous combustion, and it mainly focuses on the treatment of chemistry and flame wall interactions. For spray combustion, the focus is set on analyzing Eulerian-Lagrangian methodologies. Finally, the soot formation’s challenge is to address a large number of partial differential equations describing soot formation and transport with the Eulerian model. Table 1 shows details on the category chosen for each reference code and information about the test case considered. Details of these benchmarks can be found in the deliverable D5.1 , elaborated in CoEC´s Work Package 5: “Exascale technologies.”
The POP CoE has its own methodology for analyzing parallel codes and measuring the impact of the different factors inherent to parallel computing. In particular, the POP model decomposes any code’s performance into different factors that can be organized in a tree, as shown in Figure 2 . The metrics directly measured are the ones of the “leaf” nodes, namely: load balance, serialization efficiency, transfer efficiency, IPC scalability, instruction scalability, and frequency scalability. On the other hand, the metrics corresponding to any other node are obtained via direct multiplication of the metrics of its child nodes. For example, global efficiency (the root node) is the product of computational efficiency and parallel efficiency.
Figure 1: PoP performance metrics
We have observed that similar performance issues are faced by different codes solving the same phenomenology. For example, a common concern for the codes tackling spray simulations is the load imbalance observed on the Lagrangian Particle Transportation. This imbalance is due to an uneven distribution of particles across the domain. Various possible solutions are pointed out in the POP audits, such as runtime dynamic load balancing  or mesh repartitioning mechanisms. Particularly, the CNRS-CORIA lab, which develops and hosts the combustion code YALES2, is working on a two-constraint load balancing algorithm to enable the correct balancing of both the Eulerian grid and the Lagrangian particles in the same calculation.
Figure 2 shows the performance measurements for Alya (before applying recommended optimizations) at running the CORIA Rouen Spray Burner (CRSB) test case. We can observe that the most critical aspect that affects the performance is serialization. Exploring this aspect in further detail, the POP team discovered that the serialization is mainly caused by irregular load balance in different phases of the time step. This aspect is being tackled by the Barcelona Supercomputing Center in the context of WP5. Hopefully, positive results will be described in future newsletters. Meanwhile, we recommend you to use POP services if you want to boost your code’s performance!
Figure 2: PoP report for Alya