Attempts to improve the performance of programs have been ongoing for decades. Operating systems base a large percentage of their technology on the capability to allow several programs to run simultaneously in the best possible way. From the point of view of computer architecture, techniques to speed up access to the different types of memories, to improve pipelining, the execution of instructions that can actuate on a set of (vector) data, etc, have been incorporated.
There has been a revolution in the field of computer processors in recent years. The tendency is to stop increasing the frequency of the processors and to focus on the calculation capabilities, taking the number of cores in the processor as the scaling factor. These technologies were only available in special processors years ago, but they are currently easily accessible in the market for computers, graphics cards and even cell phones.
There are some parallels on some levels, in the sense that the evolution in hardware has gone hand in hand with the evolution in software to benefit from it. In recent years there has also been a revolution in the technologies that implement parallelism in applications. Historically, parallelism was achieved by using specific libraries that served to create a ‘virtual parallel unit’ by means of communication libraries for parallelism based on message transfers. The standard technology that was used in practice for this purpose is Message Passing Interface (MPI). This technology is very widespread in calculation centres and computer clusters in scientific and technological circles of companies and universities. With the dawn of multi-core processors, where memory is shared between the cores, the technology used to obtain parallelism has extended to that market too, and new programming environments or libraries that exploit this new capacity have appeared. OpenMP and Intel Threading Building Blocks are examples of this type of technology.
There is a tendency to mix this kind of technology with MPI to obtain parallelism in applications, both at the level of the multi-core processors where they are run and as regards the parallel execution of the application within a computer cluster. Applying these technologies is generally extremely complex, and it depends greatly on the problem to be solved. In the case of EcosimPro, having a large part of the simulation model generated dynamically presents an added challenge. Applying this type of solution to real models that establish a highly-specific and optimised calculation order is therefore very complex.
The upcoming versions of EcosimPro and PROOSIS will incorporate new capabilities of parallelism technologies, such as the simultaneous execution of simulation cases (e.g. parametric studies), that will greatly reduce calculation times, as well as other techniques to establish as much parallelism as possible between tasks of the same process. Furthermore, the new versions of C/C++/FORTRAN compilers already generate certain tasks in parallel, so the program does not need any more changes and simply changing compiler will improve calculation times.