Hello, today I bring the results of some quick parallel efficiency tests. I run on each of three computers two identical cases: one using a single core and another using 4 cores. To get an efficiency factor, I divide the time it took to run the 4-core case by 4 times the time it took to run the case on a single core. In the best case the result is 1, meaning there was no waste in multiple-core communication/overhead. But... we see this is not the case:
As you can see, the results are quite bad. I am not sure what causes this inefficiency. One reason I have seen floating around the internets (both the good and bad internets) is that the primary cache (L1?) size severely limits this. I have also seen that the secondary and below cache (L2, L3) sizes are not very useful in this application at least.
Computer 1 has an i7-2600 core, Computer 2 has an i5-750, and Computer 3 has a different i5 (Q9550). The memory is more than enough (verified by looking at system monitor). I used OpenFoam 1.5-dev with generalized grid interface (GGI, rotor in the mesh).
Given these results my impetus to cluster computers together has died. Haha.
No comments:
Post a Comment