Load Storms: Long load times or Fails to Load¶
Sometimes it can take a long time for a module to load. Or Lmod will
produce an error that it can’t load a module it should. One likely
possibility is that you have a load storm
. That is Lmod is
loading and reloading the same modulefiles over and over. This can
happen when a module tries to load other modules and other modules do
the same. For example A/1.0.lua is:
load("B/2.0")
load("C/2.0")
The module B/2.0.lua is:
load("C/2.0")
load("D/2.0")
And module C/2.0.lua is:
load("D/2.0")
The load() function always loads the requested module file even if that modulefile is already loaded.
Lmod can report what is happening. Using the -D debug flag it is possible to track what gets loaded:
$ module purge
$ module -D load A 2> ~/load_storm.log
$ grep 'MasterControl:.*load(' ~/load_storm.log
The results from the grep is:
MasterControl:load(mA={A}){
MasterControl:load(mA={B/2.0}){
MasterControl:load(mA={C/2.0}){
MasterControl:load(mA={D/2.0}){
MasterControl:load(mA={D/2.0}){
MasterControl:unload(mA={D}){
MasterControl:load(mA={D/2.0}){
MasterControl:load(mA={C/2.0}){
MasterControl:unload(mA={C}){
MasterControl:unload(mA={D/2.0}){
MasterControl:load(mA={C/2.0}){
MasterControl:load(mA={D/2.0})
We can see that the D/2.0.lua
module is loaded 4 times in this
example. To avoid this problem, one can reduce the number of loads.
In this case, the B/2.0.lua module can only load the C module as D is
already being loaded by the C module. If this is not practical then
placing guards around the load statement can also reduce the number of
loads. For example changing A/1.0.lua to:
if (not isloaded("B/2.0")) then
load("B/2.0")
end
if (not isloaded("C/2.0")) then
load("C/2.0")
end
and similarly for the other modules will reduce loading of each module to one time. This can been seen by executing the debug module load and greping the results as before:
MasterControl:load(mA={A}){
MasterControl:load(mA={B/2.0}){
MasterControl:load(mA={C/2.0}){
MasterControl:load(mA={D/2.0}){
The above guard statements won’t unload the dependent module, unloading the A won’t unload B, C, or D. Changing the guard statements to the following will allow for loading and unloading:
if (not isloaded("B/2.0") or mode() == "unload") then
load("B/2.0")
end
if (not isloaded("C/2.0") or mode() == "unload") then
load("C/2.0")
end