When using parallel computing, one thread reports the error “executable returned RC=134” while other threads run normally

Dear Professor,
When using parallel computing, one thread reports the error “executable returned RC=134” while other threads run normally. This error does not occur every time, and it may disappear after re-simulating, so it cannot be reproduced. I have followed the method in Executable returned RC=134 for processing, but it has no effect. The memory usage of a single thread is 22.4g, and I use 5 threads. My computer has 128g of memory and 32g of swap space.


Dear @xingxing ,

Apologies for the delay in response.

Would it be possible to see the .inp, the .log, the .out, and the .err file produced when this error occurs?

Best regards,
Katie

Thank you for your reply. The previous file has been deleted by me, and I will upload another simulated file with the same error.
10test4_01.inp (3.0 KB)
Processing: 10test4_01003.log…
10test4_01003.out (72.3 KB)
10test4_03.inp (3.0 KB)
10test4_03004.log (1.5 KB)
10test4_03004.out (72.3 KB)

The .err file is empty and cannot be uploaded.
The files I uploaded come from these two folders.

Perhaps it’s because I didn’t set the maximum energy on the BEAM card?

Thank you for this, would you also be able to provide the source routine you are using?

This my source routine
10test4.f (20.7 KB)

Thanks for this, can you also provide the .txt file that you call within the source routine?

The txt file may be too large to upload, so I will provide you with some screenshots. Please check if they are helpful.

The issue is arising in accessing the 10test4.txt file, this is due to its size and there being insufficient memory. The recommendation here is to reduce the number of points in the source term, reducing the size of the .txt file - this should remove the problem. Given the size of the file, it is unlikely the simulation will sample every point in the current source term, and so it should not impact the effective source definition.

Kind regards,
Kate

1 Like

Thank you for your reply. I increased the number of particles recorded in the txt document to reduce the statistical error in the simulation results. Currently, the number of particles recorded in my txt document is 3e8. When the number of simulation histories reaches approximately 1e10, further increasing the number of simulated particles no longer reduces the statistical error within the 0.3cm*0.3cm*0.3cm dose grid. If I want to achieve smaller statistical errors within the 0.3cm*0.3cm*0.3cm dose grid while reducing the number of particles recorded in the txt document to prevent the “executable returned RC=134” error, can the BIASING card or EMF-BIAS card help me achieve this?

For this purpose you should increase instead the number of primaries you run in FLUKA - that is the one determining the resulting statistical error - and not the content of your source file. The latter is just meant to represent reliably the source term, without reaching a so big size, which is the reason of your system failure.

Please do not refer to "executable returned RC=134”, which does not mean anything, rather to the error backtrace in the .log file message:

#11 0x7480396a65f8 in __libc_calloc
at ./malloc/malloc.c:3679

#17 0x62fd7ed8e8e5 in read_phase_space_file_
at /home/xin/PSF/10test4/source_library.inc:1041
#18 0x62fd7ed952e7 in source_
at /home/xin/PSF/10test4/10test4.f:566

pointing to a memory allocation issue in the reading of your huge .txt file, as Kate indicated.

Do you mean increasing the value of ni? That is indeed effective. However, when I set ni to 4e8, with five cycles and five parallel runs, under the bin size as shown in the figure, the statistical error I get is around 1%. When I continue to increase ni, the statistical error no longer decreases. Is this normal?



Up to which value did you increase it? Consider that the error is expected to decrease as the square root of the increase ratio.
More substantially, I’d judge a 1% statistical error as a satisfactory achievement, considering that at that point the result uncertainty gets dominated by the systematic error, which has many sources.

[quote=“Francesco Cerutti, post:15, topic:7449, username:ceruttif”]
您将其增加到哪个值
[/quote]I have increased the number of particles per single simulation cycle, while keeping the number of cycles unchanged.

Let me explain the results I want to obtain and some of the methods I have thought about, and I hope you can give some suggestions. I hope to obtain the dose distribution in a 0.3×0.3 cm plane at a depth of 30 cm along the central axis of the water tank. I use the usrbin-DOSE card with a 0.3 cubic centimeter bin, and the statistical error needs to be reduced to 0.3%. To this end, I have taken the following actions:

  1. Increase the number of particles (photons) recorded in the txt phase-space file to ensure that there are enough particles in each bin.
  2. Increase the number of particle histories during simulation.
  3. Increase the number of simulation cycles.

The following are some methods that I think can help reduce statistical errors:

  1. Lower the production and transport thresholds for photons and electrons.
  2. Use the BIASING card to set a larger weight in the scoring area.

I would like to get your suggestions on whether the above methods are effective or whether there are other methods to reduce statistical errors.

Lowering the production and transport thresholds for photons and electrons may increase the accuracy of the result (and the required CPU time), but may not necessarily reduce the statistical error.

You do not need any biasing for a scoring area that is in the core of the radiation propagation.

You got ~1% statistical errors with 10 billions primaries (4e8x5x5). My question was up to which value you further increased your statistics.

When I increased the number of particles from (4e7×5×5) to (4e8×5×5), the statistical error decreased to approximately one-third of the original value. However, when further increasing the number of simulated particles from (4e8×5×5) to (4e9×5×5), the statistical error showed no significant reduction compared to that of (4e8×5×5).