Multiple simulation cycles in two-step simulation

lorenzo.mercolli · 30 April 2020 20:56

Dear FLUKA experts

I’m not sure how to implement multiple simulation cycles in a two-step simulation and I would appreciate some advice.

The situation is as follows. I run multiple cycles, e.g. 5, of the first simulation and with fluscw.f this generates a phase space file for every cycle. What is now the easiest way to run 5 cycles in the second simulation? I do not want to run a single (larger) cycle in the first step and then run multiple cycles in the second step since in this case I would just resample the same distribution that came from the same random number seed. Here are my thoughts

Doing some hard coding of file names in the source.f file for the second simulation seems like a bad idea.
I could just combine the 5 phase space files from the first simulation into one big file (after all it is just a long list of particles) but that does not seem very elegant to me and I would prefer to avoid generating huge files.
Maybe one could use Flair’s loop variable capability and loop through the phase space files from the first simulation. But I didn’t manage to use Flair’s loop variables with file names instead of numerical values.
I could write a script that generates the input files for the cycles of the second step according to Flair’s naming convention but with the different phase space file names in the input file. Or is there a way to do this automatically in Flair?
Any other ideas?

Thanks,
Lorenzo

horvathd · 30 April 2020 21:48

Dear Lorenzo,

in my experience, these kind of 2 step simulations are done the following way:

In the first step you create a single large phase space data file (1 long cycle, or concatenate the results of multiple cycles, it doesn’t matter)
In the second step, in each cycle (job) you read the one data file.
You don’t go line by line each time, but you randomly select lines.
To make it more random, you can smear your phase space further. Just add a small Gaussian variation for the starting coordinates (if it makes sense), direction, energy, etc.
This way you can run much more primaries in the second step than you have in the space phase data file. Which is usually needed, unless you run a very long first step.

In my opinion, multiple space phase data file is more restrictive, since the number of jobs / cycles in both steps has to be the same.

But, if you still prefer to load different files is possible to do.

Hard coding file names, maybe not elegant, but it works. You can even signal which file to read through a WHAT on the SOURCE card to the source routine.
Like, in the source routine: if WHASOU(1) == 1 then read file_1, etc.
If you use a #define $variable in the SOURCE card, you can use this $variable in a Flair loop run.
In Flair you can define lists, and use this to change the file name on an OPEN card. (See http://flair.web.cern.ch/flair/manual/F3.6.html, section: Arrays, tuples and lists)

I hope this helps.

Cheers,
David

lorenzo.mercolli · 1 May 2020 05:28

Dear David

Thanks a lot for your detailed answer! Things are now much clearer.

Best,
Lorenzo

ctheis · 4 May 2020 13:33

Hi @lorenzo.mercolli,

I’d like to add a few more aspects to David’s answer as there are some very subtle pitfalls in these kind of simulations.

Nr of phase space files:
What we usually did in the past was to have one phase space file per cycle in the first step. Like any output file generated it will get the cycle suffix (e.g. phase-space001.dat). For the second step you can modify your rfluka script to create a link/copy to the corresponding file that is to be run for the respective cycle in the second step as the script knows the current cycle number and thus, the suffix.

Nr of particles in the second step:
As you want to profit from “boosted” statistics you can iterate over your phase space file multiple times.

Randomization of the parameters or position in the phase space file:
While this is tempting it adds little benefit and can introduce a subtle bias with artifacts that are again difficult to spot. So unless you know perfectly well what you are doing and can ensure that it has absolutely no impact on the results I would advise against it. In the end it is not really necessary as I will explain:

if you randomize which particle to start within the second file you either need to read the whole file, which takes a lot of memory, or you need to start searching on the disk which incurs a runtime penalty due to increased IO load.
keeping the order then a 2-step run, using all particles only once, should give you exactly (!) the same result as the one-step if you start with the same random seed! This is extremely important because there are subtle effects that you might not see easily and attribute to statistical fluctuations.
if you loop over the phase space file (with the original sequence) several times you will arrive at a different step in the random number generator’s sequence whenever you restart the same particle another time. With a good RNG this should provide you with sufficient randomness to avoid correlation and you will see that your histories rapidly comply with the requirements of a Markov chain. In the end you will find a very slight and negligibly small amount of slightly correlated sub-steps which averages out of the whole ensemble. To my opinion this is preferable over introducing an additional randomness in direction/energy that, depending on the physics problem you are looking at, might introduce unphysical behavior and a bias that is difficult to spot.

Final caveats:
One important point that is often missed concerns the first step. In order to avoid unphysical and wrong behavior, that is often very very difficult to spot, you must make sure that you do not consider particles twice. This means once the particle is written to the phase space file it must be stopped!. Otherwise it could back-scatter and come back being again written to your file. This can happen especially for low energy neutrons which in the end creates subtle artifacts and incorrect results that are tricky to spot!
The best way is to set the volume which is used as particle sink to the BLACKHOLE material. Whenever the particles cross into that volume they should be written to the phase space file and then they are automatically killed by the code, allowing you to conserve energy, correct weights etc.
I would even suggest that in your routine which writes out the phase space file you check this. You just need to include flkmat.inc and then you can check the material in the array MEDFLK like this:

     IF( MEDFLK( RegId(I), 1 ) .NE. 1 ) THEN
           WRITE(LUNERR,*) '%%% FLUKA 2Step: Scoring region ',RegId(I)
           WRITE(LUNERR,*) 'must be set to BLACKHOLE!!!'
           STOP '%%% FLUKA 2Step error. See the .err file!'
     END IF

One final point resorts to error calculation: if you do a two-step and analyse the corresponding statistical error you will easily get very small uncertainties which indicate convergence of your result. This is treacherous because it’s not the real uncertainty but only the uncertainty of the second step. Imagine you have a phase space file with only 10 particles inside. Clearly the uncertainty of the second step will be almost zero if you run those 10 particles often enough. But the final result is most probably completely off, simply because the first step was not calculated with sufficient statistical significance!
Therefore, don’t trust the small errors you will see - they are fake.

Cheers
Chris

lorenzo.mercolli · 4 May 2020 19:32

Dear @ctheis

Thanks a lot for the detailed explanation! Indeed I was wondering how I could do the randomization of the particles in the phase space file without having to read the whole file into my memory.

I didn’t think about the issue of stopping the particles after they cross the plane where the phase space file is written. But of course you are right. I’ll check with (fewer) particles how much the results differ in my case.

Cheers,
Lorenzo