Process multiple files

Versions

FLUKA: 4-5.0
Flair: 3.4-3
Operating system: Centos, Ubuntu

Dear Flair experts,

Following the “Processing results in a cluster” thread, I encounter difficulties with post‑processing large output sets, especially when different machines (local workstations and remote servers) somtimes require different solutions. Is there a way to run the processing step in parallel (e.g., multi-threaded) or submit it to a batch system directly from Flair?

Alternatively, I’m familiar with parallelizing runs via Run → Custom (tsp or custom batch scripts). Is there a possibility that a similar or an equivalent hook might be implemented for the processing stage? For example, could it be possible to include a configurable “Process → Custom” option that executes a user script or enables local multi-thread processing, selected (for example) from a drop‑down list?

Thank you kindly in advance,
Hen Shukrun

Dear Hen,

If I understand correctly, you would like to merge parts of your results before merging them info a final file containing all the runs.
For the moment, let’s forget about parallelization, I will come back to it later.
If you have several .bnn files containing the same detectors, but obtained with different simulations, you can merge these using the meshtk tool included in Flair.
Now, if you go to your Flair installation folder you will find the meshtk program (possibly it is a soft link to geoviewer/meshtk, depends on the version). When you type ./meshtk -h you will be able to see the manual on how to use the software. Among the displayed information you will find the following example:

./meshtk 7E12 run1.bnn:1-3 + 1e10 run2.bnn:2-4 -o sum.bnn
          create a new 'sum.bnn' mesh data file where each bin will contain
          the '7e12*run1[i,j,k] + 1e10*run2[i,j,k]' for detectors 1,2,3 and 2,3,4.
          corresponding detectors should have the same dimensions

which works as explained. Be careful with the normalization! Since your results in the USRBIN scorings are usually per primary, you might want to keep this convention. In such case you would have to keep the multiplication factors normalized to 1. For example, if run1.bnn was obtained with 1e6 primaries and run2.bnn was obtained with 2e6 primaries, you would have to merge them like this

./meshtk 1e6/3e6 run1.bnn:1-3 + 2e6/3e6 run2.bnn:2-4 -o sum.bnn

so that your final results are still expressed per primary.

Now coming back to the parallelization.
Surely on a cluster you can submit a job which will run for example usbuw program on a subset of your results. Possibly you can even add it at the end of your fluka job submission script inside flair to be run only on the files from a given spawn (you would have to make sure that the files are then correctly moved from the cluster to your working directory).

You can also have a merging script in flair, in that case you would just ignore the fluka command that is normally passed to the script by flair as a command line argument. Please have a look at the Flair manual:

In a custom flair run script, the fluka command is passed with $* to the batch job script that is being created. This fluka command is created by flair based on the parameters chosen in the Run tab in flair, for example number of cycles, executable path etc.
In priciple, you could fish from this command the information about the subset of your runs and in your job script just run usbuw instead of fluka.

About the multi-threaded processing, I am not sure if it makes sense to be honest.

I hope this helps at least to an extend,
Let me know if you have any further questions

Cheers,
Jerzy

Dear @jemancza,
Thank you kindly for your response, it was very informative and helped me pinpoint my problem.
I run FLUKA projects on several machines - some are cluster servers, others are multi-core workstations. My goal is not to merge existing .bnn files, but to try and speed up the processing step.
If I understand correctly, when I click “Process” in Flair, the post‑processors are single‑threaded and run on the head node. As a result, the conversion of hundreds of fort.* files to .bnn takes ~30–60 minutes, even though many CPU cores are idle locally or available on the cluster. I was looking for a way to use the available local CPU cores or send the processing stage to the cluster.
As you mentioned, sending the parameters manually to a custom processing script is an option, but if I understand correctly, this requires separate customization per machine and per run. As a result, I am wondering if there might be an option to include a similar feature to the Run → Custom, that executes a general processing script to the cluster. In this way, if I understand correctly, one custom script will be needed per server (as is now needed for run submissions).
Regarding local multi-threaded processing - I am not familiar with the technical details, but can a similar logic (sending to local threads) be applied here?
Alternatively, what might be a better solution for the multi-threaded workstations (and not clusters)? For example, in a workstation with over 90 threads, a simulation of a total 15-minute run time, spawns over 80 threads and 5 cycles each, takes more than 30 minutes processing time.
Thank you kindly in advance,
Hen

Dear @rachel.hen.shukrun

First of all, sorry for the very late reply! I was on vacation.

You wrote:

If I understand correctly, when I click “Process” in Flair, the post‑processors are single‑threaded and run on the head node. As a result, the conversion of hundreds of fort.* files to .bnn takes ~30–60 minutes, even though many CPU cores are idle locally or available on the cluster. I was looking for a way to use the available local CPU cores or send the processing stage to the cluster.

First of all, the total processing speed will mostly depend on the number of scorings that you have in your input file but also on the number of runs that have to be merged and the size (the number of bins) of your scorings. You can try to optimize these parameters in your simulations.

Keep in mind that a single instance of usbsuw will be writing to a single file and it has to have access to all the cycles to correctly calculate the errors. Can you please explain what exactly you want to be happening in parallel?

If you want to implement parallel merging - you can approach the problem in two ways (maybe more, these are the two I can think of).

  1. First merging a subset of runs (or even individual runs) and then merging the intermediate files into a final .bnn file. This is the method I already described, it can be achieved with meshtk from Flair or a native FLUKA program called usbscw. A similar issue was discussed before:
    The generation of bnn files on the cluster is slow - #2
    As an example, you can add the merging step (depends on the type of scorings you have, but most likely usbsuw for your USRBINs) at the end of your Flair custom run script so that each core on a cluster will merge the results from a single spawn. Then you would have to run meshtk as described above to merge the files into a single file (this should be very fast).

  2. You can merge separate units by separate cores. This is probably a better way. You would have to send a job script where the merging is limited to a given unit number, executing for example some thing like usbsuw *fort.44, where 44 is the unit number. It is up to you have many scorings will be included in a given unit and in that way you can optimize your merging speed.

Now regarding this:

As you mentioned, sending the parameters manually to a custom processing script is an option, but if I understand correctly, this requires separate customization per machine and per run. As a result, I am wondering if there might be an option to include a similar feature to the Run → Custom, that executes a general processing script to the cluster.

Separate script per machine (or cluster or server, whatever you call it) most likely yes, but not necessarily per run. As I mentioned in my previous post, you would have to fish the information about the working directory from the command line argument passed on by flair that is normally used in a run script just as $*. Just have a look at an example job script that it being generated by the flair custom run script on your machine and you will see what is hiding behind the $*. It should be something like

/YOUR_PATH_TO_FLUKA/bin/rfluka -e {your_executable_path} -M {number_of_cycles} YOUR_FLUKA_INPUT.inp

After this line you could have another operation which for example merges all of your USRBINs from a given spawn (a simulation on a single node). But then you can also delete in the script the part which actually runs FLUKA and have only the merging part. In such case, you would need to find a way to extract the run name and number (there is a flair variable) so that the merging happens only on a subset of your simulation results OR just merge everything on a cluster with a single node.

The general problem with parallel merging using method 1 described above, if you are merging the .bnn files and not the .fort files, is that you are loosing information about possible correlation between runs. Normally, your simulations should not be correlated but there are some specific cases where this might be the case. I will have to run some tests for my own curiosity and if you want we can continue the discussion on this topic.

Therefore, I think the smarter and the safer way is to use the method 2. The only disadvantage is that I don’t really know how could you have this incorporated in a flair script. However, it should be very easy to just write a bash script which generates a job script for a given unit number.

Cheers,

Jerzy