Dear @shusheng
Yes, those additional cards were not expected to change the result, it was just to clean up the input and speed up the execution.
Especially in a close geometry, small changes in the distances can have a big impact. An absolute comparison of HPGe detector efficiency may require significant fine tuning of the geometry, including the distance between the detector window and the crystal (not always known accurately), and even the crystal dead layer. The accuracy of the experimental result is of course up to you to judge; even there, the accuracy of the sample position and efficiency characterisation is crucial.
What remains more puzzling is the discrepancy between the FLUKA and MCNP results. Were these obtained with exactly the same geometry (e.g. by importing/exporting the geometry via Flair) or were these independently constructed models? Also, is the scoring comparable? Were other physics settings consistent?
These are just some ideas, it could be worth that you compare the two inputs, perhaps stripping them down to the minimum requirements and comparing the results again (once the geometry and source definitions are consistent).