I see it as different files are used to store semantically different types of data. With a single file for the architecture it’s simple and efficient to share and use visualisation tools (like https://lutzroeder.github.io/Netron/). I understand your argument for consistency though, and to ensure consistency of these two components, MXNet will give you warnings/errors if you try to load weights onto a incompatible model. So that you still have flexibility, you can disable the warnings/errors with the following two arguments of
allow_missing ( bool , default False ) – Whether to silently skip loading parameters not represents in the file.
ignore_extra ( bool , default False ) – Whether to silently ignore parameters from the file that are not present in this Block.