Combining Different Dataset for Customized Image Semantic Segmentation

What is the recommended way to combine multiple public dataset, (and potentially private dataset), to train a PSPNet? I’m thinking of using ADE20k and CocoStuff. Also, how to select only part of the dataset based on the classes of my own interest? Thanks!

Unsupervised domain adaptation by backpropagation is a very nice idea on combining multiple datasets of similar object types that I am planning to experiment with, in the following months. Keep in mind, this is still (to the best of my knowledge) an open problem that relates to self and semi supervised learning.