How could I improve the speed when process multiple video with one gpu?

Assume I have one gpu and got 6 videos want to process at the same time, how should I speed things up?

Solution I have tried :

Solution 1:

Create a global object to do the inference jobs in different threads, batch size of this object is 1.

Solution 2:

Create a few global objects to do the inference jobs in different threads, batch size of these object is 1

Unfortunately, solution 2 do not make things faster, so I think I need another solution, which could use larger batch size to process the videos frames.

Solution 3(haven’t tried) :

a : Set batch size as 6
b : If size of the image reach 6, feed into the network
c : if size of the image cannot reach 6 within certain period(ex : 500ms), feed image into the network
d : return inference results of each image

What kind of solutions are recommended?Thanks

I don’t think that doing multithreading is going to really help to speed things… What you need to make sure that your GPU is utilized at 100% using as much of your GPU memory as possible.

From my experience, usually the problem is in loading data from the disk and doing preprocessing. This is where you might need multiprocessing/multithreading.

There was a good video regarding increasing training performance. While they try to improve training performance, the same principles applies. I highly recommend to watch it: https://www.youtube.com/watch?v=Cqo7FPftNyo