Assume I have one gpu and got 6 videos want to process at the same time, how should I speed things up?
Solution I have tried :
Create a global object to do the inference jobs in different threads, batch size of this object is 1.
Create a few global objects to do the inference jobs in different threads, batch size of these object is 1
Unfortunately, solution 2 do not make things faster, so I think I need another solution, which could use larger batch size to process the videos frames.
Solution 3(haven’t tried) :
a : Set batch size as 6
b : If size of the image reach 6, feed into the network
c : if size of the image cannot reach 6 within certain period(ex : 500ms), feed image into the network
d : return inference results of each image
What kind of solutions are recommended?Thanks