I’m a little confused about how we can calculate perplexity without training a model.
My understanding of the perplexity formula is that for every time t, we compute the probability that our trained model would predict the correct output y_t. Based on this, though, how would it be possible to know the perplexity without fully training the model?