artificial intelligence - Exact Hidden Markov Model training algorithm -


in cases, baum-welch algorithm used train hidden markov model.

in many papers however, argued bw algorithm optimize until got stuck in local optimum.

does there exist exact algorithm succeeds in finding global optimum (except enumerating possible models , evaluating them)?

of course applications, bw work fine. interested in finding lower bounds of amount of information loss when reducing number of states. therefore need generate best model possible.

we looking efficient np-hard algorithm (that enumerates on (potentially) exponential number of extreme points) , not on discretized number of floating points each probability in model.

a quick search finds in http://www.cs.tau.ac.il/~rshamir/algmb/98/scribe/html/lec06/node6.html "in case, problem of finding optimal set of parameters $\theta^{\ast}$ known np-complete. baum-welch algorithm [2], special case of em technique (expectation , maximization), can used heuristically finding solution problem. " therefore suggest em variant guaranteed find global optimum in polynomial time prove p=np , unknown , in fact not exist.

this problem not convex, because there in fact multiple globally optimal solutions same scores - given proposed solution, typically gives probability distribution observations given underlying state, can, instance, rename hidden state 0 hidden state 1, , vice versa, adjusting probability distributions observed behaviour generated 2 different solutions identical. if there n hidden states there @ least n! local optimums produced permuting hidden states amongst themselves.

on application of em, https://www.math.ias.edu/csdm/files/11-12/amoitra_disentangling_gaussians.pdf provides algorithm finding globally optimum gaussian mixture model. observes em algorithm used problem, points out not guaranteed find global optimum, , not reference related work version of em might (it says em algorithm slow).

if trying sort of likelihood ratio test between e.g. 4-state model , 5-state model, embarrassing if, due local optima, 5-state model fitted had lower likelihood 4-state model. 1 way avoid or recover start 5-state em starting point close of best 4-state models found. instance, create 5th state probability epsilon , output distribution reflecting average of 4-state output distributions, keeping 4-state distributions other 4 distributions in new 5-state model, multiplying in factor of (1-epsilon) somewhere still added one.


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -