[Neural Network] {Université de Sherbrooke} C3: Conditional Random Field

http://info.usherbrooke.ca/hlarochelle/neural_networks/content.html

 

 

 

these characteristics may come from a word. (hand writting data) 

 

sequence of observation => model the joint distribution over the whole sequence

 

 


 

linear chain CRF

 

usually => iid assumption

but for the adjacent positions in a sequence => linear chain CRF

 

 

 

first term: from x_k

seconde term: from V matrix

 

 

 


context window

 

 

 

 

 three neural network, weighted by a(0) a(-1) a(+1)

 

 

 

 alternative: only one NN

 

 

 

 

 

 


 

 

computing the partition function

 

 

 y' ≠ y

 

y_k is the resultant sequence

y'_k is all the probable sequence

 

the goal here is to calculate Z(X) in polynomial time (dynamic programming)

 

if someone gives me y2' then we can calculate \alpha_1(y2')

 

 

 

 

 

 

 

 

https://www.spaces.ac.cn/archives/5542/comment-page-1 

 

 

 

 

 

 

advantage function????

 

a = max x_n

V(s) = max_a Q(a|s)

A(a|s) = Q(a|s) - V(s)

 

 

 

 

 

 

 

 

 


 

 

computing marginals

 

 

 

 


 

 

performing classification

 

 

 

 

 

 

 

 

 

 


 

 

factors, sufficient statistics and linear CRF

 

 

 

 

 

 

 

 


 

 

 

Markov network

 

 

 

 


 

 

factor graph

 

 

another visualization to get rid of the ambiguity.

 

 

 

 


 

 

belief propagation

 

 

 

 

 

 

 

 

 

posted @ 2019-05-18 09:04  ecoflex  阅读(111)  评论(0编辑  收藏  举报