Chain#

LinearChainCRF#

class supar.structs.chain.LinearChainCRF(scores: torch.Tensor, trans: Optional[torch.Tensor] = None, lens: Optional[torch.LongTensor] = None)[source]#

Linear-chain CRFs Lafferty et al. (2001).

Parameters
  • scores (Tensor) – [batch_size, seq_len, n_tags]. Log potentials.

  • trans (Tensor) – [n_tags+1, n_tags+1]. Transition scores. trans[-1, :-1]/trans[:-1, -1] represent transitions for start/end positions respectively.

  • lens (LongTensor) – [batch_size]. Sentence lengths for masking. Default: None.

Examples

>>> from supar import LinearChainCRF
>>> batch_size, seq_len, n_tags = 2, 5, 4
>>> lens = torch.tensor([3, 4])
>>> value = torch.randint(n_tags, (batch_size, seq_len))
>>> s1 = LinearChainCRF(torch.randn(batch_size, seq_len, n_tags),
                        torch.randn(n_tags+1, n_tags+1),
                        lens)
>>> s2 = LinearChainCRF(torch.randn(batch_size, seq_len, n_tags),
                        torch.randn(n_tags+1, n_tags+1),
                        lens)
>>> s1.max
tensor([4.4120, 8.9672], grad_fn=<MaxBackward0>)
>>> s1.argmax
tensor([[2, 0, 3, 0, 0],
        [3, 3, 3, 2, 0]])
>>> s1.log_partition
tensor([ 6.3486, 10.9106], grad_fn=<LogsumexpBackward>)
>>> s1.log_prob(value)
tensor([ -8.1515, -10.5572], grad_fn=<SubBackward0>)
>>> s1.entropy
tensor([3.4150, 3.6549], grad_fn=<SelectBackward>)
>>> s1.kl(s2)
tensor([4.0333, 4.3807], grad_fn=<SelectBackward>)
property argmax#

Computes \(\arg\max_y p(y)\) of the distribution \(p(y)\).

topk(k: int) torch.LongTensor[source]#

Computes the k-argmax of the distribution \(p(y)\).