Tree¶
MatrixTree¶
- class supar.structs.tree.MatrixTree(scores, lens=None, multiroot=False)[source]¶
MatrixTree for calculating partitions and marginals of non-projective dependency trees in \(O(n^3)\) by an adaptation of Kirchhoff’s MatrixTree Theorem [Koo et al. 2007].
- Parameters
Examples
>>> from supar import MatrixTree >>> batch_size, seq_len = 2, 5 >>> lens = torch.tensor([3, 4]) >>> arcs = torch.tensor([[0, 2, 0, 4, 2], [0, 3, 1, 0, 3]]) >>> s1 = MatrixTree(torch.randn(batch_size, seq_len, seq_len), lens) >>> s2 = MatrixTree(torch.randn(batch_size, seq_len, seq_len), lens) >>> s1.max tensor([0.7174, 3.7910], grad_fn=<SumBackward1>) >>> s1.argmax tensor([[0, 0, 1, 1, 0], [0, 4, 1, 0, 3]]) >>> s1.log_partition tensor([2.0229, 6.0558], grad_fn=<CopyBackwards>) >>> s1.log_prob(arcs) tensor([-3.2209, -2.5756], grad_fn=<SubBackward0>) >>> s1.entropy tensor([1.9711, 3.4497], grad_fn=<SubBackward0>) >>> s1.kl(s2) tensor([1.3354, 2.6914], grad_fn=<AddBackward0>)
- property max¶
Computes the max score of the distribution \(p(y)\).
- property argmax¶
Computes \(\arg\max_y p(y)\) of the distribution \(p(y)\).
- sample()[source]¶
Obtains a structured sample from the distribution \(y \sim p(y)\). TODO: multi-sampling.
- property entropy¶
Computes entropy \(H[p]\) of the distribution \(p(y)\).
- cross_entropy(other)[source]¶
Computes cross-entropy \(H[p,q]\) of self and another distribution.
- Parameters
other (StructuredDistribution) – Comparison distribution.
- kl(other)[source]¶
Computes KL-divergence \(KL[p \parallel q]=H[p,q]-H[p]\) of self and another distribution.
- Parameters
other (StructuredDistribution) – Comparison distribution.
DependencyCRF¶
- class supar.structs.tree.DependencyCRF(scores, lens=None, multiroot=False)[source]¶
First-order TreeCRF for projective dependency trees [Eisner 2000, Zhang et al. 2020a].
- Parameters
Examples
>>> from supar import DependencyCRF >>> batch_size, seq_len = 2, 5 >>> lens = torch.tensor([3, 4]) >>> arcs = torch.tensor([[0, 2, 0, 4, 2], [0, 3, 1, 0, 3]]) >>> s1 = DependencyCRF(torch.randn(batch_size, seq_len, seq_len), lens) >>> s2 = DependencyCRF(torch.randn(batch_size, seq_len, seq_len), lens) >>> s1.max tensor([3.6346, 1.7194], grad_fn=<IndexBackward>) >>> s1.argmax tensor([[0, 2, 3, 0, 0], [0, 0, 3, 1, 1]]) >>> s1.log_partition tensor([4.1007, 3.3383], grad_fn=<IndexBackward>) >>> s1.log_prob(arcs) tensor([-1.3866, -5.5352], grad_fn=<SubBackward0>) >>> s1.entropy tensor([0.9979, 2.6056], grad_fn=<IndexBackward>) >>> s1.kl(s2) tensor([1.6631, 2.6558], grad_fn=<IndexBackward>)
- property argmax¶
Computes \(\arg\max_y p(y)\) of the distribution \(p(y)\).
Dependency2oCRF¶
- class supar.structs.tree.Dependency2oCRF(scores, lens=None, multiroot=False)[source]¶
Second-order TreeCRF for projective dependency trees [McDonald & Pereira 2006, Zhang et al. 2020a].
- Parameters
Examples
>>> from supar import Dependency2oCRF >>> batch_size, seq_len = 2, 5 >>> lens = torch.tensor([3, 4]) >>> arcs = torch.tensor([[0, 2, 0, 4, 2], [0, 3, 1, 0, 3]]) >>> sibs = torch.tensor([CoNLL.get_sibs(i) for i in arcs[:, 1:].tolist()]) >>> s1 = Dependency2oCRF((torch.randn(batch_size, seq_len, seq_len), torch.randn(batch_size, seq_len, seq_len, seq_len)), lens) >>> s2 = Dependency2oCRF((torch.randn(batch_size, seq_len, seq_len), torch.randn(batch_size, seq_len, seq_len, seq_len)), lens) >>> s1.max tensor([0.7574, 3.3634], grad_fn=<IndexBackward>) >>> s1.argmax tensor([[0, 3, 3, 0, 0], [0, 4, 4, 4, 0]]) >>> s1.log_partition tensor([1.9906, 4.3599], grad_fn=<IndexBackward>) >>> s1.log_prob((arcs, sibs)) tensor([-0.6975, -6.2845], grad_fn=<SubBackward0>) >>> s1.entropy tensor([1.6436, 2.1717], grad_fn=<IndexBackward>) >>> s1.kl(s2) tensor([0.4929, 2.0759], grad_fn=<IndexBackward>)
- property argmax¶
Computes \(\arg\max_y p(y)\) of the distribution \(p(y)\).
ConstituencyCRF¶
- class supar.structs.tree.ConstituencyCRF(scores, lens=None)[source]¶
Constituency TreeCRF [Stern et al. 2017, Zhang et al. 2020b].
- Parameters
scores (Tensor) –
[batch_size, seq_len, seq_len]. Scores of all constituents.lens (LongTensor) –
[batch_size]. Sentence lengths for masking.
Examples
>>> from supar import ConstituencyCRF >>> batch_size, seq_len = 2, 5 >>> lens = torch.tensor([3, 4]) >>> charts = torch.tensor([[[0, 1, 0, 1, 0], [0, 0, 1, 1, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]], [[0, 1, 1, 0, 1], [0, 0, 1, 0, 0], [0, 0, 0, 1, 1], [0, 0, 0, 0, 1], [0, 0, 0, 0, 0]]]).bool() >>> s1 = ConstituencyCRF(torch.randn(batch_size, seq_len, seq_len), lens) >>> s2 = ConstituencyCRF(torch.randn(batch_size, seq_len, seq_len), lens) >>> s1.max tensor([ 2.5068, -0.5628], grad_fn=<IndexBackward>) >>> s1.argmax [[[0, 3], [0, 1], [1, 3], [1, 2], [2, 3]], [[0, 4], [0, 2], [0, 1], [1, 2], [2, 4], [2, 3], [3, 4]]] >>> s1.log_partition tensor([2.9235, 0.0154], grad_fn=<IndexBackward>) >>> s1.log_prob(charts) tensor([-0.4167, -0.5781], grad_fn=<SubBackward0>) >>> s1.entropy tensor([0.6415, 1.2026], grad_fn=<IndexBackward>) >>> s1.kl(s2) tensor([0.0362, 2.9017], grad_fn=<IndexBackward>)
- property argmax¶
Computes \(\arg\max_y p(y)\) of the distribution \(p(y)\).