Functions¶
Tarjan¶
- supar.structs.fn.tarjan(sequence)[source]¶
Tarjan algorithm for finding Strongly Connected Components (SCCs) of a graph.
- Parameters
sequence (list) – List of head indices.
- Yields
A list of indices making up a SCC. All self-loops are ignored.
Examples
>>> next(tarjan([2, 5, 0, 3, 1])) # (1 -> 5 -> 2 -> 1) is a cycle [2, 5, 1]
ChuLiuEdmods¶
- supar.structs.fn.chuliu_edmonds(s)[source]¶
ChuLiu/Edmonds algorithm for non-projective decoding [McDonald et al. 2005].
Some code is borrowed from tdozat’s implementation. Descriptions of notations and formulas can be found in [McDonald et al. 2005].
Notes
The algorithm does not guarantee to parse a single-root tree.
MST¶
- supar.structs.fn.mst(scores, mask, multiroot=False)[source]¶
MST algorithm for decoding non-projective trees. This is a wrapper for ChuLiu/Edmonds algorithm.
The algorithm first runs ChuLiu/Edmonds to parse a tree and then have a check of multi-roots, If
multiroot=Trueand there indeed exist multi-roots, the algorithm seeks to find best single-root trees by iterating all possible single-root trees parsed by ChuLiu/Edmonds. Otherwise the resulting trees are directly taken as the final outputs.- Parameters
scores (Tensor) –
[batch_size, seq_len, seq_len]. Scores of all dependent-head pairs.mask (BoolTensor) –
[batch_size, seq_len]. The mask to avoid parsing over padding tokens. The first column serving as pseudo words for roots should beFalse.multiroot (bool) – Ensures to parse a single-root tree If
False.
- Returns
A tensor with shape
[batch_size, seq_len]for the resulting non-projective parse trees.- Return type
Examples
>>> scores = torch.tensor([[[-11.9436, -13.1464, -6.4789, -13.8917], [-60.6957, -60.2866, -48.6457, -63.8125], [-38.1747, -49.9296, -45.2733, -49.5571], [-19.7504, -23.9066, -9.9139, -16.2088]]]) >>> scores[:, 0, 1:] = MIN >>> scores.diagonal(0, 1, 2)[1:].fill_(MIN) >>> mask = torch.tensor([[False, True, True, True]]) >>> mst(scores, mask) tensor([[0, 2, 0, 2]])