Functions

Tarjan

supar.structs.fn.tarjan(sequence)[source]

Tarjan algorithm for finding Strongly Connected Components (SCCs) of a graph.

Parameters

sequence (list) – List of head indices.

Yields

A list of indices making up a SCC. All self-loops are ignored.

Examples

>>> next(tarjan([2, 5, 0, 3, 1]))  # (1 -> 5 -> 2 -> 1) is a cycle
[2, 5, 1]

ChuLiuEdmods

supar.structs.fn.chuliu_edmonds(s)[source]

ChuLiu/Edmonds algorithm for non-projective decoding [McDonald et al. 2005].

Some code is borrowed from tdozat’s implementation. Descriptions of notations and formulas can be found in [McDonald et al. 2005].

Notes

The algorithm does not guarantee to parse a single-root tree.

Parameters

s (Tensor) – [seq_len, seq_len]. Scores of all dependent-head pairs.

Returns

A tensor with shape [seq_len] for the resulting non-projective parse tree.

Return type

Tensor

MST

supar.structs.fn.mst(scores, mask, multiroot=False)[source]

MST algorithm for decoding non-projective trees. This is a wrapper for ChuLiu/Edmonds algorithm.

The algorithm first runs ChuLiu/Edmonds to parse a tree and then have a check of multi-roots, If multiroot=True and there indeed exist multi-roots, the algorithm seeks to find best single-root trees by iterating all possible single-root trees parsed by ChuLiu/Edmonds. Otherwise the resulting trees are directly taken as the final outputs.

Parameters
  • scores (Tensor) – [batch_size, seq_len, seq_len]. Scores of all dependent-head pairs.

  • mask (BoolTensor) – [batch_size, seq_len]. The mask to avoid parsing over padding tokens. The first column serving as pseudo words for roots should be False.

  • multiroot (bool) – Ensures to parse a single-root tree If False.

Returns

A tensor with shape [batch_size, seq_len] for the resulting non-projective parse trees.

Return type

Tensor

Examples

>>> scores = torch.tensor([[[-11.9436, -13.1464,  -6.4789, -13.8917],
                            [-60.6957, -60.2866, -48.6457, -63.8125],
                            [-38.1747, -49.9296, -45.2733, -49.5571],
                            [-19.7504, -23.9066,  -9.9139, -16.2088]]])
>>> scores[:, 0, 1:] = MIN
>>> scores.diagonal(0, 1, 2)[1:].fill_(MIN)
>>> mask = torch.tensor([[False,  True,  True,  True]])
>>> mst(scores, mask)
tensor([[0, 2, 0, 2]])