Vocab¶
Vocab¶
- class supar.utils.vocab.Vocab(counter, min_freq=1, specials=[], unk_index=0)[source]¶
Defines a vocabulary object that will be used to numericalize a field.
- Parameters
counter (Counter) –
Counterobject holding the frequencies of each value found in the data.min_freq (int) – The minimum frequency needed to include a token in the vocabulary. Default: 1.
specials (list[str]) – The list of special tokens (e.g., pad, unk, bos and eos) that will be prepended to the vocabulary. Default: [].
unk_index (int) – The index of unk token. Default: 0.
- itos¶
A list of token strings indexed by their numerical identifiers.
- stoi¶
A
defaultdictobject mapping token strings to numerical identifiers.