pytext.models package

Subpackages

Submodules

pytext.models.crf module

class pytext.models.crf.CRF(num_tags: int)[source]

Bases: torch.nn.modules.module.Module

Compute the log-likelihood of the input assuming a conditional random field model.

Parameters:num_tags – The number of tags
decode(emissions: torch.FloatTensor, seq_lens: torch.LongTensor) → torch.Tensor[source]

Given a set of emission probabilities, return the predicted tags.

Parameters:
  • emissions – Emission probabilities with expected shape of batch_size * seq_len * num_labels
  • seq_lens – Length of each input.
export_to_caffe2(workspace, init_net, predict_net, logits_output_name)[source]

Exports the crf layer to caffe2 by manually adding the necessary operators to the init_net and predict net.

Parameters:
  • init_net – caffe2 init net created by the current graph
  • predict_net – caffe2 net created by the current graph
  • workspace – caffe2 current workspace
  • output_names – current output names of the caffe2 net
  • py_model – original pytorch model object
Returns:

The updated predictions blob name

Return type:

string

forward(emissions: torch.FloatTensor, tags: torch.LongTensor, ignore_index=0, reduce: bool = True) → torch.autograd.variable.Variable[source]

Compute log-likelihood of input.

Parameters:
  • emissions – Emission values for different tags for each input. The expected shape is batch_size * seq_len * num_labels. Padding is should be on the right side of the input.
  • tags – Actual tags for each token in the input. Expected shape is batch_size * seq_len
get_transitions()[source]
reset_parameters() → None[source]
set_transitions(transitions: torch.Tensor = None)[source]

pytext.models.disjoint_multitask_model module

class pytext.models.disjoint_multitask_model.DisjointMultitaskModel(models)[source]

Bases: pytext.models.model.Model

Wrapper model to train multiple PyText models that share parameters. Designed to be used for multi-tasking when the tasks have disjoint datasets.

Modules which have the same shared_module_key and type share parameters. Only need to configure the first such module in full in each case.

Parameters:models (type) – Dictionary of models of sub-tasks.
current_model

type – Current model to route the input batch to.

Config

alias of pytext.config.component.ComponentMeta.__new__.<locals>.Config

contextualize(context)[source]

Add additional context into model. context can be anything that helps maintaining/updating state. For example, it is used by DisjointMultitaskModel for changing the task that should be trained with a given iterator.

forward(*inputs) → List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_loss(logits, targets, context)[source]
get_pred(logits, targets, context, *args)[source]
load_state_dict(state_dict, strict=True)[source]

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Parameters:
  • state_dict (dict) – a dict containing parameters and persistent buffers.
  • strict (bool, optional) – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function. Default: True
save_modules(base_path, suffix='')[source]

Save each sub-module in separate files for reusing later.

state_dict()[source]

Returns a dictionary containing a whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names.

Returns:a dictionary containing a whole state of the module
Return type:dict

Example:

>>> module.state_dict().keys()
['bias', 'weight']

pytext.models.distributed_model module

class pytext.models.distributed_model.DistributedModel(*args, **kwargs)[source]

Bases: torch.nn.parallel.distributed.DistributedDataParallel

Wrapper model class to train models in distributed data parallel manner. The way to use this class to train your module in distributed manner is:

distributed_model = DistributedModel(
    module=model,
    device_ids=[device_id0, device_id1],
    output_device=device_id0,
    broadcast_buffers=False,
)

where, model is the object of the actual model class you want to train in distributed manner.

cpu()[source]

Moves all model parameters and buffers to the CPU.

Returns:self
Return type:Module
eval(stage=<Stage.TEST: 'Test'>)[source]

Override to set stage

train(mode=True)[source]

Override to set stage

pytext.models.doc_model module

class pytext.models.doc_model.DocModel(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]

Bases: pytext.models.model.Model

An n-ary document classification model. It can be used for all text classification scenarios. It supports PureDocAttention, BiLSTMDocAttention and DocNNRepresentation as the ways to represent the document followed by multi-layer perceptron (MLPDecoder) for projecting the document representation into label/target space.

It can be instantiated just like any other Model.

Config[source]

alias of DocModel.Config

pytext.models.joint_model module

class pytext.models.joint_model.JointModel(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]

Bases: pytext.models.model.Model

A joint intent-slot model. This is framed as a model to do document classification model and word tagging tasks where the embedding and text representation layers are shared for both tasks.

The supported representation layers are based on bidirectional LSTM or CNN.

It can be instantiated just like any other Model.

Config[source]

alias of JointModel.Config

classmethod from_config(model_config, feat_config, metadata: pytext.data.data_handler.CommonMetadata)[source]

pytext.models.model module

class pytext.models.model.Model(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]

Bases: torch.nn.modules.module.Module, pytext.config.component.Component

Generic single-task model class that expects four components:

  1. Embedding
  2. Representation
  3. Decoder
  4. Output Layer

Model also have a stage flag to indicate it’s in train, eval, or test stage. This is because the built-in train/eval flag in PyTorch can’t distinguish eval and test, which is required to support some use cases.

Forward pass: embedding -> representation -> decoder -> output_layer

These four components have specific responsibilities as described below.

Embedding layer should implement the way to represent each token in the input text. It can be as simple as just token/word embedding or can be composed of multiple ways to represent a token, e.g., word embedding, character embedding, etc.

Representation layer should implement the way to encode the entire input text such that the output vector(s) can be used by decoder to produce logits. There is no restriction on the number of inputs it should encode. There is also not restriction on the number of ways to encode input.

Decoder layer should implement the way to consume the output of model’s representation and produce logits that can be used by the output layer to compute loss or generate predictions (and prediction scores/confidence)

Output layer should implement the way loss computation is done as well as the logic to generate predictions from the logits.

Let us discuss the joint intent-slot model as a case to go over these layers. The model predicts intent of input utterance and the slots in the utterance. (Refer to Train Intent-Slot model on ATIS Dataset for details about intent-slot model.)

  1. EmbeddingList layer is tasked with representing tokens. To do so we can use learnable word embedding table in conjunction with learnable character embedding table that are distilled to token level representation using CNN and pooling. Note: This class is meant to be reused by all models. It acts as a container of all the different ways of representing a token/word.
  2. BiLSTMDocSlotAttention is tasked with encoding the embedded input string for intent classification and slot filling. In order to do that it has a shared bidirectional LSTM layer followed by separate attention layers for document level attention and word level attention. Finally it produces two vectors per utterance.
  3. IntentSlotModelDecoder accepts the two input vectors from BiLSTMDocSlotAttention and produces logits for intent classification and slot filling. Conditioned on a flag it can also use the probabilities from intent classification for slot filling.
  4. IntentSlotOutputLayer implements the logic behind computing loss and prediction, as well as, how to export this layer to export to Caffe2. This is used by model exporter as a post-processing Caffe2 operator.
Parameters:
  • embedding (EmbeddingBase) – Description of parameter embedding.
  • representation (RepresentationBase) – Description of parameter representation.
  • decoder (DecoderBase) – Description of parameter decoder.
  • output_layer (OutputLayerBase) – Description of parameter output_layer.
embedding
representation
decoder
output_layer
Config[source]

alias of Model.Config

classmethod compose_embedding(sub_emb_module_dict: Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase]) → pytext.models.embeddings.embedding_list.EmbeddingList[source]

Default implementation is to compose an instance of EmbeddingList with all the sub-embedding modules. You should override this class method if you want to implement a specific way to embed tokens/words.

Parameters:sub_emb_module_dict (Dict[str, EmbeddingBase]) – Named dictionary of embedding modules each of which implement a way to embed/encode a token.
Returns:An instance of EmbeddingList.
Return type:EmbeddingList
contextualize(context)[source]

Add additional context into model. context can be anything that helps maintaining/updating state. For example, it is used by DisjointMultitaskModel for changing the task that should be trained with a given iterator.

classmethod create_embedding(feat_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata)[source]
classmethod create_sub_embs(emb_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata) → Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase][source]

Creates the embedding modules defined in the emb_config.

Parameters:
  • emb_config (FeatureConfig) – Object containing all the sub-embedding configurations.
  • metadata (CommonMetadata) – Object containing features and label metadata.
Returns:

Named dictionary of embedding modules.

Return type:

Dict[str, EmbeddingBase]

eval(stage=<Stage.TEST: 'Test'>)[source]

Override to explicitly maintain the stage (train, eval, test).

forward(*inputs) → List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.model.Model.Config, feat_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata)[source]
get_loss(logit, target, context)[source]
get_param_groups_for_optimizer() → List[Dict[str, List[torch.nn.parameter.Parameter]]][source]

Returns a list of parameter groups of the format {“params”: param_list}. The parameter groups loosely correspond to layers and are ordered from low to high. Currently, only the embedding layer can provide multiple param groups, and other layers are put into one param group. The output of this method is passed to the optimizer so that schedulers can change learning rates by layer.

get_pred(logit, target=None, context=None, *args)[source]
prepare_for_onnx_export_(**kwargs)[source]

Make model exportable via ONNX trace.

save_modules(base_path: str = '', suffix: str = '')[source]

Save each sub-module in separate files for reusing later.

train(mode=True)[source]

Override to explicitly maintain the stage (train, eval, test).

pytext.models.module module

class pytext.models.module.Module(config=None)[source]

Bases: torch.nn.modules.module.Module, pytext.config.component.Component

Generic module class that serves as base class for all PyText modules.

Parameters:config (type) – Module’s config object. Specific contents of this object depends on the module. Defaults to None.
Config

alias of pytext.config.module_config.ModuleConfig

freeze() → None[source]
pytext.models.module.create_module(module_config, *args, create_fn=<function _create_module_from_registry>, **kwargs)[source]

Create module object given the module’s config object. It depends on the global shared module registry. Hence, your module must be available for the registry. This entails that your module must be imported somewhere in the code path during module creation (ideally in your model class) for the module to be visible for registry.

Parameters:
  • module_config (type) – Module config object.
  • create_fn (type) – The function to use for creating the module. Use this parameter if your module creation requires custom code and pass your function here. Defaults to _create_module_from_registry().
Returns:

Description of returned object.

Return type:

type

pytext.models.pair_classification_model module

class pytext.models.pair_classification_model.PairClassificationModel(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]

Bases: pytext.models.model.Model

A classification model that scores a pair of texts, for example, a model for natural language inference.

The model shares embedding space (so it doesn’t support pairs of texts where left and right are in different languages). It uses bidirectional LSTM or CNN to represent the two documents, and concatenates them along with their absolute difference and elementwise product. This concatenated pair representation is passed to a multi-layer perceptron to decode to label/target space.

See https://arxiv.org/pdf/1705.02364.pdf for more details.

It can be instantiated just like any other Model.

Config[source]

alias of PairClassificationModel.Config

classmethod compose_embedding(sub_embs)[source]

Default implementation is to compose an instance of EmbeddingList with all the sub-embedding modules. You should override this class method if you want to implement a specific way to embed tokens/words.

Parameters:sub_emb_module_dict (Dict[str, EmbeddingBase]) – Named dictionary of embedding modules each of which implement a way to embed/encode a token.
Returns:An instance of EmbeddingList.
Return type:EmbeddingList
save_modules(base_path: str = '', suffix: str = '')[source]

Save each sub-module in separate files for reusing later.

pytext.models.word_model module

class pytext.models.word_model.WordTaggingModel(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]

Bases: pytext.models.model.Model

Word tagging model. It can be used for any task that requires predicting the tag for a word/token. For example, the following tasks can be modeled as word tagging tasks. This is not an exhaustive list. 1. Part of speech tagging. 2. Named entity recognition. 3. Slot filling for task oriented dialog.

It can be instantiated just like any other Model.

Config[source]

alias of WordTaggingModel.Config

Module contents

class pytext.models.Model(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]

Bases: torch.nn.modules.module.Module, pytext.config.component.Component

Generic single-task model class that expects four components:

  1. Embedding
  2. Representation
  3. Decoder
  4. Output Layer

Model also have a stage flag to indicate it’s in train, eval, or test stage. This is because the built-in train/eval flag in PyTorch can’t distinguish eval and test, which is required to support some use cases.

Forward pass: embedding -> representation -> decoder -> output_layer

These four components have specific responsibilities as described below.

Embedding layer should implement the way to represent each token in the input text. It can be as simple as just token/word embedding or can be composed of multiple ways to represent a token, e.g., word embedding, character embedding, etc.

Representation layer should implement the way to encode the entire input text such that the output vector(s) can be used by decoder to produce logits. There is no restriction on the number of inputs it should encode. There is also not restriction on the number of ways to encode input.

Decoder layer should implement the way to consume the output of model’s representation and produce logits that can be used by the output layer to compute loss or generate predictions (and prediction scores/confidence)

Output layer should implement the way loss computation is done as well as the logic to generate predictions from the logits.

Let us discuss the joint intent-slot model as a case to go over these layers. The model predicts intent of input utterance and the slots in the utterance. (Refer to Train Intent-Slot model on ATIS Dataset for details about intent-slot model.)

  1. EmbeddingList layer is tasked with representing tokens. To do so we can use learnable word embedding table in conjunction with learnable character embedding table that are distilled to token level representation using CNN and pooling. Note: This class is meant to be reused by all models. It acts as a container of all the different ways of representing a token/word.
  2. BiLSTMDocSlotAttention is tasked with encoding the embedded input string for intent classification and slot filling. In order to do that it has a shared bidirectional LSTM layer followed by separate attention layers for document level attention and word level attention. Finally it produces two vectors per utterance.
  3. IntentSlotModelDecoder accepts the two input vectors from BiLSTMDocSlotAttention and produces logits for intent classification and slot filling. Conditioned on a flag it can also use the probabilities from intent classification for slot filling.
  4. IntentSlotOutputLayer implements the logic behind computing loss and prediction, as well as, how to export this layer to export to Caffe2. This is used by model exporter as a post-processing Caffe2 operator.
Parameters:
  • embedding (EmbeddingBase) – Description of parameter embedding.
  • representation (RepresentationBase) – Description of parameter representation.
  • decoder (DecoderBase) – Description of parameter decoder.
  • output_layer (OutputLayerBase) – Description of parameter output_layer.
embedding
representation
decoder
output_layer
Config[source]

alias of Model.Config

classmethod compose_embedding(sub_emb_module_dict: Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase]) → pytext.models.embeddings.embedding_list.EmbeddingList[source]

Default implementation is to compose an instance of EmbeddingList with all the sub-embedding modules. You should override this class method if you want to implement a specific way to embed tokens/words.

Parameters:sub_emb_module_dict (Dict[str, EmbeddingBase]) – Named dictionary of embedding modules each of which implement a way to embed/encode a token.
Returns:An instance of EmbeddingList.
Return type:EmbeddingList
contextualize(context)[source]

Add additional context into model. context can be anything that helps maintaining/updating state. For example, it is used by DisjointMultitaskModel for changing the task that should be trained with a given iterator.

classmethod create_embedding(feat_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata)[source]
classmethod create_sub_embs(emb_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata) → Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase][source]

Creates the embedding modules defined in the emb_config.

Parameters:
  • emb_config (FeatureConfig) – Object containing all the sub-embedding configurations.
  • metadata (CommonMetadata) – Object containing features and label metadata.
Returns:

Named dictionary of embedding modules.

Return type:

Dict[str, EmbeddingBase]

eval(stage=<Stage.TEST: 'Test'>)[source]

Override to explicitly maintain the stage (train, eval, test).

forward(*inputs) → List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.model.Model.Config, feat_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata)[source]
get_loss(logit, target, context)[source]
get_param_groups_for_optimizer() → List[Dict[str, List[torch.nn.parameter.Parameter]]][source]

Returns a list of parameter groups of the format {“params”: param_list}. The parameter groups loosely correspond to layers and are ordered from low to high. Currently, only the embedding layer can provide multiple param groups, and other layers are put into one param group. The output of this method is passed to the optimizer so that schedulers can change learning rates by layer.

get_pred(logit, target=None, context=None, *args)[source]
prepare_for_onnx_export_(**kwargs)[source]

Make model exportable via ONNX trace.

save_modules(base_path: str = '', suffix: str = '')[source]

Save each sub-module in separate files for reusing later.

train(mode=True)[source]

Override to explicitly maintain the stage (train, eval, test).