com.microsoft - Sampling#

Sampling - 1#

Version

  • name: Sampling (GitHub)

  • domain: com.microsoft

  • since_version: 1

  • function:

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

  • custom - INT : If 1 custom sampling logic

  • decoder - GRAPH (required) : Decoder subgraph to execute in a loop.

  • decoder_start_token_id - INT : The id of the token that indicates decoding starts.

  • encoder - GRAPH : The subgraph for initialization of encoder and decoder. It will be called once before decoder subgraph.

  • eos_token_id - INT (required) : The id of the end-of-sequence token

  • filter_value - FLOAT : All filtered values will be set to this float value.

  • init_decoder - GRAPH : The subgraph for the first decoding run. It will be called once before decoder subgraph. This is relevant only for the GPT2 model. If this attribute is missing, the decoder subgraph will be used for all decoding runs

  • min_tokens_to_keep - INT : Minimumber of tokens we keep per batch example in the output.

  • model_type - INT : Model type: 0 for decoder only like GPT-2; 1 for encoder decoder like Bart

  • no_repeat_ngram_size - INT : no repeat ngrams size

  • pad_token_id - INT (required) : The id of the padding token

  • presence_penalty - FLOAT : Presence penalty for custom sampling

  • temperature - FLOAT : The value used to module the next token probabilities.

  • top_p - FLOAT : If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.

  • vocab_size - INT : Size of the vocabulary. If not provided, it will be inferred from the decoder subgraph’s output shape

Inputs

Between 2 and 9 inputs.

  • input_ids (heterogeneous) - I:

  • max_length (heterogeneous) - I:

  • min_length (optional, heterogeneous) - I:

  • repetition_penalty (optional, heterogeneous) - T:

  • vocab_mask (optional, heterogeneous) - I:

  • prefix_vocab_mask (optional, heterogeneous) - I:

  • attention_mask (optional, heterogeneous) - I:

  • presence_mask (optional, heterogeneous) - I:

  • seed (optional, heterogeneous) - I:

Outputs

Between 1 and 2 outputs.

  • sequences (heterogeneous) - I:

  • filtered_logits (optional, heterogeneous) - T:

Type Constraints

  • T in ( tensor(float) ): Constrain input and output types to float tensors.

  • I in ( tensor(int32) ): Constrain to integer types

Examples