com.microsoft - Sampling#
Sampling - 1#
Version
name: Sampling (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 1 of domain com.microsoft.
Summary
Attributes
custom - INT : If 1 custom sampling logic
decoder - GRAPH (required) : Decoder subgraph to execute in a loop.
decoder_start_token_id - INT : The id of the token that indicates decoding starts.
encoder - GRAPH : The subgraph for initialization of encoder and decoder. It will be called once before decoder subgraph.
eos_token_id - INT (required) : The id of the end-of-sequence token
filter_value - FLOAT : All filtered values will be set to this float value.
init_decoder - GRAPH : The subgraph for the first decoding run. It will be called once before decoder subgraph. This is relevant only for the GPT2 model. If this attribute is missing, the decoder subgraph will be used for all decoding runs
min_tokens_to_keep - INT : Minimumber of tokens we keep per batch example in the output.
model_type - INT : Model type: 0 for decoder only like GPT-2; 1 for encoder decoder like Bart
no_repeat_ngram_size - INT : no repeat ngrams size
pad_token_id - INT (required) : The id of the padding token
presence_penalty - FLOAT : Presence penalty for custom sampling
temperature - FLOAT : The value used to module the next token probabilities.
top_p - FLOAT : If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.
vocab_size - INT : Size of the vocabulary. If not provided, it will be inferred from the decoder subgraph’s output shape
Inputs
Between 2 and 9 inputs.
input_ids (heterogeneous) - I:
max_length (heterogeneous) - I:
min_length (optional, heterogeneous) - I:
repetition_penalty (optional, heterogeneous) - T:
vocab_mask (optional, heterogeneous) - I:
prefix_vocab_mask (optional, heterogeneous) - I:
attention_mask (optional, heterogeneous) - I:
presence_mask (optional, heterogeneous) - I:
seed (optional, heterogeneous) - I:
Outputs
Between 1 and 2 outputs.
sequences (heterogeneous) - I:
filtered_logits (optional, heterogeneous) - T:
Type Constraints
T in ( tensor(float) ): Constrain input and output types to float tensors.
I in ( tensor(int32) ): Constrain to integer types
Examples