com.microsoft - LongformerAttention#

LongformerAttention - 1#

Version

  • name: LongformerAttention (GitHub)

  • domain: com.microsoft

  • since_version: 1

  • function:

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

  • num_heads - INT (required) : Number of attention heads

  • window - INT (required) : One sided attention windows length W, or half of total window length

Inputs

  • input (heterogeneous) - T:

  • weight (heterogeneous) - T:

  • bias (heterogeneous) - T:

  • mask (heterogeneous) - T:

  • global_weight (heterogeneous) - T:

  • global_bias (heterogeneous) - T:

  • global (heterogeneous) - G:

Outputs

  • output (heterogeneous) - T:

Type Constraints

  • T in ( tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

  • G in ( tensor(int32) ): Constrain to integer types

Examples