module onnxrt.ops_cpu.op_string_normalizer
#
Short summary#
module mlprodict.onnxrt.ops_cpu.op_string_normalizer
Runtime operator.
Classes#
class |
truncated documentation |
---|---|
The operator is not really threadsafe as python cannot play with two locales at the same time. stop words should … |
Properties#
property |
truncated documentation |
---|---|
|
Returns the list of arguments as well as the list of parameters with the default values (close to the signature). … |
|
Returns the list of modified parameters. |
|
Returns the list of optional arguments. |
|
Returns the list of optional arguments. |
|
Returns all parameters in a dictionary. |
Methods#
method |
truncated documentation |
---|---|
Normalizes strings. |
|
Normalizes string in a columns. |
|
Transforms accentuated unicode symbols into their simple counterpart. Source: sklearn/feature_extraction/text.py. … |
Documentation#
Runtime operator.
- class mlprodict.onnxrt.ops_cpu.op_string_normalizer.StringNormalizer(onnx_node, desc=None, **options)#
Bases:
mlprodict.onnxrt.ops_cpu._op.OpRunUnary
The operator is not really threadsafe as python cannot play with two locales at the same time. stop words should not be implemented here as the tokenization usually happens after this steps.
- Parameters
onnx_node – onnx node
desc – internal representation
expected_attributes – expected attributes for this node
options – runtime options
- __init__(onnx_node, desc=None, **options)#
- Parameters
onnx_node – onnx node
desc – internal representation
expected_attributes – expected attributes for this node
options – runtime options
- _infer_shapes(x)#
Returns the same shape by default.
- _remove_stopwords(text, stops)#
- _run(x)#
Normalizes strings.
- _run_column(cin, cout)#
Normalizes string in a columns.
- strip_accents_unicode(s)#
Transforms accentuated unicode symbols into their simple counterpart. Source: sklearn/feature_extraction/text.py.
- Parameters
s – string The string to strip
- Returns
the cleaned string