Constant sequential filling operation needs output shape parameter #492

fdwr · 2023-12-15T06:33:55Z

See constant(start, end, step, type) at https://www.w3.org/TR/webnn/#dom-mlgraphbuilder-constant-start-end-step-type

Problem

Computing the output shape indirectly from a floating-point calculation is brittle because the caller, depending on its equation and rounding, can yield a different size from WebNN which will cause graph execution to fail due to mismatched tensors. Such incompatible off-by-one differences can occur via differing calculation precisions (e.g. the caller computes the output shape in float64 whereas Chromium computes it in float32) and via differing handling of the inclusive/exclusive end value (see a non-hypothetical example below). So it's prudent for the library to pass WebNN the shape explicitly, and we've increasingly been moving toward more explicitness anyway (e.g. removing the special null values in Reshape). The output shape was in the fillSequence operator, but got dropped, and it reoccurred to me while reviewing Bin's Chromium implementation.

Proposal

- constant(start, end, step, type) method steps are:
+ constant(start, step, outputShape, type) method steps are:
...
- Let size be max(0, ceil((end - start)/step)).
- Set descriptor.[dimensions](https://www.w3.org/TR/webnn/#dom-mloperanddescriptor-dimensions) to [size].
+ Set descriptor.[dimensions](https://www.w3.org/TR/webnn/#dom-mloperanddescriptor-dimensions) to [outputShape].
+ Let outputElementCount be the product of all elements in descriptor.[dimensions](https://www.w3.org/TR/webnn/#dom-mloperanddescriptor-dimensions).
...
- [For each](https://infra.spec.whatwg.org/#map-iterate) index in [the range](https://infra.spec.whatwg.org/#the-range) 0 to size, exclusive:
+ [For each](https://infra.spec.whatwg.org/#map-iterate) index in [the range](https://infra.spec.whatwg.org/#the-range) 0 to outputElementCount, exclusive:

* Note the end is now redundant, since it was only used for size calculation, not needed for the core start + (index * step) formula.

Problematic Library differences

We see differences between TF and PT with the exact same inputs due to how they handle the end result (included vs excluded), which emphasizes why WebNN should be clearly informed by the caller of the expected shape:

import tensorflow
result = tensorflow.range(5, 5, 2)
print("value:", result)

# output = [], shape= (0,)

import torch
result = torch.range(5, 5, 2)
print("value:", result)

# output = [5], shape= (1,)

Usage Examples

... = constant(0, 1, [3,3], "float32"); // Fill with an increasing sequence.
// output = [[0,1,2],[3,4,5],[6,7,8]

... = constant(9, -1, [3,3], "float32"); // Fill with a decreasing sequence.
// output = [[9,8,7],[6,5,4],[3,2,1]

... = constant(1, 0, [3,3], "float16"); // Quickly fill large tensor with a constant value, like numpy.ones(outputShape).
// output = [[1,1,1],[1,1,1],[1,1,1]    // Note this is currently awkward otherwise for float16, because you have to
                                        // pass 0x3C00 (1.0) to fill a Uint16Array.

@wchao1115 @huningxin

The text was updated successfully, but these errors were encountered:

huningxin · 2024-01-02T05:51:18Z

constant(start, step, outputShape, type)

Could the parameters outputShape and type be encapsulated by MLOperandDescriptor?

miaobin · 2024-01-10T06:40:22Z

constant(start, step, outputShape, type)

So the final proposal is like this: constant(start, step, operand_descriptor), right (I guess the order of parameters is like this)? @huningxin @fdwr WDYT?

fdwr · 2024-01-10T07:45:56Z

constant(start, step, outputShape, type)

Could the parameters outputShape and type be encapsulated by MLOperandDescriptor?

I'm fine with it either way. If you as a caller already have shape and data type variables, then requiring a temporary tensor desc is a little more bothersome than just passing them directly as parameters, but if you already have a tensor desc, then passing that directly is slightly more convenient. 🤷‍♂️ Since the internal implementation complexity doesn't really change either way, maybe asking someone from the WebNN API caller point of view (like Wanming from ORT) would be a nice tie breaker?

Whatever we do, we should use the same pattern in the future with similar operators. So if we introduce, for example, a diagonal matrix operator that fills a new tensor with 1's along the diagonal and 0's everywhere else, then it should take a tensordesc or two separate parameters. I don't see that we have much existing precedent though. Any ops that take a shape currently (like reshape or expand) take the shape as a separate parameter, and ops that take a data type take them as a separate parameter (like cast), but we don't have any others that take both shape and data type.

Honry · 2024-01-10T08:05:31Z

Could the parameters outputShape and type be encapsulated by MLOperandDescriptor?

No much impact to WebNN EP, I am fine with it either way as well.

zolkis · 2024-01-10T10:02:05Z

In Web APIs they usually prefer passing options in dictionaries.
Semantically important parameters make sense to be passed as args, if there is not many of them.

fdwr · 2024-01-10T23:55:11Z

In Web APIs they usually prefer passing options in dictionaries. Semantically important parameters make sense to be passed as args, if there is not many of them.

Well these are essential parameters, not options, but it would be more consistent with the existing constant(descriptor, bufferView) overload to accept a tensor description MLOperandDescriptor. So, seeing that precedent, I more clearly favor making it take MLOperandDescriptor too. Then we would have these two (or three) overloads:

1. constant(descriptor, bufferView);
2. constant(descriptor, start, step = 0);
3. constant(value, type) // I recall Ningxin proposed removing this one elsewhere due to implementation complexity.

Usage:

const x = builder.constant({dataType: 'uint32', dimensions: [4]}, new Float32Array([1,3,5,7]));
const y = builder.constant({dataType: 'uint32', dimensions: [4]}, 1.0, 2.0); // 1,3,5,7

I guess the order of parameters is like this)

If it causes no ambiguity in the IDL, then I'd like both overload parameters to go in the same order, rather than inconsistently reversed between them. Also if you make step optional (defaulting to 0), then that satisfies the reasonably likely use case of wanting to fill with a solid value (const y = builder.constant({dataType: 'uint32', dimensions: [4]}, 1.0)). Conveniently, I think that also obviates the constant(value, type) overload?

wacky6 · 2024-01-11T10:30:33Z

In Web APIs they usually prefer passing options in dictionaries. Semantically important parameters make sense to be passed as args, if there is not many of them.

I think marking fields are required in IDL dictionary is fine (it's not uncommon). So the proposal in #492 (comment) looks reasonable to me.

The following is also fine IMO.

dictionary FillSequenceDescriptor {
  required double start;
  double step = 0;
};

constant(MLOperandDescriptor desc, FillSequenceDescriptor fillSequence);

The type we use for start and step might be tricky. I think we'll at least need a double to cover int32 and float32 (fp32 can't accurately represent large int32 start value).

How to support for int64 remains an open question :)

fdwr · 2024-01-12T10:00:31Z

I think we'll at least need a double to cover int32 and float32

One of the three current constant overloads already takes a double...

MLOperand constant(double value, optional MLOperandDataType type = "float32");

...and so there's precedent for passing at least that much precision. I agree that float32 is not sufficient to cover the accuracy needed (it cannot even represent odd int32 values >= 16777216 accurately).

How to support for int64 remains an open question

One operator (split) already accepts parameters of differing types:

partial interface MLGraphBuilder {
    sequence<MLOperand> split(
        MLOperand input,
        (unsigned long or sequence<unsigned long>) splits, <----------------
        optional MLSplitOptions options = {}
    );
};

So (barring any IDL issues I'm not seeing?) we could accept (double or BigInt) to accurately cover the full floating point and integer range:

MLOperand constant(MLOperandDescriptor descriptor, (double or BigInt) start, (double or BigInt) step = 0.0);

Curiously I see a BigUint64Array, but I don't see a matching BigUint type? 🙃 I suppose it doesn't matter because a BigInt can hold unsigned values too, with the precise type uint64 being clarified by the operand desc's dataType (and I see that BigInt.asUintN(64, (2n ** 64n) - 1n) runs fine, printing 18446744073709551615n as expected).

miaobin · 2024-01-12T10:25:09Z

So (barring any IDL issues I'm not seeing?) we could accept (double or BigInt) to accurately cover the full floating point and integer range:
MLOperand constant(MLOperandDescriptor descriptor, (double or BigInt) start, (double or BigInt) step = 0.0);

I thought of the same method as you. And I am trying to add bigint to verify the feasibility of the implementation.

miaobin · 2024-02-05T08:05:30Z

Updates to the constant sequential filling operation:

CL5091540 (extends the build method of constant) and CL5243778 (defines mojom and DirectML node) are developed according to IDL:MLOperand constant(MLOperandDescriptor desc, (double or bigint) start, (double or bigint) step);
We met the problem of using the keyword or in IDL to create a union type containing double and bigint. In the generated file "v8_union_bigint_double.cc", it was only judged that the incoming data was of JS number type and there is no judgment that it is BigInt, which causes an error to be reported when BigInt data is passed in when calling this API using JS code. (For example, when we call constant op, we pass in start=123n). @fdwr Maybe we should file an issue to chromium for this problem?

On one hand, we can continue to review these two CLs to find potential problems and verify feasibility. On the other hand, can we first create an updated specification PR which matches what is being implemented and get folks with more WebIDL and JavaScript API design knowledge to take a look at it to make sure it will work the way we intend it to as Reilly said? WDYT @huningxin @fdwr @wacky6

inexorabletash · 2024-02-17T00:17:29Z

Drive-by comment: Use of bigint is extremely limited in Chromium at the moment, so I'm not surprised if the binding code generation can't handle a union with it yet. Note also this in WebIDL: https://webidl.spec.whatwg.org/#limit-bigint-numeric-unions - while WebIDL support for a union of bigint and another numeric type is desired , there is a request to file an issue so that the problem can be discussed. And... from that bug tracker, it doesn't look like anyone has actually done it yet.

fdwr · 2024-02-17T00:32:31Z

https://webidl.spec.whatwg.org/#limit-bigint-numeric-unions

It would not be appropriate to accept such a union, only to then convert values of the numeric type to a bigint for further processing, as this runs the risk of introducing precision errors. Please file an issue before using this feature.

Joshua: I see. Well the point in this case is to preserve values all the way through and avoid precision errors. I see one existing issue here whatwg/webidl#959 (but oddly empty and already closed).

inexorabletash · 2024-02-17T00:38:44Z

Yeah, I think the concern (from the WebIDL editors) is that they want to ensure that it's going to be used correctly (as proposed here) with handling at the C++ (etc) layer that preserves full precision either as a float64 or bigint. The WebIDL editors just don't want to do extra work unless it's really needed, and don't want anyone writing such a union unless the precision is actually needed. It absolutely is here... this may just be the first spec where that's actually the case! 🤯

inexorabletash · 2024-02-29T18:02:58Z

2. In the generated file "v8_union_bigint_double.cc", it was only judged that the incoming data was of JS number type and there is no judgment that it is BigInt

Comparing the blink bindings implementation for unions and the spec for union conversions, it looks like the cases for bigint are simply missing i.e. not implemented yet.

fdwr · 2024-04-11T08:07:25Z

We have some parameter validation discussions happening in the fill-sequence constant overload CR. I'm content to restrict this sequential fill overload to reject infinity as an input (to simplify parameter validation) so long as we have some other means of creating tensors filled with such values; and I propose we repurpose the existing constant(value, type) overload to instead constant(descriptor, MLNumber), which accepts any possible value for that data type from the caller (including infinity and NaN, which are valid representations for floating point numbers). Thoughts?

fdwr · 2024-04-12T05:26:51Z

We're deciding between (a) vs (b):

(a) 3 separate functions, one consuming buffers, one for constant fills, one for sequence fills:

graphBuilder.constant(descriptor, buffer);    // initialize via buffer
graphBuilder.constant(descriptor, 13.0);      // fill with 13's
graphBuilder.constant(descriptor, 42.0, 1.0); // fill sequence 42,43,44,45,...

➕ more direct

(b) 2 functions, one consuming buffers, and one that takes an options object with optional params:

dictionary MLConstantOptions {
  MLNumber start = 0;           // MLNumber is unrestricted (thus does permit inf/NaN)
  (bigint or double) step = 0;  // not MLNumber since inf/NaN are not allowed
  // bikeshed naming, maybe `start` -> `value`.
};

graphBuilder.constant(descriptor, buffer);                 // initialize via buffer
graphBuilder.constant(descriptor, {start: 13.0});          // fill with 13's
graphBuilder.constant(descriptor, {start: 42.0, step: 1}); // fill sequence 42,43,44,45,...

Credit @a-sully

➕ pretty immediately clear to the reader, without needing to lookup whether parameter 2 or 3 is the step.

wacky6 · 2024-04-12T06:14:03Z

2 functions, one consuming buffers and one that takes an options object with optional params

This seems nice from caller's perspective.

But I think we'll need hand-written parsing logic.

I don't think WebIDL binding generator is smart enough to tell different kinds of dictionaries apart.

dictionary FillConstant{
  fill: MLNumber  // or `start`
};

dictionary FillWithSequence {
  start: MLNumber
  step: MLNumber
}

a-sully · 2024-04-12T17:48:45Z

➕ pretty immediately clear to the reader, without needing to lookup whether parameter 2 or 3 is the step.

Yup! That recommendation is based on guidance here: https://w3ctag.github.io/design-principles/#prefer-dictionaries

You should also consider accepting mandatory parameters through a dictionary, if it would make the API more readable, especially when they are of primitive types.

And my suggestion for combining the "fill with a given value" and "fill with this value + a step" methods into one is based on the guidance here: https://w3ctag.github.io/design-principles/#optional-parameters

If the dictionary argument itself is optional, then just this is sufficient:

graphBuilder.constant(descriptor); // Fill a tensor of the given dimensions with zeros

I don't think WebIDL binding generator is smart enough to tell different kinds of dictionaries apart.

I'm not sure, though as @fdwr mentioned above I think if we use value instead of start that might be sufficient for both use cases?

a-sully · 2024-04-12T23:07:27Z

Taking a step back and looking at the problem statement described in the issue description, I'd like to poke at this claim:

Note the end is now redundant, since it was only used for size calculation, not needed for the core start + (index * step) formula.

Let's first look at the Problematic Library differences:

We see differences between TF and PT with the exact same inputs due to how they handle the end result (included vs excluded), which emphasizes why WebNN should be clearly informed by the caller of the expected shape:
import tensorflow
result = tensorflow.range(5, 5, 2)
print("value:", result)

# output = [], shape= (0,)
import torch
result = torch.range(5, 5, 2)
print("value:", result)

# output = [5], shape= (1,)

PyTorch seems to have resolved this by deprecating range in favor of arange, which behaves the same as TF, Python's builtin range(), numpy's arange, and also Core ML's equivalent. Each of these operators take start, end, and step arguments

So there's no problematic library difference, right?...

...What about DirectML?!

DirectML does not require an explicit end parameter, instead inferring the endpoint from the dimensions (as this issue proposes to do for WebNN). But this has a few implications:

Infinite `end`

@fdwr threw out the following scenario over on the Chromium code review:

We also have to consider what happens for the endpoint, because even though the delta can't be infinite, it can be close to infinite, and starting at a large value could yield infinity for the endpoint (which would be okay if the start accepts infinity too, but it would be inconsistent to reject start being infinity whereas the end could result in infinity):

graphBuilder.constant(descriptor, {start: 3.4e+38, step: 3.4e+38}); // inf end

This makes sense in DML, which (according to what @fdwr tells me :P ) will gladly can saturate floats to inf once they exceed the max value

This does not make sense for the other frameworks, which require a finite end value:

torch.arange(start=5, end=float("inf"), step=2)
# RuntimeError: unsupported range: 5 -> inf

This means that even if we decide that an end value does not need to be specified via WebNN - since we can infer it from start + step * size - we must be able to compute it to a finite value

But even if it's finite, can we be sure the values are filled as we expect?

Quirks of floating point math

Say you want to fill a tensor of shape (10,) with evenly-spaced float16 values from 0 to 1. In the original IDL, this might look like:

constant(start=0., end=1., step=0.1, type="float16")

This maps closely to the (non-DML) APIs described above. For example:

np.arange(start=0., stop=1., step=0.1, dtype=np.dtype(np.float16))
# array([0.    , 0.1   , 0.2   , 0.2998, 0.4   , 0.5   , 0.5996, 0.6997,
#        0.8   , 0.9   ], dtype=float16)

DML, on the other hand, (according to their docs) performs the following pseudocode:

result = numpy.zeros(shape=[10,], dtype=np.float16)
for i in range(1, 10):
  result[i] = result[i-1] + numpy.float16(0.1)
# array([0.    , 0.1   , 0.2   , 0.2998, 0.4   , 0.5   , 0.6   , 0.7   ,
#        0.8003, 0.9004], dtype=float16)

The values are different!

The numpy docs have an explanation:

The actual step value used to populate the array is dtype(start + step) - dtype(start) and not step

and also

the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop

A quick check of TFLite and PyTorch show the same behavior. Core ML's range is implemented using np.arange. And Python's built-in range function sidesteps this problem by only supporting integers

Another thing to consider are the calculations needed by the caller themselves. In the case above, the caller has start, end, and dimensions. If the WebNN API surface includes a start, step, and dimensions, then the caller will have to compute step manually. This computation itself has the potential to introduce unwanted imprecision

So what should we do?

In order to ensure we get the same values across platforms, I propose we specify the behavior of np.arange. Concretely, this means:

The API surface includes start, step, and end, as well as the dimensions (and dtype)
An internalStep value will be computed as dtype(start + step) - dtype(start)
The length of the result will be computed using ceil((stop - end)/step)
To be a valid input, WebNN must (synchronously) validate that result's length matches what we'd expect from the given dimensions

Chromium's DML implementation can then use start and internalStep - rather than the step value passed from JS. I believe that should give us the same behavior across all platforms 🤞

Meanwhile, requiring a finite end value sidesteps the whole "infinite end" problem

If we go down this route, then I think three overloads probably makes sense? :)

a-sully · 2024-04-16T01:14:58Z

Taking an even bigger step back, I was curious what the rationale for adding this operator (which I'll refer to as fillSequence below) was in the first place, since this functionality can be achieved by filling an ArrayBuffer (or MLBuffer, soon #614 (comment)) from JavaScript and passing it as a constant()... From what I can tell based on digging through old issues and PRs, the rationale for adding this operator can be traced back to this comment by @wchao1115
#478 (comment) (copied below)

Yes, [user code can generate the data and create a constant operand] but it'll cause additional copy on the array buffer. This overload makes sure that there is just 1 copy of the constant data created for the resulting operand.

In light of the recent discussions on #614 about achieving minimal buffer copies by using MLBuffer, I wonder whether the optimization promised by fillSequence is still worthwhile? Recalling the conclusions from #614 (see #614 (comment)):

passing an MLBuffer as an input() will allow the same data to be passed to multiple MLGraph instances (no additional copies). This data cannot be optimized during graph initialization because it's passed as an input rather than a constant

passing an MLBuffer as a constant() will ensure there's only one copy of the unoptimized constant data. Each MLGraph which this buffer is passed to may create its own optimized copy of the constant data during graph initialization.

Passing an MLBuffer as an input() may achieve the same "single copy of the buffer" behavior that fillSequence aims to provide. Additionally, this input buffer would only need to be allocated and have its contents computed once, whereas use of the fillSequence operator would require re-allocating the buffer and re-computing the sequence each time compute() is called on the MLGraph. My understanding of the situation is as follows:

	`fillSequence`	`input(MLBuffer)`	`constant(MLBuffer)`
Copies	one per inference	one†	one† + one per subgraph
Optimized constant data	???	No	Yes
Times contents computed	once per inference	once, up-front	once, up-front

Questions for the group:

Is the table above accurate? If not, please help me correct it!
Can someone help fill in the ???? Are there examples of frameworks which may internally optimize this operator - e.g. by cleverly avoiding realizing the full output tensor?
Do we have target use cases in mind for fillSequence()? If so, what are the expected usage patterns?

†Hopefully 🤞 Pending further MLBuffer prototyping @bbernhar

For some MLGraphBuilder methods the type of a numeric input can vary - e.g. for constant() an explicit MLOperandDataType is provided; for clamp() and pad() the data type is implied by input operands. In these cases, specifying the numeric value as either a float/double or int64 type runs into accuracy or range issues - you can't accurately represent all int64 values as a double, and you can't represent the full range of floats as int64. (You also can't represent all int64 values as an long long either - over 2^53 things get wierd. But that's a digression.) Per discussion in whatwg/webidl#1388 this change introduces a union between a bigint type and unrestricted double called MLNumber. Callers can pass a JS number (1234, 1.1234e38) or a JS bigint (9007199254740993n), and the implementation will treat it properly based on the explicit or implicit MLOperandDataType. Usage of this type should be limited to only those cases. Fixes webmachinelearning#442 Note that webmachinelearning#492 proposes changes to the constant sequential filling operation; this just adds IDL to match the current spec prose. Some of the concerns raised in webmachinelearning#325 are addressed (e.g. clamp()'s options). However, several other options are still specified as "float", and should maybe be "double" - but MLNumber is likely not appropriate for those, so they are not updated here.

fdwr · 2024-04-18T23:49:38Z

My biggest concern is deducing tensors shapes from wobbly floating point calculations, especially considering the caller (maybe Javascript, maybe compiled WebAssembly, maybe TF with one rounding behavior, maybe PT with another rounding behavior...) and the browser (with its own FPU details) can differ (it would be bad if the browser computed its range subtraction and delta division via float64 whereas the caller used float32, or vice versa), and a mismatch between the output shape can result in the whole graph execution breaking due to incompatible shapes when combined with other operators.

High-level frameworks like TF Lite and PyTorch can do they want and include such helpful functions based solely on FP parameters, but low-level API's like WebNN should take more explicit tensor shapes.

PyTorch seems to have resolved this by deprecating range in favor of arange, which behaves the same as TF

🤔 Seems not so. I picked nice integer numbers for the example above, but more generically, if you pass the same values into tf.range and torch.arange, you get different values, and more adversely, different shapes:

import torch
import tensorflow
import random

ptRange = torch.arange(    0.3908395899830157, 1.9541979499150786, 0.3908395899830157, dtype=torch.float32)
tfRange = tensorflow.range(0.3908395899830157, 1.9541979499150786, 0.3908395899830157, dtype=tensorflow.float32)

print("---------------------------------")
print("PT shape: ", ptRange.shape)
print("   dtype: ", ptRange.dtype)
print("   value: ", ptRange)
print("TF shape: ", tfRange.shape)
print("   dtype: ", tfRange.dtype)
print("   value: ", tfRange)

# PT shape:  torch.Size([4])
#    dtype:  torch.float32
#    value:  tensor([0.3908, 0.7817, 1.1725, 1.5634])
# TF shape:  (5,)
#    dtype:  <dtype: 'float32'>
#    value:  tf.Tensor([0.39083958 0.78167915 1.1725187  1.5633583  1.9541979 ], shape=(5,), dtype=float32)

I'd sooner drop this op altogether (and force the caller to fill in the Float32Array) than include it with FP deduced shape.

DML, on the other hand, (according to their docs) performs the following pseudocode

(update 20240419T0320) I opened a doc fix for the pseudocode. The actual DML output is [0. 0.1 0.2 0.2998 0.4 0.5 0.5996 0.6997 0.8 0.9 ]. 📖🐞

a-sully · 2024-04-19T16:42:16Z

Thanks again for the deep dive @fdwr :)

Just to clarify a couple points...

...and the browser (with its own FPU details) can differ (it would be bad if the browser computed its range subtraction and delta division via float64 whereas the caller used float32, or vice versa)

If we went down this route we would clearly specify the behavior of the browser - say, to always be float64. So if frameworks are doing their parallel calculations in some other data type, that's a bug in the framework and not something we (as builders of the web platform) should be too worried about.

Of course we should strive to design the API such that callers fall into the pit of success. But there's a difference between an API which is unintuitive to use (say, because the parameters are ordered differently from the analogous function you're used to in Python) and one which cannot reliably be used correctly_ (say, if the browser sometimes computes shapes in float32 and other times in float64, and you have no way to know which will be used).

I think the "FP deduced shape" API we're discussing here is the former. But it sounds like (based on comments like the one below) you're suggesting that that may not be realistic, given the discrepancies in the underlying frameworks?

...and a mismatch between the output shape can result in the whole graph execution breaking due to incompatible shapes when combined with other operators

If that's true, I concur:

I'd sooner drop this op altogether (and force the caller to fill in the Float32Array) than include it with FP deduced shape.

Recalling the questions I posed to the group earlier...

Is the table above accurate? If not, please help me correct it!

Can someone help fill in the ???? Are there examples of frameworks which may internally optimize this operator - e.g. by cleverly avoiding realizing the full output tensor?

Do we have target use cases in mind for fillSequence()? If so, what are the expected usage patterns?

Unless there are very compelling answers to 2 and 3, I'm in favor of removing this operator. WDYT?

fdwr · 2024-04-24T23:53:24Z

unless there are very compelling answers to 2 and 3, I'm in favor of removing this operator. WDYT?

@a-sully : I had to analyze some models deeper first before I could reply. Looking through ~700 models on my mini-collection with 19 having Range in them, all the cases (unless I accidentally missed opening one) turn out to either:

(a) compute tiny tensors, not really worth the full beefy power of a GPU.
(b) related to shape computation in some way, which are tiny tensors (typically 2-4 elements) and are worse to compute on the GPU because you'd have to read it back from the GPU when determine the tensor shape. Additionally these operator occurrences should disappear due to constant folding during shape inference. I think @Honry's analysis shows the same for targeted WebNN transformer models.

If we encounter a more compelling case later where a large fill directly on the GPU is worthwhile, then we can consider re-adding fillSequence (with explicit shape parameter, and only computing once rather than each compute call). Until then, letting the caller just fill in a little array with numbers (with their framework idiosyncrasies intact) sounds fine, and we can remove this operator. I'll talk with Miao and Ningxin in a few minutes for their take... ⏱️.

a-sully · 2024-04-25T16:23:42Z

Thanks again for the investigations @fdwr! I just put up #656 :)

a-sully · 2024-04-30T01:17:27Z

The above PR merged. Let's close this issue?

fdwr changed the title ~~Constant sequential filling operation needs shape parameter~~ Constant sequential filling operation needs output shape parameter Dec 16, 2023

fdwr mentioned this issue Jan 11, 2024

Completed constant webmachinelearning/webnn-baseline#65

Merged

BruceDai mentioned this issue Feb 2, 2024

WPT tests tracker / 2 for remaining ops #338

Closed

anssiko added the feature request label Feb 5, 2024

inexorabletash mentioned this issue Feb 6, 2024

Process: Add documentation for labels, current and proposed #533

Merged

3 tasks

anssiko added the operator specific label Feb 7, 2024

fdwr mentioned this issue Feb 16, 2024

Type of some parameters should match the input data type #442

Closed

This was referenced Feb 17, 2024

Intent to use bigint/numeric union whatwg/webidl#959

Closed

Intent to use BigInt/numeric union in WebNN whatwg/webidl#1388

Open

huningxin mentioned this issue Feb 18, 2024

WebIDL definition for constant(start, end, step, type) is missing #571

Closed

inexorabletash mentioned this issue Apr 17, 2024

Update 2024-04-18-wg-agenda.md webmachinelearning/meetings#22

Merged

This was referenced Apr 18, 2024

Introduce MLNumber for specifying numeric inputs of any type #647

Merged

Remove builder.constant(value, type) variant #475

Closed

a-sully mentioned this issue Apr 25, 2024

Remove sequential filling overload of constant() #656

Merged

inexorabletash closed this as completed Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constant sequential filling operation needs output shape parameter #492

Constant sequential filling operation needs output shape parameter #492

fdwr commented Dec 15, 2023 •

edited

Loading

huningxin commented Jan 2, 2024

miaobin commented Jan 10, 2024

fdwr commented Jan 10, 2024 •

edited

Loading

Honry commented Jan 10, 2024

zolkis commented Jan 10, 2024

fdwr commented Jan 10, 2024 •

edited

Loading

wacky6 commented Jan 11, 2024

fdwr commented Jan 12, 2024 •

edited

Loading

miaobin commented Jan 12, 2024

miaobin commented Feb 5, 2024

inexorabletash commented Feb 17, 2024

fdwr commented Feb 17, 2024

inexorabletash commented Feb 17, 2024

inexorabletash commented Feb 29, 2024

fdwr commented Apr 11, 2024

fdwr commented Apr 12, 2024 •

edited

Loading

wacky6 commented Apr 12, 2024

a-sully commented Apr 12, 2024

a-sully commented Apr 12, 2024

a-sully commented Apr 16, 2024

fdwr commented Apr 18, 2024 •

edited

Loading

a-sully commented Apr 19, 2024

fdwr commented Apr 24, 2024 •

edited

Loading

a-sully commented Apr 25, 2024

a-sully commented Apr 30, 2024

Constant sequential filling operation needs output shape parameter #492

Constant sequential filling operation needs output shape parameter #492

Comments

fdwr commented Dec 15, 2023 • edited Loading

Problem

Proposal

Problematic Library differences

Usage Examples

huningxin commented Jan 2, 2024

miaobin commented Jan 10, 2024

fdwr commented Jan 10, 2024 • edited Loading

Honry commented Jan 10, 2024

zolkis commented Jan 10, 2024

fdwr commented Jan 10, 2024 • edited Loading

wacky6 commented Jan 11, 2024

fdwr commented Jan 12, 2024 • edited Loading

miaobin commented Jan 12, 2024

miaobin commented Feb 5, 2024

inexorabletash commented Feb 17, 2024

fdwr commented Feb 17, 2024

inexorabletash commented Feb 17, 2024

inexorabletash commented Feb 29, 2024

fdwr commented Apr 11, 2024

fdwr commented Apr 12, 2024 • edited Loading

(a) 3 separate functions, one consuming buffers, one for constant fills, one for sequence fills:

(b) 2 functions, one consuming buffers, and one that takes an options object with optional params:

wacky6 commented Apr 12, 2024

a-sully commented Apr 12, 2024

a-sully commented Apr 12, 2024

Infinite end

Quirks of floating point math

a-sully commented Apr 16, 2024

fdwr commented Apr 18, 2024 • edited Loading

a-sully commented Apr 19, 2024

fdwr commented Apr 24, 2024 • edited Loading

a-sully commented Apr 25, 2024

a-sully commented Apr 30, 2024

fdwr commented Dec 15, 2023 •

edited

Loading

fdwr commented Jan 10, 2024 •

edited

Loading

fdwr commented Jan 10, 2024 •

edited

Loading

fdwr commented Jan 12, 2024 •

edited

Loading

fdwr commented Apr 12, 2024 •

edited

Loading

Infinite `end`

fdwr commented Apr 18, 2024 •

edited

Loading

fdwr commented Apr 24, 2024 •

edited

Loading