Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve protobuf support for BigQuery Storage Write API #13873

Open
alvioshki opened this issue Nov 25, 2024 · 3 comments
Open

Improve protobuf support for BigQuery Storage Write API #13873

alvioshki opened this issue Nov 25, 2024 · 3 comments
Assignees
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@alvioshki
Copy link

The feature request in short: to implement a better support for/remove the protobuf restrictions for BigQuery Storage Write API in the C# client.

I've tried to migrate from BigQuery legacy streaming API - to use the BigQuery Storage Write API with a C# client library instead.
However the protobuf handling restrictions have prevented me from taking this path to the end:
https://cloud.google.com/bigquery/docs/write-api#proto_buffer_handling

I have existing proto3 contracts. They use package specifiers, imports for wrapper types and other features. However, these are restricted according to documentation. I've tried using it anyway, I have hit some limitations with imports, enums and default values (I've managed to worked around this one).

Changing the proto3 contracts is not an option for me, as I have too many of them. Writing custom proto2 contracts and mapping from one to another is a no-go as well. I've considered implementing a dynamic mapper, but I am not confident my investment would work in the end.

The documentation states that "The Java and Go clients support arbitrary protocol buffers, because the client library normalizes the protocol buffer schema.".

What does this mean? My guesss is that this means that these client libraries have built-in mechanisms to handle and adapt arbitrary protocol buffer schemas, allowing more flexibility compared to other languages. Perhaps adapt the schema to conform to the restrictions imposed by the BigQuery Write API, such as removing or adjusting package specifiers, flattening nested messages or enums to meet the top-level definition requirements, resolving external references and embedding them into a single schema, etc.

Given I understand it correctly, could this be implemented in the C# client?

I would like to know if such an improvement arrives. Until then I will be using the legacy streamin API.

@jskeet jskeet assigned amanda-tarafa and unassigned jskeet Nov 25, 2024
@jskeet jskeet added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Nov 25, 2024
@jskeet
Copy link
Collaborator

jskeet commented Nov 25, 2024

Thanks for the clear description. We'll talk to the BigQuery team internally and discuss the next steps.

@amanda-tarafa amanda-tarafa added priority: p2 Moderately-important priority. Fix may not be included in next release. api: bigquerystorage Issues related to the BigQuery Storage API. labels Nov 25, 2024
@amanda-tarafa
Copy link
Contributor

@alvioshki We've discussed internally with the Bigquery team. And as you described, it's a little more complex than converting from proto3 to proto2, although that's also part of the problem. See the documentation for the NormalizeDescriptor function in Go for more details.

Unfortunately, we have higher priority work at the moment, so we cannot inmediately look into implementing similar normalization logic for the .NET client. We'll keep this request in our backlog for future consideration.

@amanda-tarafa amanda-tarafa added priority: p3 Desirable enhancement or fix. May not be included in next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Feb 19, 2025
@alvioshki
Copy link
Author

alvioshki commented Feb 20, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

3 participants