-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement dataless cubes #6253
base: main
Are you sure you want to change the base?
Implement dataless cubes #6253
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6253 +/- ##
==========================================
- Coverage 89.83% 89.73% -0.10%
==========================================
Files 88 88
Lines 23347 23451 +104
Branches 4344 4383 +39
==========================================
+ Hits 20974 21044 +70
- Misses 1646 1664 +18
- Partials 727 743 +16 ☔ View full report in Codecov by Sentry. |
This reverts commit 6ed270d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO I'm only partially through the review, but I thought I'd push these review comments early so that you can see them and address them ASAP.
Note that you also need to refactor the following DataManager
methods:
__equal__
to account for dataless cubes, particularly when dataless cubes are involved in the operation i.e., for the new use cases between data and dataless, and dataless and datalesslazy_data
to deal with the dataless case- akin to the
lazy_data
method, you also have to deal withcore_data
for the dataless case __repr__
requires to cope with the dataless case i.e., provide the shape
lib/iris/__init__.py
Outdated
@@ -832,3 +841,6 @@ def use_plugin(plugin_name): | |||
significance of the import statement and warn that it is an unused import. | |||
""" | |||
importlib.import_module(f"iris.plugins.{plugin_name}") | |||
|
|||
|
|||
DATALESS_COPY = "NONE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO It's better to have such constants defined at the top of the module, plus with a comment please 👍
Around about line+142 (after the constraint definition conveniences seems about right)
lib/iris/_data_manager.py
Outdated
managed. If a value of None is given, the data manager will be | ||
considered dataless. | ||
|
||
shape : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO We've been adopting the following numpydoc standard for specifying the type of parameters i.e., shape : tuple, optional
Same standard applies to the data
parameter 👍
lib/iris/_data_manager.py
Outdated
|
||
shape : | ||
A tuple, representing the shape of the data manager. This can only | ||
be used in the case of `data=None`, and will render the data manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use double-ticks, see here
i.e., ``data=None``
lib/iris/_data_manager.py
Outdated
|
||
""" | ||
if (shape is not None) and (data is not None): | ||
msg = "`shape` should only be provided if `data is None`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg = "`shape` should only be provided if `data is None`" | |
msg = '"shape" should only be provided if "data" is None' |
lib/iris/_data_manager.py
Outdated
# Initialise the instance. | ||
self._lazy_array = None | ||
self._real_array = None | ||
|
||
# Assign the data payload to be managed. | ||
self._shape = shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO The comment on line+45 applies to self.data = data
on line+47.
Could you move self._shape = shape
to the above # Initialise the instance.
block 👍
lib/iris/_data_manager.py
Outdated
if not dataless: | ||
data = np.asarray(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO We shouldn't need this defensive code for the dataless case i.e., we shouldn't get here.
lib/iris/_data_manager.py
Outdated
result = self.core_data().shape | ||
return result | ||
|
||
def is_dataless(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def is_dataless(self): | |
def is_dataless(self) -> bool: |
lib/iris/_data_manager.py
Outdated
return result | ||
|
||
def is_dataless(self): | ||
"""Determine whether the cube is dataless. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO Perhaps it's best to not use dataless
to describle is_dataless
e.g., maybe something like Determine whether the cube has no data.
instead?
lib/iris/_data_manager.py
Outdated
if self.core_data() is None: | ||
result = self._shape | ||
else: | ||
result = self.core_data().shape | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.core_data() is None: | |
result = self._shape | |
else: | |
result = self.core_data().shape | |
return result | |
return self._shape if self._shape else self.core_data().shape |
lib/iris/_data_manager.py
Outdated
bool | ||
|
||
""" | ||
return (self.core_data() is None) and (self.shape is not None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ESadek-MO Given our axiom (and if this isn't true then something is wrong) it must always be the case that:
self._shape = None
and (self._lazy_array is not None
orself._real_array is not None
)self._shape is not None
and (self._lazy_array is None
andself._real_array is None
)
Therefore, is_dataless
should be defined simply as return self._shape is not None
, right?
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Something is wrong with the _assert_axiom code now. I need to catch a bus, so will have to figure out why tomorrow. |
for more information, see https://pre-commit.ci
Although the tests are passing, next time I look at this, I need to investigate the data setter. Currently, line 256 causes issues, as self._shape should be changable if self._shape = (), and it won't be. That is to say, it works as expected, but I believe it hasn't been written in such a way as it SHOULD be. |
lib/iris/_data_manager.py
Outdated
|
||
@property | ||
def shape(self): | ||
"""The shape of the data being managed.""" | ||
return self.core_data().shape | ||
return self._shape if self._shape else self.core_data().shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is defensive; it wouldn't cause any issues that I can think of to just have return self._shape
, but this should protect us in case I've missed something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've actually removed this now.
() is falsy. I could have easily done ...if self._shape is not None
, but I decided there's no point in adding redundant code. Easy undo if my reviewer disagrees!
state = is_lazy ^ is_real | ||
assert state, emsg.format("" if is_lazy else "no ", "" if is_real else "no ") | ||
|
||
if not (is_lazy ^ is_real): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be clearer as if is_lazy == is_real
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did see this, but forgot to comment.
Personally, I think I prefer it as it is, but if there's significant desire for your suggestion, I don't mind.
Closes #4447.
I plan to do this in four stages:
data is None
when handed a shape value, and can essentially round-trip removing and adding data. This shouldn't break existing tests. 71c7ae8DataManager
methods all make sense and work withNone
data.DataManager.copy()
is an example of a method that won't make sense withNone
data.