diff mbox series

[08/20] python/machine.py: fix _popen access

Message ID 20201006235817.3280413-9-jsnow@redhat.com
State New
Headers show
Series python/qemu: strictly typed mypy conversion, pt2 | expand

Commit Message

John Snow Oct. 6, 2020, 11:58 p.m. UTC
As always, Optional[T] causes problems with unchecked access. Add a
helper that asserts the pipe is present before we attempt to talk with
it.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 python/qemu/machine.py | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

Comments

Kevin Wolf Oct. 7, 2020, 10:07 a.m. UTC | #1
Am 07.10.2020 um 01:58 hat John Snow geschrieben:
> As always, Optional[T] causes problems with unchecked access. Add a
> helper that asserts the pipe is present before we attempt to talk with
> it.
> 
> Signed-off-by: John Snow <jsnow@redhat.com>

First a question about the preexisting state: I see that after
initialising self._popen once, we never reset it to None. Should we do
so on shutdown?

>  python/qemu/machine.py | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
> index 3e9cf09fd2d..4e762fcd529 100644
> --- a/python/qemu/machine.py
> +++ b/python/qemu/machine.py
> @@ -131,7 +131,7 @@ def __init__(self, binary, args=None, wrapper=None, name=None,
>          # Runstate
>          self._qemu_log_path = None
>          self._qemu_log_file = None
> -        self._popen = None
> +        self._popen: Optional['subprocess.Popen[bytes]'] = None

Another option that we have, especially if it's an attribute that is
never reset, would be to set the attribute only when it first gets a
value other than None. Accessing it while it hasn't been set yet
automatically results in an AttributeError. I don't think that's much
worse than the exception raised explicitly in a property wrapper.

In this case, you would only declare the type in __init__, but not
assign a value to it:

    self._popen: Optional['subprocess.Popen[bytes]']

Maybe a nicer alternative in some cases than adding properties around
everything.

Instead of checking for None, you would then have to use hasattr(),
which is a bit uglier, so I guess it's mainly for attributes where you
can assume that you will always have a value (if the caller isn't buggy)
and therefore don't even have a check in most places.

>          self._events = []
>          self._iolog = None
>          self._qmp_set = True   # Enable QMP monitor by default.
> @@ -244,6 +244,12 @@ def is_running(self):
>          """Returns true if the VM is running."""
>          return self._popen is not None and self._popen.poll() is None
>  
> +    @property
> +    def _subp(self) -> 'subprocess.Popen[bytes]':
> +        if self._popen is None:
> +            raise QEMUMachineError('Subprocess pipe not present')
> +        return self._popen
> +
>      def exitcode(self):
>          """Returns the exit code if possible, or None."""
>          if self._popen is None:

Of course, even if an alternative is possible, what you have is still
correct.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
John Snow Oct. 7, 2020, 6:44 p.m. UTC | #2
On 10/7/20 6:07 AM, Kevin Wolf wrote:
> Am 07.10.2020 um 01:58 hat John Snow geschrieben:
>> As always, Optional[T] causes problems with unchecked access. Add a
>> helper that asserts the pipe is present before we attempt to talk with
>> it.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
> 
> First a question about the preexisting state: I see that after
> initialising self._popen once, we never reset it to None. Should we do
> so on shutdown?
> 

Yup, we should.

>>   python/qemu/machine.py | 16 +++++++++++-----
>>   1 file changed, 11 insertions(+), 5 deletions(-)
>>
>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
>> index 3e9cf09fd2d..4e762fcd529 100644
>> --- a/python/qemu/machine.py
>> +++ b/python/qemu/machine.py
>> @@ -131,7 +131,7 @@ def __init__(self, binary, args=None, wrapper=None, name=None,
>>           # Runstate
>>           self._qemu_log_path = None
>>           self._qemu_log_file = None
>> -        self._popen = None
>> +        self._popen: Optional['subprocess.Popen[bytes]'] = None
> 
> Another option that we have, especially if it's an attribute that is
> never reset, would be to set the attribute only when it first gets a
> value other than None. Accessing it while it hasn't been set yet
> automatically results in an AttributeError. I don't think that's much
> worse than the exception raised explicitly in a property wrapper.
> 
> In this case, you would only declare the type in __init__, but not
> assign a value to it:
> 
>      self._popen: Optional['subprocess.Popen[bytes]']
> 

If you do this, you can just declare it as non-Optional. Whenever it 
exists, it is definitely a subprocess.Popen[bytes].

> Maybe a nicer alternative in some cases than adding properties around
> everything.
> 
> Instead of checking for None, you would then have to use hasattr(),
> which is a bit uglier, so I guess it's mainly for attributes where you
> can assume that you will always have a value (if the caller isn't buggy)
> and therefore don't even have a check in most places.
> 

As long as the style checkers are OK with that sort of thing. After a 
very quick test, it seems like they might be.

Generally, we run into trouble because pylint et al want variables to be 
declared in __init__, but doing so requires Optional[T] most of the time 
to allow something to be initialized later.

A lot of our stateful objects have this kind of pattern. QAPIGen has a 
ton of it. machine.py has a ton of it too.

You can basically imply the stateful check by just foregoing the actual 
initialization, which trades the explicit check for the implicit one 
when you get the AttributeError.

This is maybe more convenient -- less code to write, certainly. The 
error message you get I think is going to be a little worse, though.

I think I have been leaning towards the cute little @property shims 
because it follows a familiar OO model where a specific class always has 
a finite set of properties that does not grow or shrink. You can also 
use the shim to give a meaningful error that might be nicer to read than 
the AttributeError.

I'm open to suggestions on better patterns. I had considered at one 
point that it might be nice to split Machine out into a version with and 
without the console to make stronger typing guarantees. It has 
implications for how shutdown and cleanup and so on is handled, too.

(I had some WIP patches to do this, but I think I got a little stuck 
making the code pretty, and then the release, and then I got busy, and...)

>>           self._events = []
>>           self._iolog = None
>>           self._qmp_set = True   # Enable QMP monitor by default.
>> @@ -244,6 +244,12 @@ def is_running(self):
>>           """Returns true if the VM is running."""
>>           return self._popen is not None and self._popen.poll() is None
>>   
>> +    @property
>> +    def _subp(self) -> 'subprocess.Popen[bytes]':
>> +        if self._popen is None:
>> +            raise QEMUMachineError('Subprocess pipe not present')
>> +        return self._popen
>> +
>>       def exitcode(self):
>>           """Returns the exit code if possible, or None."""
>>           if self._popen is None:
> 
> Of course, even if an alternative is possible, what you have is still
> correct.
> 
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> 

Thanks; I'll continue with this for now, but I really am open to talking 
about better ways to model the common pattern of "Optional sub-feature 
for a class that can be engaged post-initialization".

It's an interesting typing problem. If we were using semantic types, 
what we are describing is an f(x) such that:

f(object-without-feature) -> object-with-feature

It's a kind of semantic cast where we are doing something akin to an 
in-place transformation of a base type to a subtype. I'm not sure I have 
encountered any language that actually intentionally supports such a 
paradigm.

(Maybe haskell? I just assume haskell can do everything if you learn to 
lie to computers well enough.)

--js
Kevin Wolf Oct. 8, 2020, 7:04 a.m. UTC | #3
Am 07.10.2020 um 20:44 hat John Snow geschrieben:
> On 10/7/20 6:07 AM, Kevin Wolf wrote:
> > Am 07.10.2020 um 01:58 hat John Snow geschrieben:
> > > As always, Optional[T] causes problems with unchecked access. Add a
> > > helper that asserts the pipe is present before we attempt to talk with
> > > it.
> > > 
> > > Signed-off-by: John Snow <jsnow@redhat.com>
> > 
> > First a question about the preexisting state: I see that after
> > initialising self._popen once, we never reset it to None. Should we do
> > so on shutdown?
> > 
> 
> Yup, we should.
> 
> > >   python/qemu/machine.py | 16 +++++++++++-----
> > >   1 file changed, 11 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/python/qemu/machine.py b/python/qemu/machine.py
> > > index 3e9cf09fd2d..4e762fcd529 100644
> > > --- a/python/qemu/machine.py
> > > +++ b/python/qemu/machine.py
> > > @@ -131,7 +131,7 @@ def __init__(self, binary, args=None, wrapper=None, name=None,
> > >           # Runstate
> > >           self._qemu_log_path = None
> > >           self._qemu_log_file = None
> > > -        self._popen = None
> > > +        self._popen: Optional['subprocess.Popen[bytes]'] = None
> > 
> > Another option that we have, especially if it's an attribute that is
> > never reset, would be to set the attribute only when it first gets a
> > value other than None. Accessing it while it hasn't been set yet
> > automatically results in an AttributeError. I don't think that's much
> > worse than the exception raised explicitly in a property wrapper.
> > 
> > In this case, you would only declare the type in __init__, but not
> > assign a value to it:
> > 
> >      self._popen: Optional['subprocess.Popen[bytes]']
> > 
> 
> If you do this, you can just declare it as non-Optional. Whenever it exists,
> it is definitely a subprocess.Popen[bytes].

Sorry, yes, copied too much while thinking too little.

Getting rid of Optional was the whole point of the suggestion.

> > Maybe a nicer alternative in some cases than adding properties around
> > everything.
> > 
> > Instead of checking for None, you would then have to use hasattr(),
> > which is a bit uglier, so I guess it's mainly for attributes where you
> > can assume that you will always have a value (if the caller isn't buggy)
> > and therefore don't even have a check in most places.
> > 
> 
> As long as the style checkers are OK with that sort of thing. After a very
> quick test, it seems like they might be.
> 
> Generally, we run into trouble because pylint et al want variables to be
> declared in __init__, but doing so requires Optional[T] most of the time to
> allow something to be initialized later.
> 
> A lot of our stateful objects have this kind of pattern. QAPIGen has a ton
> of it. machine.py has a ton of it too.
> 
> You can basically imply the stateful check by just foregoing the actual
> initialization, which trades the explicit check for the implicit one when
> you get the AttributeError.
> 
> This is maybe more convenient -- less code to write, certainly. The error
> message you get I think is going to be a little worse, though.

Whether this matters depends on the meaning of the individual attribute.

There can be attributes that can legitimately be None during most of
the lifetime of the object. These should clearly be Optional.

In many cases, however, the contract say that you must first call method
A that initialises the attribute and then you can call method B which
uses it.  Calling B without A would be a bug, so it's not an error
message that users should ever see. For developers who will then look at
the stack trace anyway, I don't think it should make a big difference.

Here, it's usually expected that the attribute is not None except during
phases where the object is mostly inactive anyway (like VMs before
launch or after shutdown). Then you can just not add the attribute yet
and access it without checks (which would only throw an exception
anyway) elsewhere.

> I think I have been leaning towards the cute little @property shims because
> it follows a familiar OO model where a specific class always has a finite
> set of properties that does not grow or shrink. You can also use the shim to
> give a meaningful error that might be nicer to read than the AttributeError.
> 
> I'm open to suggestions on better patterns. I had considered at one point
> that it might be nice to split Machine out into a version with and without
> the console to make stronger typing guarantees. It has implications for how
> shutdown and cleanup and so on is handled, too.
> 
> (I had some WIP patches to do this, but I think I got a little stuck making
> the code pretty, and then the release, and then I got busy, and...)

I guess the way to have everything static would be splitting QEMUMachine
into QEMUVMConfig (which exists without a running QEMU instance) and
QEMUVMInstance (which gets a QEMUVMConfig passed to its constructor and
is directly tied to a QEMU process).

Not sure if it would be worth such a major change.

> > >           self._events = []
> > >           self._iolog = None
> > >           self._qmp_set = True   # Enable QMP monitor by default.
> > > @@ -244,6 +244,12 @@ def is_running(self):
> > >           """Returns true if the VM is running."""
> > >           return self._popen is not None and self._popen.poll() is None
> > > +    @property
> > > +    def _subp(self) -> 'subprocess.Popen[bytes]':
> > > +        if self._popen is None:
> > > +            raise QEMUMachineError('Subprocess pipe not present')
> > > +        return self._popen

The major downside that I saw while reviewing this patch (besides having
extra code just for making the error message of what essentially a
failed assertion nicer) is that we have two names for the same thing, we
have both names in active use in the other methods, and I'll never be
able to remember which of _subp and _popen is the real attribute and
which is the property (or that they are related at all and changing one
will actually change the other, too) without looking it up.

I mean, I guess tools will tell me after getting it wrong, but still...

Properties can make a nice external interface, but I feel using them
internally while you don't avoid accessing the real attribute in methods
other than the property implementation is more confusing than helpful.

> > >       def exitcode(self):
> > >           """Returns the exit code if possible, or None."""
> > >           if self._popen is None:
> > 
> > Of course, even if an alternative is possible, what you have is still
> > correct.
> > 
> > Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> > 
> 
> Thanks; I'll continue with this for now, but I really am open to talking
> about better ways to model the common pattern of "Optional sub-feature for a
> class that can be engaged post-initialization".
> 
> It's an interesting typing problem. If we were using semantic types, what we
> are describing is an f(x) such that:
> 
> f(object-without-feature) -> object-with-feature
> 
> It's a kind of semantic cast where we are doing something akin to an
> in-place transformation of a base type to a subtype. I'm not sure I have
> encountered any language that actually intentionally supports such a
> paradigm.
> 
> (Maybe haskell? I just assume haskell can do everything if you learn to lie
> to computers well enough.)

You can always express this kind of thing as object-with-feature
containing an object-without-feature.

Kevin
John Snow Oct. 8, 2020, 3:29 p.m. UTC | #4
On 10/8/20 3:04 AM, Kevin Wolf wrote:
> The major downside that I saw while reviewing this patch (besides having
> extra code just for making the error message of what essentially a
> failed assertion nicer) is that we have two names for the same thing, we
> have both names in active use in the other methods, and I'll never be
> able to remember which of _subp and _popen is the real attribute and
> which is the property (or that they are related at all and changing one
> will actually change the other, too) without looking it up.
> 
> I mean, I guess tools will tell me after getting it wrong, but still...
> 
> Properties can make a nice external interface, but I feel using them
> internally while you don't avoid accessing the real attribute in methods
> other than the property implementation is more confusing than helpful.

Good point. I'll see if I can find a nicer cleanup soon. For now I will 
suggest relying on the type checker to spot if we get it wrong.

I do think the little property wrappers are kind of distracting, but 
seemed like the quickest means to an end at the time. With type checking 
fully in place, refactors can be a little more fearless going forward, I 
think.

--js
diff mbox series

Patch

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 3e9cf09fd2d..4e762fcd529 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -131,7 +131,7 @@  def __init__(self, binary, args=None, wrapper=None, name=None,
         # Runstate
         self._qemu_log_path = None
         self._qemu_log_file = None
-        self._popen = None
+        self._popen: Optional['subprocess.Popen[bytes]'] = None
         self._events = []
         self._iolog = None
         self._qmp_set = True   # Enable QMP monitor by default.
@@ -244,6 +244,12 @@  def is_running(self):
         """Returns true if the VM is running."""
         return self._popen is not None and self._popen.poll() is None
 
+    @property
+    def _subp(self) -> 'subprocess.Popen[bytes]':
+        if self._popen is None:
+            raise QEMUMachineError('Subprocess pipe not present')
+        return self._popen
+
     def exitcode(self):
         """Returns the exit code if possible, or None."""
         if self._popen is None:
@@ -254,7 +260,7 @@  def get_pid(self):
         """Returns the PID of the running process, or None."""
         if not self.is_running():
             return None
-        return self._popen.pid
+        return self._subp.pid
 
     def _load_io_log(self):
         if self._qemu_log_path is not None:
@@ -415,8 +421,8 @@  def _hard_shutdown(self) -> None:
             waiting for the QEMU process to terminate.
         """
         self._early_cleanup()
-        self._popen.kill()
-        self._popen.wait(timeout=60)
+        self._subp.kill()
+        self._subp.wait(timeout=60)
 
     def _soft_shutdown(self, timeout: Optional[int],
                        has_quit: bool = False) -> None:
@@ -440,7 +446,7 @@  def _soft_shutdown(self, timeout: Optional[int],
                 self._qmp.cmd('quit')
 
         # May raise subprocess.TimeoutExpired
-        self._popen.wait(timeout=timeout)
+        self._subp.wait(timeout=timeout)
 
     def _do_shutdown(self, timeout: Optional[int],
                      has_quit: bool = False) -> None: