[Qemu-devel] [RFC PATCH] remove numpy dependency

Joannah Nanjekye posted 1 patch 6 years, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1509986155-4735-1-git-send-email-nanjekyejoannah@gmail.com
Test checkpatch failed
Test docker passed
Test ppc passed
Test s390x passed
scripts/analyze-migration.py | 23 +++++++++++++++--------
1 file changed, 15 insertions(+), 8 deletions(-)
[Qemu-devel] [RFC PATCH] remove numpy dependency
Posted by Joannah Nanjekye 6 years, 4 months ago
Users tend to hit an ImportError when running analyze-migration.py due
to the numpy dependency.  numpy functionality isn't actually used, just
binary serialization that the standard library 'struct' module already
provides.  Removing the dependency allows the script to run
out-of-the-box.

Signed-off-by: Joannah Nanjekye <nanjekyejoannah@gmail.com>
---
 scripts/analyze-migration.py | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index 1455387..6175c99 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -17,7 +17,6 @@
 # You should have received a copy of the GNU Lesser General Public
 # License along with this library; if not, see <http://www.gnu.org/licenses/>.
 
-import numpy as np
 import json
 import os
 import argparse
@@ -36,23 +35,29 @@ class MigrationFile(object):
         self.file = open(self.filename, "rb")
 
     def read64(self):
-        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i8')[0])
+        buffer = file.read(64)
+        return struct.unpack('>i16', buffer)[0]
 
     def read32(self):
-        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i4')[0])
+        buffer = file.read(32)
+        return struct.unpack('>i8', buffer)[0]
 
     def read16(self):
-        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i2')[0])
+        buffer = file.read(16)
+        return struct.unpack('>i4', buffer)[0]
 
     def read8(self):
-        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i1')[0])
+        buffer = file.read(8)
+        return struct.unpack('>i2', buffer)[0]
 
     def readstr(self, len = None):
+        read_format = str(len) + 'd'
         if len is None:
             len = self.read8()
         if len == 0:
             return ""
-        return np.fromfile(self.file, count=1, dtype=('S%d' % len))[0]
+        buffer = file.read(8)
+        return struct.unpack(read_format, buffer[0:(0 + struct.calcsize(read_format))])
 
     def readvar(self, size = None):
         if size is None:
@@ -303,8 +308,10 @@ class VMSDFieldInt(VMSDFieldGeneric):
 
     def read(self):
         super(VMSDFieldInt, self).read()
-        self.sdata = np.fromstring(self.data, count=1, dtype=(self.sdtype))[0]
-        self.udata = np.fromstring(self.data, count=1, dtype=(self.udtype))[0]
+        buffer = file.read(self.data)
+        read_format = self.sdtype
+        self.sdata = struct.unpack(self.sdtype, buffer[0:(0 + struct.calcsize(self.sdtype))])
+        self.sdata = struct.unpack(self.udtype, buffer[0:(0 + struct.calcsize(self.udtype))])
         self.data = self.sdata
         return self.data
 
-- 
2.7.4


Re: [Qemu-devel] [RFC PATCH] remove numpy dependency
Posted by no-reply@patchew.org 6 years, 4 months ago
Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC PATCH] remove numpy dependency
Type: series
Message-id: 1509986155-4735-1-git-send-email-nanjekyejoannah@gmail.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
d0c6cb1cd0 remove numpy dependency

=== OUTPUT BEGIN ===
Checking PATCH 1/1: remove numpy dependency...
WARNING: line over 80 characters
#58: FILE: scripts/analyze-migration.py:60:
+        return struct.unpack(read_format, buffer[0:(0 + struct.calcsize(read_format))])

ERROR: line over 90 characters
#70: FILE: scripts/analyze-migration.py:313:
+        self.sdata = struct.unpack(self.sdtype, buffer[0:(0 + struct.calcsize(self.sdtype))])

ERROR: line over 90 characters
#71: FILE: scripts/analyze-migration.py:314:
+        self.sdata = struct.unpack(self.udtype, buffer[0:(0 + struct.calcsize(self.udtype))])

total: 2 errors, 1 warnings, 53 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org
Re: [Qemu-devel] [RFC PATCH] remove numpy dependency
Posted by Stefan Hajnoczi 6 years, 4 months ago
On Mon, Nov 06, 2017 at 07:35:55PM +0300, Joannah Nanjekye wrote:
> Users tend to hit an ImportError when running analyze-migration.py due
> to the numpy dependency.  numpy functionality isn't actually used, just
> binary serialization that the standard library 'struct' module already
> provides.  Removing the dependency allows the script to run
> out-of-the-box.
> 
> Signed-off-by: Joannah Nanjekye <nanjekyejoannah@gmail.com>
> ---
>  scripts/analyze-migration.py | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
> index 1455387..6175c99 100755
> --- a/scripts/analyze-migration.py
> +++ b/scripts/analyze-migration.py
> @@ -17,7 +17,6 @@
>  # You should have received a copy of the GNU Lesser General Public
>  # License along with this library; if not, see <http://www.gnu.org/licenses/>.
>  
> -import numpy as np
>  import json
>  import os
>  import argparse
> @@ -36,23 +35,29 @@ class MigrationFile(object):
>          self.file = open(self.filename, "rb")
>  
>      def read64(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i8')[0])

dtype='>i8' is a 64-bit (8 bytes) big-endian signed integer according to
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html#arrays-dtypes-constructing.

> +        buffer = file.read(64)

This reads 64 bytes (not bits!).  It should read 8 bytes.

> +        return struct.unpack('>i16', buffer)[0]

This should be '>q' according to
https://docs.python.org/2.7/library/struct.html#format-strings.

>  
>      def read32(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i4')[0])
> +        buffer = file.read(32)

read(4)

> +        return struct.unpack('>i8', buffer)[0]

'>i'

>  
>      def read16(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i2')[0])
> +        buffer = file.read(16)

read(2)

> +        return struct.unpack('>i4', buffer)[0]

'>h'

>  
>      def read8(self):
> -        return np.asscalar(np.fromfile(self.file, count=1, dtype='>i1')[0])
> +        buffer = file.read(8)

read(1)

> +        return struct.unpack('>i2', buffer)[0]

'b'

>  
>      def readstr(self, len = None):
> +        read_format = str(len) + 'd'
>          if len is None:
>              len = self.read8()
>          if len == 0:
>              return ""
> -        return np.fromfile(self.file, count=1, dtype=('S%d' % len))[0]
> +        buffer = file.read(8)
> +        return struct.unpack(read_format, buffer[0:(0 + struct.calcsize(read_format))])

According to the numpy documentation 'S' produces raw bytes (not a
unicode string).  To get the raw bytes we just need file.read(len).  The
struct module isn't needed.

Something like this should work:

  if len is None:
      len = self.read8()
  if len == 0:
      return ""
  return file.read(len).split(b'\0')[0]

>  
>      def readvar(self, size = None):
>          if size is None:
> @@ -303,8 +308,10 @@ class VMSDFieldInt(VMSDFieldGeneric):
>  
>      def read(self):
>          super(VMSDFieldInt, self).read()
> -        self.sdata = np.fromstring(self.data, count=1, dtype=(self.sdtype))[0]
> -        self.udata = np.fromstring(self.data, count=1, dtype=(self.udtype))[0]
> +        buffer = file.read(self.data)

This statement doesn't make sense.  According to the Python
documentation:

  read([size]) -> read at most size bytes, returned as a string.

self.data *is* the buffer.  There's no need to call file.read().

> +        read_format = self.sdtype

Unused.

> +        self.sdata = struct.unpack(self.sdtype, buffer[0:(0 + struct.calcsize(self.sdtype))])

struct.calcsize() is unnecessary since self.size already contains the
size in bytes.

struct.unpack returns a tuple and we need the first element, so it
should be:

  self.sdata = struct.unpack(...)[0]

> +        self.sdata = struct.unpack(self.udtype, buffer[0:(0 + struct.calcsize(self.udtype))])

The data type specifiers are incorrect because the struct module uses
different syntax than numpy:

  self.sdtype = '>i%d' % self.size
  self.udtype = '>u%d' % self.size

The struct data type specifiers can be looked up like this:

  sdtypes = {1: 'b', 2: '>h', 4: '>i', 8: '>q'}
  udtypes = {1: 'B', 2: '>H', 4: '>I', 8: '>Q'}
  self.sdtype = sdtypes[self.size]
  self.udtype = udtypes[self.size]