Ndarray.nbytes vs sys.getsizeof(ndarray)

Hello all!

I’m working on a NumPy module.
Here’s a snippet of code for some insight:

from sys import getsizeof as sz

values = [-127, -57, -6, 0, 9, 42, 125]
x = np.array(values, dtype=np.int8)
y = np.array(values)

print("x -> ", x.nbytes, sz(x))
print("y -> ", y.nbytes, sz(y))

Output:

x ->  7 103
y ->  56 152

NumPy manual indicates, ndarray.nbytes Does not include memory consumed by non-element attributes of the array object.

How to interpret the extra memory consumption reported by sys.getsizeof()?
Python documentation indicates this returns the memory consumption in bytes.

What are these non-element array attributes?

NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.

The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.

This explanation from stackoverflow says that nbytes return the size of the store data, while getsizeof returns the size of the entire array at creation.

Thank you @monorienaghogho for your quick response and your time!
From the above, both x and y are ndarrays.

I bumped into this for Python object overhead clarification. I think this sums it up.

Even with an empty ndarray, there is some default size allocation that accounts for ndarray object size allocation (much like the object metadata) because it’s an instance of a class. As the array grew, this overhead size stayed constant.
Makes sense to me now.
Even sys.getsizeof() doesn’t spit-out the full allocation space.

from sys import getsizeof as sz
import numpy as np

values = [-127, -57, -6, 0, 9, 42, 125]

_empty = np.array([])
_empty8 = np.array([], dtype=np.int8)
_one = np.array([1])
_one8 = np.array([1], dtype=np.int8)
_x = np.array(values)
__x = np.array(values, dtype=np.int8)

print("_empty (", _empty.dtype, ") -> ", _empty.nbytes, sz(_empty))
print("_empty8 (", _empty8.dtype, ") -> ", _empty8.nbytes, sz(_empty8))
print("_one (", _one.dtype, ") -> ", _one.nbytes, sz(_one))
print("_one8 (", _one8.dtype, ") -> ", _one8.nbytes, sz(_one8))
print("_x (", _x.dtype, ") -> ", _x.nbytes, sz(_x))
print("__x (", __x.dtype, ") -> ", __x.nbytes, sz(__x))

Output:

_empty ( float64 ) ->  0 96  # Overhead=96 bytes
_empty8 ( int8 ) ->  0 96
_one ( int64 ) ->  8 104  # Overhead=96 + 8 bytes
_one8 ( int8 ) ->  1 97  # Overhead=96 + 1 byte
_x ( int64 ) ->  56 152  # Overhead=96 + 56 bytes (from the values ndarray.nbytes 7*8)
__x ( int8 ) ->  7 103