Create 2D numpy array from buffer

Create 2D numpy array from buffer

Problem Description:

Consider a system with n_channels transmitting n_samples at a given sampling rate. The 1D buffer containing the timestamps and the 2D buffer containing (n_channels, n_samples) is:

from ctypes import c_double, c_float

# Assume a 2-second window, 3 channels, sampled at 1024 Hz
# data: (n_channels, n_samples) = (3, 2048)
# timestamps: (n_samples,) = (2048,)
n_channels = 3
n_samples = 2048
n_data_values = n_channels * n_samples
data_buffer = (c_float * n_data_values)()
ts_buffer = (c_double * n_samples)() 

I have a C++ binary library that fills the buffer. The function can be summarized as:

from ctypes import byref

fill_buffers(
    byref(data_buffer),
    byref(ts_buffer),
)

At this point, I have 2 filled buffers, one with 2048 elements (timestamps) and one with 3* 2048 elements (data). I want to load as efficiently as possible those 2 buffers in a numpy array.

np.frombuffer seems amazing to read 1D array, e.g. the timestamps, but I can’t find a counterpart for N-dim arrays.

# read from buffer for the 1D array
timestamps = np.frombuffer(ts_buffer)  # 192 ns ± 1.11 ns per loop
timestamps = np.array(ts_buffer)  # 854 ns ± 2.99 ns per loop

For now, the data array is loaded with:

data = np.array(data_buffer).reshape(-1, n_channels, order="C").T

Any way to use the same efficient method as np.frombuffer while providing the output shape and the order?


This question is different from How can I initialize a NumPy array from a multidimensional buffer? and from How to restore a 2-dimensional numpy.array from a bytestring? since it does not focus on an alternative to np.frombuffer, but an alternative as efficient.


EDIT: Why is np.frombuffer(data_buffer).reshape(-1, n_channels).T not working? With 3 channels and 1024 points (to speed-up my testing), I get len(data_buffer) = 3072, but:

np.array(data_buffer).reshape(-1, 3).T.size = 3072
np.frombuffer(data_buffer).reshape(-1, 3).T.size = 1536

The application is a LabStreamingLayer buffer. The buffer is filled here https://github.com/labstreaminglayer/liblsl-Python/blob/87276974a311bcf7ceb3383e9d04c6bdcf302771/pylsl/pylsl.py#L854-L861
using the C++ library https://github.com/sccn/liblsl with specifically this function https://github.com/sccn/liblsl/blob/08aa186326e9a339316b7d5677ef31b3651b4aad/src/lsl_inlet_c.cpp#L180-L185

Solution – 1

Does np.frombuffer(data_buffer, dtype=c_float).reshape(-1, n_channels, order="C").T not work correctly? As you are doing it np.array treats the buffer as a 1D array until you reshape it anyways.

For me the following code produces the right shapes. (Hard to verify if it works correctly without a MWE for the data that should be in the buffers).

import numpy as np
from ctypes import c_double, c_float

# Assume a 2-second window, 3 channels, sampled at 1024 Hz
# data: (n_channels, n_samples) = (3, 2048)
# timestamps: (n_samples,) = (2048,)
n_channels = 3
n_samples = 2048
n_data_values = n_channels * n_samples
data_buffer = (c_float * n_data_values)() # Note that c_float is typically 32 bytes while c_double and numpy's default is 64 bytes
ts_buffer = (c_double * n_samples)()

# Create a mock buffer

input_data = np.arange(0,n_data_values, dtype=c_float)
input_data_buffer = input_data.tobytes()


timestamps = np.frombuffer(ts_buffer) 

# Note to specify the data type for the array of floats
data = np.frombuffer(input_data_buffer, dtype=c_float).reshape(-1, n_channels, order="C").T
# data has values 0,1,2 for first time point, 3,4,5 for second, and so on
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject