This reference design can be run on a Ryzen™ AI NPU.
In the design, a 2-D array in a row-major layout is read from external memory to ComputeTile2
with a transposed layout,
by using an implicit copy via the compute tile's Data Movement Accelerator (DMA). The data is read from and written to external memory through the Shim tile (col
, 0).
The implicit copy is performed using the object_fifo_link
operation that specifies how input data arriving via of_in
should be sent further via of_out
by specifically leveraging the compute tile's DMA. This operation and its functionality are described in more depth in Section-2b of the programming guide.
To compile and run the design for NPU:
make
make run