Compute back ends.
More...
Compute back ends.
Compute back ends for task execution in heterogeneous compute devices.
- OpenCLBackend: main/default OpenCL 1/2 back end.
- Pseudocomp: fake back end for testing.
- An absent back end is also supported, in which case native CPU code is used.
The following area of the screenshots page shows some of the compute capabilities:
https://bbguimaraes.com/nngn/screenshots/compute.html
Lua
The compute back end is exposed to Lua via the nngn.compute
variable. Compiling a program and executing a kernel is done similarly to the C++ API:
prog =
nngn.compute:create_prog(io.open(
"prog.cl"):
read(
"a"))
nngn.compute:
execute(prog, "fn", Compute.BLOCKING, {256, 256}, {16, 16}, {
Compute.FLOAT, 3.1415,
Compute.BYTEV, {0, 1, 2, 3},
})
bool execute(Compute &c, u32 program, const std::string &func, Compute::ExecFlag flags, nngn::lua::as_table_t< std::vector< std::size_t > > global_size, nngn::lua::as_table_t< std::vector< std::size_t > > local_size, nngn::lua::table_view data, std::optional< nngn::lua::object > wait_opt, std::optional< nngn::lua::table_view > events_opt)
Definition: lua_compute.cpp:513
bool read(std::string_view filename, byte_type auto *v)
Definition: utils.cpp:17
Compute.BLOCKING
makes the execution synchronous. Kernels can also be executed concurrently (requires a device with an out-of-order queue) and synchronized using events:
events = {}
-- Last two arguments are
wait list and output events.
nngn.compute:
execute(prog,
"k0", 0, {1}, {1}, {},
nil, events)
-- Reuse output.
nngn.compute:
execute(prog,
"k1", 0, {1}, {1}, {},
nil, events)
-- Wait for all previous events, reuse output.
nngn.compute:
execute(prog,
"k2", 0, {1}, {1}, {}, events, events)
-- Wait
for all previous events and
release them.
nngn.compute:
wait(events)
bool release(Compute &c, std::uint32_t id)
Definition: lua_compute.cpp:400
auto wait(const Compute &c, const nngn::lua::as_table_t< std::vector< Compute::Event > > &events)
Definition: lua_compute.cpp:391
bool release_events(const Compute &c, nngn::lua::as_table_t< std::vector< Compute::Event > > events)
Definition: lua_compute.cpp:402
Separate tables can be used to construct more complex task graphs.
Buffers
To create and populate a device buffer:
v = Compute.create_vector(size)
Compute.fill_rnd_vector(v)
bool write_buffer(const Compute &c, std::uint32_t b, std::size_t off, std::size_t n, Type type, const std::vector< std::byte > &v)
Definition: lua_compute.cpp:329
auto create_buffer(Compute &c, Compute::MemFlag flags, Type type, std::size_t n, std::optional< const std::vector< std::byte > * > ov)
Definition: lua_compute.cpp:278
Common operations on buffers are creation:
rw = Compute.READ_WRITE
-- From existing data.
rw, Compute.FLOATV, size,
Compute.create_vector(1024)))
write:
&SpriteAnimation::set_track n
Definition: lua_animation.cpp:44
std::ptrdiff_t offset(const Entities &es, const Entity &e)
Definition: entity.cpp:20
and read:
bool read_buffer(const Compute &c, std::uint32_t b, Type type, std::size_t n, std::vector< std::byte > &v)
Definition: lua_compute.cpp:294