-
Notifications
You must be signed in to change notification settings - Fork 80
KernelIntrinsics API #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
KernelIntrinsics API #635
Conversation
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/src/intrinsics.jl b/src/intrinsics.jl
index e0b2c46a..2025bf5e 100644
--- a/src/intrinsics.jl
+++ b/src/intrinsics.jl
@@ -159,7 +159,6 @@ end
function _print end
-
"""
Kernel{Backend, Kern}
|
c16a665 to
b166baa
Compare
|
Can you rebase? |
b166baa to
928e6fd
Compare
This comment was marked as outdated.
This comment was marked as outdated.
928e6fd to
cd7476e
Compare
|
@christiangnrd do you think we need a lower-level kernel launch interface? Otherwise the three-dimensional indices would be superflous. |
cd7476e to
84d0c68
Compare
I'd been thinking about that and I think so. Would something that, assuming you wrote the whole kernel with |
Modified from initial Claude code
vchuravy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Maybe we can also add some simple shuffle operations?
| """ | ||
| barrier() | ||
| After a `barrier()` call, all read and writes to global and local memory | ||
| from each thread in the workgroup are visible in from all other threads in the | ||
| workgroup. | ||
| !!! note | ||
| `barrier()` must be encountered by all workitems of a work-group executing the kernel or by none at all. | ||
| !!! note | ||
| Backend implementations **must** implement: | ||
| ``` | ||
| @device_override barrier() | ||
| ``` | ||
| """ | ||
| function barrier() | ||
| error("Group barrier used outside kernel or not captured") | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify the memory semantics that are required by the barrier implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that we should specify the semantics in the error message or that the semantics described in the docstring be more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics in the docs string should be more specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/KernelAbstractions.jl
Outdated
|
|
||
| return quote | ||
| $SharedMemory($(esc(T)), Val($(esc(dims))), Val($(QuoteNode(id)))) | ||
| $SharedMemory($(esc(T)), Val($(esc(dims)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically an ABI break, which I had avoided so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ca11482 sufficient or do we still need to require backends to take in a third unused argument on the KI side?
| @@ -0,0 +1,342 @@ | |||
| module KernelIntrinsics | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe add a little module-level docstring elaborating the positioning of KernelIntrinsics vs. the KernelAbstractions intrinsics? AFAIU, everything here operating in the launch domain vs. KA.jl returning positions in the domain of the array or problem being launched over (i.e. ndrange).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thrown up something. Very open to feedback
|
I'm thinking we should either directly rename |
|
I am okay with providing the alias! |
|
One additional thing to consider is what intrinsics do we need to implement something like #559 Can we add some primitive shuffles? Or add a test kernel that implements a reduction correctly? |
The goal is to allow for kernels to be written without relying on KernelAbstractions macros
See #562 for initial discussion
@vchuravy @maleadt