HuggingFace Kernels API — load and run custom compute kernels from the Hub.
Kernels are optimized computation primitives (GPU kernels, custom ops) shared on the Hub. They can be loaded and used in training/inference pipelines.
See: https://huggingface.co/docs/kernels
Example
# List available kernels
{:ok, kernels} = HuggingfaceClient.list_kernels(access_token: token)
# Get kernel info
{:ok, kernel} = HuggingfaceClient.kernel_info(
"kernels/flash-attention-2",
access_token: token
)
# Get kernel download URLs
{:ok, files} = HuggingfaceClient.kernel_files(
"kernels/flash-attention-2",
backend: "cuda",
access_token: token
)
Summary
Functions
Returns the download URL for a kernel file.
Returns the files available for a kernel (binaries, headers, etc.).
Gets detailed information about a specific kernel.
Lists available kernels on the Hub.
Searches for kernels compatible with a specific operation or architecture.
Functions
Returns the download URL for a kernel file.
Example
url = HuggingfaceClient.kernel_download_url(
"flash-attention/flash-attention-2-cuda",
filename: "flash_attention_2_cuda.so",
access_token: token
)
@spec files( String.t(), keyword() ) :: {:ok, [map()]} | {:error, Exception.t()}
Returns the files available for a kernel (binaries, headers, etc.).
Options
:backend— filter to specific backend:"cuda","triton","metal":version— specific version/revision (default:"main"):access_token
Example
{:ok, files} = HuggingfaceClient.kernel_files(
"flash-attention/flash-attention-2-cuda",
backend: "cuda",
access_token: token
)
Enum.each(files, fn f -> IO.puts(f["rfilename"]) end)
@spec info( String.t(), keyword() ) :: {:ok, map()} | {:error, Exception.t()}
Gets detailed information about a specific kernel.
Example
{:ok, kernel} = HuggingfaceClient.kernel_info("flash-attention/flash-attention-2-cuda",
access_token: token
)
IO.puts("Supported backends: #{inspect(kernel["tags"])}")
@spec list(keyword()) :: {:ok, [map()]} | {:error, Exception.t()}
Lists available kernels on the Hub.
Options
:search— filter by name:backend— filter by backend:"cuda","triton","metal","cpu":sort— sort field:"downloads","likes","lastModified":limit— max results:access_token
Example
{:ok, kernels} = HuggingfaceClient.list_kernels(
backend: "cuda",
sort: "downloads",
access_token: token
)
Enum.each(kernels, fn k -> IO.puts(k["id"]) end)
@spec search( String.t(), keyword() ) :: {:ok, [map()]} | {:error, Exception.t()}
Searches for kernels compatible with a specific operation or architecture.
Example
{:ok, matches} = HuggingfaceClient.search_kernels("flash attention",
backend: "cuda",
access_token: token
)