HuggingfaceClient.Hub.FileSystem (huggingface_client v0.1.0)

Copy Markdown View Source

HuggingFace FileSystem — fsspec-compatible interface to Hub repos and Buckets.

Provides filesystem-like operations (ls, glob, open, read, write, cp, mv, rm) over hf:// URIs, mirroring the Python HfFileSystem class.

URI format

hf://[repo_type_prefix]repo_id[@revision]/path/in/repo

Where repo_type_prefix is:

  • (none) for models: hf://username/my-model/config.json
  • datasets/ for datasets: hf://datasets/rajpurkar/squad/train.csv
  • spaces/ for Spaces: hf://spaces/my-org/my-space/app.py
  • buckets/ for Buckets: hf://buckets/username/my-bucket/data.bin

Example

fs = HuggingfaceClient.Hub.FileSystem.new(token: "hf_...")

# List files
{:ok, files} = HuggingfaceClient.Hub.FileSystem.ls(fs, "gpt2")
{:ok, files} = HuggingfaceClient.Hub.FileSystem.ls(fs, "datasets/rajpurkar/squad")

# Read a file
{:ok, content} = HuggingfaceClient.Hub.FileSystem.read(fs, "gpt2/config.json")

# Write a file
:ok = HuggingfaceClient.Hub.FileSystem.write(fs, "my-org/my-model/README.md", "# My Model")

# Glob pattern
{:ok, matches} = HuggingfaceClient.Hub.FileSystem.glob(fs, "gpt2/*.json")

# Copy
:ok = HuggingfaceClient.Hub.FileSystem.cp(fs,
  "datasets/rajpurkar/squad/train.csv",
  "buckets/my-user/my-bucket/squad-train.csv"
)

Summary

Functions

Copies a file from one hf:// path to another.

Returns true if a file exists at the given hf:// path.

Returns paths matching a glob pattern.

Returns metadata about a file or directory at an hf:// path.

Lists files and directories at an hf:// path.

Creates a new FileSystem client.

Reads the content of a file at an hf:// path.

Reads a file as text (UTF-8).

Deletes a file at an hf:// path.

Writes content to a file at an hf:// path.

Types

t()

@type t() :: %HuggingfaceClient.Hub.FileSystem{
  endpoint: String.t(),
  skip_cache: boolean(),
  token: String.t() | nil
}

Functions

cp(fs, src_path, dst_path, opts \\ [])

@spec cp(t(), String.t(), String.t(), keyword()) :: :ok | {:error, Exception.t()}

Copies a file from one hf:// path to another.

For bucket destinations, uses the server-side copy API. For repo destinations, downloads then uploads.

Example

:ok = HuggingfaceClient.Hub.FileSystem.cp(fs,
  "datasets/rajpurkar/squad/train.csv",
  "buckets/my-user/my-bucket/squad-train.csv"
)

exists?(fs, path)

@spec exists?(t(), String.t()) :: boolean()

Returns true if a file exists at the given hf:// path.

glob(fs, pattern, opts \\ [])

@spec glob(t(), String.t(), keyword()) ::
  {:ok, [String.t()]} | {:error, Exception.t()}

Returns paths matching a glob pattern.

Example

{:ok, matches} = HuggingfaceClient.Hub.FileSystem.glob(fs, "gpt2/*.json")
{:ok, matches} = HuggingfaceClient.Hub.FileSystem.glob(fs, "datasets/rajpurkar/squad/**/*.csv")

info(fs, path, opts \\ [])

@spec info(t(), String.t(), keyword()) :: {:ok, map()} | {:error, Exception.t()}

Returns metadata about a file or directory at an hf:// path.

Example

{:ok, info} = HuggingfaceClient.Hub.FileSystem.info(fs, "gpt2/config.json")
IO.puts("Size: #{info["size"]}")
IO.puts("SHA: #{info["sha"]}")

ls(fs, path, opts \\ [])

@spec ls(t(), String.t(), keyword()) ::
  {:ok, [map() | String.t()]} | {:error, Exception.t()}

Lists files and directories at an hf:// path.

Parameters

  • fs — filesystem client
  • pathhf:// URI or simplified path (e.g. "gpt2" or "gpt2/subfolder")
  • :detail — if true, return full metadata; if false, return paths only

Example

{:ok, files} = HuggingfaceClient.Hub.FileSystem.ls(fs, "gpt2")
{:ok, files} = HuggingfaceClient.Hub.FileSystem.ls(fs, "datasets/squad", detail: true)

new(opts \\ [])

@spec new(keyword()) :: t()

Creates a new FileSystem client.

Options

  • :token — HF access token
  • :endpoint — Hub endpoint (default: https://huggingface.co)
  • :skip_cache — bypass local cache for reads (default: false)

read(fs, path, opts \\ [])

@spec read(t(), String.t(), keyword()) :: {:ok, binary()} | {:error, Exception.t()}

Reads the content of a file at an hf:// path.

Example

{:ok, content} = HuggingfaceClient.Hub.FileSystem.read(fs, "gpt2/config.json")
config = Jason.decode!(content)

# Read dataset file
{:ok, csv} = HuggingfaceClient.Hub.FileSystem.read(fs, "datasets/my-user/my-dataset/train.csv")

read_text(fs, path, opts \\ [])

@spec read_text(t(), String.t(), keyword()) ::
  {:ok, String.t()} | {:error, Exception.t()}

Reads a file as text (UTF-8).

Example

{:ok, text} = HuggingfaceClient.Hub.FileSystem.read_text(fs, "gpt2/README.md")

rm(fs, path, opts \\ [])

@spec rm(t(), String.t(), keyword()) :: :ok | {:error, Exception.t()}

Deletes a file at an hf:// path.

Example

:ok = HuggingfaceClient.Hub.FileSystem.rm(fs, "my-org/my-model/old-weights.bin")
:ok = HuggingfaceClient.Hub.FileSystem.rm(fs, "buckets/my-bucket/temp.txt")

write(fs, path, content, opts \\ [])

@spec write(t(), String.t(), binary() | String.t(), keyword()) ::
  :ok | {:error, Exception.t()}

Writes content to a file at an hf:// path.

For repos this creates a commit. For buckets this does a direct upload.

Options

  • :commit_message — commit message for repo writes
  • :revision — branch to write to (default: "main")

Example

:ok = HuggingfaceClient.Hub.FileSystem.write(fs, "my-org/my-model/README.md", "# My Model")
:ok = HuggingfaceClient.Hub.FileSystem.write(fs, "buckets/my-bucket/data.bin", binary_data)