aws/s3/transfer

S3 multipart-upload helper. Splits a buffered body into parts, runs CreateMultipartUploadUploadPart × N → CompleteMultipartUpload, and best-effort aborts the upload on any failure so dangling uploads don’t accumulate in the bucket (S3 charges storage for incomplete multipart uploads until you abort them, and large numbers of orphaned uploads slow down ListObjects).

Two entry points: upload for callers that already have the bytes in a BitArray, and upload_from_stream for callers holding a StreamingBody. The streaming variant rechunks across chunk boundaries so wire-side part sizes follow part_size_bytes rather than the source’s chunking. Both today hold the full body in memory; bounded-memory streaming arrives when StreamingBody grows a lazy Source(...) variant (file handles, generators).

The upload-coordination logic is sequential — parts upload one at a time. Parallel uploads (the bandwidth-saturating common case) want a Task-based fan-out around this helper; building that lives in aws/s3/transfer_parallel.gleam once a use case pins the right concurrency knob.

Types

Errors a multipart upload surfaces. CreateFailed / UploadPartFailed / CompleteFailed wrap the underlying typed S3 error so callers can pattern-match on the wire-side cause (NoSuchBucket, AccessDenied, etc.). UploadPartFailed also records which part number failed so callers know what to retry. MissingUploadId fires if S3’s CreateMultipartUpload response arrives without an upload_id (should never happen in production, but the wire-type is Option(String) so we surface it explicitly rather than asserting).

pub type Error {
  CreateFailed(cause: s3.CreateMultipartUploadError)
  UploadPartFailed(part_number: Int, cause: s3.UploadPartError)
  CompleteFailed(cause: s3.CompleteMultipartUploadError)
  MissingUploadId
  EmptyBody
}

Constructors

Per-object metadata + access-control options applied at CreateMultipartUpload time. Only content_type, metadata, acl, cache_control, content_encoding, content_disposition, storage_class, and server_side_encryption are surfaced today — they cover the 90% of S3 PutObject use cases (HTTP-content-metadata, ACL, storage tier, SSE). Callers needing more exotic options (object lock, grants, request payer, SSE-C, etc.) should call s3.create_multipart_upload / s3.upload_part / s3.complete_multipart_upload directly until / unless those fields land on UploadOptions.

Construct with default_options() and override individual fields via record-update syntax:

let opts = transfer.UploadOptions(
  ..transfer.default_options(),
  content_type: option.Some("application/json"),
  cache_control: option.Some("max-age=3600"),
)
pub type UploadOptions {
  UploadOptions(
    content_type: option.Option(String),
    content_encoding: option.Option(String),
    content_disposition: option.Option(String),
    cache_control: option.Option(String),
    metadata: option.Option(dict.Dict(String, String)),
    acl: option.Option(s3.ObjectCannedACL),
    storage_class: option.Option(s3.StorageClass),
    server_side_encryption: option.Option(s3.ServerSideEncryption),
    max_concurrency: option.Option(Int),
  )
}

Constructors

Result of a successful multipart upload. upload_id is exposed so callers can correlate with S3 access logs or with their own audit trail.

pub type UploadResult {
  UploadResult(
    bucket: String,
    key: String,
    upload_id: String,
    parts_uploaded: Int,
  )
}

Constructors

  • UploadResult(
      bucket: String,
      key: String,
      upload_id: String,
      parts_uploaded: Int,
    )

Values

pub fn default_options() -> UploadOptions

All-None options — what upload / upload_from_stream pass when callers don’t supply their own. Equivalent to using s3.create_multipart_upload with no metadata overrides; S3 applies its bucket-level defaults. max_concurrency: None keeps the sequential coordinator — with_max_concurrency flips it to the parallel path.

pub const default_part_size_bytes: Int

S3’s documented minimum part size (5 MiB) for all parts except the last. Smaller part sizes are rejected with EntityTooSmall at CompleteMultipartUpload time; larger sizes cut down on per-part round trips but raise outstanding-request memory.

pub const max_parts_per_upload: Int

S3’s hard cap on parts per multipart upload. Past 10,000 the Complete call returns InvalidArgument regardless of total size, so part_size_for scales part_size_bytes up for large totals to stay inside this limit.

pub fn part_size_for(total_bytes: Int) -> Int

Pick a part size large enough to fit total_bytes inside S3’s 10,000-parts-per-upload cap. Always returns at least default_part_size_bytes (5 MiB, the S3 minimum). Use this to drive upload / upload_from_stream when the body could be arbitrarily large — under 50 GB it returns the 5 MiB default, past 50 GB it scales up so the part count stays at or under 10,000.

For zero or negative total_bytes the helper returns the default — callers that don’t know the size up front can pass 0 and accept the 5 MiB part size until they have a better estimate.

pub fn upload(
  client client: s3.Client,
  bucket bucket: String,
  key key: String,
  body body: BitArray,
  part_size_bytes part_size_bytes: Int,
) -> Result(UploadResult, Error)

Upload body as bucket/key via S3’s multipart API. Splits the body into parts of part_size_bytes (the last part may be smaller), uploads each, then finalises with CompleteMultipartUpload.

Any failure mid-flight triggers a best-effort AbortMultipartUpload so the bucket doesn’t accumulate dangling uploads. The abort’s own success / failure is intentionally silenced — the caller already has the more interesting error from the step that failed.

An empty body returns Error(EmptyBody); S3 rejects empty multipart uploads with EntityTooSmall, so we short-circuit before the create round trip.

pub fn upload_from_stream(
  client client: s3.Client,
  bucket bucket: String,
  key key: String,
  body body: streaming.StreamingBody,
  part_size_bytes part_size_bytes: Int,
) -> Result(UploadResult, Error)

Same as upload, but takes a StreamingBody instead of a buffered BitArray. Walks the body’s chunks once, re-aggregating across chunk boundaries so the wire-side part sizes follow part_size_bytes rather than the source’s chunking — useful when the body comes from a chunked transport or builder that emits frequent small chunks (request streaming, log ingestion, line- oriented producers).

Today both StreamingBody representations (Buffered / Chunked) hold their full bytes in memory, so this variant doesn’t yet reduce peak memory vs upload(buffer_to_bit_array(body), ...). Once StreamingBody grows a lazy Source(...) variant (file handles, generators), this path picks up true bounded-memory streaming for free.

pub fn upload_from_stream_with_options(
  client client: s3.Client,
  bucket bucket: String,
  key key: String,
  body body: streaming.StreamingBody,
  part_size_bytes part_size_bytes: Int,
  options options: UploadOptions,
) -> Result(UploadResult, Error)

upload_from_stream with caller-specified per-object metadata. See UploadOptions.

pub fn upload_with_options(
  client client: s3.Client,
  bucket bucket: String,
  key key: String,
  body body: BitArray,
  part_size_bytes part_size_bytes: Int,
  options options: UploadOptions,
) -> Result(UploadResult, Error)

upload with caller-specified per-object metadata — sets HTTP content metadata (Content-Type, Cache-Control, etc.), the optional ACL / storage class / SSE choice, and any user metadata at CreateMultipartUpload time. See UploadOptions for the field set.

pub fn with_max_concurrency(
  opts: UploadOptions,
  n: Int,
) -> UploadOptions

Override the parallel-upload concurrency cap on an existing UploadOptions. n must be ≥ 1 — values ≤ 0 are coerced to sequential (None) so callers can pass a derived count without guarding it.

Search Document