Zap v0.1.0 Zap View Source

Native ZIP archive creation with chunked input and output.

Erlang/OTP provides powerful the :zip and :zlib modules, but they can only create an archive all at once. That requires all of the data to be kept in memory or written to disk. What if you don't have enough space to keep the file in memory or on disk? With Zap you can add files one at a time while writing chunks of data at the same time.

Examples

Create a ZIP by adding a single entry at a time:

iodata =
  Zap.new()
  |> Zap.entry("a.txt", a_binary)
  |> Zap.entry("b.txt", some_iodata)
  |> Zap.entry("c.txt", more_iodata)
  |> Zap.to_iodata()

File.write!("archive.zip", iodata, [:binary, :raw])

Use into support from the Collectable protocol to build a ZIP dynamically:

iodata =
  "*.*"
  |> Path.wildcard()
  |> Enum.map(fn path -> {Path.basename(path), File.read!(path)} end)
  |> Enum.into(Zap.new())
  |> Zap.to_iodata()

File.write!("files.zip", iodata, [:binary, :raw])

Use Zap.into_stream/2 to incrementally build a ZIP by chunking files into an archive:

one_mb = 1024 * 1024

write_fun = &File.write!("streamed.zip", &1, [:append, :binary, :raw])

file_list
|> Stream.map(fn path -> {Path.basename(path), File.read!(path)} end)
|> Zap.into_stream(one_mb)
|> Stream.each(write_fun)
|> Stream.run()

Glossary

The entry and header bytes are composed based on the ZIP specification provided by PKWare. Some helpful terms that you may encounter in the function documentation:

  • LFH (Local File Header) — Included before each file in the archive. The header contains details about the name and size of the entry.
  • CDH (Central Directory Header) — The final bits of an archive, this contains summary information about the files contained within the archive.

Link to this section Summary

Functions

Check the total number of un-flushed bytes available.

Add a named entry to a zap struct.

Generate the final CDH (Central Directory Header), required to complete an archive.

Flush a fixed number of bytes from the stored entries.

Stream an enumerable of name/data pairs into a zip structure and emit chunks of zip data.

Initialize a new Zap struct.

Output a complete iolist of data from a Zap struct.

Link to this section Types

Link to this type

t()

View Source
t() :: %Zap{entries: [Zap.Entry.t()]}

Link to this section Functions

Link to this function

bytes(zap)

View Source (since 0.1.0)
bytes(zap :: t()) :: non_neg_integer()

Check the total number of un-flushed bytes available.

Example

iex> Zap.bytes(Zap.new())
0

iex> Zap.new()
...> |> Zap.entry("a.txt", "a")
...> |> Zap.bytes()
52

iex> zap = Zap.new()
...> zap = Zap.entry(zap, "a.txt", "a")
...> {zap, _bytes} = Zap.flush(zap, :all)
...> Zap.bytes(zap)
0
Link to this function

entry(zap, name, data)

View Source (since 0.1.0)
entry(zap :: t(), name :: binary(), data :: binary()) :: t()

Add a named entry to a zap struct.

Example

Zap.new()
|> Zap.entry("a.jpg", jpg_data)
|> Zap.entry("b.png", png_data)
Link to this function

final(zap)

View Source (since 0.1.0)
final(zap :: t()) :: {t(), iodata()}

Generate the final CDH (Central Directory Header), required to complete an archive.

Link to this function

flush(zap, bytes \\ :all)

View Source (since 0.1.0)
flush(zap :: t(), bytes :: pos_integer() | :all) :: {t(), iodata()}

Flush a fixed number of bytes from the stored entries.

Flushing is stateful, meaning the same data won't be flushed on successive calls.

Example

iex> Zap.new()
...> |> Zap.entry("a.txt", "aaaa")
...> |> Zap.entry("b.txt", "bbbb")
...> |> Zap.flush()
...> |> elem(1)
...> |> IO.iodata_length()
110
Link to this function

into_stream(enum, chunk_size \\ 1024 * 1024)

View Source (since 0.1.0)
into_stream(enum :: Enumerable.t(), chunk_size :: pos_integer()) ::
  Enumerable.t()

Stream an enumerable of name/data pairs into a zip structure and emit chunks of zip data.

The chunked output will be at least the size of chunk_size, but they may be much larger. The last emitted chunk automatically includes the central directory header, the closing set of bytes.

Example

iex> %{"a.txt" => "aaaa", "b.txt" => "bbbb"}
...> |> Zap.into_stream(8)
...> |> Enum.to_list()
...> |> IO.iodata_to_binary()
...> |> :zip.table()
...> |> elem(0)
:ok
Link to this function

new()

View Source (since 0.1.0)
new() :: t()

Initialize a new Zap struct.

The struct is used to accumulate entries, which can then be flushed as parts of a zip file.

Link to this function

to_iodata(zap)

View Source (since 0.1.0)
to_iodata(zap :: t()) :: iolist()

Output a complete iolist of data from a Zap struct.

This is a convenient way of combining the output from flush/1 and final/1.

Though the function is called to_iodata it also returns a zap struct because the struct is modified when it is flushed.

Example

iex> Zap.new()
...> |> Zap.entry("a.txt", "aaaa")
...> |> Zap.entry("b.txt", "bbbb")
...> |> Zap.to_iodata()
...> |> IO.iodata_length()
248