BulkUpsert (Bulk Upsert v0.2.0)
View SourceBulk upsert a list of Ecto structs and their nested associations in one call.
Summary
Functions
Validate a list of attrs maps (attrs_list) by passing them through an Ecto changeset,
then upsert the valid items to the database that corresponds to a given Ecto repo_module (e.g.
YourProject.Repo).
Functions
Validate a list of attrs maps (attrs_list) by passing them through an Ecto changeset,
then upsert the valid items to the database that corresponds to a given Ecto repo_module (e.g.
YourProject.Repo).
Using a changeset serves two purposes:
- The changeset can be used to validate and transform the data.
- Using a changeset allows this function to perform bulk upserts with nested associations.
For validation, each list item in the attrs_list is converted to a changeset for a given
schema_module. By default, this function expects the schema module to contain a 2-arity
function called :changeset. (See the #options section for more info.)
Basic example
iex> BulkUpsert.bulk_upsert(
...> YourProject.Repo,
...> YourProject.Persons.Person,
...> _attrs_list = [
...> %{id: 1, name: "Alice", age: 25, phone_number: "555-1234"},
...> %{id: 2, name: "Bob", age: 35, phone_number: "555-2345"},
...> ]
...> )
:okOptions
:changeset_function_atom- The name of the 2-arity changeset function to apply for the givenschema_module(Default::changeset):chunk_size- The number of parent attrs items to insert into the database in a single query. Can be increased or decreased as needed to avoid hitting Postgres max item limit for a single query. (Default:1000):insert_all_function_module- Instead of using the:insert_allfunction in the givenrepo_module, you may specify the name of a custom module to use instead. (Default: Inherited from the value specified in therepo_modulefunction argument, e.g.YourProject.Repo))- Example:
YourProject.OtherRepo
- Example:
:insert_all_function_atom- Instead of using your repo module's:insert_allfunction, you may pass a compatible equivalent that accepts the same arguments. (Default::insert_all)- Example:
:insert_all_with_autogenerated_timestamps
- Example:
:insert_all_opts- Pass customoptsto theinsert_all/3function. This option consists of a map whose key is the schema or source that may have items being upserted, and the value is theYourProject.Repo.insert_all/3opts that will be applied when items for that schema are being upserted. By default, this function is configured to replace all values in a given struct, except for the primary key(s) and the insert timestamp. (Default:%{})- Example:
%{YourProject.Persons.Person => [on_conflict: {:nothing}]} - A
many_to_manyjoin table is keyed by its source, e.g.%{"persons_topics" => [...]}.
- Example:
:placeholders- Set fields from shared values that are sent to the database once instead of once per row, using the:placeholdersfeature of Ecto'sinsert_all/3. This option is a map whose key is the schema or source being upserted, and the value is a map offield => value. The fields are set after changeset validation, so they do not need to appear in the attrs. (Default:%{})- Example:
%{YourProject.Persons.Person => %{inserted_at: DateTime.utc_now()}} - Placeholder fields bypass the changeset, so they are not cast or validated.
- Do not include a placeholder field in the changeset's
validate_required/2. The value is absent during validation, so the changeset would be invalid and the row would be skipped.
- Example:
:recover_changeset_errors- If the given fields in a changeset have errors, then replace them with a custom fallback value. (Default:%{})- Example:
%{YourProject.Persons.Person => %{phone_number: "INVALID"}}
- Example:
:replace_all_except- If a row already exists, then all fields will be replaced except the primary key, and any fields specified here. (Default:[])- Example:
[:field, :other_field]
- Example:
:timeout- The maximum timeout for a transaction. (Default:15000)- Example:
60_000
- Example:
Examples
Upsert a list of Person attrs using the changeset function
YourProject.Persons.Person.upsert_changeset/2 to validate the attrs:
iex> attrs_list = [%{id: 1, name: "Alice", ...}]
iex> BulkUpsert.bulk_upsert(
...> YourProject.Repo,
...> YourProject.Persons.Person,
...> attrs_list,
...> changeset_function_atom: :upsert_changeset
...> )
:okUpsert a list of attrs, but overwrite the :name field if there is a conflict.
If using this option, you must declare each schema that will get a customized :insert_all_opts
keyword list. Any schemas that are not given custom :insert_all_opts will overwrite all fields
except the primary key:
iex> insert_all_opts = %{
...> YourProject.Persons.Person => [on_conflict: {:replace, [:name]}]
...> }
iex> BulkUpsert.bulk_upsert(
...> YourProject.Repo,
...> YourProject.Persons.Person,
...> _attrs_list = [%{id: 1, name: "Alicia"}],
...> insert_all_opts: insert_all_opts
...> )
:okAssociations
Nested associations are upserted in the same call as the parent:
has_manyandhas_one: the associated records are upserted into their own table. Each child must include its foreign key in its attrs, since it is upserted directly viainsert_all/3.many_to_many: the associated records are upserted into their own table, and the join table rows linking each parent to its associations are upserted as well. Duplicate records and links are removed automatically.embeds_oneandembeds_many: embedded data has no table of its own, so it is stored inline on the parent row as part of the parent upsert.
Known limitations
- Nested
belongs_toassociations are not upserted. To associate with abelongs_toparent, include its foreign key field in the attrs (e.g.category_id).