Ecto.Query.preload

You're seeing just the macro preload, go back to Ecto.Query module for more information.
Link to this macro

preload(query, bindings \\ [], expr)

View Source (macro)

Preloads the associations into the result set.

Imagine you have a schema Post with a has_many :comments association and you execute the following query:

Repo.all from p in Post, preload: [:comments]

The example above will fetch all posts from the database and then do a separate query returning all comments associated with the given posts. The comments are then processed and associated to each returned post under the comments field.

Often times, you may want posts and comments to be selected and filtered in the same query. For such cases, you can explicitly tell an existing join to be preloaded into the result set:

Repo.all from p in Post,
           join: c in assoc(p, :comments),
           where: c.published_at > p.updated_at,
           preload: [comments: c]

In the example above, instead of issuing a separate query to fetch comments, Ecto will fetch posts and comments in a single query and then do a separate pass associating each comment to its parent post. Therefore, instead of returning number_of_posts * number_of_comments results, like a join would, it returns only posts with the comments fields properly filled in.

Nested associations can also be preloaded in both formats:

Repo.all from p in Post,
           preload: [comments: :likes]

Repo.all from p in Post,
           join: c in assoc(p, :comments),
           join: l in assoc(c, :likes),
           where: l.inserted_at > c.updated_at,
           preload: [comments: {c, likes: l}]

Applying a limit to the association can be achieved with inner_lateral_join:

Repo.all from p in Post, as: :post,
           join: c in assoc(p, :comments),
           inner_lateral_join: top_five in subquery(
             from Comment,
             where: [post_id: parent_as(:post).id],
             order_by: :popularity,
             limit: 5,
             select: [:id]
           ), on: top_five.id == c.id,
           preload: [comments: c]

Preload queries

Preload also allows queries to be given, allowing you to filter or customize how the preloads are fetched:

comments_query = from c in Comment, order_by: c.published_at
Repo.all from p in Post, preload: [comments: ^comments_query]

The example above will issue two queries, one for loading posts and then another for loading the comments associated with the posts. Comments will be ordered by published_at.

When specifying a preload query, you can still preload the associations of those records. For instance, you could preload an author's published posts and the comments on those posts:

posts_query = from p in Post, where: p.state == :published
Repo.all from a in Author, preload: [posts: ^{posts_query, [:comments]}]

Note: keep in mind operations like limit and offset in the preload query will affect the whole result set and not each association. For example, the query below:

comments_query = from c in Comment, order_by: c.popularity, limit: 5
Repo.all from p in Post, preload: [comments: ^comments_query]

won't bring the top of comments per post. Rather, it will only bring the 5 top comments across all posts. Instead, use a window:

ranking_query =
  from c in Comment,
  select: %{id: c.id, row_number: over(row_number(), :posts_partition)},
  windows: [posts_partition: [partition_by: :post_id, order_by: :popularity]]

comments_query =
  from c in Comment,
  join: r in subquery(ranking_query),
  on: c.id == r.id and r.row_number <= 5

Repo.all from p in Post, preload: [comments: ^comments_query]

Preload functions

Preload also allows functions to be given. In such cases, the function receives the IDs of the parent association and it must return the associated data. Ecto then will map this data and sort it by the relationship key:

comment_preloader = fn post_ids -> fetch_comments_by_post_ids(post_ids) end
Repo.all from p in Post, preload: [comments: ^comment_preloader]

This is useful when the whole dataset was already loaded or must be explicitly fetched from elsewhere. The IDs received by the preloading function and the result returned depends on the association type:

  • For has_many and belongs_to - the function receives the IDs of the parent association and it must return a list of maps or structs with the associated entries. The associated map/struct must contain the "foreign_key" field. For example, if a post has many comments, when preloading the comments with a custom function, the function will receive a list of "post_ids" as the argument and it must return maps or structs representing the comments. The maps/structs must include the :post_id field

  • For has_many :through - it behaves similarly to a regular has_many but note that the IDs received are of the last association. Imagine, for example, a post has many comments and each comment has an author. Therefore, a post may have many comments_authors, written as has_many :comments_authors, through: [:comments, :author]. When preloading authors with a custom function via :comments_authors, the function will receive the IDs of the authors as the last step

  • For many_to_many - the function receives the IDs of the parent association and it must return a tuple with the parent id as the first element and the association map or struct as the second. For example, if a post has many tags, when preloading the tags with a custom function, the function will receive a list of "post_ids" as the argument and it must return a tuple in the format of {post_id, tag}

Keywords example

# Returns all posts, their associated comments, and the associated
# likes for those comments.
from(p in Post,
  preload: [comments: :likes],
  select: p
)

Expressions examples

Post |> preload(:comments) |> select([p], p)

Post
|> join(:left, [p], c in assoc(p, :comments))
|> preload([p, c], [:user, comments: c])
|> select([p], p)