-
Notifications
You must be signed in to change notification settings - Fork 937
Description
What do you want to change?
sqlc
generates methods for :many
queries that read all rows into a slice. For example, the parameterised ListAuthorsByIDs
query executes:
-- name: ListAuthorsByIDs :many
SELECT id, bio, birth_year FROM authors
WHERE id = ANY($1::int[]);
and the generated Go method allocates a []Author
, iterates over rows.Next()
, scans each row into a struct and appends it to the slice before returning. This is convenient but can be inefficient for large result sets because it requires storing all rows in memory.
Standard :many generated code
const listAuthors = `-- name: ListAuthors :many
SELECT id, name, bio FROM authors
ORDER BY name
`
func (q *Queries) ListAuthors(ctx context.Context) ([]Author, error) {
rows, err := q.db.QueryContext(ctx, listAuthors)
if err != nil {
return nil, err
}
defer rows.Close()
var items []Author
for rows.Next() {
var i Author
if err := rows.Scan(&i.ID, &i.Name, &i.Bio); err != nil {
return nil, err
}
items = append(items, i)
}
if err := rows.Close(); err != nil {
return nil, err
}
if err := rows.Err(); err != nil {
return nil, err
}
return items, nil
}
There have been discussions about adding streaming APIs (e.g. via a :stream
or :callback
annotation) 1 and a long‑running PR to support iterators 2. A maintainer noted that the forthcoming Go iter
package may influence how such a feature should be implemented, and no built‑in solution has been decided on yet.
Proposal
To avoid committing to a particular iterator/channel API, I propose generating two low‑level helper functions per :many
query when a new emit_stream_helpers
option is enabled. These helpers decouple executing the query from scanning rows, enabling callers to implement their own streaming or pooling logic while relying on sqlc
’s type‑safe scanning.
Example using ListAuthorsByIDs
Today sqlc
generates a ListAuthorsByIDs
method that returns ([]Author, error)
and reads all rows into a slice. With emit_stream_helpers: true
, sqlc
could additionally generate helpers like these:
const listAuthorsByIDs = `-- name: ListAuthorsByIDs :many
SELECT id, bio, birth_year FROM authors
WHERE id = ANY($1::int[])`
// ListAuthorsByIDsPrepare executes the query and returns *sql.Rows.
// It accepts the same parameters as the current ListAuthorsByIDs.
func (q *Queries) ListAuthorsByIDsPrepare(ctx context.Context, ids []int64) (*sql.Rows, error) {
return q.db.QueryContext(ctx, listAuthorsByIDs, pq.Array(ids))
}
// ListAuthorsByIDsBindRow scans the current row into dest.
// The caller is responsible for allocating dest and can reuse it.
func (q *Queries) ListAuthorsByIDsBindRow(rows *sql.Rows, dest *Author) error {
return rows.Scan(&dest.ID, &dest.Bio, &dest.BirthYear)
}
A caller can then stream results without materialising a slice:
ctx := context.Background()
rows, err := q.ListAuthorsByIDsPrepare(ctx, []int64{1, 2, 3})
if err != nil { return err }
defer rows.Close()
buf := &Author{} // could come from sync.Pool
for rows.Next() {
if err := q.ListAuthorsByIDsBindRow(rows, buf); err != nil {
return err
}
// process buf (e.g. write to a gRPC stream)
}
if err := rows.Err(); err != nil { return err }
Benefits
- Constant memory usage: The caller controls when each row is read; no slice is allocated.
- Memory pooling: Because
BindRow
accepts a pointer, callers can reuse a struct from async.Pool
to minimise allocations. - Flexibility: Low‑level helpers let users wrap them in a channel, iterator, or callback abstraction, and leave room for future integration with Go’s
iter
package. - Consistency with existing patterns:
sqlc
already supports prepared queries (emit_prepared_queries
); this proposal applies a similar pattern at the per‑query level.
Open questions / feedback
- Should helpers be generated by default or behind a new
emit_stream_helpers
flag? - What naming best communicates their purpose (e.g.
Prepare
/BindRow
,Rows
/ScanRow
)? - Should
Prepare
return*sql.Rows
(executing the query) or*sql.Stmt
(requiring the caller to execute)? - Would returning a boolean from
BindRow
(e.g.(bool, error)
) be useful for early termination?
I plan to contribute an implementation for this feature. Before beginning, I’d appreciate feedback from maintainers and the community on whether this approach fits the project’s direction and how the API should be designed.
What database engines need to be changed?
PostgreSQL
What programming language backends need to be changed?
Go