pgUnionAll
If you need to pull data from multiple tables and cannot use joins to do so,
then pgUnionAll
is your friend.
This step class uses the SQL UNION ALL
construct to select a number
(including zero, which is a good number for GraphQL unions 😉) of fields
from a one or more different tables that might all be part of the same union or
interface in GraphQL.
You may specify a list of shared fields, and if so then you can order by these
shared fields, or apply conditions to them, and we'll pass these orders and
conditions down to the individual table selects (as part of the UNION ALL
) to
ensure that we get the results in the most efficient manner. You can of course
still select fields that are not shared using the normal GraphQL typed fragment
spreading mechanism.
Right now these shared fields must match name and type exactly on each source in the union (we don't check this, but unexpected errors may occur at runtime if you don't adhere to it). There's definitely scope to soften these requirements - get in touch if this is something you need.
pgUnionAll
is polymorphic-capable (but it doesn't
have to be polymorphic!) and supports both limit/offset and cursor pagination.
pgUnionAll function
The pgUnionAll
function accepts one argument - the PgUnionAllStepConfig
.
This configuration object has the following entries:
resourceByTypeName
- (required) a map from GraphQL type name to the relevantPgResource
from which matching records can be fetched.members
- (optional) a list of details of the sources and relationship paths to combine in theunion all
statement; each entry inmembers
will become anotherunion all
'dselect
statement. If unspecified, we'll generate members for you based onresourceByTypeName
.- TODO: document subkeys of members
attributes
- (optional) an object defining the available common attributes (if any) as a map from the attribute name to a specification object containing thecodec
to use for the attribute; this is generally used with GraphQL interfacesmode
- (optional)normal
for normal mode (default), oraggregate
for performing aggregates (such ascount(*)
)
Every source
must have the same executor
, whether the source is defined
explicitly, or implicitly by following the given relationships.
Every final source
(the source found at the end of any relationship paths)
must have a primary key (an entry in source.uniques
with
isPrimary === true
) that can be used to fetch the resulting record that
matches the entry in the union.
Applying conditions
Conditions can be applied to the resulting step via the .where()
method,
which accepts an object containing the following keys:
attribute
- the (string) name of the attribute from thepgUnionAll
attributes
to apply the condition againstcallback
- a callback function, invoked for each union source and passed the alias for that source, that should return an SQL fragment expressing the condition.
callback
will be called for each entry in members
since each source is
responsible for adding its own conditions.
Custom ordering
The order of the union can be specified via the .orderBy()
method,
which accepts an object containing the following keys:
attribute
- the (string) name of the attribute from thepgUnionAll
attributes
to use for ordering.direction
- eitherASC
for ascending order, orDESC
for descending order. All other values have undefined results that may change in a patch release.
Every entry in members
will be ordered, and the union all
will be ordered again
to ensure a stable ordering result.
Pagination
Limit/offset pagination can be accomplished via .setFirst($n)
and
.setOffset($n)
. pgUnionAll
also implements the relevant interfaces to
support the connection
step for cursor
pagination.
Example
const $vulnerabilities = pgUnionAll({
executor: firstPartyVulnerabilitiesResource.executor,
resourceByTypeName: {
FirstPartyVulnerability: firstPartyVulnerabilitiesResource,
ThirdPartyVulnerability: thirdPartyVulnerabilitiesResource,
},
attributes: {
cvss_score: {
codec: TYPES.float,
},
},
});
$vulnerabilities.orderBy({
attribute: "cvss_score",
direction: "DESC",
});
$vulnerabilities.where({
attribute: "cvss_score",
callback: (alias) =>
sql`${alias} > ${$vulnerabilities.placeholder(constant(6), TYPES.float)}`,
});
$vulnerabilities.setFirst(2);
$vulnerabilities.setOffset(2);
pgUnionAll SQL explained
Though the UNION ALL
complicates PostgreSQL's planning and execution, we've
put effort into building the most efficient SQL queries we can for this
problem, whilst still supporting pagination, custom conditions and custom
ordering. This does result in more complex SQL queries than you may be used
to from this module. Effectively the queries look like this:
-- OUTER SELECT
select
__union__."0"::text,
__union__."1"::text
from (
-- MIDDLE SELECT
select
__first_table__."0",
__first_table__."1",
__first_table__."2",
"n"
from (
-- INNER SELECT
select
__first_table__."column1" as "0",
__first_table__."id" as "1",
'FirstTable' as "2",
row_number() over (partition by 1) as "n"
from first_table as __first_table__
where ...
order by __first_table__."column1"
limit ...
)
-- Any number of additional "middle selects" from different tables
-- via 'union all'
union all
select
...
order by
"0" desc,
"n" asc,
"2" asc
limit ...
offset ...
) __union__
We'll have as many "inner select" and "middle select" fragments as there are tables in the union.
Each "inner select" is responsible for selecting the requisite common fields
from each individual table, applying any conditions (into the where
clause),
applying the ordering (order by
clause), and applying a limit (which will be
the main limit plus the offset so that we can source enough rows for the
union all
's limit/offset to apply).
The middle select exists solely because union all
only
allows a single order by
at the end of the statement, and for some reason we
think we know better how to optimize this query than Postgres does... (Time
will tell.) So the middle select just re-selects the relevant attributes.
The union all
statement then orders by the relevant attributes again
(including the type name and the row_number()
to ensure there's a stable
order) and applies the final limit/offset.
Finally the "outer select" selects the fields we need, and casts them according to the codecs involved. Note that we couldn't have cast them earlier since they were used in ordering, and casting them to text (for example) could seriously compromise the ordering.