(Other possible values are discussed below.) To force the planner to follow the join order laid out by explicit JOINs, set the join_collapse_limit run-time parameter to 1. This effect is not worth worrying about for only three tables, but it can be a lifesaver with many tables. SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id) īut if we tell the planner to honor the JOIN order, the second and third take less time to plan than the first. SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id For example, these three queries are logically equivalent: Most practical cases involving LEFT JOIN or RIGHT JOIN can be rearranged to some extent.Įxplicit inner join syntax ( INNER JOIN, CROSS JOIN, or unadorned JOIN) is semantically the same as listing the input relations in FROM, so it does not constrain the join order.Įven though most kinds of JOIN don't completely constrain the join order, it is possible to instruct the PostgreSQL query planner to treat all JOIN clauses as constraining the join order anyway. Currently, only FULL JOIN completely constrains the join order. It is valid to join A to either B or C first. SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id) In other cases, the planner might be able to determine that more than one join order is safe. Accordingly, this query takes less time to plan than the previous query. Therefore the planner has no choice of join order here: it must join B to C and then join A to that result. SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id) Īlthough this query's restrictions are superficially similar to the previous example, the semantics are different because a row must be emitted for each row of A that has no matching row in the join of B and C. When the query involves outer joins, the planner has less freedom than it does for plain (inner) joins. (The switch-over threshold is set by the geqo_threshold run-time parameter.) The genetic search takes less time, but it won't necessarily find the best possible plan. When there are too many input tables, the PostgreSQL planner will switch from exhaustive search to a genetic probabilistic search through a limited number of possibilities. Beyond ten or so input tables it's no longer practical to do an exhaustive search of all the possibilities, and even for six or seven tables planning might take an annoyingly long time. But the number of possible join orders grows exponentially as the number of tables expands. When a query only involves two or three tables, there aren't many join orders to worry about. Therefore, the planner will explore all of them to try to find the most efficient query plan. (All joins in the PostgreSQL executor happen between two input tables, so it's necessary to build up the result in one or another of these fashions.) The important point is that these different join possibilities give semantically equivalent results but might have hugely different execution costs. Or it could join A to C and then join them with B - but that would be inefficient, since the full Cartesian product of A and C would have to be formed, there being no applicable condition in the WHERE clause to allow optimization of the join. Or it could join B to C and then join A to that result. For example, it could generate a query plan that joins A to B, using the WHERE condition a.id = b.id, and then joins C to this joined table, using the other WHERE condition. The planner is free to join the given tables in any order. SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id To see why this matters, we first need some background. It is possible to control the query planner to some extent by using the explicit JOIN syntax. 14.3. Controlling the Planner with Explicit JOIN Clauses #
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |