sql - ORDER BY column from right table of LEFT OUTER JOIN -
i'm having serious performance issues when using left outer join
, trying use column in right table in postgres. have table of users , table online_users lists user ids online in website. both tables have indexes in user ids. need run select on users table , list first users online, followed users aren't online. select is:
select * users left join online_users on (users.id = online_users.usr_id) order online_users.online_date
i have indexes on users.id
, online_users.usr_id
, online_users.online_date
, reason, when run analyze
on query, index online_users.online_date
isn't used postgres , full scan ruins query's performance.
is there way optimize query without changing tables' structure (these tables replicated, changing structure require major refactoring of our project).
postgre version 9.3
below explain analyze:
query plan ------------------------------------------------------------------------------------------------------------------------------------ sort (cost=2589440.94..2595456.84 rows=2406361 width=506) (actual time=18635.686..25775.334 rows=2239030 loops=1) sort key: usuarios_online.datamessenger sort method: external merge disk: 512424kb -> hash left join (cost=219.73..130113.66 rows=2406361 width=506) (actual time=0.723..12388.266 rows=2239030 loops=1) hash cond: (usuarios.id = usuarios_online.id_usr) -> seq scan on usuarios (cost=0.00..108832.61 rows=2406361 width=494) (actual time=0.009..7328.191 rows=2238984 loops=1) -> hash (cost=212.66..212.66 rows=566 width=12) (actual time=0.704..0.704 rows=572 loops=1) buckets: 1024 batches: 1 memory usage: 27kb -> seq scan on usuarios_online (cost=0.00..212.66 rows=566 width=12) (actual time=0.079..0.555 rows=572 loops=1) total runtime: 28519.611 ms (10 rows)
since order rows online_users
, makes sense use union
query instead:
( select usr_id, online_date -- more columns? online_users order online_date ) union select u.id, null -- more matching columns? users u left join online_users o on u.id = o.usr_id o.usr_id null;
should faster in case.
online_users
can utilize index on online_date
now.
2 simpler query plans can use indexes more easily.
other users don't have sorted @ all. second select
needs exclude online_users
:
parentheses around first select
required allow order by
put it.
this might further optimized, depending on undeclared specifics of case.
Comments
Post a Comment