sql - ORDER BY column from right table of LEFT OUTER JOIN -


i'm having serious performance issues when using left outer join , trying use column in right table in postgres. have table of users , table online_users lists user ids online in website. both tables have indexes in user ids. need run select on users table , list first users online, followed users aren't online. select is:

select * users left join online_users on (users.id = online_users.usr_id) order online_users.online_date 

i have indexes on users.id, online_users.usr_id , online_users.online_date, reason, when run analyze on query, index online_users.online_date isn't used postgres , full scan ruins query's performance.

is there way optimize query without changing tables' structure (these tables replicated, changing structure require major refactoring of our project).

postgre version 9.3

below explain analyze:

                                                          query plan                                                              ------------------------------------------------------------------------------------------------------------------------------------  sort  (cost=2589440.94..2595456.84 rows=2406361 width=506) (actual time=18635.686..25775.334 rows=2239030 loops=1)    sort key: usuarios_online.datamessenger    sort method: external merge  disk: 512424kb    ->  hash left join  (cost=219.73..130113.66 rows=2406361 width=506) (actual time=0.723..12388.266 rows=2239030 loops=1)          hash cond: (usuarios.id = usuarios_online.id_usr)          ->  seq scan on usuarios  (cost=0.00..108832.61 rows=2406361 width=494) (actual time=0.009..7328.191 rows=2238984 loops=1)          ->  hash  (cost=212.66..212.66 rows=566 width=12) (actual time=0.704..0.704 rows=572 loops=1)                buckets: 1024  batches: 1  memory usage: 27kb                ->  seq scan on usuarios_online  (cost=0.00..212.66 rows=566 width=12) (actual time=0.079..0.555 rows=572 loops=1)  total runtime: 28519.611 ms (10 rows) 

since order rows online_users, makes sense use union query instead:

( select usr_id, online_date  -- more columns?   online_users order  online_date ) union select u.id, null  -- more matching columns?   users u left   join online_users o on u.id = o.usr_id  o.usr_id null; 

should faster in case.

online_users can utilize index on online_date now.
2 simpler query plans can use indexes more easily.
other users don't have sorted @ all. second select needs exclude online_users:

parentheses around first select required allow order by put it.

this might further optimized, depending on undeclared specifics of case.


Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -