Course : Data Engineering
Mission: Using an Index
Slide: Index Performance on Join
In this slide, the output of the exercise is showing that the total cost before and after the index creation in column state is same, but am expecting the cost is lesser when we create an index. Any specific reasons for this kind of output?
import pprint as pp
conn = psycopg2.connect(dbname="dq", user="hud_admin", password="abc123")
cur = conn.cursor()
cur.execute("CREATE INDEX state_idx ON homeless_by_coc(state)")
conn.commit()
cur.execute("EXPLAIN (format json, ANALYZE)SELECT hbc.state, hbc.coc_number, hbc.coc_name, si.name FROM homeless_by_coc as hbc, state_info as si WHERE hbc.state = si.postal")
pp.pprint(cur.fetchall())
cur.execute("DROP INDEX if EXISTS state_idx")
conn.commit()
cur.execute("EXPLAIN (format json, ANALYZE)SELECT hbc.state, hbc.coc_number, hbc.coc_name, si.name FROM homeless_by_coc as hbc, state_info as si WHERE hbc.state = si.postal")
pp.pprint(cur.fetchall())
Both the queries have the total cost as 9956.15
Hi @sreeramkolluru
The short answer is: both queries have the same query plan and therefore the same estimated cost.
Long answer:
If you look at the result of EXPLAIN
with and without the index you will see that they have the same query plan.
HashJoin
Seq Scan
Hash
The cost is estimated depending on the query plan so it is natural to get the same cost.
This means that Postgres decided not to use the state_idx
. In general, Postgres will decide to ignore an index and revert back to a sequential scan if the query results in too many rows because in these situations it is actually slower to use an index.
If you actually execute the query (removing the EXPLAIN
) and print the number of rows in the result you will see that it has 85449
rows. In total the homeless_by_coc
table has 86529
rows.
As you can see, the query outputs nearly every row so Postgres decided not to use the index since to go over all rows it is faster to do a simple sequential scan.
I hope this answers your question. Do not hesitate to ask for more clarifications if you are still having trouble understanding it.
2 Likes
Thanks for such a clear explanation.