Postgres SQL GROUP BY refering to column in SELECT?

Do not test this on https://sqliteonline.com, their timestamp datatype is somehow messed up. (rounding down timestamp values at insertion time).

This example is using Postgres 12.2 in Dbeaver, which should be very strict with following SQL standards.

Why does GROUP BY truncated (truncated is a column alias) not error?

At runtime during GROUP BY, it should not know the aliased columns in SELECT clause yet, much less the DATE_TRUNC calculation, yet truncated which is an alias can be used in GROUP BY?
I expected the correct way to require the full calculation in SELECT to be repeated like GROUP BY date_trunc('day',created_at)

Is truncated somehow being automatically translated to date_trunc('day',created_at) at the backend?
I definitely remember learning from DQ somewhere that the full calculation needs to be specified in GROUP BY and aliases won’t work

DROP TABLE IF exists post_events; 

CREATE TABLE post_events(
	user_id INTEGER,
 	created_at timestamp,
  	event_name VARCHAR
);

insert into post_events
VALUES
(123,'2019-01-01T10:00','enter'),
(123,'2019-01-01T11:00','post'),
(456,'2019-02-02T7:00','enter'),
(456,'2019-02-02T8:00','cancel'),
(456,'2019-02-02T9:00','enter'),
(456,'2019-02-02T19:00','post')
SELECT date_trunc('day',created_at) truncated,COUNT(user_id )
FROM post_events 
GROUP BY truncated

hi @hanqi

This is almost 0 help, but you may understand it better than me.

https://www.postgresql.org/message-id/7608.1259177709%40sss.pgh.pa.us

This is only in the context of Postgres.

Hi Hanqi,

TLDR this is non-standard SQL in Postgresql they left in.

I’m also not sure if the column alias prohibitions are still in later versions of SQL standard. You might want not to use these aliases in GROUP BY clauses, if you want portable SQL, but it’s probably a beneficial quirk of the platform as retyping those column alias expressions in different clauses is a bit unpleasant. The following link lays it out as well what SQL clauses you can use column aliases on in Postgresql. https://www.enterprisedb.com/postgres-tutorials/how-use-tables-and-column-aliases-when-building-postgresql-query

The restriction against the column aliases in GROUP BY clause has been part of the SQL standard and many commercial products (Oracle, SQL Server). The reason for this is because of order of the operations of different clause and the SELECT clause is parsed later. I had never noticed that a column alias worked in GROUP BY in PG before, but I guess the thread @Rucha posted above shows it’s a implementation difference they have stuck with for a while. The implication of the post seems to be there isn’t really a strong enough motivation to to not just save the alias definition ( the translation you mention above when parsing what to do for GROUP BY even if GROUP BY will get parsed first.

If you run you query above with the EXPLAIN hint, you will see the query plan has a step Group Key: date_trunc(‘day’::text, created_at)
EXPLAIN SELECT date_trunc(‘day’,created_at) truncated,COUNT(user_id )
FROM post_events
GROUP BY truncated

1 Like