Help Running Query Directly From SEDE

Screen Link:

Hello,

For this project, I was experimenting with pulling the data from 2020.

This is my SQL code:

SELECT p.Id, pl.CreationDate, p.Score, p.ViewCount, t.TagName, p.AnswerCount, p.FavoriteCount
  FROM Posts AS p
 WHERE p.CreationDate >= '2020-01-01 00:00:00' 
   AND p.CreationDate <= '2020-12-31 23:59:59'
   AND p.PostTypeId = 1
  LEFT JOIN PostTypes AS pt ON pt.Id = p.PostTypeId
  LEFT JOIN PostLinks AS pl ON pl.PostId = p.Id
  LEFT JOIN PostTags AS ptg ON ptg.PostId = pl.PostId
  LEFT JOIN Tags AS t ON t.Id = ptg.TagId;

Since the CreationDate is in datetime format, I searched online for how to filter by date on T-SQL and I found multiple stack overflow answers that suggested the above. I think I am doing it right? I tried with the times and without, both yielded errors.

I also want to only include questions (PostTypeId = 1)

For the joining I am envisioning starting at Posts and left joining all the required tables based on the schema. Is this the right idea?

SEDE is giving me this error: Line 6: Incorrect syntax near the keyword 'LEFT'..

If I delete the WHERE clauses, the query runs fine so I am assuming I am making a mistake somewhere in the WHERE clause.

Where I’m I going wrong?

Thank you for your time!

1 Like

Actually I think I got it with this code:

WITH a AS (
           SELECT Id, CreationDate, Score, ViewCount, AnswerCount, FavoriteCount
             FROM Posts
            WHERE CreationDate >= '2020-01-01 00:00:00' 
              AND CreationDate <= '2020-12-31 23:59:59'
              AND PostTypeId = 1
          )

SELECT a.Id, a.CreationDate, a.Score, a.ViewCount, t.TagName, a.AnswerCount, a.FavoriteCount
  FROM a 
  LEFT JOIN PostLinks AS pl ON pl.PostId = a.Id
  LEFT JOIN PostTags AS ptg ON ptg.PostId = pl.PostId
  LEFT JOIN Tags AS t ON t.Id = ptg.TagId;

But I am only getting 1,914 rows out of 9,021 that actually have tag in the TagName column. Is this right?

Don’t start with writing the code immediately. Give a minute or two and think about the problem. Then think what is one step before this query. Starting point (given all these tables) -> step 1 -> … -> step n taken to reach this query. And, slow work backwards.

Learn the differences between each type of joins.

  • Left Join -> may contains null on Right Table columns. And contains all rows of Left Table.
  • Right Join -> may contains null on Left Table columns. And contains all rows of Right Table.
  • Join -> only matching rows from left and right table.

Knowledge on different types of joins and together with working backwards will help you to build your intuition on what data to merge in order to solve the problem.

This help me to solve complex SQL on LeetCode.

I don’t have access to the premium DQ mission. Only can help as much as possible:

For your multiple left joins, can be describe as
((a LEFT JOIN x) LEFT JOIN y) LEFT JOIN z)

x = PostLinks pl ON pl.PostId = a.Id
y = PostTag ptg ON ptg.PostId = pl.PostId
z = Tags tON t.Id = ptg.TagId

Each Left Join takes

  • Everything on the Left Table
  • And, any matching rows on the Right Table.
  • And, non matching rows on the Left Table, set NULL on right columns.

You have 3 Left Joins. Each Joins reduce the selection of data.

1 Like