Code Error + 2 doubts on CAST(Float ) and column display in main query

Screen Link:
Building And Organizing Complex Queries | Dataquest

My Code:

WITH subq AS
        playlist.playlist_id, AS playlist_name,
        COUNT(track.track_id) AS number_of_tracks,
        CAST(SUM(track.milliseconds/1000) as FLOAT) AS length_seconds
        FROM playlist
        LEFT JOIN playlist_track ON playlist_track.playlist_id = playlist.playlist_id
        LEFT JOIN track ON track.track_id =playlist_track.track_id
GROUP BY playlist_id, playlist_name
ORDER BY playlist_id;

What I expected to happen:
The code to execute successfully

What actually happened:

I got just 1 row as output instead of 18 rows

I compared my code with DQ’s solution and made changes as long i could understand the logic. Apart from the query that my code still did not work, I have 2 other doubts:

  1. Mission asks us to calculate length of playlist as float. I do not see CAST function being used. Is it implicit in the format of 1000.0 used by DQ ? Also is there anything wrong with the way i have used CAST?

  2. For the main query, using the example given by DQ, I used "SELECT * " and aggregate functions within sub query to display the columns . However, in the DQ solution the main query mentions the name of the columns /aliases to be displayed. Is that wrong ?

DQ solution is below:

WITH playlist_info AS
p.playlist_id, playlist_name, track_name,
(t.milliseconds / 1000.0) length_seconds
FROM playlist p
LEFT JOIN playlist_track pt ON pt.playlist_id = p.playlist_id
LEFT JOIN track t ON t.track_id = pt.track_id

COUNT(track_name) number_of_tracks,
SUM(length_seconds) length_seconds
FROM playlist_info

Someone had asked the same query earlier, but guess it wasn’t answered:
I ran the GROUP BY clause within the subquery and got near identical results yet slightly off and do not understand why - Q&A / DQ Courses - Dataquest Community

Hi @raturi22

To answer your questions:

  1. SQL will recognise 1000.0 as a FLOAT value, and we know from this screen: that if you multiply or divide with a FLOAT value, you will always return a FLOAT value. While your code CAST(SUM(track.milliseconds/1000) as FLOAT) isn’t incorrect in terms of syntax, there’s an important distinction here in the methodology that will affect the returned value. Recall that the milliseconds column from the track table contains a list of track lengths in milliseconds, stored as INTERGER values. Note the following examples, where I’ve substituted t.milliseconds with a dummy track length
  • DQ method: 8359 / 1000.0 = 8.359 here, an INTEGER value is divided by a FLOAT value, and therefore a FLOAT is returned
  • Your method: 8359 / 1000 = 8 here, dividing an INTERGER by and INTERGER returns an INTERGER. You then go on to effectively CAST(8 AS FLOAT) = 8.0, a very different value from above!
  1. Here you have attempted to perform aggregate functions within your subquery, while placing the GROUP BY operation outside as part of the main query. Essentially, you haven’t specified what the aggregations in your subquery should aggregate by. Take a look at this page on how GROUP BY works. What you need to do is either:
  • move your GROUP BY into your subquery OR
  • as in the DQ method, move your aggregations into the main query

Hope that makes sense

1 Like

thanks! - keeping group by and aggregate function together is new knowledge!