Different Method Similar Results - Feedback

Hi guys. This was my approach and my results and analysis. Would like feedback. I could not do it the way you were showing it so I just gave up. I have no experience in SQL so basically I couldn’t bridge the gap from the instruction the platform has, google, and my understanding. I’m content with these results. I attached my analysis and explanation also. Thanks in advance.


Album Bought - 19.84%
Album Not Bought - 66.67%
Manually Added Tracks - 13.5%

After running the query for analysis it shows that if doing away with the ability to purchase a complete album the loss in sales would be significant.

The analysis was conducted by looking at the invoices, counting how many tracks were in the invoice, comparing that to the number of tracks in the albums we have listed.

This gives rise to three scenarios:

  1. The amount of tracks is greater than the number in the album which means the person added individual tracks manually. This was one of the options customers had when purchasing as well.
  2. The amount of tracks purchased in the invoice was below the amount of tracks in the album, which naturally means the complete album was not purchased.
  3. The amount of tracks in the invoice are the same which would mean the complete album was purchased.
WITH album_main as (
SELECT  a.album_id album_id,  count( distinct t.track_id) track_id_num
FROM album a
LEFT JOIN track t on a.album_id = t.album_id
LEFT JOIN invoice_line il on t.track_id = il.track_id
LEFT JOIN invoice i on il.invoice_id = i.invoice_id
GROUP BY a.album_id


invoiced_as AS (
SELECT i.invoice_id invoice_id, a.album_id album_id, count(distinct t.track_id) tracks_per_invoice
FROM album a
LEFT JOIN track t on a.album_id = t.album_id
LEFT JOIN invoice_line il on t.track_id = il.track_id
LEFT JOIN invoice i on il.invoice_id = i.invoice_id
GROUP BY i.invoice_id

order by i.invoice_id desc

bought_it AS (

select invoiced_as.invoice_id, invoiced_as.album_id from_album, album_main.album_id album_id, album_main.track_id_num tracks_on_album, invoiced_as.tracks_per_invoice,
       case when (tracks_per_invoice/album_main.track_id_num) - 1 > 0 then 'Manually Added Tracks'
            when tracks_per_invoice/album_main.track_id_num < 1.0 then 'Album Not Bought'
            when tracks_per_invoice/album_main.track_id_num = 1.0 then 'Album Bought'
            END as bought_or_not
from invoiced_as
left join album_main on album_main.album_id=invoiced_as.album_id)

select bought_or_not, round(cast(count(bought_or_not) as float)/(select count(*) from bought_it)*100,2) bought_percent
from bought_it
group by bought_or_not

@hanqi - Any shot you could give some feedback when you have time please?

Sorry I have no idea what’s going on here, I haven’t looked at that mission in years.

Is there something wrong (functional/non-functional) with your sql code?
Are you looking for alternative ways to answer the same DQ question?

Maybe you could summarize

  1. The goal as described by dataquest
  2. Your interpretation of the goal (if 1. is not obvious)
  3. If you are looking for alternative solutions, what are your hypothesis of what you can tweak from your currently working code?
  4. Which part of your current answer are you not satisfied?

It’s very mentally taxing for me to go through that sql without comments and guess why each table has to be joined, why there are 3 CTEs, whether all 3 CTE are in a linear or parallel dependency chain, or what are the intermediate output samples from each CTE, why each selected column is necessary (eg. can’t see why invoiced_as must select invoice_id)