Confused about the best solution

Hi, I am working from this portion of the mission:

Screen Link:

The way I’m addressing this is by creating two views (the first is intended to isolate invoices from the USA, the second joins the genres table to the tracks table). Finally, the third block combines these, but grouping by genre, and then counting the number of tracks sold per genre. As I look around the discussions here, I am finding that others are finding different output for their versions of this work. I am getting much lower numbers. So, these queries “work” together, but they do not seem to produce the correct results (it’s also nearly impossible for me to figure out, at this point, how I’d go a step further and calculate percentages, but I’d at least try and get this right). Anyone see what I might be doing wrong?

My Code:

First View:
SELECT i.BillingCountry, i.invoiceid, it.trackid
    FROM invoices AS i
    INNER JOIN invoice_items AS it ON it.invoiceid = i.invoiceid
    WHERE BillingCountry = 'USA';

Second View:

CREATE VIEW Tracks_and_genres AS
SELECT t.trackid, t.genreid,
    FROM tracks AS t
    INNER JOIN genres AS g ON g.genreid = t.genreid

Combining these views:

SELECT, COUNT(USA.trackid) AS Tracks_Sold_Per_Genre
FROM Tracks_and_genres AS tg
INNER JOIN USA_Only AS USA ON USA.trackid = tg.trackID

Hey @Willyjgolden

We are going to analyze this based on our existing data and decide which are the best genres to be added: Hip-Hop, Punk, Pop or Blues.

You can follow the below code which is a very simple version and easy to understand


WITH usa_tracks_sold AS (
SELECT il.track_id
FROM invoice AS i
INNER JOIN invoice_line AS il
ON il.invoice_id = i.invoice_id
WHERE i.billing_country = ‘USA’

 genre_popularity AS (SELECT AS genre, 
                           COUNT(t.track_id) AS tracks_sold, 
                           ROUND(COUNT(t.track_id) / CAST((SELECT COUNT(*) 
                                                             FROM usa_tracks_sold) AS FLOAT), 3) AS percentage_sold
                      FROM track AS t
                           INNER JOIN genre AS g
                              ON g.genre_id = t.genre_id

                           INNER JOIN usa_tracks_sold
                              ON usa_tracks_sold.track_id = t.track_id 
                     GROUP BY g.genre_id
                     ORDER BY tracks_sold DESC)

SELECT genre AS Genre,
tracks_sold AS ‘Tracks Sold’,
percentage_sold AS ‘Percentage Sold’
FROM genre_popularity

Once you are done with this , try to convert the query to dataframe and plot it . You can easily identify the percentage of each genres.

Hope it helps.