Even after you fix your solution, it’s going to end up being different because the proposed solution is using a different definition of what an album purchase is.

For example the following invoice ids are purchases that include full albums with only one track that the given solution ignores: `{4, 517, 7, 11, 15, 527, 20, 27, 548, 37, 38, 39, 41, 565, 59, 577, 579, 67, 586, 78, 597, 87, 91, 94, 97, 101, 103, 106, 111, 121, 127, 134, 137, 164, 168, 176, 180, 189, 190, 195, 198, 210, 216, 218, 227, 242, 250, 256, 260, 265, 270, 272, 275, 285, 288, 306, 311, 313, 316, 324, 325, 338, 345, 360, 365, 368, 369, 384, 395, 402, 405, 415, 421, 423, 424, 428, 434, 437, 441, 445, 450, 452, 456, 462, 467, 486, 509}`.

You can take a look at, for instance, invoice id 597. The track id 2093 is part of an album with only one track. To see Dataquest’s invoice classification, you can run the following query:

``````WITH invoice_first_track AS
(
SELECT
il.invoice_id invoice_id,
MIN(il.track_id) first_track_id
FROM invoice_line il
GROUP BY 1
)

SELECT
ifs.*,
CASE
WHEN
(
SELECT t.track_id FROM track t
WHERE t.album_id = (
SELECT t2.album_id FROM track t2
WHERE t2.track_id = ifs.first_track_id
)

EXCEPT

SELECT il2.track_id FROM invoice_line il2
WHERE il2.invoice_id = ifs.invoice_id
) IS NULL
AND
(
SELECT il2.track_id FROM invoice_line il2
WHERE il2.invoice_id = ifs.invoice_id

EXCEPT

SELECT t.track_id FROM track t
WHERE t.album_id = (
SELECT t2.album_id FROM track t2
WHERE t2.track_id = ifs.first_track_id
)
) IS NULL
THEN "yes"
ELSE "no"
END AS "album_purchase"
FROM invoice_first_track ifs
;
``````

• Filtering for albums with more than one track. There are a ton of albums with only one track.

• `SELECT count(*) FROM tracks_purchased` is a table comprised of tracks, basically. You’re supposed to count invoices, hence the huge number 2966, despite the fact that the `invoice` table only has 614 rows.

2 Likes

Hey @saidakbarp,

We have a `solved` feature that allows you the ability to mark something as the “correct” answer, which helps future students with the same question quickly find the solution they’re looking for.

Here’s an article on how to mark posts as solved - I don’t want to do this for you until I know that solution/explanation works.

Best,
Alvin.

2 Likes

@saidakbarp I don’t know if this is still relevant to you. But a few solutions to this were given here.

Hi @Bruno ,

I’m not sure why we have to find the invoice_first_track in order to find if the customer puchased an album or just tracks.

Hey. @ngokieuphu

We don’t need to find `invoice_first_track`. That was used in Dataquest’s solution, but it’s not necessary. I even suggested a strategy that doesn’t use that table.

When i run my code my answer differs from the solution. Did i mess up something logically because I don’t see what I did wrong. My logic being that if there are any invoices with more than one distinct album_id than it is not an album and if the album has less than a few tracks than it does not count as an album. The solution from this code is 132 albums and 482 individual purchases.

``````WITH album_indiv AS
(
SELECT
CASE
WHEN COUNT(DISTINCT album_id)=1 and COUNT(il.track_id)>4 THEN "album"
ELSE "individual"
END purchase_type,
t.album_id,
t.name,
il.invoice_id
FROM invoice_line il
INNER JOIN track t ON t.track_id=il.track_id
GROUP BY invoice_id
)

SELECT
CASE
WHEN purchase_type="album" THEN "yes"
WHEN purchase_type="individual" THEN "no"
END album_purchase,
COUNT(*) total_type

FROM album_indiv ai
GROUP BY purchase_type
``````

Hi Bruno. I tried using the method you mentioned and I get a different answer than the solution. I was wondering if you know what mistake I made. Thanks!

``````WITH
album_title AS
(
SELECT
a.title,
a.album_id,
COUNT(t.name) total_tracks
FROM album a
INNER JOIN track t ON t.album_id=a.album_id
GROUP BY a.title
),

invoice_table AS
(
SELECT
invoice_id,
album_id,
COUNT(il.track_id) tracks_purchased
FROM invoice_line il
INNER JOIN track t ON t.track_id=il.track_id
GROUP BY il.invoice_id
)
SELECT
CASE
WHEN total_tracks=tracks_purchased THEN "YES"
ELSE "NO"
END album_purchase,
COUNT(*) total_type,
CAST(COUNT(*) as float)/CAST(MAX(invoice_id) as float) percentage
FROM album_title at
INNER JOIN invoice_table it ON it.album_id=at.album_id
GROUP BY album_purchase
``````

Hey, Ja.

In any case, in that answer you mention, I provided a file which classifies invoices as to whether they are album purchases or not. You can use that to investigate the differences and try to figure it out yourself — it’s a good exercise.

Hi @Bruno,

I am also having trouble with this exercise. I want to understand Dataquest’s solution since it shows something I have not seen in past exercises.

“…SELECT t2.album_id FROM track t2
WHERE t2.track_id = ifs.first_track_id…”

In past execises, I remember seeing a number or string in the WHERE clause, for instance:
WHERE t2.track_id = 2
or a subquery that results in a number or string. In this exercise, we are comparing two columns in the WHERE clause, similarly to when we use the JOIN ON clause.

I do not understand how WHERE in this situation works, specially considering that the ifs table was not joined in this subquery.

Can you help me to understand this, please?

I do not want to advance with an alternate solution because I fear I might need to use this WHERE condition again in future exercises.

Thank you very much for your help

Hey, Rodrigo.

Since this is an entirely different question, I won’t go into too much detail. Feel free to ask your questions in a new post if this doesn’t suffice.

This is great feedback, thanks. @rodrigoc.lomonaco

A clause like `WHERE a_column = another_column` is filtering for rows in which the values of the columns match.

For instance, say you have a table called `a_table` that looks like this:

a_column another_column yet_another_column
1 1 “Whateva!”
2 5 “I do what I want.”
3 3 “— Eric Cartman”

The query `SELECT * FROM a_table WHERE a_column = another_column;` yields

a_column another_column yet_another_column
1 1 “Whateva!”
3 3 “— Eric Cartman”
1 Like

When you meant that I should ask my questions in a new post, I am not sure if you meant I should create a new topic or if I should not reply the post I have previously replied. If you meant the former, I am sorry

I still have one question. In the same subquery:

a column of “ifs” table is used but “ifs” table is not in a FROM clause nor in a JOIN clause. I thought that if you are going to use a column from a table, this table had to be in a FROM clause or in a JOIN clause. Was that assumption incorrect?

Thanks again and sorry if I did a mistake.

Sorry, I used the wrong term, I meant a new topic. The one we’re using isn’t meant to encompass all questions regarding 191-7. The `191-7` tag only serves to inform what the topic pertains to.

It is:

1 Like

I think you are asking about “within the same SELECT statement, how did a new table appear?”

This was the concept where i remember the lessons missed totally.
Here is a clear example.

2 Likes

Thank you, @Bruno and @hanqi!

Oh, my mistake, Bruno. I will keep this in mind in future questions.

That is exactly what I meant, hanqi! I didn’t know I could use an information from an outer query in an inner query. I always thought the subqueries had to be sufficient on their own. I will try to find some exercises regarding correlated subqueries before tackling this 191-7 project.

Use the search function here on Discourse to search for 191-7, there are other solutions around that do not use the this technique.

Hey, Bruno. I had run through these topics regarding 191-7. I had found another solution in:

The proposed solution counts the number of distinct tracks in an album and the number of distinct tracks in an invoice, compares both and considers it an album purchase if the numbers are equal, isn’t it?

However, can’t we have some false positives in doing so? I mean, cases when the number of distinct tracks in an invoice are the same number of tracks in an album but the tracks themselves are different? In this situation, the invoice will be called an album purchase but it should not be.

I am not sure if I misinterpreted the code, though.

Not quite. It’s the number of distinct tracks in each combination of invoice and album.

If you have further questions concerning that solution, please reply to the post where that solution is posted or create a new topic including the solution you wish to ask about.

Thanks, Bruno!

1 Like

2 posts were split to a new topic: Answering Business Questions Using SQL

I am certainly no expert (yet - hopefully one day ), but this approach seems a lot more simple and intuitive to me than the solution given. I started going down this route, got stuck on some details, got very confused by the proposed solution and started thinking I was way off base… I went back and did it this way in the end. Thanks for your help!

1 Like