Dear TAs,
I believe there is a mistake in the last exercise within the I/O Bound Program
module. The related links are given here at:
Description
Instruction
The Best Fielder is defined by someone with the bigger RF score, which has a formula of RF = (A + PO) / G
, where G must be > 100.
So the project wanted us to find the top 20 fielders with SQL, and here is the final solution:
query = '''
SELECT
(CAST(A AS FLOAT) + PO) / G as RF,
playerID
FROM Fielding
GROUP BY Fielding.playerID
HAVING G > 100
ORDER BY RF desc
LIMIT 20;
'''
However, I believe the answer is incorrect. This is because each player can has more than one RF scores. Therefore we need to extract the MAX(RF)
score from each player. Otherwise the clause (CAST(A AS FLOAT) + PO)/G
will simply return None
if that person has more than one entry. So I believe the correct solution should be:
query = '''
SELECT
MAX((CAST(A AS FLOAT) + PO) / G) as RF,
playerID
FROM Fielding
GROUP BY Fielding.playerID
HAVING G > 100
ORDER BY RF desc
LIMIT 20;
'''
The answers for the Best Pitcher and Best Batter are also questionable.
I believe the answers should be:
# best batter
query = '''
SELECT
MAX((CAST(H AS FLOAT) + BB + HBP) / (AB + BB + HBP + SF) +
(H + "2B" + 2*"3B" + 3*HR) / AB) as OPS,
playerID
FROM Batting
GROUP BY playerID
HAVING AB > 100
ORDER BY OPS desc
LIMIT 20;
'''
# best pitcher
query = '''
SELECT
MIN((13 * CAST(HR AS FLOAT) + 3*BB - 2*SO) / IPOuts + 3.2) as FIP,
playerID
FROM Pitching
GROUP BY playerID
HAVING IPOuts > 100
ORDER BY FIP asc
LIMIT 20;
'''
Correct me if I am wrong.
Thanks