My question is the same as in the title: when we are joining two data frames why FROM
statement doesn’t need both data frames names?
for example:
SELECT c.*, f.name country_name FROM facts f
INNER JOIN cities c ON c.facts_id = f.id
LIMIT 5;
Why does from
has just facts
instead of facts
and cities
?
Does it matter which one I choose (when I choose just one)?
The simple answer: FROM
doesn’t need two dataframes. Consider the simplest of queries:
SELECT *
FROM cities;
This is a perfectly valid query and FROM
needs only one dataframe. It’s only when we want to JOIN
two dataframes that we actually need the name of another dataframe. Therefore we name the second dataframe only when we use a JOIN
statement and the name of the second dataframe is part of the JOIN
statement.
Yes, it will matter which one you choose if you’re only choosing one – you will need to choose the one that has the features you’re looking to query. ie the features you place in the SELECT
statement.
…so when I join two, it doesn’t matter which I choose in FROM
statement? Am I understand this particular case correctly?
The only time order doesn’t matter is when doing an inner join. For all other types of joins, it will matter which is in the FROM
statement and which one is in the JOIN
statement.
Check out this stackoverflow post for a more in depth explanation.