Guided Project: Designing and Creating a Database / UNIQUE contraint COMPOUND KEY

Hello,
I have been stuck for hours on page 7 of the guided project; and Rstudio is user-unfriendly, so apologies for not formatting this in the correct way…

We create the “team_appearance” table. We want to have a coumpound key composed of game_id and team_id. I looked at my tables and it doesn’t make sense, my compound key should absolutely be unique. However, I get this answer ALL THE TIME, which makes no sense since my compound key is unique:

Error: UNIQUE constraint failed: team_appearance.team_id, team_appearance.game_id

my code is (abbreviated to 4 columns for clarity):

q1 <-"CREATE TABLE team_appearance **
** (team_id TEXT,

** game_id TEXT,**
** home TEXT,**
** league_id TEXT,**
** PRIMARY KEY (team_id,game_id));"**

dbExecute(con,q1) **
q2<- "INSERT INTO team_appearance
** SELECT

** h_name,**
** game_id,**
** 1 AS home,**
** h_league**
** FROM game_log**
UNION
** SELECT **
** v_name,**
** game_id,**
** 0 AS home,**
** v_league**
** FROM game_log;"**

dbExecute(con,q2)

result8 ← dbGetQuery(con,q2)

yet I get the UNIQUE constraint violated every time. why???

Thanks in avance

1 Like

Hi @claire.guesdon77,

Can you share the link to the guided project?

It’ll make it easier for others to know which project you’re having trouble with.

And for formatting, just put ``` before the start of the code and another after the end of the code.

```

code
code
code

```

I think the ‘Preformatted text’ option in the toolbar also works the same. It’s the one that looks like </>.

1 Like

Good afternoon,

the guided project is : Learn data science with Python and R projects

The code is on my laptop in R Studio. I am not sure how to create a link to a local file…I attach the text of the R file in this message.

Regards
R_project_notworking.txt (4.1 KB)

code below :

q27 <-"CREATE TABLE team_appearance4 (team_id TEXT,
                                    game_id TEXT,
                                    home TEXT,
                                    league_id TEXT,
                                    PRIMARY KEY (team_id,game_id));"

dbExecute(con,q27) 
                                   
q28 <- "INSERT INTO team_appearance4
    SELECT
        h_name,
        game_id,
        1 AS home,
        h_league
      FROM game_log
UNION
    SELECT    
        v_name,
        game_id,
        0 AS home,
        v_league
    from game_log;"

dbExecute(con,q28)

result8 <- dbGetQuery(con,q28) 
#ERROR APPEAR AFTER HERE```
**Error: UNIQUE constraint failed: team_appearance4.game_id, team_appearance4.team_id**

Honestly I spent yesterday 9pm-2am on this thing and I really don’t understand the error. I fully deleted and rebuilt the database several times. Tonight I destroyed and rebuilt it again from scratch and I have the same error.
A little help for the learners who work full time and learn at night would be helpful, I can’t do that many 2 am stops until I give up. I already had zero help on the Chinook database project when I ran into trouble (ticket #155052) I look forward to see some suggestions. Thanks

1 Like

Hi Claire,

Welcome to the Community!

Sorry to hear that you’re experiencing this issue.

While you’re waiting for help on your question from our Community members, I can comment on your situation with ticket #155052. I’ve checked it right now and noticed that you received the reply from our Support Team on 8.11.2022. Have you seen and read it? If you haven’t received it, please check your spam folder. Anyway, one of the suggestions from the Support Team was:

If, even after posting in our community, you didn’t receive adequate help, please reply to this ticket, and we will do our best to assist you.

To get help on that ticket, could you please reply to it and continue being in contact with the Support Team until the question is fully resolved? Thank you!

Our Community support is purely volunteering-based meaning that our helpers here also work, study, and volunteer at the same time. This is the reason why sometimes the help doesn’t arrive immediately. Also, it’s important to provide as many (and as clear) details as possible when asking a question on our forum (I remember a cool phrase coined by @masterryan.prof - help us help you learn :grinning:). For this purpose, you can find some useful links here and, in general, the whole topic is very helpful.

Happy learning! :nerd_face:

2 Likes

dear Elena, thank for your message.

I clearly exposed the Guided project I had trouble with and the error I received.

I am fully aware that this community is purely volunteer, but when I try to contact Dataquest Help for a non-trivial issue I have spent hours on without going anywhere, they send a canned, automatic email to state that I should “post it to the community”, which is an insult to intelligence. As you mentionned, there is no reason why the community should have any quick answer…In the meantime, it leaves the student (you and me, paying participants because Dataquest isn’t free) frustrated and burnt out, unable to progress. I assure it is not laziness, I only posted 2 questions in the last 3 years I used Dataquest. For both I spent >10 hours debugging without success. There is no progress for anyone under these circumstances. it might well be an arcane issue with the version of R I am using

I wish we could have dedicated support for issues, and not lazy emails “post it to our Free Labor Community because we cannot be bothered to help you the few times you have a non-trivial question”.

3 Likes

Dear Claire,

I’m really sorry to hear that you feel so frustrated about this situation :disappointed: Anyway, since the main objective here is to resolve the problem, I would suggest you do the following:

  • Reply to that ticket where you were directed to use the Community and explain that your question isn’t actually Community-relevant, so it needs the attention of the Support Team (maybe add the link to this thread as well).
  • Do the same with the previous ticket about another issue you had.
  • In the meantime, let’s wait for our Community members to reply to you in this thread. Indeed, @wanzulfikri is already in touch with you about your issue, so let’s give him some time to reply, or maybe other @moderators or Community members will be able to help you.

Hope your issue will be resolved soon!

2 Likes

Hi @claire.guesdon77,

Sorry to hear about your frustrations. I’ll have a look at your problem and see if there’s anything I can do about it.

To be frank, we’re most likely living on different ends of the globe so our time is not synchronised and it might take me a few hours for me to reply. I sometimes think that most people here are asleep when I open the forum and vice-versa. Also, sometimes the problem can be quite difficult to debug and it can take hours at times (I personally need to spend that much time to solve some problems). This happens especially if the code is running locally instead of the Dataquest platform (which I understand because I think you need to run them locally for R guided projects).

I’m not trying to minimise your concerns in a way because I think they’re valid. I just want you to know that we in the community are doing our best to help you and speaking for myself, I’m very sorry that my best is not good enough to allay your frustrations.

1 Like

Hello @claire.guesdon77
Sorry for the frustration you are in. I had the same experience and I know how much the frustration
it is , when you can’t move on to the next step due to unresolved error. Fortunately, my problem was solved by the community members and I was able to progress with my learning.

I am also into the issue to see how quick it can get resolved. Though as mentioned by @wanzulfikri , the problem my take much time more so when it is being worked locally as compared to when it is worked on Dataquest platform.

Meanwhile , I have to add this to the issue;

UNIQUE constraint failed error always occur when you are inserting a data that has an entry which is already in the corresponding column of the table that you are inserting into. Like in your case, team_appearance4.game_id render the same value as that of team_appearance4.team_id despite querying the fields as unique. This normally occurs when generating the keys manually. You can try adding IGNORE while inserting these values into team_appearance4 table to allow the query to bypass the constraint;

q28 <- "INSERT or IGNORE  INTO team_appearance4
    SELECT
        h_name,
        game_id,
        1 AS home,
        h_league
      FROM game_log

Not very sure whether this will solve your issue , but it is as well giving us a broader thoughts on the bug.

Happy Learning.

2 Likes

Hi @claire.guesdon77,

If @brayanopiyo18 suggestion helped to solve your problem, please ignore what I’m going to write next.

I’m trying to run your code but I can’t seem to install RSQLServer. It seems that it has been removed from CRAN and the Github repository has also been archived quite a while ago. Can you clarify how you were able to install that library? Are you using an old R version by chance (type R.version in the R studio console to see the version)?

1 Like

Hi @wanzulfikri, I installed the R studio recommended by Dataquest and indeed, it throws an error about RSQLServer. I bypass the issue with the following code:

conn = dbConnect(RSQLite::SQLite(), "./mlb.db")
dbListTables(conn)

this actually works despite the package not loading…I don’t know why, this is actually a part of the difficulty of these R projects, the local environment is not consistent. With the Python path it’s easier to debug :slight_smile:

2 Likes

Thanks a lot for the suggestions! I implemented it and the result was that my end table had zero lines. Not a single one! In order to make sure my database was alright, I built from scratch 2 small tables with a small subset of data (4 columns x 15 lines), one for home, one for visitors. Then I tried to make an union of these two. The result worked well (4 columns x 30 lines, with ‘home’ being either a 0 or a 1. However, when I tried to insert these columns into the final table that is structured with the compound key, it either raises an UNIQUE error or it does not include a single line. If I remove the KEY declaration, then the table accepts the insertion of all the rows.

So this is the code that seems to create the issue:

q37 <-"CREATE TABLE team_appearance14 (team_id TEXT,
                                    game_id TEXT,
                                    home TEXT,
                                    league_id TEXT,
                                    PRIMARY KEY (game_id,team_id));"

after a dozen hours I am at my wit’s end, wasn’t it how we are supposed to declare a compound key?

regards

2 Likes

I see.

Yeah, RSQLServer is probably not needed and RSQLite is enough. Okay then, I’ll modify the code and see if I can get to the bottom of your issue.

1 Like

Hi @claire.guesdon77,

I believe the following is the source of your problem:

dbExecute(con,q2)

result8 ← dbGetQuery(con,q2)

From the documentation, both dbExecute and dbGetQuery executes the query. The only difference is the former doesn’t return the result while the latter returns the result:

DBI has new functions dbSendStatement() and dbExecute(), which are the counterparts of dbSendQuery() and dbGetQuery() for SQL statements that do not return a tabular result, such as inserting records into a table, updating a table, or setting engine parameters. It is good practice, although currently not enforced, to use the new functions when you don’t expect a result.

This means that your code is actually inserting the results of the union twice so there’s going to be duplicate data because of the second query with dbGetQuery. For INSERT, use dbExecute once and use a different SELECT query instead to get the results. Here’s an example:

q28 <- "INSERT INTO team_appearance4
    SELECT
        h_name,
        game_id,
        1 AS home,
        h_league
      FROM game_log
UNION
    SELECT    
        v_name,
        game_id,
        0 AS home,
        v_league
    from game_log;"

# The following has already inserted the rows.
#
# If you run dbGetQuery with the query q28 again, 
# you will be repeating the same query thus inserting duplicate data into team_appearance
#
#That will violate the UNIQUE constraint.
dbExecute(con,q28)

# my modifications
q_select_team_appearance <- "SELECT * FROM team_appearance4"

result8 <- dbGetQuery(con,q_select_team_appearance) 

print(result8)

From what I’ve seen, you can’t use dbGetQuery directly with INSERT statement or you’ll get the following error : SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().

It’s still quite odd that in your original code when you have both, it didn’t stop INSERT from being used with dbGetQuery when you used it the second time. The safest option is to use dbExecute with INSERT and then query the results of that insertion with dbGetQuery using SELECT.

Try to implement my suggestion and see if it fixes your problem.

4 Likes

Thanks! Indeed it looks very promising, it worked on my mini-set of tables! This is the solution.
FYI This weekend I deleted and reinstalled both R and Rstudio in order to avoid version issues. I still have an issue with not loading RSQLite. It does load SQLite, so the workaround that functions for me is:

con <- dbConnect(RSQLite::SQLite(),"my_db_name.db")

in case anyone had issues like me with connecting to the DB.

I also see that my last version of R has the following error so I wouln’t use DbGetQuery anymore:

Warning message:
In result_fetch([email protected], n = n) :
SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().

hopefully this can help the next person sweating with Rstudio!

2 Likes

No worries @claire.guesdon77.

Happy learning.

1 Like