Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS377 Database Systems, Spring 2024

Assignment 1

You can optionally work in pairs for this assignment! Due: Friday, February 16 at 11:59 PM sharp on Gradescope [Late submissions within 24 hours of the deadline: see syllabus for policy]

Introduction

For this assignment, you will write relational algebra queries on two different databases: (1) the Barnes and Noble office equipment database, and (2) the Olympic games of 2023. We are providing both schemas for you. Later in the course, you will learn about how to develop your own schema based on knowledge of the domain. Even though developing a schema is really the first step in building a database, it is a more advanced task than querying an existing database; this is why we will be learning about it and practicing it later.

Write the queries below in relational algebra. There are a number of variations on relational algebra, and different notations for the operations. You may only use the operations and notations we used in the lectures: Π, σ, ./, ./condition, ×, ∩, ∪, −, ρ. Assume all relations are sets (not bags). Some of the queries cannot be expressed in the language that you are using. In those cases, simply write “cannot be expressed”.

Additional points to keep in mind:

• Do not make any assumptions about the data that are not enforced by the original constraints given in the schema above. Your queries should work for any database that satisfies those constraints.

• Assume that every tuple has a value for every attribute. (For those of you who know some SQL, in other words, there are no null values.)

• Remember that the condition on a select operation may only examine the values of the attributes in one tuple (not whole columns), and that it can use comparison operators (such as ≤ and ≠) and boolean operators (∨, ∧ and ¬). Two relations in our schema have a date attribute. You may use comparison operators on such values. You may refer to the year component of a date d using the notation d.year.

• You are encouraged to use the assignment operator (:=) to define intermediate results, and it is a good idea to add commentary explaining what you’re doing. This way, even if your final answer is not completely correct, you may receive partial credit.

• The order of the columns in the result doesn’t matter.

• When asked for a maximum or minimum, if there are ties, report all of them.

1 Barnes and Noble Office Equipment Database (28 points)

Consider the following relational schema for the Barnes and Noble section that sells printers and computers:

Relations

• Company(companyID, cname, address, city, state, zipcode)

A tuple in this relation represents a company that makes computers and/or printers. companyID is the company id, cname is the company name, address is the street address of the company headquarters, city is the city of the company headquarters, state is the state of the company headquarters, zipcode is the zipcode of the company headquarters.

• Computer(computerID, model, laptop, os, speed, ram, harddrive, screen, price)

A tuple in this relation represents a computer product. computerID is the computer id, model is the computer model name, laptop is a boolean representing whether or not it is a laptop, os is the operating system of the computer, speed is the cpu speed of the computer, ram is the size of the computer memory, harddrive is the size of the computer harddrive, screen is the size of the screen (e.g., 11-inch, 13-inch, etc.), price is the sales price of the computer.

• ProduceComputer(companyID, computerID)

A tuple in this relation represents which company produces the computer. companyID is the company id that makes the product and computerID is the computer id.

• Printer(printerID, model, color, price, scanner, wireless)

A tuple in this relation represents a printer product. printerID is the printer id, model is the printer model name, color is a boolean representing whether or not it is a color printer, price is the sales price of the printer, scanner is a boolean representing whether or not it has scanning functionality, wireless is a boolean representing whether or not it has a wireless connection.

• ProducePrinter(companyID, printerID)

A tuple in this relation represents which company produces the printer. companyID is the company id that makes the product and printerID is the printer id.

Integrity constraints

• ProduceComputer[companyID] ⊆ Company[companyID]

• ProduceComputer[computerID] ⊆ Computer[computerID]

• ProducePrinter[companyID] ⊆ Company[companyID]

• ProducePrinter[printerID] ⊆ Printer[printerID]

Express the following queries in Relational Algebra:

1. [7 pts] Find the name of the company and the model name for all personal computers (non-laptops) that are more than $800 and less than $1200.

2. [7 pts] What company only sells computer with a “Windows” operating system?

3. [7 pts] What company makes the laptop with the largest ram and what city are they located in?

4. [7 pts] Which company makes at least 3 different color printers but does not make a laptop with a 15-inch screen?

2 Olympic Games (72 points)

The schema below represents data for the Olympic Games 2023, where people can buy tickets for matches they would like to attend, and the results are recorded as events are completed. The schema attempts to represent the domain well, but in order to keep the assignment simple, some things are simplified and other details are not represented. For example, the fact that some sporting disciplines are team events, while others are individual competitions is not represented. Each athlete, whether competing in an individual event or a team competition, is awarded a medal for finishing in the top 3 (so multiple athletes can get a gold medal in the same event, if it happens to be a team event). Remember that all your queries will be written with respect to the schema below.

Relations

• Country(CID, cname)

A tuple in this relation represents a country that participates in the competition. CID is the country id, cname is the country name.

• Athlete(AID, fname, lname, sport, CID, gold, silver, bronze)

A tuple in this relation represents an athlete who is participating in the Pan Am games. AID is the athlete’s ID, fname is their first name, lname is their last name, sport is the general sporting discipline the athlete is qualified to compete in (for example: athletics, swimming, fencing, diving, boxing, etc. – you can assume no athlete is qualified for more than one discipline), and CID is the country they are representing (each athlete only represents one country). Finally, gold/silver/bronze represent the number of medals of each type won so far by that athlete.

• Stadium(SID, sname, address, capacity)

A tuple in this relation represents a stadium where a match takes place. SID is the stadium’s ID, sname is the stadium name, address represents its location within the city of Lima, capacity is the capacity of the stadium (number of seats).

• Event(EID, date, time, sport, description, SID)

A tuple in this relation represents a sporting event that took place or will be taking place in the future. EID is the id of the event, date is the date the event is scheduled on, time is the time of the event, sport is the sporting discipline of the event, and description is the type of event in that sport (e.g., sport can be “athletics”, description can be “100m sprint”). The SID is the id of the stadium where the event takes place.

• Ticket(TID, dateIssued, timeIssued, EID)

A tuple in this relation represents a ticket that was purchased, (regardless of whether the event has taken place yet or not - ticket purchases can be made in advance). TID is the ticket’s id, dateIssued is the date that it was purchased, timeIssued is the time that it was purchased, and EID is the EID of the event it was purchased for.

• Result(EID, AID, medal)

This relation keeps track of all the results of completed events so far. A tuple in this relation represents the fact that a medal has been won in a specific event by a specific athlete. EID is the event id, AID is the athlete competing in this event, and medal is the medal type (gold, silver, bronze, or nomedal) which the respective athlete won in the competition.

Integrity constraints

• Athlete[CID] ⊆ Country[CID]

• Event[SID] ⊆ Stadium[SID]

• Ticket[EID] ⊆ Event[EID]

• Result[EID] ⊆ Event[EID]

• Result[AID] ⊆ Athlete[AID]

• ΠmedalResult ⊆ {“gold”, “silver”, “bronze”, “nomedal”}

Olympic Games: Queries (60 pts)

5. [7 pts] Find all the sporting disciplines that Mexican athletes have competed in so far.

6. [8 pts] Find the last names of the athlete(s) of the country(ies) that did not win any medals yet (either because they did not compete, or because their athletes did not rank in the top 3 in any event so far).

7. [8 pts] Find the first and last name of the athletes whose sporting discipline is “swimming” and have won the highest number of gold medals among all athletes who compete in the same sport (i.e., swimming).

8. [8 pts] Find the name of every country that has won at least one of every type of medal (gold, silver, and bronze).

9. [7 pts] Find the sports disciplines of the events for which at least two tickets were bought on the date of the event.

10. [7 pts] Find the discipline (sport) of the event for which the highest number of tickets was purchased.

11. [7 pts] Find the athlete with the highest overall number of gold medals won so far, and report that athlete’s first and last name, country name, and number of gold medals won.

12. [8 pts] Find the gold medalist country of the event for which the very first ticket out of all the tickets in the database was purchased.

Olympic Games: Additional Integrity Constraints (12 pts)

Below are some additional integrity constraints on our schema. Express each of them using the notation we introduced in class (see slides 16-18 in lecture 6). If a constraint cannot be expressed using such notations, simply write “cannot be expressed”.

1. [4 pts] All tickets for an event have to be purchased before the time of the event.

2. [4 pts] An athlete could not have competed in an event for a sporting discipline that they are not qualified to participate in.

3. [4 pts] The number of tickets purchased for an event should not exceed the capacity of the stadium where the event takes place.

Submission instructions

Your assignment must be typed; handwritten assignments will not be marked. You may use any word-processing software you like. Many academics use LaTeX. It produces beautifully typeset text and handles mathematical notation well. Whatever you choose to use, you need to produce a final document in pdf format, and you must call it “A1.pdf’.

You are allowed, and in fact encouraged, to work with a partner for this assignment. You must declare your team (if you are working with a partner) and submit your work on Canvas. For this assignment, hand in just one file: “A1.pdf”. If you are working in a pair, only one of you should submit the file, but please remember to declare your team on Canvas before submission. Late submissions are subject to our late policy (see course syllabus). If you are working in a pair, both students must have a late token each in order to accept the assignment within 24 hours with no penalty. Otherwise, the standard 10% penalty will be applied to both.

Honor Code: Solve this assignment without collaborating with classmates (besides your assignment partner), and without consulting external/online resources. The assignment is governed by the College Honor Code andDepartmental Policy. Remember, any work you submit must be your own; otherwise you risk being investigated by the Honor Council and facing the consequences of that.

Please remember to include the following comment at the beginning of the a1.pdf file:

THIS WORK WAS MY (OUR) OWN WORK. IT WAS WRITTEN WITHOUT CONSULTING

WORK WRITTEN BY OTHER STUDENTS OR COPIED FROM ONLINE RESOURCES.

_Student1_Name_, _Student2_Name_