Research paper assignments (RP1 and RP2)
Overview
For RP1, you will choose a research paper published at a recent top-tier database research conference known as VLDB (International Conference on Very Large Databases). For RP2, you will give a presentation on the content of your chosen paper.
You can complete these two assignments individually or in a team of two. Note, however, that you cannot work with the same partner on this assignment and on the final project (FP1-FP4). If you would like the instructor to find a partner for you, please just ask.
RP1 (120 points)
RP1 consists of two main steps. First, you must choose your paper. Then, submit a document answering a few questions about the paper. These two steps are outlined below.
Step 1 of RP1: choose your paper
VLDB, officially known as the International Conference on Very Large Databases, is a prestigious annual conference presenting original research about databases. Only about 20% of papers submitted to the conference are accepted for presentation and publication. Therefore, papers published in the proceedings of the conference are likely to be of high quality. The title of the conference reflects its focus at the time when it was founded in the 1970s, but these days the conference is about all types of database research, not just “very large” databases.
You must choose a paper that was published in one of the two most recent VLDB conferences: VLDB 2021 or VLDB 2020. As each conference accepts hundreds of papers, this may be rather overwhelming. To help narrow your choice, you can choose to focus on just the papers that were awarded prizes. The prize-winning papers are listed at the following links:
- VLDB 2021 award papers: Any of the first four award-winning papers can be chosen. The “Best Demonstration Award” paper is not eligible.
- VLDB 2020 award papers: The “Best Demonstration” and “Best Demonstration Runner-up” papers are not eligible. Any of the others, including runner-up or honorable mention, may be chosen.
You are also free to choose any other paper published in one of the main research tracks at these two conferences. PDF versions of these papers are available at the following links:
- VLDB 2021 all papers: Click on paper title to download PDF.
- VLDB 2020 all papers: Click on “download PDF” to view a paper. A free registration is required.
Each individual and/or team must choose a different paper. Papers will be assigned on a first-come first-served basis. You can see a list of papers that have already been chosen at the following link:
You can make your selection at the following link:
Here is a list of papers already chosen (which may be out of date but the instructor will try to add information whenever possible):
Alyssa | DatAgent: The Imminent Age of Intelligent Data Assistants. Mandamadiotis et al., VLDB21 |
Leah + Katie | Are We Ready For Learned Cardinality Estimation? Wang et al., VLDB21 |
Leo | MyRocks: LSM-Tree Database Storage Engine Serving Facebook’s Social Graph. Matsunobu et al., VLDB20 |
John and Amir | Optimizing Fitness-For-Use of Differentially Private Linear Queries. Xiao et al., VLDB21 |
Khanh | RAMP-TAO: Layering Atomic Transactions on Facebook’s Online TAO Data Store. Cheng et al., VLDB21 |
Han | Davos: A System for Interactive Data-Driven Decision Making. Shang et al., VLDB21 |
Sophia | Auctus: A Dataset Search Engine for Data Discovery and Augmentation. Castelo et al., VLDB21 |
Son | DIAMetrics: Benchmarking Query Engines at Scale. Gruenheid et al., VLDB21. |
Billy | Optimizing Bipartite Matching in Real-World Applications by Incremental Cost Computation. Abeywickrama et al., VLDB21 |
Evan | SaS: SSD as SQL Database System. Parl et al., VLDB21 |
William | Opportunities for Optimism in Contended Main-Memory Multicore Transactions. Huang et al., VLDB20 |
Pamela | Inspector Gadget: A Data Programming-based Labeling System for Industrial Images. Heo et al., VLDB20 |
Step 2 of RP1: submit paper description
For this part of the assignment, you must submit to Moodle a document answering the following questions. The submission is due at the start of class on the due date listed on the class schedule.
In answering the following questions, you are not required to do extensive reading or research. It should be possible to answer the questions based only on reading the abstract, introduction, and conclusion of your chosen paper. You may also find it useful to examine the figures and captions in the main body of the paper, without trying to understand all details.
Below are the questions to be answered in your submission. In each case, one sentence should be sufficient to answer the question.
- Give the full citation of your chosen paper in APA
format. Example:
- Kandula, S., Orr, L., & Chaudhuri, S. (2019). Pushing data-induced predicates through joins in big-data clusters. Proceedings of the VLDB Endowment, 13(3), 252-265.
-
What is the main problem the authors are trying to solve? In other words, what is the main research question of the paper?
-
What is the main idea that the authors use to solve the problem you identified in the previous question? Note that you are not required to understand any technical details at this stage. You can re-state the authors’ claims without necessarily understanding them.
-
What kind of experiments did the authors run to demonstrate that their solution to the research problem is a good one?
-
Based on the citations given in the introduction of the paper (or possibly elsewhere), guess which previously-published paper is the most closely related to this one. Give a citation of that paper in APA format as your answer to this question.
- Obtain a PDF version of the closely-related paper you identified in the previous question. Most likely it will be freely available. If not, you will need to submit an inter-library loan request to the Dickinson library. As your answer to this question, state whether you have already obtained the PDF or, alternatively, describe how you submitted your request for it.
RP2 (300 points)
The final objective of RP2 is to give a presentation explaining the content of your chosen paper to our class. To achieve this, you will
- read your chosen paper carefully
- read one other background paper lightly (most likely, the paper you identified at the end of RP1)
- do a small amount of additional background research if necessary
- create your presentation
- deliver your presentation in class
The amount of time required to achieve all of the above will vary according to many factors such as the nature of the paper that you chose, but a reasonable target would be at least five hours of preparation.
Your presentation should be about 10 minutes in length, and there will be about five minutes of questions from the audience afterwards.
You must submit your slides, in any reasonable format, to Moodle before the start of the class in which you are delivering your presentation.
To achieve an excellent grade, your presentation should excel on the following aspects. It should:
- meet the target length of ten minutes, plus or minus two minutes;
- clearly explain the problem that the paper is trying to solve, or the research problem being addressed;
- clearly explain the method used to solve the problem and/or address the research question;
- clearly explain any experiments demonstrating the success of the main idea;
- use clear and engaging visual materials (slides and/or whiteboard);
- employ large fonts that are very easy for the audience to read and use only a small number of words on every slide so that every word can be read by the audience;
- be delivered in a clear, engaging, and fluent voice;
- be delivered without reading verbatim from notes or slides (it is a good idea to use notes—just don’t read from them word for word);
- demonstrate a level of understanding of the technical material that is achievable by an undergraduate student (you are not expected to master all technical details);
- demonstrate some understanding of at least one additional background source;
- include a list of sources in APA format, including the chosen paper and at least one additional background paper.
Note that it will not be possible to explain everything about the paper within the 10-minute time limit. Part of the challenge of this assignment is to judiciously choose which material to explain and which to leave out.
Schedule of presentations (assigned randomly)
In class on Thursday 3/31:
- Pamela
- Han
- William
- Son
- Evan
In class on Monday 4/4:
- Leah and Katie
- Leo
- Billy
- Khanh
- Alyssa
In class on Thursday 4/7:
- Sophia
- John and Amir