Skip to the content.

Research paper assignments (RP1 and RP2)

Overview

For RP1, you will choose a research paper published at a recent top-tier database research conference known as VLDB (International Conference on Very Large Databases). For RP2, you will give a presentation on the content of your chosen paper.

You can complete these two assignments individually or in a team of two. Note, however, that you cannot work with the same partner on this assignment and on the final project (FP1-FP4). If you would like the instructor to find a partner for you, please just ask.

RP1 (120 points)

RP1 consists of two main steps. First, you must choose your paper. Then, submit a document answering a few questions about the paper. These two steps are outlined below.

Step 1 of RP1: choose your paper

VLDB, officially known as the International Conference on Very Large Databases, is a prestigious annual conference presenting original research about databases. Only about 20% of papers submitted to the conference are accepted for presentation and publication. Therefore, papers published in the proceedings of the conference are likely to be of high quality. The title of the conference reflects its focus at the time when it was founded in the 1970s, but these days the conference is about all types of database research, not just “very large” databases.

You must choose a paper that was published in one of the two most recent VLDB conferences: VLDB 2021 or VLDB 2020. As each conference accepts hundreds of papers, this may be rather overwhelming. To help narrow your choice, you can choose to focus on just the papers that were awarded prizes. The prize-winning papers are listed at the following links:

You are also free to choose any other paper published in one of the main research tracks at these two conferences. PDF versions of these papers are available at the following links:

Each individual and/or team must choose a different paper. Papers will be assigned on a first-come first-served basis. You can see a list of papers that have already been chosen at the following link:

You can make your selection at the following link:

Here is a list of papers already chosen (which may be out of date but the instructor will try to add information whenever possible):

   
Alyssa DatAgent: The Imminent Age of Intelligent Data Assistants. Mandamadiotis et al., VLDB21
Leah + Katie Are We Ready For Learned Cardinality Estimation? Wang et al., VLDB21
Leo MyRocks: LSM-Tree Database Storage Engine Serving Facebook’s Social Graph. Matsunobu et al., VLDB20
John and Amir Optimizing Fitness-For-Use of Differentially Private Linear Queries. Xiao et al., VLDB21
Khanh RAMP-TAO: Layering Atomic Transactions on Facebook’s Online TAO Data Store. Cheng et al., VLDB21
Han Davos: A System for Interactive Data-Driven Decision Making. Shang et al., VLDB21
Sophia Auctus: A Dataset Search Engine for Data Discovery and Augmentation. Castelo et al., VLDB21
Son DIAMetrics: Benchmarking Query Engines at Scale. Gruenheid et al., VLDB21.
Billy Optimizing Bipartite Matching in Real-World Applications by Incremental Cost Computation. Abeywickrama et al., VLDB21
Evan SaS: SSD as SQL Database System. Parl et al., VLDB21
William Opportunities for Optimism in Contended Main-Memory Multicore Transactions. Huang et al., VLDB20
Pamela Inspector Gadget: A Data Programming-based Labeling System for Industrial Images. Heo et al., VLDB20

Step 2 of RP1: submit paper description

For this part of the assignment, you must submit to Moodle a document answering the following questions. The submission is due at the start of class on the due date listed on the class schedule.

In answering the following questions, you are not required to do extensive reading or research. It should be possible to answer the questions based only on reading the abstract, introduction, and conclusion of your chosen paper. You may also find it useful to examine the figures and captions in the main body of the paper, without trying to understand all details.

Below are the questions to be answered in your submission. In each case, one sentence should be sufficient to answer the question.

  1. Give the full citation of your chosen paper in APA format. Example:
    • Kandula, S., Orr, L., & Chaudhuri, S. (2019). Pushing data-induced predicates through joins in big-data clusters. Proceedings of the VLDB Endowment, 13(3), 252-265.
  2. What is the main problem the authors are trying to solve? In other words, what is the main research question of the paper?

  3. What is the main idea that the authors use to solve the problem you identified in the previous question? Note that you are not required to understand any technical details at this stage. You can re-state the authors’ claims without necessarily understanding them.

  4. What kind of experiments did the authors run to demonstrate that their solution to the research problem is a good one?

  5. Based on the citations given in the introduction of the paper (or possibly elsewhere), guess which previously-published paper is the most closely related to this one. Give a citation of that paper in APA format as your answer to this question.

  6. Obtain a PDF version of the closely-related paper you identified in the previous question. Most likely it will be freely available. If not, you will need to submit an inter-library loan request to the Dickinson library. As your answer to this question, state whether you have already obtained the PDF or, alternatively, describe how you submitted your request for it.

RP2 (300 points)

The final objective of RP2 is to give a presentation explaining the content of your chosen paper to our class. To achieve this, you will

The amount of time required to achieve all of the above will vary according to many factors such as the nature of the paper that you chose, but a reasonable target would be at least five hours of preparation.

Your presentation should be about 10 minutes in length, and there will be about five minutes of questions from the audience afterwards.

You must submit your slides, in any reasonable format, to Moodle before the start of the class in which you are delivering your presentation.

To achieve an excellent grade, your presentation should excel on the following aspects. It should:

Note that it will not be possible to explain everything about the paper within the 10-minute time limit. Part of the challenge of this assignment is to judiciously choose which material to explain and which to leave out.

Schedule of presentations (assigned randomly)

In class on Thursday 3/31:

  1. Pamela
  2. Han
  3. William
  4. Son
  5. Evan

In class on Monday 4/4:

  1. Leah and Katie
  2. Leo
  3. Billy
  4. Khanh
  5. Alyssa

In class on Thursday 4/7:

  1. Sophia
  2. John and Amir