A look at two Papers
- These are from KDD 2017.
- This conference had an acceptance rate of
- 8.6% for research papers
- 9.2% for applied papers.
- Clustering Individual Transactional Data for Masses of Users
- This was in the research track.
- What does the abstract do?
- Intoduce the problem.
- Introduce the algorithm (solution)
- State that an experiment was successful.
- State that an application using the algorithm has been developed.
- What does the intoroduction do?
- Big picture
- I sure would like a citation justifiying the 5GB of data per year.
- Which is stated as Gb, but I bet they mean GB.
- Definition of focus.
- Not Data Science
- Not clustering
- "In this paper we focus on the problem of performing transactional clustering for a large number of dierent datasets."
- Note that they even narrow the scope further.
- "Given a collection of transactions, transactional clustering consists in discovering groups of homogeneous transactions which share many common items [30]. "
- Notice in this paragraph, they continue to narrow the problem.
- Large number of users.
- Large amount of data for each user.
- Need to perform the algorithm for each user.
- No ability to tune parameters.
- A discussion of the state of the art.
- They discuss what has been done before.
- This is the place that you show that you know what you are talking about.
- Or at least that you have done your homework
- xmeans is presented
- how it solves part of the problem but not the other part.
- As are two other clustering algorithms.
- There are many citations in this section.
- Finally, thier solution is introduced in some detail.
- Note, this is not a scientific presentation at this point.
- It is still overview.
- From the introduction, you may need to go do some background work.
- Do you know all of the terms?
- Do you know the algorithms they are comparing this to?
- Do you understand the problem.
- This paper has section 2, "Related Work".
- It begins a more detailed discussion of the problem and how others have attempted to solve it.
- If you are missing background, this is the palce to pick it up.
- You might find a different solution to your problem.
- This section is not always present.
- Section three
- This begins a detailed discussion of the problem.
- Mathematical definitions are given/made.
- Note the use of math here. (Set theory)
- In section four they present the algorithm.
- This is an examination of the algorithm in deatil.
- And should provide sufficient detail to implement the algorithm if the user chooses.
- This becomes a tough read
- Individual sentences are "dense"
- Large amount of detail.
- You might skip this section on first read.
- Section 5 describes experimental results.
- Again, sufficient information should be present to allow readers to recreate the experiment.
- They used artificial data.
- This data was from previous experiments.
- This allows some comparison with previously reported reuslts.
- The Real Case study section
- Slat section, Conclusions
- Usually includes future direction.
- "Not All Passes Are Created Equal:" Objectively Measuring the Risk and Reward of Passes in Soccer from Tracking Data
- This was in the applied track.
- Look for similarities and differences in the paper.