Monday, September 12, 2011

Performance Modeling and Analysis of Flash-based Storage Devices

I often find how it is easy to reject relatively well-written papers. Here is my pretending conference review about this paper. Do not get me wrong. This is no offense, and I am not angry about endless paper reading.

Dear authors,

We regret to inform you that your paper #1234567890 was not accepted for presentation and publication in the Proceedings of Symposium on Perfect Computer Science Papers (SPCSP 2011).

For this year's SPCSP program, we selected 2 papers from the 747 papers submitted. There were many good papers that were submitted and seriously considered for acceptance by the TPC. However, to maintain our high reputation, we could not include many of the papers we would have liked.

===== Review 1 =====

> Overall Placement: Among all papers submitted to SPCSP, where  would you place this one?
top 2%, but not top 1%

> Contributions:
This paper proposes a new way to model and analyze the performance of Flash-based Storage Devices with Regression Tree. It also provides basic performance measurements of Intel and Samsung SSDs (after reading this paper, you may want to go buy Intel SSDs).

> Strengths:
This paper is IEEE format, which is considerably easier to read than ACM format.

> Weaknesses:
Why do we have three papers to read until the next monday?

> Detailed comments:

Overall, this is a fairly readable paper. I found it also educative if a reader has a little expertise in SSDs. However, here are a few things you might want to consider
  • Your plots are not very legible.
  • In Fig 3, (b) and (c) are quite redundant since the write size is fixed.
  • Why does the Y-axis of Fig 4 begin from 10, not 0?
  • You should have explicitly stated that Fig 4 is about write operations.
  • Fig 6 and Fig 7 are the same information. They are just inversely proportional.
The authors does not reason why they chose to use regression tree among other alternatives (e.g., why not  neural networks?). After all, I am not fully convinced why we need a quite sophisticated black-box method to predict performance? (in other words, why not just measure it?) One possible usage could be to estimate the performance of SSDs with currently unknown workloads, but I am not sure if it is really common.

The author claims that their model is more beneficial for SSDs than HDDs, but I cannot see any meaningful differences for real traces in Table VII. In addition, the table shows only MRE (Meal, Ready-to-Eat Mean Relative Error). However, the average error without the distribution of errors might make people skeptical about the usefulness of their approach against real-world traces.

No comments:

Post a Comment