Comment Locator

About

Programmers should write code comments, but not on every line of code. Because both too few and too many comments are undesirable, programmers must judiciously decide where to write code comments. We have created a machine learning model that suggests locations where a programmer should write a code comment. We trained it on existing commented code to learn locations that are chosen by developers. Once trained, the model can predict locations in new code. Our models achieved precision of 74% and recall of 13% in identifying comment-worthy locations. This first success opens the door to future work, both in the new where-to-comment problem and in guiding comment generation.

This page presents the code and data from our paper:
"Where should I comment my code? A dataset and model for predicting locations that need comments",
Annie Louis, Santanu Kumar Dash, Earl T. Barr, Michael D. Ernst and Charles Sutton.
Proceedings of ICSE NIER 2020.

Downloads

[code] built with Tensorflow 1.14
[data]
[best models]

More Information

For more information on the corpus, please contact us: