A Convolutional Attention Network for Extreme Summarization of Source Code

A convolutional attentional neural network for the summarization of source code tokens in a short and consise method name.

Dataset

The dataset consists of the parsed source code of the top 11 projects. For information about the structure, please see the dataset.txt.

Download

Source Code

The source code for this work, can be found here.

Visualization of Attention

You can view the visualization of the attention for the prediction of the method names of the libgdx project in the following standalone html files:

[batch 1] [batch 2] [batch 3] [batch 4] [batch 5] [batch 6] [batch 7] [batch 8] [batch 9] [batch 10] [batch 11] [batch 12] [batch 13] [batch 14] [batch 15] [batch 16] [batch 17] [batch 18] [batch 19] [batch 20] [batch 21]

Source Code

Source code of the convolutional attention network can be found at this GitHub repository.

About

The related ArXiV preprint can be found here. Please cite:

@inproceedings{allamanis2016convolutional,
  title={A Convolutional Attention Network for Extreme Summarization of Source Code},
  author={Allamanis, Miltiadis and Peng, Hao and Sutton, Charles},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2016}
}

More related work from the MAST Group and related resources about Big Code