Group photo

Members of CUP, June 2017. This photo demostrates both how excited we are to be working together, and that Charles should not be allowed to take the photo next year.


Our research spans machine learning, deep learning, data mining, software engineering, and programming languages.

Here are some recent and current projects. For the whole story, see our list of publications.

(Members of CUP: Would you like your research to be listed here? Then create a web page and add the link below in the git repo!)

AI for Analysing Large Data Sets

Data analysis is a long process that includes data cleaning, visualizing, applying a machine learning method, evaluating the performance of an algorithm, and monitoring it over time. We are building new tools based on machine learning and artificial intelligence to help people carry out each step of the process much faster.

  • AIDA: An artificial intelligence for practical data anlaytics.
  • MIST: Mining Interesting Stuff. Probabilistc machine learning for finding commonly occurring patterns in large data sets.
  • Word Storms: Multiples of Word Clouds for Visual Comparison of Documents. Published WWW 2014 paper.

AI for Software

Software development is difficult and expensive. Billions of lines of code are now available online, which contain huge amounts of implicit information on how to write software that is easy to read and easy to debug. We are building new methds that use machine learning to identify patterns that capture this information and use it to help software developers.

  • MAST: An overview of our work on AI and deep learning to help software development

Deep Learning

We also carry out fundamental research in how to build probabilistic and deep learning methods that reflect the structure of complex problems.

  • VEEGAN: Better training for Generative Adversarial Networks (GANs) that avoids some of the pathologies of GAN training. NIPS 2017 paper.
  • SemVec: Deep learning for continuous representations of symbolic expressions. Towards combining symbolic and continuous AI.

Natural Language Processing

We also develop machine learning models, recently using deep learning, to bring novel and robust capabilities to language processing systems.

  • DDD: Deep Dungeons and Dragons is a model for a new narrative learning task. DDD aims to uncover latent predictive ties between characters and their actions in stories, when both characters and actions are described in complex natural language text.

Other Applications


At the moment, the best way to see our publications is this list of publications.


We have a weekly reading group at 4pm on Fridays. Everyone is welcome to attend. More about this is available on the CUP reading group page.

We also have "special interest groups" for subsets who are interested in a specific topic. All CUP SIGs are required to have catchy acronyms:

  • MAST: Machine learning for software engineering and programming languages.
  • AIDA: Artificial Intelligence for Data Analytics


Former members

  • Mingjun Zhong (postdoctoral researcher), now lecturer (assistant professor), University of Lincoln
  • Miltiadis Allamanis, PhD (2016), now postdoc, Microsoft Research
  • Krzysztof Geras, PhD (2016), now postdoc, New York University
  • Yichuan Zhang, PhD (2015)
  • Jaroslav Fowkes, postdoctoral researcher, now researcher at Oxford University
  • Pankajan Chantirasegaran (research programmer)
  • Daniel Renshaw, MPhil (2016)