Jordan Clive

Lead Machine Learning Engineer @ Chattermill AI • Previously: Imperial College London / Funding Circle


Hey, thanks for stopping by! đź‘‹

I am Lead Deep Learning Engineer at Chattermill. I modify, train and deploy deep learning models. I also work for the LAION AI (Stability AI) research group and train models on their multi-node research cluster as well as JUWELS and LUMI supercomputers.

I'm interested in all aspects of machine learning, particularly parameter efficient adaptation of large language and vision models. My research interests lie in productionizing ML systems, long context text summarization, and the data flywheel effect for robust/scalable systems. As an engineer, I'm interested in production-grade ML systems, and reallocating compute to training from compute for inference, leading to faster predictions.

Previously, I completed my Masters in ML from Imperial College London, where I was advised by Marek Rei and Kris Cao of DeepMind and worked on parameter efficient natural language generation. I completed my undergraduate in mathematical physics at Durham University.

I also love contributing to open-source projects and have been an active contributor to projects such as the BigScience Workshop and the GEM Benchmark.

In my spare time, I enjoy sports (mostly tennis and running), mountaineering, rock-climbing and ski-touring.

learn ⇄ imagine ⇆ build ⇆ deploy

Selected Publications

  1. Control Prefixes for Parameter-Efficient Text Generation
    Jordan Clive, Kris Cao, Marek Rei

    EMNLP, Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM) 2022

    PDF HTML Code
    Prefix-tuning is a powerful lightweight technique for adapting a large pre-trained language model to a downstream application. However, it uses the same dataset-level tuned prompt for all examples in the dataset. We extend this idea and propose a dynamic method, Control Prefixes, which allows for the inclusion of conditional input-dependent information, combining the benefits of prompt tuning and controlled generation. The method incorporates attribute-level learnable representations into different layers of a pre-trained transformer, allowing for the generated text to be guided in a particular direction. We provide a systematic evaluation of the technique and apply it to five datasets from the GEM benchmark for natural language generation (NLG). Although the aim is to develop a parameter-efficient model, we show Control Prefixes can even outperform full fine-tuning methods. We present state-of-the-art results on several data-to-text datasets, including WebNLG.
  2. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
    Teven Le Scao, Angela Fan, Christopher Akiki, Suzana Ilić, Daniel Hesslow, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchix, and 375 more authors

    In arXiv (1806.01258) 2022

    PDF HTML Code
    Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
  3. © Copyright 2023 Jordan Clive.