Jordan Clive
Lead Deep Learning Scientist @ Chattermill AI • Previously: Imperial College London / Funding Circle
Hey, thanks for stopping by! đź‘‹
I am the Lead Deep Learning Scientist at Chattermill. I lead the design and implementation of ML pipelines where we modify, train and deploy deep learning models. Recently, I worked for LAION AI (Stability AI) research group and trained models on their multi-node research cluster as well as JUWELS and LUMI supercomputers.
I'm interested in all aspects of machine learning, particularly parameter efficient adaptation of large language and vision and speech models. My research interests lie in long context text summarization, and the data flywheel effect for robust/scalable systems. As an engineer, I'm interested in production-grade ML systems.
Previously, I completed my Masters in ML from Imperial College London, where I was advised by Marek Rei and Kris Cao of DeepMind and worked on parameter efficient natural language generation. I completed my undergraduate in mathematical physics at Durham University.
I also love contributing to open-source projects and have been an active contributor to projects such as the BigScience Workshop and the Massive Text Embedding and GEM Benchmarks.
In my spare time, I enjoy sports (mostly tennis and running), mountaineering, rock-climbing and ski-touring.
learn ⇄ imagine ⇆ build ⇆ deploy
Hey, thanks for stopping by! đź‘‹
I am the Lead Deep Learning Scientist at Chattermill. I lead the design and implementation of ML pipelines where we modify, train and deploy deep learning models. Recently, I worked for LAION AI (Stability AI) research group and trained models on their multi-node research cluster as well as JUWELS and LUMI supercomputers.
I'm interested in all aspects of machine learning, particularly parameter efficient adaptation of large language and vision and speech models. My research interests lie in long context text summarization, and the data flywheel effect for robust/scalable systems. As an engineer, I'm interested in production-grade ML systems.
Previously, I completed my Masters in ML from Imperial College London, where I was advised by Marek Rei and Kris Cao of DeepMind and worked on parameter efficient natural language generation. I completed my undergraduate in mathematical physics at Durham University.
I also love contributing to open-source projects and have been an active contributor to projects such as the BigScience Workshop and the Massive Text Embedding and GEM Benchmarks.
In my spare time, I enjoy sports (mostly tennis and running), mountaineering, rock-climbing and ski-touring.
learn ⇄ imagine ⇆ build ⇆ deploy
Selected Publications
Control Prefixes for Parameter-Efficient Text Generation
EMNLP, Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM) 2022
Abstract PDF HTML CodePrefix-tuning is a powerful lightweight technique for adapting a large pre-trained language model to a downstream application. However, it uses the same dataset-level tuned prompt for all examples in the dataset. We extend this idea and propose a dynamic method, Control Prefixes, which allows for the inclusion of conditional input-dependent information, combining the benefits of prompt tuning and controlled generation. The method incorporates attribute-level learnable representations into different layers of a pre-trained transformer, allowing for the generated text to be guided in a particular direction. We provide a systematic evaluation of the technique and apply it to five datasets from the GEM benchmark for natural language generation (NLG). Although the aim is to develop a parameter-efficient model, we show Control Prefixes can even outperform full fine-tuning methods. We present state-of-the-art results on several data-to-text datasets, including WebNLG.BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
In arXiv (1806.01258) 2022
Abstract PDF HTML CodeLarge language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.