Cambridge Quantum releases NLP toolkit and library

Update: December 12, 2023

Cambridge Quantum releases NLP toolkit and library

Cambridge Quantum (“CQ”) has released a toolkit and library for Quantum Natural Language Processing (QNLP). The toolkit is called lambeq, named after the mathematician and linguist Joachim Lambek.

lambeq converts sentences into a quantum circuit. It is designed to accelerate the development of practical, real-world QNLP applications, including dialogue, text mining, language translation, text-to-speech, language generation and bioinformatics.

lambeq has been released on a fully open-sourced basis for the benefit of the world’s quantum computing community and the rapidly growing ecosystem of quantum computing researchers, developers and users. lambeq works seamlessly with CQ’s TKET, the quantum software development platform that is also fully open-sourced. This provides QNLP developers with access to the broadest possible range of quantum computers.

lambeq was conceived, designed and engineered by CQ’s Oxford-based quantum computing research team led by Chief Scientist Bob Coecke, with senior scientist Dimitrios Kartsaklis, Ph.D., as chief architect of the platform. lambeq, and QNLP more broadly, are the result of a research project stretching back over a decade.

lambeq enables and automates the design and deployment of NLP experiments of the compositional-distributional (DisCo) type that CQ scientists have previously described.

This means moving from syntax/grammar diagrams, which encode a text’s structure, to either (classical) tensor networks or quantum circuits implemented with TKET, ready to be optimised for machine learning tasks such as text classification. lambeq has a modular design so that users can swap components in and out of the model and have flexibility in architecture design.

lambeq removes the barriers to entry for practitioners and researchers who are focused on AI and human-machine interactions, potentially one of the most significant applications of quantum technologies.

TKET has gained a worldwide user base now measured in the hundreds of thousands. lambeq has the potential to become the most important toolkit for the quantum computing community seeking to engage with QNLP applications that are amongst the most important markets for AI. A key point that has become apparent recently is that QNLP will also be applicable to the analysis of symbol sequences that arise in genomics as well as in proteomics.

lambeq has been released as a conventional Python repository on Github and is available here: https://github.com/CQCL/lambeq. The quantum circuits generated by lambeq have thus far been executed and implemented on IBM quantum computers and on Honeywell Quantum Solutions’ H series devices.

The toolkit is introduced by a technical report uploaded on arxiv available here, while a more generally accessible blog post can be found here. Technical enquiries can be directed to lambeq-support@cambridgequantum.com.