Hybrid Quantum-Classical Transformer Architecture

The whiteboard shows how a Quantum-enhanced Transformer architecture is built. The model begins with classical input data (X), just like a standard Transformer. This data is mapped into a quantum state using Quantum Embedding (Uₑₙc), where classical values are encoded into superposition and entanglement. Once embedded, the data passes through a Parameterized Quantum Circuit (PQC).

Figure: Hybrid Quantum Classical Transformer Architecture

Here, the classical Q–K–V attention mechanism is replaced by a trainable quantum process. The circuit parameters are optimized during training, similar to learning weights in a classical neural network. After measurement, the quantum state produces a hybrid output (Y), which is fed into classical linear layers for further processing. This creates a hybrid loop: Quantum circuits handle representation and interaction Classical layers handle optimization and scaling

There are two architectural directions:

1. NISQ (current): PQC-based attention running on today’s noisy quantum devices

2. Quantum Linear Algebra (future): block encoding and quantum matrix operations for large-scale acceleration, requiring fault-tolerant hardware. This approach does not aim to replace classical Transformers. It explores where quantum operations can add value, especially in representation power and computational efficiency.

References:
[1] Quantum-enhanced Transformers Overview: https://arxiv.org/html/2504.03192v1
[2] PQC vs QLA Architectures: https://arxiv.org/html/2504.03192v2
[3] Survey on Quantum Transformers: https://arxiv.org/abs/2504.03192
[4] Challenges in Quantum ML: https://quantumzeitgeist.com/exploring-transformer-models-in-quantum-machine-learning-challenges-and-future-directions-for-pqc-and-qla-approaches/
[5] Quantinuum Hardware Advances: https://www.quantinuum.com/blog/our-hardware-is-now-running-quantum-transformers
[6] Yale-NVIDIA Collaboration: https://www.bqpsim.com/articles/quantum-enhanced-artificial-intelligence