Quantum Convolutional Neural Networks
for High-Energy Physics Analysis at the LHC

A Google Summer of Code 2021 Project Repository.
This project aims to demonstrate quantum machine learning's potential, specifically Quantum Convolutional Neural Network (QCNN), in HEP events classification from particle image data.
The code used in the research is wrapped as an open-source package to ease future research in this field.
Check the How to Use section to learn more about it.

Introduction

Organization
- Machine Learning for Science (ML4Sci)
Student
- Eraraya Ricardo Muten
Mentors
- Prof. Sergei V. Gleyzer, Dr. Emanuele Usai, and Raphael Koh
Project Details

Abstract

One of the challenges in High-Energy Physics (HEP) is events classification, which is to predict whether an image of particle jets belongs to events being sought after or just background signals. Classical Convolutional Neural Network (CNN) has been proven a powerful algorithm in image classification, including jets image. As quantum computers promise many advantages over classical computing, comes a question on whether quantum machine learning (QML) can give any improvement in solving the problem. This project aims to demonstrate quantum machine learning's potential, specifically Quantum Convolutional Neural Network (QCNN), in HEP events classification from image data. The code used in the research is wrapped as an open-source package to ease future research in this field.

How to Use

Package Description

This package is a TensorFlow Quantum implementation of quantum convolution and classifier with Data Re-uploading[3] ansatz. Both are wrapped as Keras layers that can easily be integrated into other Keras layers (classical and/or quantum), acting as building blocks for Quantum Convolutional Neural Networks (both hybrid and fully quantum). The model can be trained using Keras API.

Package Dependencies

Installation

git clone https://github.com/eraraya-ricardo/qcnn-hep.git
cd qcnn-hep
python -m pip install -r requirements.txt
python setup.py

For a more detail step-by-step installation, please refer to Docs and Tutorial.

Docs and Tutorial

Docs: or download here.
Tutorial: or download here.

Weekly Progress

Week 1: Looking and getting used to the dataset, train a classical ResNet[1] model as a baseline.
Week 2: Coding the graph-convolution preprocessing[2] and Quantum Conv layer with data re-uploading[3] PQC.
Week 3: Testing the first iteration of the QCNN model, coding the parallelized convolution, testing ResNet with 8x8 images.
Week 4: Tested the Kaggle platform, tested the parallelized convolution, trained QCNN with varying hyperparameters.
Week 5: Tested the classical CNN and Fully-connected NN, started to train the QCNN v0.1.0 with varying filter size & stride, coded the new ansatz for quantum convolution layer based on [4].
Week 6: Tested the new quantum convolution ansatz[4], try to combine ideas from data re-uploading circuit to the new ansatz, presented a short summary about the project at the MCQST Student Conference.
Week 7: Tested the QCNN v0.1.1 on MNIST[5] and LArTPC[4] dataset.
Week 8: Wrapped the code in the development notebooks as a Python package.
Week 9: Tested the QCNN v0.1.1 and classical CNN on the Quark-Gluon[6] dataset.
Week 10: Cleaned up the repository, README, and the docs/tutorial notebook.

Project's Datasets

Photon-Electron Electromagnetic Calorimeter (ECAL) Dataset

Averages of Photon (left) and Electron (right) image samples from the dataset.

The dataset contains images from two types of particles: photons (0) and electrons (1) captured by the ECAL detector.

Each pixel corresponds to a detector cell.
The intensity of the pixel corresponds to how much energy is measured in that cell.
In total, there are 498,000 samples, equally distributed between the two classes.
The size of the images are 32x32.

If you are interested on using the datast for your study, contact me and I can try to connect you to the people at ML4Sci who have the dataset.

Liquid Argon Time Projection Chamber (LArTPC) Dataset[4]

An image sample for each class from LArTPC dataset.

The dataset contains images of simulated particle activities (μ+, e−, p+, π+, π0, γ) in a LArTPC detector. This dataset is prepared by the authors for study in [4].

The images have a resolution of 480 x 600 pixels, where each pixel in the x-axis represents a single wire and each pixel in the y-axis represents a sampling time tick.
Colors in the images represent the sizes of the ionization energy loss along the particle trajectories when measured by LArTPC’s wire planes.
In total, there are 100 samples for each class.
Each particle’s momentum is set such that the mean range of the particle is about 2 meters, so the classification is not sensitive to the image size.
In this study, the images are scaled to 30x30, prepared by the original authors of the dataset.
You can check reference [4] for more details of the dataset.

The dataset can be obtained from the original authors of [4] upon reasonable request.

Quark-Gluon Dataset[6]

Averages of Gluon (left) and Quark (right) image samples of the track channel from the subdataset of 10k samples.

Cropped to 40 x 40.

The dataset contains images of simulated quark and gluon jets. The image has three channels, the first channel is the reconstructed tracks of the jet, the second channel is the images captured by the electromagnetic calorimeter (ECAL) detector, and the third channel is the images captured by the hadronic calorimeter (HCAL) detector.

The images have a resolution of 125 x 125 pixels (for every channel).
Since the original size of 125 x 125 pixels is too large for quantum computing simulation, we cropped the images into certain size. For now, we limit the current size to 40 x 40 pixels.
In this study, we focus mostly on the tracks channel.
You can check reference [6] for more details of the dataset.

If you are interested on using the datast for your study, contact me and I can try to connect you to the people at ML4Sci who have the dataset.

MNIST Dataset[5]

An image sample for each class from MNIST dataset.

The dataset contains images of grayscale (8 bit) handwritten digits, 28 x 28 pixels, has a training set of 60,000 examples, and a test set of 10,000 examples.

It can be obtained from [5].

Research

Hardware and Platform

The whole project is run on Google Colab with GPU vary between V100, P100, or T4. The runtime listed in the Research section might not be too accurate as the GPU used vary between runs. The main benchmarking metric is the Test AUC.

Results

Early Testing

Test on ECAL Dataset

Notebook Version Name	Notes	Num. Trainable Params	Test AUC	Runtime (secs per epoch)
ResNet v2	Whole samples with 15% for test samples, 200 epochs, 128 batch size, classical preprocessing: MinMax scaling -> subtract mean, optimizer: Adam(learning_rate=lr_schedule)	295,074	0.80	-
QCNN v0.1.0 (data re-uploading circuit)	Whole samples with 15% for test samples, 10 epochs, 128 batch size, 1 qubits, 1 layers, filter size = [3, 3], stride = [1, 1], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling, optimizer: Adam(learning_rate=lr_schedule)	190	0.730	(about 1.5 hours/epoch)
ResNet v2	Whole samples with 15% for test samples, 200 epochs, 128 batch size, classical preprocessing: crop to 8x8 -> MinMax scaling -> subtract mean, optimizer: Adam(learning_rate=lr_schedule)	295,074	0.63 (overfit, train AUC = 0.80)	-
QCNN v0.2.0 (data re-uploading circuit)	Whole samples with 15% for test samples, 10 epochs, 128 batch size, 2 qubits, 2 layers, filter size = [2, 2], stride = [2, 1], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling, optimizer: Adam(learning_rate=lr_schedule)	194	0.68	-
Classical CNN	Whole samples with 15% for test samples, 10 epochs, 128 batch size, filter size = [3, 3], stride = [1, 1], num. of filters = [2, 1], conv activation = [relu, relu], use_bias = [True, True], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling, optimizer: Adam(learning_rate=lr_schedule)	193	0.738	13
Classical Fully-connected NN	Whole samples with 15% for test samples, 10 epochs, 128 batch size, num. of nodes = [3, 2], activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling -> flatten to 64, optimizer: Adam(learning_rate=lr_schedule)	203	0.691	10

QCNN v0.1.1 (data re-uploading circuit[3])

Test on ECAL Dataset

10k samples with 15% for test samples, 200 epochs, 128 batch size, varying qubits, varying layers, filter size = [3, 3], stride = [1, 1], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling

optimizer: Adam(learning_rate=lr_schedule)

Num. Qubits	Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Runtime (secs per epoch)
1	1	190	0.689	0.636	±80
1	2	226	0.716	0.666	±165
1	3	262	0.687	0.622	±330
1	4	298	0.691	0.607	±370
2	1	226	0.687	0.661	±200
2	2	298	0.710	0.645	±420
3	1	262	0.691	0.655	±350
3	2	370	0.707	0.636	±670

Validation AUC for Varying the Number of Layers and Qubits (0.0 = not tested).

10k samples with 15% for test samples, 200 epochs, 128 batch size, varying qubits, varying layers, filter size = [2, 2], stride = [1, 1], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling

Num. Qubits	Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Runtime (secs per epoch)
1	1	338	0.650	0.623	±120

Test on LArTPC Dataset

2 classes, 160 training samples (80 per class), 40 testing samples (20 per class), 200 epochs, 16 batch size, varying qubits, varying layers, filter size = [3, 2], stride = [2, 2], followed by classical head [2] with activation [softmax], classical preprocessing: log scaling -> MinMax scaling

optimizer: RMSProp(learning_rate=0.01, rho=0.99, epsilon=1e-08)

Sample Images of e- vs μ+.

Sample Images of p+ vs μ+.

Sample Images of π+ vs μ+.

Classes	Num. Qubits	Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Train Accuracy	Test Accuracy	Runtime (secs per epoch)
e- vs μ+	1	1	130	1.0	0.977	1.0	0.925	±6
e- vs μ+	1	2	160	1.0	0.971	1.0	0.925	±14
e- vs μ+	2	2	220	1.0	0.996	1.0	0.950	±27
p+ vs μ+	1	1	130	1.0	0.980	1.0	0.950	±6
p+ vs μ+	2	2	220	1.0	0.969	1.0	0.925	±25
π+ vs μ+	1	1	130	1.0	0.928	1.0	0.850	±6
π+ vs μ+	1	2	160	1.0	0.921	1.0	0.875	±11
π+ vs μ+	1	3	190	1.0	0.863	1.0	0.825	±18
π+ vs μ+	2	1	160	1.0	0.890	1.0	0.925	±12
π+ vs μ+	2	2	220	1.0	0.977	1.0	0.950	±24
π+ vs μ+	2	3	280	1.0	0.940	1.0	0.850	±36
π+ vs μ+	3	1	190	1.0	0.896	1.0	0.875	±18
π+ vs μ+	3	2	280	1.0	0.971	1.0	0.925	±35
π+ vs μ+	3	3	370	1.0	0.954	1.0	0.850	±55

-----

Comparison of Best Models.
The model developed in this project (*) produced similar results with the QCNN and classical CNN models from [4] (†) with less number of qubits and trainable parameters. Reference [4] didn't report their AUC scores.

Classes	Model	Num. Qubits	Num. Trainable Params	Train Accuracy	Test Accuracy	Train AUC	Test AUC
e- vs μ+	QCNN-DRC* QCNN† CNN†	2 9 (classical)	220 472 498	1.0 1.0 0.9938	0.950 0.925 0.950	1.0 - -	0.996 - -
p+ vs μ+	QCNN-DRC* QCNN† CNN†	1 9 (classical)	130 472 498	1.0 1.0 0.9125	0.950 0.975 0.80	1.0 - -	0.980 - -
π+ vs μ+	QCNN-DRC* QCNN† CNN†	2 9 (classical)	220 472 498	1.0 0.9688 0.975	0.950 0.975 0.825	1.0 - -	0.977 - -

Test on Quark-Gluon Dataset

This part is still a working progress. A much higher specs computational device (more RAMs) is needed for training the model because this dataset is huge.
Early testing with small samples showed a promising results. The training accuracies and AUCs are high, indicates that the model was able to learn how to differentiate the data. Low test metrics indicates overfitting, the model failed to generalize well -> we need to train the model on larger number of samples.

2 classes, 850 training samples (425 per class), 150 testing samples (75 per class), 200 epochs, 128 batch size, varying qubits, varying layers, filter size = [3, 3], stride = [2, 1], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 40x40 -> log scaling -> MinMax scaling

optimizer: Adam(learning_rate=lr_schedule)

Averages of Gluon (left) and Quark (right) image samples of the track channel from the subdataset of 1k samples after cropping to 40 x 40.

Num. Qubits	Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Train Accuracy	Test Accuracy	Runtime (secs per epoch)
1	1	2374	0.712	0.531	0.659	0.567	128
1	2	2410	0.928	0.559	0.846	0.567	370
1	3	2446	0.992	0.571	0.956	0.560	670
2	1	2410	0.829	0.663	0.754	0.640	450

Test on MNIST Dataset

2 classes, 400 training samples (200 per class), 1000 testing samples (500 per class), 10 epochs, 32 batch size, varying qubits, varying layers, filter size = [3, 3], stride = [2, 2], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 27x27 -> pixels range [0, 1] (divide all pixels by 255)

optimizer: Adam(learning_rate=lr_schedule)

Classes	Num. Qubits	Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Train Accuracy	Test Accuracy	Runtime (secs per epoch)
0 vs 1	1	1	350	0.999	0.999	0.99	0.99	±40

-----

2 classes, 400 training samples (200 per class), 1000 testing samples (500 per class), 200 epochs, 32 batch size, varying qubits, varying layers, filter size = [3, 3], stride = [2, 2], followed by classical head [2] with activation [softmax], classical preprocessing: pixels range [0, 1] (divide all pixels by 255)

optimizer: Adam(learning_rate=0.001)

Classes	Num. Qubits	Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Train Accuracy	Test Accuracy	Runtime (secs per epoch)
0 vs 1	1	1	110	0.9999	0.9999	0.998	0.996	±41
3 vs 6	1	1	110	0.999	0.998	0.998	0.985	±55
4 vs 7	1	1	110	0.9999	0.993	0.998	0.967	±42
8 vs 9	1	1	110	0.997	0.978	0.975	0.927	±41
2 vs 5	1	1	110	0.9997	0.994	0.995	0.962	±41

QCNN v0.3.0 (circuit from [4])

Test on ECAL Dataset

10k samples with 15% for test samples, 200 epochs, 128 batch size, varying layers, filter size = [3, 2], stride = [1, 1], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> convert all pixels' value with arctan function

optimizer: Adam(learning_rate=lr_schedule)

Num. Layers	Num. Trainable Params	Train AUC	Test AUC	Runtime (secs per epoch)
1	265	0.613	0.586	±300
2	304	0.663	0.644	±570
3	343	0.647	0.630	±780
4	382	0.653	0.635	±950

Classical CNN

Test on ECAL Dataset

10k samples with 15% for test samples, 200 epochs, 128 batch size, filter size = [3, 3], stride = [1, 1], conv activation = [relu, relu], use_bias = [True, True], followed by classical head [8, 2] with activation [relu, softmax], classical preprocessing: crop to 8x8 -> standard scaling

optimizer: Adam(learning_rate=lr_schedule)

Num. of filters	Num. Trainable Params	AUC Train	AUC Test	Runtime (secs per epoch)
[2, 1]	193	0.723	0.675	±0.268
[4, 1]	231	0.735	0.693	±0.268
[6, 1]	269	0.745	0.696	±0.268
[8, 1]	307	0.746	0.700	±0.268
[4, 2]	396	0.764	0.699	±0.268
[4, 3]	561	0.784	0.687	±0.268

General Notes

lr_schedule

def lr_schedule(epoch):
    """Learning Rate Schedule

    Learning rate is scheduled to be reduced after 80, 120, 160, 180 epochs.
    Called automatically every epoch as part of callbacks during training.

    # Arguments
        epoch (int): The number of epochs

    # Returns
        lr (float32): learning rate
    """
    lr = 1e-3
    if epoch > 180:
        lr *= 0.5e-3
    elif epoch > 160:
        lr *= 1e-3
    elif epoch > 120:
        lr *= 1e-2
    elif epoch > 80:
        lr *= 1e-1
    print('Learning rate: ', lr)
    return lr

# Both `lr_scheduler` and `lr_reducer` are used for `callbacks` argument of Keras' model training API
lr_scheduler = tf.keras.callbacks.LearningRateScheduler(lr_schedule)
lr_reducer = tf.keras.callbacks.ReduceLROnPlateau(factor=np.sqrt(0.1),
                               cooldown=0,
                               patience=5,
                               min_lr=0.5e-6)

References

[1] He, J. (2016). Identity Mappings in Deep Residual Networks. In Computer Vision – ECCV 2016 (pp. 630–645). Springer International Publishing.

[2] Samuel Yen-Chi Chen, Tzu-Chieh Wei, Chao Zhang, Haiwang Yu, & Shinjae Yoo. (2021). Hybrid Quantum-Classical Graph Convolutional Network.

[3] Pérez-Salinas, A., Cervera-Lierta, A., Gil-Fuster, E., & Latorre, J. (2020). Data re-uploading for a universal quantum classifier. Quantum, 4, 226.

[4] Samuel Yen-Chi Chen, Tzu-Chieh Wei, Chao Zhang, Haiwang Yu, & Shinjae Yoo. (2020). Quantum Convolutional Neural Networks for High Energy Physics Data Analysis.

[5] LeCun Y, Cortes C. MNIST handwritten digit database 2010.

[6] Andrews, M., Alison, J., An, S., Burkle, B., Gleyzer, S., Narain, M., Paulini, M., Poczos, B., & Usai, E. (2020). End-to-end jet classification of quarks and gluons with the cms open data. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 977, 164304.

Contributing and Reporting

Contributing

If you want to contribute to the code (e.g., adding new features, provide another example of usage in Jupyter Notebook, research results), please let me know by sending a pull request with a comprehensive PR note on new things that you added and the reasoning. I won't be too strict on it as the main goal of this project is more towards "research" rather than "code development". So, as long as it is clear and good enough, I will merge the PR.

You can also open a new issue or contact me if you want to discuss things first.

Reporting

If you used the code and found any bugs/errors, or have any suggestions, critics, requests, etc., please let me know either by opening up a new issue or by contacting me.

Thank you!

Name		Name	Last commit message	Last commit date
Latest commit History 238 Commits
assets		assets
development notebooks		development notebooks
qcnn_drc		qcnn_drc
sandbox		sandbox
slides		slides
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

License

ericardomuten/qcnn-hep

Folders and files

Latest commit

History

Repository files navigation

Quantum Convolutional Neural Networksfor High-Energy Physics Analysis at the LHC

Table of (Main) Contents

Introduction

Abstract

How to Use

Package Description

Package Dependencies

Installation

Docs and Tutorial

Weekly Progress

Project's Datasets

Photon-Electron Electromagnetic Calorimeter (ECAL) Dataset

Liquid Argon Time Projection Chamber (LArTPC) Dataset[4]

Quark-Gluon Dataset[6]

MNIST Dataset[5]

Research

Hardware and Platform

Results

Early Testing

Test on ECAL Dataset

QCNN v0.1.1 (data re-uploading circuit[3])

Test on ECAL Dataset

Test on LArTPC Dataset

Test on Quark-Gluon Dataset

Test on MNIST Dataset

QCNN v0.3.0 (circuit from [4])

Test on ECAL Dataset

Classical CNN

Test on ECAL Dataset

General Notes

References

Contributing and Reporting

Contributing

Reporting

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

Quantum Convolutional Neural Networks
for High-Energy Physics Analysis at the LHC