ELIZABETH DWENGER

00. Overview

Implementing an Evolutionary Algorithm for Neural Architecture Search.

Within the field of computational intelligence, evolutionary algorithms and neural networks are two frameworks that aim to advance the goal of optimization. Evolutionary algorithms emulate structures found in nature, while neural networks approximate the structure of the human brain to identify patterns. This project combines these frameworks by using an evolutionary algorithm to select the architecture of a neural network. The resulting architecture is compared against two other algorithms: a fully connected neural network and a convolutional neural network. The Scikit-Learn Digits dataset is used to train and evaluate the models, with the goal of identifying the best-performing algorithm on the test set.

Problem Statement

This project focuses on three subgoals:

1. Implementing a convolutional neural network.
2. Implementing an evolutionary algorithm for neural architecture search.
3. Evaluating the performance of the resulting algorithm.

01. Methodology

Convolutional Neural Network.

A convolutional neural network (CNN) was implemented in PyTorch with the following general architecture:
1. A 2D convolutional layer followed by an activation function and pooling.
2. Flattening of the output.
3. A linear layer with another activation function.
4. A final linear layer followed by a log softmax function.

Hyperparameter Choices

The CNN architecture allows the following hyperparameter configurations:

2D Convolutional Layer:
Filters or Kernel: (padding = 1, stride = 1) or (padding = 2, stride = 1)
Activation Function:
ReLU, sigmoid, tanh, softplus, or ELU
Pooling:
Filter size or Type: Max pooling or average pooling
Linear Layer:
Neurons ranging from 10 to 100 in multiples of 10

Evolutionary Algorithm

The evolutionary algorithm performs neural architecture search with the following components:

Crossover Operator:
A one-point crossover recombines parts of parent configurations.
Survival Selection:
Elitist selection copies the best-performing individuals to the next generation.
Mutation Operators:
Bit flip for binary values (probability); Random integer generation for other values

Models

Three models were implemented and trained on the Scikit-Learn Digits dataset:

1. A fully connected neural network (FCNN).
2. A convolutional neural network (CNN).
3. A CNN optimized through the evolutionary algorithm (EA-CNN).

02. Results

Results & Discussion.

The EA-CNN achieved a best fitness score of 0.029 after 25 generations.
This project explored the integration of evolutionary algorithms with neural networks for architecture optimization. While the EA-CNN demonstrated efficient initial optimization, its performance lagged behind standard CNNs and FCNNs. Future work could address computational constraints by increasing generations and epochs or incorporating parallel processing to explore more configurations.

EA Selects NN Architecture