https://www.kaggle.com/code/mertcanay/aygazgoruntuislemebootcamp
This project implements a Convolutional Neural Network (CNN) to classify images of 10 animal species. The goal was to build and evaluate a model that can effectively identify different animal classes from images. While the project successfully delivered a functional model, the final results indicate room for improvement in both model design and data handling.
- Source: Animals with Attributes 2
- Classes: 10 animal species (e.g., collie, dolphin, elephant, polar bear, etc.)
- Image Preprocessing: All images were resized to 128x128.
- Train/Test Split: 70% training, 30% testing.
- Augmentation: Basic techniques like rotation and blurring were applied. Advanced augmentation (e.g., flipping, brightness adjustment) was not implemented due to time constraints.
- Type: Convolutional Neural Network (CNN)
- Layers:
- 3 convolutional layers (with 32, 64, and 128 filters respectively).
- 3 max-pooling layers to reduce spatial dimensions.
- 1 fully connected layer with 256 neurons.
- Dropout layer with a rate of 50% to prevent overfitting.
- Final dense layer for 10-class classification.
- Loss Function: Categorical Crossentropy
- Optimizer: Adam
- Total Parameters: ~6.5 million
- Training Accuracy: ~61.82%
- Validation Accuracy: ~35.28%
- Validation Loss: 3.98
- Test Accuracy: ~35.28%
- Test Loss: 3.98
- Overfitting: The model performed well on the training data but struggled to generalize on the validation and test data.
- Limited Data Augmentation: Only basic augmentation techniques were applied, leading to a lack of diversity in the training data.
- Model Complexity: With over 6.5 million parameters, the model may be too large for the given dataset.
- Dataset Size: Each class contained only 650 images. For a model with this level of complexity, the dataset size was insufficient.
- Advanced Data Augmentation: Techniques like flipping, zooming, brightness adjustment, and shifting could have provided the model with more diverse data, improving generalization.
- Simpler Model Design: Reducing the number of filters and the size of the fully connected layers could have mitigated overfitting and made the model more efficient.
- Transfer Learning: Utilizing a pretrained model (e.g., ResNet or VGG16) could have leveraged already-learned features and boosted accuracy with limited data.
- Longer Training: Increasing the number of epochs and using learning rate scheduling might have allowed the model to converge better.
- Enhance Data Augmentation:
- Add brightness, contrast, flipping, and zoom transformations.
- Reduce Model Complexity:
- Use fewer filters in convolutional layers (e.g., 16, 32, 64).
- Reduce the number of dense layer neurons.
- Use Pretrained Models:
- Fine-tune a model like ResNet50 or VGG16 with the current dataset.
- Extend Training:
- Increase the number of epochs and monitor learning using callbacks like
ReduceLROnPlateau
.
- Increase the number of epochs and monitor learning using callbacks like
- Improved Generalization: Data augmentation and a simpler model would help the model perform better on unseen data.
- Faster Training: A smaller model would require less computational power and train faster.
- Higher Accuracy: Pretrained models would provide a significant performance boost with minimal effort.
This project provided valuable insights into CNN model design and its challenges with limited data. While the model achieved a modest accuracy, there is considerable room for improvement through better data handling, model design, and advanced techniques like transfer learning.