Integration of few-shot learning methods to improve image classification performance with small data sets

Document Type : Original Article

Authors

Computer Engineering Department, Yazd University, Yazd, Iran

Abstract

Despite the significant advancement of artificial intelligence methods in recent years, there is still a need for a lot of data to learn these methods. To meet this need, a new machine learning model called Few-shot learning has been proposed. One of the methods in this field is the prototypical networks approach, which is actually a combination of metric learning and meta-learning methods. In these networks, the classifier tries to generalize to these classes according to only a small number of samples of each new class. In this study, an attempt was made to propose a modified form of the prototype networks to solve the finite shot classification problem. Initially, in order to improve the performance of the prototype networks, instead of the Euclidean distance, the Mahalanobis distance was used to measure the distance between the samples. This improved the performance of these networks in classifying omniglot and miniImageNet images so that the proposed network was able to achieve 99.1% and 68.5% accuracy on these two datasets, respectively. The next section introduces a general approach that can automatically improve the architecture of convolutional neural networks using a genetic algorithm. In this research, this approach has been used specifically on omniglot datasets with the proposed primary architecture in prototype networks. Finally, by using this approach and replacing the proposed architecture with the main architecture of the prototype network, the network accuracy was improved and was able to achieve 99.5% accuracy.

Keywords


[1] M. A. Turing, Computing machinery and intelligenc. Mind 59, 236, 433–433, 1950.
[2] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li and L. Fei-Fei. ImageNet, A large-scale hierarchical image database. Computer Vision and Pattern Recognition. 248–255, 2009.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks, Neural Information Processing Systems, 1097–1105, 2012.
[4] M. Mohri, A. Rostamizadeh, and A. Talwalkar.Foundations of machine learning, MIT Press, 2018.
[5] S.Kadam, V.Vaidya, Review and Analysis of Zero, One and Few Shot Learning Approaches, Intelligent Systems Design and Applications, 100-112,2018.
[6] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, 1735–1780, 1997.
[7] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. Matching networks for one shot learning. InAdvances inNeural Information Processing Systems. 3630–3638, 2016.
[8]     C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. InInternational Conference on Machine Learning. 1126–1135, 2017.
[9]     Aurélien Bellet, Amaury Habrard, and Marc Sebban. A survey on metric learning for feature vectors and structured data, 2013.
[10]  Brian Kulis. Metric learning: A survey. Foundations and Trends in Machine Learning,5(4):287–364, 2012.
[11]   Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis, and Ruslan Salakhutdinov. Neighbourhood components analysis. In Advances in Neural Information Processing Systems, pages 513–520, 2004.
[12] J. Snell, K. Swersky, and R. S. Zemel. Prototypical networks for few-shot learning. InAdvances in NeuralInformation Processing Systems. 4077–4087, 2017.
[13]  Chelsea Finn, Pieter Abbeel, Sergey Levine; “Model-agnostic meta-learning for fast adaptation of deep networks.”, Proceedings of the 34th International Conference on Machine Learning - Volume 70, Pages 1126-1135, 2017.
[14]  Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto; “Meta-Learning with Differentiable Convex Optimization.”, The IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[15]  Kelvin Guu, Tatsunori B. Hashimoto, Yonatan Oren and Percy Liang; “Generating Sentences by Editing Prototypes.”, Transactions of the Association for Computational Linguistics, Volume 6, p.437-450, 2018.
[16]  Eli Schwartz, Leonid Karlinsky, Rogerio Feris, Raja Giryes, Alex M. Bronstein; “Baby steps towards few-shot learning with multiple semantics.”, arXiv:1906.01905v1 [cs.CV], 2019.
[17]  Arindam Banerjee, Srujana Merugu, Inderjit S Dhillon, and Joydeep Ghosh. Clustering with bregman divergences. Journal of Machine Learning Research, 6(Oct):1705–1749, 2005.
[18] D. Goldberg and J. Holland, Genetic algorithms and machine learning, Machine Learning, vol. 3, 95-99, 1988.
[19]  Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pages 3630–3638, 2016.
[20]  Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. International Conference on Learning Representations, 2017.
[21]  Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In IEEE Computer Vision and Pattern Recognition, pages 1–9, 2015.
[22]  Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning, 2017.
[23]  Harrison Edwards and Amos Storkey. Towards a neural statistician. International Conference on Learning Representations, 2017.
[24] L. van d. Maaten and G. Hinton, Visualizing data using t-sne, Journal of Machine Learning Research, 2579–2605, 2008.
[25] A. Krizhevsky, Learning multiple layers of features from tiny images, Technical report, 2009.