تلفیق روش‌های یادگیری شات محدود جهت بهبود عملکرد طبقه‌بندی تصاویر با مجموعه داده‌های کم

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجو ی کارشناسی ارشد دانشکده مهندسی کامپیوتر، دانشگاه یزد، یزد، ایران

2 دانشیار دانشکده مهندسی کامپیوتر، دانشگاه یزد، یزد، ایران

چکیده

در این پژوهش سعی بر آن شد تا شکل تغییریافته‌ای از رویکرد prototypical networks برای حل مسئله طبقه‌بندی شات محدود پیشنهاد شود. در این شبکه‌ها، طبقه‌بند سعی می‌کند تا با توجه به تنها تعداد کمی از نمونه‌های هر کلاس جدید نسبت به این کلاس‌ها تعمیم یابد. در رویکرد پیشنهادی به‌جای فاصله اقلیدسی، که در رویکرد مرجع استفاده شده است، از فاصله ماهالانوبیس برای اندازه‌گیری فاصله بین نمونه‌ها استفاده شد. به این ترتیب شبکه یک فضای متریک را یاد می‌گیرد که در آن می‌توان طبقه‌بندی را با محاسبه فواصل نسبت به بازنمایی‌های نمونه اولیه هر کلاس انجام داد. همچنین از یک معماری شبکه عصبی پنج لایه با فیلتر‌هایی با سایز پنج در پنج به‌جای بلوک‌های چهارتایی معرفی شده در رویکرد prototypical networks مرجع استفاده شد. این تغییرات موجب بهبود عملکرد این شبکه‌ها در طبقه‌بندی تصاویر omniglot و miniImageNet شد به طوری که شبکه پیشنهادی توانست به ترتیب به‌دقت‌های 1/99% و 5/68% بر روی این دو مجموعه داده دست یابد که نسبت به شبکه‌های نمونه‌های اولیه از دقت بهتری برخوردار است. نتایج نشان می‌دهد که برخی از تصمیمات ساده طراحی می‌توانند پیشرفت‌های قابل‌توجهی را در رویکردهای اخیر در این زمینه مانند انتخاب‌های معماری پیچیده و فرا یادگیری ایجاد کنند.

کلیدواژه‌ها


عنوان مقاله [English]

Integration of few-shot learning methods to improve image classification performance with small data sets

نویسندگان [English]

  • Ali Bashiri 1
  • Ali mohammad Latif 2
1 Computer Engineering Department, Yazd University, Yazd, Iran
2 Computer Engineering Department, Yazd University, Yazd, Iran
چکیده [English]

Despite the significant advancement of artificial intelligence methods in recent years, there is still a need for a lot of data to learn these methods. To meet this need, a new machine learning model called Few-shot learning has been proposed. One of the methods in this field is the prototypical networks approach, which is actually a combination of metric learning and meta-learning methods. In these networks, the classifier tries to generalize to these classes according to only a small number of samples of each new class. In this study, an attempt was made to propose a modified form of the prototype networks to solve the finite shot classification problem. Initially, in order to improve the performance of the prototype networks, instead of the Euclidean distance, the Mahalanobis distance was used to measure the distance between the samples. This improved the performance of these networks in classifying omniglot and miniImageNet images so that the proposed network was able to achieve 99.1% and 68.5% accuracy on these two datasets, respectively. The next section introduces a general approach that can automatically improve the architecture of convolutional neural networks using a genetic algorithm. In this research, this approach has been used specifically on omniglot datasets with the proposed primary architecture in prototype networks. Finally, by using this approach and replacing the proposed architecture with the main architecture of the prototype network, the network accuracy was improved and was able to achieve 99.5% accuracy.

کلیدواژه‌ها [English]

  • Classification
  • few-shot learning
  • meta-learning
  • metric learning
[1] M. A. Turing, Computing machinery and intelligenc. Mind 59, 236, 433–433, 1950.
[2] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li and L. Fei-Fei. ImageNet, A large-scale hierarchical image database. Computer Vision and Pattern Recognition. 248–255, 2009.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks, Neural Information Processing Systems, 1097–1105, 2012.
[4] M. Mohri, A. Rostamizadeh, and A. Talwalkar.Foundations of machine learning, MIT Press, 2018.
[5] S.Kadam, V.Vaidya, Review and Analysis of Zero, One and Few Shot Learning Approaches, Intelligent Systems Design and Applications, 100-112,2018.
[6] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, 1735–1780, 1997.
[7] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. Matching networks for one shot learning. InAdvances inNeural Information Processing Systems. 3630–3638, 2016.
[8]     C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. InInternational Conference on Machine Learning. 1126–1135, 2017.
[9]     Aurélien Bellet, Amaury Habrard, and Marc Sebban. A survey on metric learning for feature vectors and structured data, 2013.
[10]  Brian Kulis. Metric learning: A survey. Foundations and Trends in Machine Learning,5(4):287–364, 2012.
[11]   Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis, and Ruslan Salakhutdinov. Neighbourhood components analysis. In Advances in Neural Information Processing Systems, pages 513–520, 2004.
[12] J. Snell, K. Swersky, and R. S. Zemel. Prototypical networks for few-shot learning. InAdvances in NeuralInformation Processing Systems. 4077–4087, 2017.
[13]  Chelsea Finn, Pieter Abbeel, Sergey Levine; “Model-agnostic meta-learning for fast adaptation of deep networks.”, Proceedings of the 34th International Conference on Machine Learning - Volume 70, Pages 1126-1135, 2017.
[14]  Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto; “Meta-Learning with Differentiable Convex Optimization.”, The IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[15]  Kelvin Guu, Tatsunori B. Hashimoto, Yonatan Oren and Percy Liang; “Generating Sentences by Editing Prototypes.”, Transactions of the Association for Computational Linguistics, Volume 6, p.437-450, 2018.
[16]  Eli Schwartz, Leonid Karlinsky, Rogerio Feris, Raja Giryes, Alex M. Bronstein; “Baby steps towards few-shot learning with multiple semantics.”, arXiv:1906.01905v1 [cs.CV], 2019.
[17]  Arindam Banerjee, Srujana Merugu, Inderjit S Dhillon, and Joydeep Ghosh. Clustering with bregman divergences. Journal of Machine Learning Research, 6(Oct):1705–1749, 2005.
[18] D. Goldberg and J. Holland, Genetic algorithms and machine learning, Machine Learning, vol. 3, 95-99, 1988.
[19]  Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pages 3630–3638, 2016.
[20]  Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. International Conference on Learning Representations, 2017.
[21]  Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In IEEE Computer Vision and Pattern Recognition, pages 1–9, 2015.
[22]  Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning, 2017.
[23]  Harrison Edwards and Amos Storkey. Towards a neural statistician. International Conference on Learning Representations, 2017.
[24] L. van d. Maaten and G. Hinton, Visualizing data using t-sne, Journal of Machine Learning Research, 2579–2605, 2008.
[25] A. Krizhevsky, Learning multiple layers of features from tiny images, Technical report, 2009.