[1] M. A. Turing, Computing machinery and intelligenc. Mind 59, 236, 433–433, 1950.
[2] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li and L. Fei-Fei. ImageNet, A large-scale hierarchical image database. Computer Vision and Pattern Recognition. 248–255, 2009.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks, Neural Information Processing Systems, 1097–1105, 2012.
[4] M. Mohri, A. Rostamizadeh, and A. Talwalkar.Foundations of machine learning, MIT Press, 2018.
[5] S.Kadam, V.Vaidya, Review and Analysis of Zero, One and Few Shot Learning Approaches, Intelligent Systems Design and Applications, 100-112,2018.
[6] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, 1735–1780, 1997.
[7] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. Matching networks for one shot learning. InAdvances inNeural Information Processing Systems. 3630–3638, 2016.
[8] C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. InInternational Conference on Machine Learning. 1126–1135, 2017.
[9] Aurélien Bellet, Amaury Habrard, and Marc Sebban. A survey on metric learning for feature vectors and structured data, 2013.
[10] Brian Kulis. Metric learning: A survey. Foundations and Trends in Machine Learning,5(4):287–364, 2012.
[11] Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis, and Ruslan Salakhutdinov. Neighbourhood components analysis. In Advances in Neural Information Processing Systems, pages 513–520, 2004.
[12] J. Snell, K. Swersky, and R. S. Zemel. Prototypical networks for few-shot learning. InAdvances in NeuralInformation Processing Systems. 4077–4087, 2017.
[13] Chelsea Finn, Pieter Abbeel, Sergey Levine; “Model-agnostic meta-learning for fast adaptation of deep networks.”, Proceedings of the 34th International Conference on Machine Learning - Volume 70, Pages 1126-1135, 2017.
[14] Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto; “Meta-Learning with Differentiable Convex Optimization.”, The IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[15] Kelvin Guu, Tatsunori B. Hashimoto, Yonatan Oren and Percy Liang; “Generating Sentences by Editing Prototypes.”, Transactions of the Association for Computational Linguistics, Volume 6, p.437-450, 2018.
[16] Eli Schwartz, Leonid Karlinsky, Rogerio Feris, Raja Giryes, Alex M. Bronstein; “Baby steps towards few-shot learning with multiple semantics.”, arXiv:1906.01905v1 [cs.CV], 2019.
[17] Arindam Banerjee, Srujana Merugu, Inderjit S Dhillon, and Joydeep Ghosh. Clustering with bregman divergences. Journal of Machine Learning Research, 6(Oct):1705–1749, 2005.
[18] D. Goldberg and J. Holland, Genetic algorithms and machine learning, Machine Learning, vol. 3, 95-99, 1988.
[19] Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pages 3630–3638, 2016.
[20] Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. International Conference on Learning Representations, 2017.
[21] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In IEEE Computer Vision and Pattern Recognition, pages 1–9, 2015.
[22] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning, 2017.
[23] Harrison Edwards and Amos Storkey. Towards a neural statistician. International Conference on Learning Representations, 2017.
[24] L. van d. Maaten and G. Hinton, Visualizing data using t-sne, Journal of Machine Learning Research, 2579–2605, 2008.
[25] A. Krizhevsky, Learning multiple layers of features from tiny images, Technical report, 2009.