طبقه‌بندی کننده دومرحله‌ای مبتنی بر نمایش تنک و کاربرد آن در تشخیص سرطان

نوع مقاله: مقاله کامل پژوهشی

نویسندگان

1 ﺩﺍﻧﺸﺠﻮی ﻛﺎﺭﺷﻨﺎﺳﻰ ﺍﺭﺷﺪ،ﮔﺮﻭﻩ ﻣﺨﺎﺑﺮﺍﺕ، ﺩﺍﻧﺸﻜﺪﻩ ﻣﻬﻨﺪﺳﻰ ﺑﺮﻕ ﻭ ﻛﺎﻣﭙﻴﻮﺗﺮ، ﺩﺍﻧﺸﮕﺎﻩ ﻳﺰﺩ

2 ﺍﺳﺘﺎﺩﻳﺎﺭ، ﮔﺮﻭﻩ ﻣﺨﺎﺑﺮﺍﺕ، ﺩﺍﻧﺸﻜﺪﻩ ﻣﻬﻨﺪﺳﻰ ﺑﺮﻕ ﻭ ﻛﺎﻣﭙﻴﻮﺗﺮ، ﺩﺍﻧﺸﮕﺎﻩ ﻳﺰﺩ

10.22041/ijbme.2014.13555

چکیده

ﺑﺎ ﺗﻮﺟﻪ ﺑﻪ ﻧﺘﺎﻳﺞ ﻣﻮﻓﻘﻴﺖﺁﻣﻴﺰ ﻃﺒﻘﻪﺑﻨﺪیﻛﻨﻨﺪﻩ ﻣﺒﺘﻨﻰ ﺑﺮ ﻧﻤﺎﻳﺶ ﺗﻨﮏ (SRC) ﻭ ﺧﻮﺷﻪﺑﻨﺪی ﺯﻳﺮﻓﻀﺎی ﺗﻨﮏ (SSC) ﺩﺭ ﻛﺎﺭﺑﺮﺩﻫﺎی ﻣﺨﺘﻠﻒ، ﺩﺭ ﺍﻳﻦ ﻣﻘﺎﻟﻪ ﺑﺎ ﺗﺮﻛﻴﺐ ﺍﻳﻦ ﺩﻭ ﺭﻭﺵ، ﻳﮏ ﺭﻭﺵ ﻃﺒﻘﻪﺑﻨﺪی ﺳﻠﺴﻠﻪ ﻣﺮﺍﺗﺒﻰ ﺍﺭﺍﺋﻪ ﻣﻰﺷﻮﺩ. ﺍﻳﺪﻩ ﺍﺻﻠﻰ ﺩﺭ ﺭﻭﺵﻫﺎی ﻃﺒﻘﻪﺑﻨﺪی ﻭ ﺧﻮﺷﻪﺑﻨﺪی ﻣﺒﺘﻨﻰ ﺑﺮ ﻧﻤﺎﻳﺶ ﺗﻨﮏ، ﻧﻤﺎﻳﺶ ﻫﺮ ﺩﺍﺩﻩ ﺑﻪ ﺻﻮﺭﺕ ﺗﺮﻛﻴﺐ ﺧﻄﻰ ﺗﻨﮏ ﺍﺯ ﺳﺎﻳﺮ ﺩﺍﺩﻩﻫﺎ ﺍﺳﺖ ﺑﻪ ﮔﻮﻧﻪﺍی ﻛﻪ ﺩﺍﺩﻩﻫﺎی ﻣﺸﺎﺑﻪ ﺑﺎ ﺩﺍﺩﻩ ﻣﻮﺭﺩ ﻧﻈﺮ ﺩﺭ ﺍﻳﻦ ﺗﺮﻛﻴﺐ ﺧﻄﻰ ﺑﻴﺸﺘﺮﻳﻦ ﻭﺯﻥ ﺭﺍ ﺑﻪ ﺧﻮﺩ ﺍﺧﺘﺼﺎﺹ ﺩﻫﻨﺪ. ﺩﺭ ﺭﻭﺵ ﭘﻴﺸﻨﻬﺎﺩی، ﺑﻪ ﻣﻨﻈﻮﺭ ﺩﺳﺖﻳﺎﺑﻰ ﺑﻪ ﺻﺤﺖ ﻃﺒﻘﻪﺑﻨﺪی ﺑﻴﺸﺘﺮ، ﺍﺑﺘﺪﺍ ﺩﺍﺩﻩﻫﺎی ﺁﻣﻮﺯﺷﻰ ﺑﺎ ﺍﺳﺘﻔﺎﺩﻩ ﺍﺯ ﺭﻭﺵ ﺧﻮﺷﻪﺑﻨﺪی ﺯﻳﺮﻓﻀﺎی ﺗﻨﮏ ﺑﺨﺶﺑﻨﺪی ﻣﻰﺷﻮﻧﺪ. ﺳﭙﺲ ﺑﺎ ﺍﺳﺘﻔﺎﺩﻩ ﺍﺯ ﺷﻴﻮﮤ ﺑﻜﺎﺭ ﮔﺮﻓﺘﻪ ﺷﺪﻩ ﺩﺭ ﻃﺒﻘﻪﺑﻨﺪیﻛﻨﻨﺪﻩ ﻣﺒﺘﻨﻰ ﺑﺮ ﻧﻤﺎﻳﺶ ﺗﻨﮏ، ﻃﺒﻘﻪﺑﻨﺪیﻛﻨﻨﺪﻩﺍی ﺩﻭ ﻣﺮﺣﻠﻪﺍی ﻃﺮﺍﺣﻰ ﻣﻰﺷﻮﺩ. ﺩﺭ ﻣﺮﺣﻠﺔ ﺍﻭﻝ، ﺧﻮﺷﻪﺍی ﻛﻪ ﺩﺍﺩﻩ ﻭﺭﻭﺩی ﺑﻴﺸﺘﺮﻳﻦ ﺷﺒﺎﻫﺖ ﺭﺍ ﺑﺎ ﺁﻥ ﺩﺍﺭﺩ ﺗﻌﻴﻴﻦ ﺷﺪﻩ ﻭ ﺩﺭ ﻣﺮﺣﻠﻪ ﺑﻌﺪ ﻃﺒﻘﺔ ﻣﺮﺑﻮﻃﻪ ﺑﺮﭼﺴﺐ ﺩﺍﺩﻩ) ﺗﻌﻴﻴﻦ ﻣﻰﺷﻮﺩ. ﺑﺮﺍی ﺍﺭﺯﻳﺎﺑﻰ ﺭﻭﺵ ﭘﻴﺸﻨﻬﺎﺩی ﺍﺯ ﺩﺍﺩﮔﺎﻥ ﺭﻳﺰﺁﺭﺍﻳﻪ Tumors-14 -ﻛﻪ ﺣﺎﻭی ﺍﻃﻼﻋﺎﺕ ﻣﺮﺑﻮﻁ ﺑﻪ ﻧﻮﻉ ﺳﺮﻃﺎﻥ ﻣﺨﺘﻠﻒ ﺍﺳﺖ- ﺍﺳﺘﻔﺎﺩﻩ ﺷﺪﻩ ﺍﺳﺖ. ﺍﺯ ﺟﻤﻠﻪ ﻭﻳﮋﮔﻰﻫﺎی ﺍﻳﻦ ﺩﺍﺩﮔﺎﻥ ﺗﻌﺪﺍﺩ ﺯﻳﺎﺩ ﺑﻌﺪ ﺩﺭ ﻣﻘﺎﺑﻞ ﺗﻌﺪﺍﺩ ﻛﻢ ﻧﻤﻮﻧﻪ ﺩﺭ ﺍﺳﺖ ﻛﻪ ﻋﻤﻞ ﻃﺒﻘﻪﺑﻨﺪی ﺁﻥﻫﺎ ﺭﺍ ﺑﻪ ﻣﺴﺄﻟﻪﺍی ﭼﺎﻟﺶﺑﺮﺍﻧﮕﻴﺰ ﺗﺒﺪﻳﻞ ﻣﻰﻛﻨﺪ. ﺍﺑﻌﺎﺩ ﺯﻳﺎﺩ ﺩﺍﺩﻩﻫﺎ ﻧﻪ ﺗﻨﻬﺎ ﻣﺸﻜﻼﺗﻰ ﺍﺯ ﺟﻤﻠﻪ ﻧﻔﺮﻳﻦ ﺍﺑﻌﺎﺩ ﻭ ﺑﻴﺶ ﺍﻧﻄﺒﺎﻕ ﻃﺒﻘﻪﺑﻨﺪیﻛﻨﻨﺪﻩ ﺑﻪ ﺩﺍﺩﻩﻫﺎی ﺁﻣﻮﺯﺷﻰ ﺭﺍ ﺑﻪ ﺩﻧﺒﺎﻝ ﺩﺍﺭﺩ، ﺑﻠﻜﻪ ﺑﺎﻋﺚ ﺍﻓﺰﺍﻳﺶ ﭘﻴﭽﻴﺪﮔﻰ ﻣﺤﺎﺳﺒﺎﺗﻰ ﺷﺪﻩ؛ ﺯﻣﺎﻥ ﻻﺯﻡ ﺭﺍ ﺑﺮﺍی ﺍﺟﺮﺍی ﺍﻟﮕﻮﺭﻳﺘﻢﻫﺎ ﺍﻓﺰﺍﻳﺶ ﻣﻰﺩﻫﺪ. ﺁﺯﻣﺎﻳﺶﻫﺎی ﺍﻧﺠﺎﻡ ﺷﺪﻩ ﺑﺮ ﺍﻳﻦ ﺩﺍﺩﮔﺎﻥ ﺑﺎ ﺍﺳﺘﻔﺎﺩﻩ ﺍﺯ ﺭﻭﺵ ﭘﻴﺸﻨﻬﺎﺩی ﻧﺸﺎﻥ ﻣﻰﺩﻫﺪ ﻛﻪ ﺩﺭ ﻣﻘﺎﻳﺴﻪ ﺑﺎ ﺳﺎﻳﺮ ﺭﻭﺵﻫﺎی ﻃﺒﻘﻪﺑﻨﺪی، ﺍﻳﻦ ﺭﻭﺵ ﺑﻪ ﻧﺘﺎﻳﺞ ﺑﻬﺘﺮی ﻣﻨﺠﺮ ﻣﻰﺷﻮﺩ.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Two Stages Sparse Representation-based Classifier and its Application for Cancer Classification

نویسندگان [English]

  • Malihe Miri 1
  • Mohammad Taghi Sadeghi 2
  • Vahid Abootalebi 2
1 M.Sc., Signal Processing Research Lab, Electrical and Computer Engineering Department, Yazd University
2 Assistant Professor, Signal processing Research Lab, Electrical and Computer Engineering Department, Yazd University
چکیده [English]

Successful outcomes of Sparse Representation-based Classifier (SRC) and Sparse Subspace Clustering (SSC) in many applications motivated us to combine these methods and propose a hierarchical classifier. The main idea behind the SRC and SSC algorithms is to represent a data using a sparse linear combination of elementary signals so that those elementary signals which are similar to the data contribute mainly in the representation. In this paper, the performance of a sparse representation based classifier is improved by pre-clustering of training samples using the SSC algorithm. A twostage SRC is then designed using the resulting clusters. A test data is classified by first determining the most similar cluster. The data label is subsequently found using the second stage classifier. The performance of the proposed method is evaluated considering cancer classification problem using the 14-Tumors microarray dataset. Due to low number of data samples per each class and high dimensionality of the data, this is a challenging problem. Curse of dimensionality, overfitting of the classifier to the training data and computational complexity are the possible related problems. Our experimental results show that the proposed method outperforms some other state of the art classifiers.

کلیدواژه‌ها [English]

  • Sparse Subspace Clustering
  • Microarray data
  • Cancer classification
  • Hierarchical classifier
  • Sparse Representation-based Classification
  • sparse representation

[1]    Donoho D., Compressed sensing; IEEE Trans.  Information Theory, 2006; 52(4): 1289–1306.

[2]    Donoho D., For Most Large Underdetermined Systems of Linear

Equations the Minimal l1 -Norm Solution Is Also the Sparsest Solution;  Comm. Pure and Applied Math., 2006; 59( 6): 797-829.

[3]    Elhamifar E., Vidal R., Sparse subspace clustering: Algorithm, theory, and applications; to appear in IEEE Trans. on Pattern Analysis and Machine Intelligence, 2012; arXiv preprint arXiv: 1203.1005.

[4]    Elhamifar E., Vidal R., Sparse subspace clustering; Proc. Int. Conf. on Computer Vision and Pattern Recognition, 2009; 2790–2797.

[5]    Liu G., Lin Z., Yu Y., Robust subspace segmentation by low-rank representation; Proc. Int. Conf. on International Conference on Machine Learning, 2010.

[6]    Ho J., Yang M.H., Lim J., Lee K.C., Kriegman D., Clustering appearances of objects under varying illumination conditions; Proc. Int. Conf. on Computer Vision and Pattern Recognition, 2003.

[7]    Vidal R., Ma Y., Sastry S., Generalized principal component analysis (gpca); IEEE Trans. on Pattern Analysis and Machine Intelligence, 2005; 27(12): 1945–1959.

[8]    Tipping M.E., Bishop C.M., Mixtures of probabilistic principal component analyzers; Neural Computation, 1999;11(2): 443–482.

[9]    Fischler M.A., Bolles R.C., Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography; Communications of the ACM, 1981; 24(6): 381–395.

[10] Von Luxburg U., A tutorial on spectral clustering; Statistics and Computing, 2007; 17.

[11] Saha B., Pham D., Phung D., Venkatesh S., Sparse Subspace Clustering via Group Sparse Coding; 2013.

[12] Wright J., Yang A.Y., Ganesh A., Sastry S.S., Ma Y., Robust  face  recognition  via  sparse  representation; IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009; 31(2): 210–227.

[13] Sahu S., Panda G., Barik R., A Hybrid Method of Feature Extraction for Tumor Classification Using Microarray Gene Expression Data; International Journal of Computer Science & Informatics,2011;1: issue-1.

[14] Sharma A., Paliwal K., Cancer classification by gradient LDA technique using microarray gene expression data; Data & Knowledge Engineering, 2008; 66: 338–347.

[15] Guyon I., Weston J., Barnhill S., Gene Selection for Cancer Classification using Support Vector Machines; Machine Learning, 2002; 46: 389–422.

[16] Ramaswamy S., Tamayo P., Rifkin R., Multiclass cancer  diagnosis  using  tumor  gene  expression  signatures; Proc. National Academy of Sciences of the United States of America, 2001; 98(26): 15149–15154.

[17] Linder R., Dew D., Sudhoff H., Theegarten D., Remberger K., Poppel S. J., Wagner M., The subsequent  artificial  neural  network (SANN) approach might bring more classificatory power to ANN based DNA microarray analyses;  Bioinformatics, 2004; 20(18): 3544–3552.

[18] Zhang R., Huang G. B., Sundararajan N., Saratchandran P., Multi-category classification using an extreme learning machine for microarray gene expression cancer diagnosis; IEEE/ACM Trans. on Computational Biology and Bioinformatics, 2007;4(3): 485– 495, July.

[19] Shabgahi A. Z., Abadeh M. S., A fuzzy classification system based on memetic algorithm for cancer disease diagnosis; in Proc. 18th IEEE Iranian Conference of Biomedical Engineering (ICBME), 2011.

[20] Hang X. Wu F., Sparse Representation for Classification of Tumors Using Gene Expression Data; Journal of Biomedicine and Biotechnology, 2009.

[21] Chen S., Donoho D., Saunders M. A., Atomic decomposition by basis pursuit; SIAM journal on scientific computing, 1998; 20: 33-61.

[22] Mallat S. G., Zhifeng Z., Matching pursuit with time-frequency dictionaries; IEEE Trans. On Signal Processing, 1993; 41: 3397-3415.

[23] Pati Y. C., Rezaiifar R., Krishnaprasad P. S., orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition; 1993; rec. 27 Asilomar cnf. Signals, syst. Comput, 41-44.

[24] Yang A. Y., Zhou Z., Ganesh A., Sastry, S. S., Ma Y., Fast  l1 -Minimization Algorithms For Robust Face Recognition; IEEE Transactions on Image Processing, 2013; 99:1-1- 0.

[25] Nan Z., Jian Y., K nearest neighbor based local sparse represen-tation classifier; CCPR, IEEE, 2010; 1–5.