학술논문
Efficient Data Clustering using Fast Choice for Number of Clusters
이용수 35
- 영문명
- 발행기관
- 한국산업경영시스템학회
- 저자명
- Sung-Soo Kim(김성수) Bum-Su Kang(강범수)
- 간행물 정보
- 『산업경영시스템학회지』제41권 제2호, 1~8쪽, 전체 8쪽
- 주제분류
- 경제경영 > 경영학
- 파일형태
- 발행일자
- 2018.06.30
4,000원
구매일시로부터 72시간 이내에 다운로드 가능합니다.
이 학술논문 정보는 (주)교보문고와 각 발행기관 사이에 저작물 이용 계약이 체결된 것으로, 교보문고를 통해 제공되고 있습니다.
국문 초록
영문 초록
K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, this method has the limitation to be used with fixed number of clusters because of only considering the intra-cluster distance to evaluate the data clustering solutions. Silhouette is useful and stable valid index to decide the data clustering solution with number of clusters to consider the intra and inter cluster distance for unsupervised data. However, this valid index has high computational burden because of considering quality measure for each data object. The objective of this paper is to propose the fast and simple speed-up method to overcome this limitation to use silhouette for the effective large-scale data clustering.
In the first step, the proposed method calculates and saves the distance for each data once. In the second step, this distance matrix is used to calculate the relative distance rate (Vj) of each data j and this rate is used to choose the suitable number of clusters without much computation time. In the third step, the proposed efficient heuristic algorithm (Group search optimization, GSO, in this paper) can search the global optimum with saving computational capacity with good initial solutions using probabilistically
for the data clustering. The performance of our proposed method is validated to save significantly computation time
against the original silhouette only using Ruspini, Iris, Wine and Breast cancer in UCI machine learning repository datasets by experiment and analysis. Especially, the performance of our proposed method is much better than previous method for the larger size of data.
목차
1. 연구의 배경 및 목적
2. 빠른 클러스터 수 선택과 휴리스틱 알고리즘
3. 실험 및 분석
4. 결 론
해당간행물 수록 논문
- Optimal Allocation Model of Anti-Artillery Radar by Using ArcGIS and its Specifications
- Cash Flow Statement Preparation Using Accounts Reconciliation Method for IACF
- 산업경영시스템학회지 제41권 2호 저자소개
- 산업경영시스템학회지 제41권 2호 목차
- Multivariate Process Capability Index Using Inverted Normal Loss Function
- An Efficient One Class Classifier Using Gaussian-based Hyper-Rectangle Generation
- The Benchmark Model of Servitization through Similar Company Cases
- Effects of Abnormal Neck Posture on Postural Stability
- Optimizing Assembly Line Balancing Problems with Soft Constraints
- Analysis of Vertical Differentiation Strategy of a Monopolistic Company under Network Externality
- An Estimation of ASL in Appraisal :
- An Appropriated Share between Revenue Expenditure and Capital Expenditure in Capital Stock Estimation for Infrastructure
- Network Betweenness Centrality and Passenger Flow Analysis of Seoul Metropolitan Subway Lines
- Analysis and Probability of Overestimation by an Imperfect Inspector with Errors of Triangular Distributions
- Developing a Decision-Making Model to Determine the Preventive Maintenance Schedule for the Leased Equipment
- Impact of Function and Design Elements of Sign on Customer Preference
- Design and Application of Two-Stage Performance Measurement System Considering Dynamic Capabilities
- Failure Analysis to Derive the Causes of Abnormal Condition of Electric Locomotive Subsystem
- Public Satisfaction Analysis of Weather Forecast Service by Using Twitter
- Efficient Data Clustering using Fast Choice for Number of Clusters
- Flow of Paper Review
- The Effect of QM Activities on the Management Results of Small and Medium sized Enterprises in South Korea
참고문헌
교보eBook 첫 방문을 환영 합니다!
신규가입 혜택 지급이 완료 되었습니다.
바로 사용 가능한 교보e캐시 1,000원 (유효기간 7일)
지금 바로 교보eBook의 다양한 콘텐츠를 이용해 보세요!