원제 : Paper: Replication Under Scalable Hashing
번역 : Paper: Replication Under Scalable Hashing
From the abstract:
비중앙화된 데이터 분산을 위한 전형적인 알고리즘은 처음 사용되기 전에 완전하게 구축된 시스템에서 최적으로 동작합니다: 컴포넌트 추가 제거가 데이터의 넓은 재조직이나 시스템에서 부하 불균형을 초래합니다. 우리는 비중앙화 알고리즘군을 개발했는데, RUSH (Replication Under Scalable Hashing)라는 알고리즘이며, 복제된 객체를 서버나 디스크의 확장가능한 집합으로 맵핑합니다. RUSH 알고리즘은 객체를 사용자 지정 서버 가중치에 따라 분산합니다. 모든 RUSH 변형이 시스템에 서버의 추가를 지원할 때, 다른 변형은 PB 규모의 시스템에서 검색시간, 미러링 성능(중복 코드와 반대로)과 저장소 서버 제거의 측면에서 다른 특성을 가집니다. 새 서버가 추가되거나 서버가 제거될 때 모든 RUSH 변형은 가능한 적은 객체를 재배포하고 모든 변형은 특정한 객체의 두 복제는 같은 서버에 있을 수 없다는 것을 보증합니다. 중앙적인 디렉터리가 없기 때문에, 클라이언트는 병렬로 데이터 위치를 계산하고, 동시적으로 수천대의 클라이언트가 수천대 서버에 있는 객체에 접근할 수 있습니다.
Typical algorithms for decentralized data distribution work best in a system that is fully built before it first used; adding or removing components results in either extensive reorganization of data or load imbalance in the system. We have developed a family of decentralized algorithms, RUSH (Replication Under Scalable Hashing), that maps replicated objects to a scalable collection of storage servers or disks. RUSH algorithms distribute objects to servers according to user-specified server weighting. While all RUSH variants support addition of servers to the system, different variants have different characteristics with respect to lookup time in petabyte-scale systems, performance with mirroring (as opposed to redundancy codes), and storage server removal. All RUSH variants redistribute as few objects as possible when new servers are added or existing servers are removed, and all variants guarantee that no two replicas of a particular object are ever placed on the same server. Because there is no central directory, clients can compute data locations in parallel, allowing thousands of clients to access objects on thousands of servers simultaneously.
Typical algorithms for decentralized data distribution work best in a system that is fully built before it first used; adding or removing components results in either extensive reorganization of data or load imbalance in the system. We have developed a family of decentralized algorithms, RUSH (Replication Under Scalable Hashing), that maps replicated objects to a scalable collection of storage servers or disks. RUSH algorithms distribute objects to servers according to user-specified server weighting. While all RUSH variants support addition of servers to the system, different variants have different characteristics with respect to lookup time in petabyte-scale systems, performance with mirroring (as opposed to redundancy codes), and storage server removal. All RUSH variants redistribute as few objects as possible when new servers are added or existing servers are removed, and all variants guarantee that no two replicas of a particular object are ever placed on the same server. Because there is no central directory, clients can compute data locations in parallel, allowing thousands of clients to access objects on thousands of servers simultaneously.
Related Articles
'훌륭한 논문' 카테고리의 다른 글
Paper: Designing Disaster Tolerant High Availability Clusters (0) | 2007.07.25 |
---|---|
Paper: Lightweight Web servers (0) | 2007.07.25 |
Paper: Guide to Cost-effective Database Scale-Out using MySQL (0) | 2007.07.16 |
PAPER: MySQL Scale-Out by application partitioning (0) | 2007.07.16 |
Paper: The Clustered Storage Revolution (0) | 2007.07.16 |