6 | Notion

Terminology
- skewed: 특정 partition에 data, query가 몰리는 현상
- hot spot: skewed partition
Strategy
- random assign
  - hot spot 방지에 용이
  - read 시점에 어떤 partition에 질의할 지 불명확
- key range partitioning
  - LSM-Tree의 응용으로 key를 항상 sorted
  - hot spot 발생 가능성 존재
- Hash key partitioning
  - range query에 불리
  - 같은 key로 많은 query가 들어올 경우 hot spot 발생
- document based partitioning
  - column indexing. 같은 field value를 갖는 id를 secondary index로 묶음
- term based partitioning
  - secondary index를 한 곳으로 모음
  - item 검색을 위해 모든 node 확인할 필요 없어짐
Rebalancing
- 각 partition에 가해진 부하 재분배
- 작업 중 read write 작업 가능해야 함
- node 간 이동을 최소화 권장
  - mod N 방식의 최대 약점
- fixed Number
  - item의 증가분을 예측해야 함
  - Riak, Elasticsearch, Couchbase, Voldemort
- dynamic partitioning
  - item 증가하면 기존 partition을 둘로 나눔, item 줄어들면 partition concat
  - partition이 너무 작으면 partition 변경이 잦아 overhead 발생
  - HBase, RethinkDB
- Proportional to nodes
  - node 추가 시 random number로 partition을 골라 split
  - Cassandra
- Automatic or Manual rebalancing
  - Automatic 작업 진행 시 rebalancing이 잦게 일어날 수 있음 → 성능 저하
Request routing
- contact any node → redirect
- routering tier
  - zoo-keeper: partition 정보 관리 tool. routing tier에 정보 전달
- client requirement