MediumRating 1479

355. Design Twitter

hash-tablelinked-listdesignheap-priority-queue

在 LeetCode 上作答 ↗

解題說明

Design Twitter

題目描述：設計一個簡化版推特系統，支援以下操作：

postTweet(userId, tweetId)：使用者發推文。
getNewsFeed(userId)：回傳使用者自己及其追蹤者最新的最多 10 則推文（按時間由新到舊）。
follow(followerId, followeeId)：追蹤功能。
unfollow(followerId, followeeId)：取消追蹤。

解題思路：getNewsFeed 是此題難點——需要合併多個使用者的推文串流並取前 10 筆，這是**多路合併（K-way merge）**問題，以最小堆實作最有效率。

資料設計：

tweets：unordered_map<int, vector<pair<int,int>>>，userId → [(timestamp, tweetId), ...]，每個使用者的推文按時間遞增存放（vector 尾端是最新的）。
following：unordered_map<int, unordered_set<int>>，userId → 追蹤的人的集合。
全域時間戳 timer 記錄發推順序。

getNewsFeed 流程：

收集 userId 自己 + 所有追蹤者的推文串流。
對每個串流，將其最新推文（即 vector 尾端）放入最大堆（按 timestamp 排序）。
每次從堆頂取出最新推文加入結果，然後將同一串流的次新推文推入堆。
重複至取得 10 筆或所有串流耗盡。

堆中每個元素記錄：(timestamp, tweetId, userId, indexInVector)，方便回溯找下一筆。

C++ 解法

複雜度分析

時間複雜度：postTweet O(1)，follow/unfollow O(1) 平均（雜湊表）；getNewsFeed O(F log F + 10 log F) = O(F log F)，其中 F 為追蹤人數（加上自己共 F+1 個串流）。建堆初始化需要對每個串流各做一次 push O(F log F)，之後最多取 10 次，每次 heap 操作為 O(log F)，合計 O(F log F)。

空間複雜度：O(T + U²)，其中 T 為總推文數，U 為使用者數。tweets 儲存 O(T) 筆推文，following 最壞情況 O(U²)（每個使用者追蹤所有其他人）。getNewsFeed 的堆最多容納 F+1 個元素，為 O(U) 臨時空間。

虛擬碼

1. postTweet(userId, tweetId): append (globalTimer++, tweetId) to tweets[userId].
2. follow(followerId, followeeId): add followeeId to following[followerId] (skip self-follow).
3. unfollow(followerId, followeeId): remove followeeId from following[followerId].
4. getNewsFeed(userId):
   a. Collect all relevant users: {userId} ∪ following[userId].
   b. For each user u, push their most recent tweet (last element) as (timestamp, tweetId, u, lastIndex) into a max-heap.
   c. While feed has < 10 tweets and heap is not empty:
      - Pop (ts, tid, u, idx) from heap → add tid to feed.
      - If idx-1 >= 0, push (tweets[u][idx-1].ts, tweets[u][idx-1].tid, u, idx-1) into heap.
   d. Return feed.

其他解法

暴力合併排序 O(T log T)：對 userId 及其追蹤者的所有推文合併成一個大陣列，直接按 timestamp 排序後取前 10。實作簡單，但需掃描所有歷史推文，當推文總量 T 很大時效率低落，而堆方法只需存取每個串流的最新端。

每次 getNewsFeed 預先排序快取：在 follow/unfollow/postTweet 時標記 feed 為 dirty，在 getNewsFeed 時重新生成並快取。適合讀多寫少場景（Twitter 的實際使用模式），但快取失效策略複雜，且快取本身需要 O(U * 10) 空間。

LinkedList + 時間戳指針：每個使用者的推文用雙向鏈結串列儲存，各串流維護當前讀取指針，實現真正的 O(1) 移動到下一筆。與本解的堆方法類似，但鏈結串列在 cache 效率上不如連續的 vector。

延伸思考

若系統需要支援數百萬使用者和數十億則推文，現有的 in-memory 設計哪裡會成為瓶頸？如何設計分散式方案？
若要加入「轉推（retweet）」功能，資料結構和 getNewsFeed 邏輯需要哪些調整？
若 getNewsFeed 需要支援分頁（例如每次取 10 則，可以繼續往下取下 10 則），如何修改設計以支援有狀態的分頁查詢？

解題說明

Design Twitter

題目描述：設計一個簡化版推特系統，支援以下操作：

postTweet(userId, tweetId)：使用者發推文。
getNewsFeed(userId)：回傳使用者自己及其追蹤者最新的最多 10 則推文（按時間由新到舊）。
follow(followerId, followeeId)：追蹤功能。
unfollow(followerId, followeeId)：取消追蹤。

解題思路：getNewsFeed 是此題難點——需要合併多個使用者的推文串流並取前 10 筆，這是**多路合併（K-way merge）**問題，以最小堆實作最有效率。

資料設計：

tweets：unordered_map<int, vector<pair<int,int>>>，userId → [(timestamp, tweetId), ...]，每個使用者的推文按時間遞增存放（vector 尾端是最新的）。
following：unordered_map<int, unordered_set<int>>，userId → 追蹤的人的集合。
全域時間戳 timer 記錄發推順序。

getNewsFeed 流程：

收集 userId 自己 + 所有追蹤者的推文串流。
對每個串流，將其最新推文（即 vector 尾端）放入最大堆（按 timestamp 排序）。
每次從堆頂取出最新推文加入結果，然後將同一串流的次新推文推入堆。
重複至取得 10 筆或所有串流耗盡。

堆中每個元素記錄：(timestamp, tweetId, userId, indexInVector)，方便回溯找下一筆。

複雜度分析

虛擬碼

1. postTweet(userId, tweetId): append (globalTimer++, tweetId) to tweets[userId]. 2. follow(followerId, followeeId): add followeeId to following[followerId] (skip self-follow). 3. unfollow(followerId, followeeId): remove followeeId from following[followerId]. 4. getNewsFeed(userId): a. Collect all relevant users: {userId} ∪ following[userId]. b. For each user u, push their most recent tweet (last element) as (timestamp, tweetId, u, lastIndex) into a max-heap. c. While feed has < 10 tweets and heap is not empty: - Pop (ts, tid, u, idx) from heap → add tid to feed. - If idx-1 >= 0, push (tweets[u][idx-1].ts, tweets[u][idx-1].tid, u, idx-1) into heap. d. Return feed.

其他解法

延伸思考

若系統需要支援數百萬使用者和數十億則推文，現有的 in-memory 設計哪裡會成為瓶頸？如何設計分散式方案？
若要加入「轉推（retweet）」功能，資料結構和 getNewsFeed 邏輯需要哪些調整？
若 getNewsFeed 需要支援分頁（例如每次取 10 則，可以繼續往下取下 10 則），如何修改設計以支援有狀態的分頁查詢？