Hacker News Source¶
The Hacker News archiver fetches top stories from Hacker News using the official Firebase API.
Configuration¶
[sources.hackernews]
type = "api"
display_name = "Hacker News"
frequency = "daily"
enabled = true
top_stories = 30
include_comments = true
max_comment_depth = 3
max_comments_per_level = 10
include_article_content = true
Options¶
| Option | Type | Default | Description |
|---|---|---|---|
type |
string | — | Must be "api" |
display_name |
string | — | Name shown in output |
frequency |
string | "daily" |
Fetch frequency hint |
enabled |
bool | true |
Enable/disable this source |
top_stories |
int | 30 |
Number of top stories to fetch |
include_comments |
bool | true |
Whether to archive comment threads |
max_comment_depth |
int | 3 |
How deep to traverse comment trees |
max_comments_per_level |
int | 10 |
Max comments per nesting level |
include_article_content |
bool | true |
Fetch and extract full article content |
How It Works¶
- Fetches the top story IDs from
https://hacker-news.firebaseio.com/v0/topstories.json - For each story (up to
top_storiescount), fetches the story details - If
include_article_contentistrue, fetches each story's linked URL and extracts readable content using readability - If
include_commentsistrue, recursively fetches comment threads up to the configured depth - All fetches run concurrently using the configured
max_workers
Output¶
Each story produces one article file containing:
- Story title, URL, score, and author
- Extracted article content (if enabled)
- Comment thread (if enabled), rendered as nested HTML
Tips¶
Tuning for Speed
If you only want headlines without full articles, set include_article_content = false. This dramatically reduces fetch time.
Comment Depth
HN comment threads can be very deep. Setting max_comment_depth = 2 and max_comments_per_level = 5 keeps output manageable while still capturing the highlights.