Skip to content

Hacker News Source

The Hacker News archiver fetches top stories from Hacker News using the official Firebase API.

Configuration

[sources.hackernews]
type = "api"
display_name = "Hacker News"
frequency = "daily"
enabled = true
top_stories = 30
include_comments = true
max_comment_depth = 3
max_comments_per_level = 10
include_article_content = true

Options

Option Type Default Description
type string Must be "api"
display_name string Name shown in output
frequency string "daily" Fetch frequency hint
enabled bool true Enable/disable this source
top_stories int 30 Number of top stories to fetch
include_comments bool true Whether to archive comment threads
max_comment_depth int 3 How deep to traverse comment trees
max_comments_per_level int 10 Max comments per nesting level
include_article_content bool true Fetch and extract full article content

How It Works

  1. Fetches the top story IDs from https://hacker-news.firebaseio.com/v0/topstories.json
  2. For each story (up to top_stories count), fetches the story details
  3. If include_article_content is true, fetches each story's linked URL and extracts readable content using readability
  4. If include_comments is true, recursively fetches comment threads up to the configured depth
  5. All fetches run concurrently using the configured max_workers

Output

Each story produces one article file containing:

  • Story title, URL, score, and author
  • Extracted article content (if enabled)
  • Comment thread (if enabled), rendered as nested HTML

Tips

Tuning for Speed

If you only want headlines without full articles, set include_article_content = false. This dramatically reduces fetch time.

Comment Depth

HN comment threads can be very deep. Setting max_comment_depth = 2 and max_comments_per_level = 5 keeps output manageable while still capturing the highlights.