Chapter 06

技术文档阅读策略

RFC、论文、API 文档、源码注释——五类文档各有套路,逐字读是初学者的弯路。

6.1 五类技术文档的结构差异

程序员每天读的英文文档大致分五类,它们的写作传统、密度、目标读者都不同。把它们当作同一种文体读是低效的。

类型典型来源密度读法
RFC / specIETF, W3C, ECMAScript极高(每一字都重要)逐字、对照术语表
API 文档MDN, Stripe, AWS中(参数表 + 示例)查阅式,先看 example
man pageUNIX 手册中等密度SYNOPSIS → DESCRIPTION → EXAMPLES
论文arXiv, ACM, USENIX高(带数学)速读法(见 6.4)
源码注释项目 README, godoc顺读

下面我们逐类拆解。

6.2 RFC:MUST / SHOULD / MAY 关键词系统

RFC(Request for Comments)是互联网协议的"圣经"。例如:

RFC 写作有一份"元规范"——BCP 14 / RFC 2119 / RFC 8174——规定了几个关键词的精确含义。这些关键词在 RFC 文本里大写出现:

关键词等价含义
MUSTREQUIRED, SHALL绝对要求,必须实现
MUST NOTSHALL NOT绝对禁止
SHOULDRECOMMENDED强烈建议;除非有充分理由不做
SHOULD NOTNOT RECOMMENDED强烈不建议
MAYOPTIONAL可选;不实现也合规

实例(HTTP RFC 7230):

"A server MUST NOT send a response containing Transfer-Encoding
unless the corresponding request indicates HTTP/1.1 (or later)."

"A client SHOULD send a Host header field in all HTTP/1.1 request
messages."

"A user agent MAY attempt to recover from such an error condition."

这三句话有完全不同的法律意义:第一句违反 = bug;第二句违反 = 允许但不推荐;第三句做不做都行。读 RFC 时如果忽略大小写,等于完全没读懂。

RFC 阅读顺序

不要从头读。推荐顺序:

  1. Abstract(摘要):5 行内的目的陈述。
  2. Status of This Memo:是 Internet Standard 还是 Informational?
  3. Table of Contents:扫一遍框架。
  4. Introduction:动机和背景。
  5. Terminology:必读,定义后续所有术语。
  6. Examples(如有):直接抓住协议的样子。
  7. 核心章节按需读,遇到不懂的术语回到 Terminology 查。
  8. Security Considerations:安全要点(最容易忽略的部分)。

RFC 高频句型

# 定义术语
"For the purposes of this document, the following terms are used:"

# 设定边界
"This specification does not define ..."
"The mechanism described herein is intended for ..."

# 正式禁止
"Implementations MUST NOT rely on ..."

# 警告
"It is RECOMMENDED that implementations ..."

# 引用其他 RFC
"... as described in [RFC7234] Section 5.2 ..."

# 表示弃用
"The use of this header is deprecated; see Section X for migration."

6.3 API 文档:先看 example,后看 spec

大多数现代 API 文档遵循类似 Stripe / GitHub 的结构:

1. Endpoint description       — 一句话说明
2. URL                        — POST /v1/charges
3. Parameters                 — 表格列出所有参数
4. Returns                    — 返回什么
5. Errors                     — 可能的错误
6. Examples                   — curl, code snippets
7. Webhooks / Events (如有)

读 API 文档的高效顺序:example → parameters → errors → spec。这与 RFC 完全相反——API doc 是"查阅式"文档,目的是让你快速调通。

关键词识别

# 必填 vs 可选
required        # 必填
optional        # 可选
nullable        # 可为空
deprecated      # 已废弃,下个 major 版本会移除
beta / alpha    # 实验性,可能变
preview         # 预览版

# 数据类型术语
string, integer, boolean, array, object,
timestamp (Unix epoch / ISO 8601),
enum (枚举),
union (多类型),
recursive (递归结构)

# 行为修饰词
idempotent          # 幂等:重复调用结果相同
atomic              # 原子性
eventually consistent  # 最终一致
strongly typed      # 强类型
backward compatible # 向后兼容
breaking change     # 不兼容变更

# 限制
rate limit          # 速率限制
quota               # 配额
throttling          # 节流
pagination          # 分页 (cursor / offset / page-based)

Stripe API doc 实例分析

POST /v1/charges
Creates a new charge object.

Parameters
─────────
amount       integer    REQUIRED
             A positive integer representing how much to charge in
             the smallest currency unit (e.g., 100 cents to charge
             $1.00).

currency     string     REQUIRED
             Three-letter ISO currency code, in lowercase. Must be
             a supported currency.

source       string     OPTIONAL
             A payment source to be charged. ...

Returns
───────
Returns a charge object if the charge succeeded.

Errors
──────
402  card_declined  - The card was declined.
402  insufficient_funds  - The card has insufficient funds.

注意几个英文写作细节:

6.4 论文速读法

程序员读论文的目的不是"完全理解"——是"判断这个想法对我有没有用"。S. Keshav 的经典论文《How to Read a Paper》提出三遍阅读法:

第一遍:5-10 分钟

第一遍后你应该能回答 "5 个 C":

  1. Category:这是什么类型的论文?(measurement, system, theory)
  2. Context:和哪些工作相关?
  3. Correctness:假设看起来合理吗?
  4. Contributions:主要贡献是什么?
  5. Clarity:写得好吗?

如果第一遍后判断 "对我没用",就停在这里。

第二遍:1 小时

第三遍:4-5 小时

如果你需要复现这篇论文:把自己当成作者,逐句重建论证。这一遍才需要查所有不懂的术语。

论文常见英文句型

# Introduction 开场
"In recent years, X has become increasingly important ..."
"Despite significant progress in X, the problem of Y remains ..."
"X is a fundamental problem in Y ..."

# 表达贡献
"In this paper, we present / propose / introduce ..."
"Our key insight is that ..."
"Our contributions are threefold: (1) ... (2) ... (3) ..."
"To the best of our knowledge, this is the first work to ..."

# 实验描述
"We evaluate our approach on ..."
"Our method outperforms the baseline by X% on Y benchmark."
"Compared to prior work [REF], our approach achieves ..."

# 局限性(必读)
"Our approach has several limitations. First, ..."
"We leave the extension to ... as future work."

# 结论
"In conclusion, we have shown that ..."
"This work opens up several directions for future research ..."

6.5 Changelog 和 Release Notes

读 changelog 是工程师每天都做的事——升级依赖、追踪 breaking changes。Keep a Changelog(https://keepachangelog.com)规范定义了 7 种条目:

类别含义
Added新功能
Changed已有功能的变化
Deprecated即将移除
Removed已移除
Fixedbug 修复
Security安全修复
Performance性能改进(非标准但常见)

升级时的关注点:Removed → Deprecated → Changed → Security → Added → Fixed。前三类决定你要不要改代码。

常见 Release Note 关键句

# 重要性标记
🚨 BREAKING CHANGE: ...
⚠️  Deprecation notice: ...
🔒 Security fix: CVE-2024-XXXX

# 描述变化
"X is now Y"               # 正面陈述
"X has been replaced by Y" # 替换
"X is no longer supported" # 不再支持
"X has been moved to Y"    # 迁移

# 迁移指南
"To upgrade, run ..."
"You will need to update your config from X to Y."
"See the migration guide at ..."

# 致谢
"Special thanks to @username for ..."
"This release was made possible by contributions from ..."

6.6 Stack Trace / 错误信息的英文

错误信息是程序员每天读的英文 — 但很多人没系统化它的词汇。

常见错误类型词汇

含义典型语言
NullPointerException / NPE空指针Java
nil pointer dereference解引用 nilGo
SegFault / SIGSEGV段错误C, C++
panic不可恢复运行时错误Go, Rust
OOM / Out of Memory内存耗尽所有
Stack Overflow栈溢出(递归过深)所有
TLE (Time Limit Exceeded)超时算法 / OJ
Race condition竞态并发
Deadlock死锁并发
Livelock活锁(一直让步)并发
Heisenbug观察就消失的 bug调试
Memory leak内存泄漏所有
Buffer overflow缓冲区溢出C/C++
Use after free释放后使用C/C++
Double free重复释放C/C++
Dangling pointer悬垂指针C/C++
Type mismatch类型不匹配类型语言
Cannot find module找不到模块JS/TS
Undefined is not a function调用了 undefinedJS
Connection refused连接被拒网络
Connection reset连接被重置网络
Read timeout读超时I/O
Permission denied权限不足OS
No such file or directory文件不存在OS

典型 Stack Trace 拆解

Traceback (most recent call last):
  File "app.py", line 42, in <module>
    main()
  File "app.py", line 28, in main
    user = fetch_user(user_id)
  File "/lib/api.py", line 156, in fetch_user
    return _parse(response.json())
  File "/lib/parser.py", line 89, in _parse
    return User(**data["user"])
KeyError: 'user'

读法:

  1. 从最下面看类型:KeyError: 'user' — 字典里没有 'user' 键。
  2. 从下往上看调用栈:是在 _parse 函数访问 data["user"] 时报的错。
  3. 触发点是 app.py line 42 → main → fetch_user。

"most recent call last" 是 Python 默认顺序——栈底是错误源头,栈顶是入口。Java/Go 默认顺序相反(最新调用在最上)。读栈一定要先确认顺序。

6.7 源码注释:@param / @return / @throws / @deprecated / @since

大多数语言有自己的 doc comment 风格,但标签语义高度一致:

标签含义示例
@param参数说明@param id The user ID
@returns / @return返回值@returns The user object
@throws / @raises可能抛出的异常@throws NotFoundError if user missing
@deprecated已废弃@deprecated Use newApi() instead
@since引入版本@since 2.4.0
@see参考@see RelatedClass
@example用法示例跟着代码块
@todo待办@todo handle empty case
@override覆盖父类Java/TS
@inheritDoc继承父注释Java
@author作者逐渐被 git blame 替代
@version版本同上
@link跨链接{@link OtherClass}
@internal内部 API不建议外部用
@beta实验性API 可能变

JSDoc 实例

/**
 * Fetches a user by their ID, optionally including related orders.
 *
 * @param id - The user's unique identifier (UUID v4).
 * @param options - Optional fetch options.
 * @param options.includeOrders - If true, also fetches orders. Defaults to false.
 * @returns A promise that resolves to the user object.
 * @throws {NotFoundError} If no user with the given ID exists.
 * @throws {NetworkError} If the underlying HTTP call fails.
 * @example
 * const user = await fetchUser('abc-123', { includeOrders: true });
 * @since 2.4.0
 * @see {@link User}
 */
async function fetchUser(id, options = {}) { ... }

Python docstring(Google 风格)

def fetch_user(id: str, include_orders: bool = False) -> User:
    """Fetches a user by their ID, optionally including related orders.

    Args:
        id: The user's unique identifier (UUID v4).
        include_orders: If True, also fetches orders. Defaults to False.

    Returns:
        The user object.

    Raises:
        NotFoundError: If no user with the given ID exists.
        NetworkError: If the underlying HTTP call fails.

    Example:
        >>> user = fetch_user('abc-123', include_orders=True)

    .. versionadded:: 2.4.0
    """

6.8 man page 阅读

UNIX man page 有固定结构,每节都是缩写:

$ man grep

NAME           # 命令名 + 一句话功能
   grep, egrep, fgrep - print lines matching a pattern

SYNOPSIS       # 用法语法(最重要)
   grep [OPTIONS] PATTERN [FILE...]

DESCRIPTION    # 详细行为
   grep searches the named input FILEs for lines matching ...

OPTIONS        # 所有 flag
   -i, --ignore-case
       Ignore case distinctions ...

EXAMPLES       # 例子
   grep -i 'error' app.log

EXIT STATUS    # 退出码
   0 if any line was selected, 1 otherwise.

SEE ALSO       # 相关命令
   awk(1), sed(1), regex(7)

BUGS           # 已知问题
AUTHOR
COPYRIGHT

SYNOPSIS 的语法符号:

SEE ALSO 后的数字 (1) (7) 是 man 区段:

6.9 阅读训练建议

  1. 每周读一篇 RFC——从你日常用的协议开始(HTTP、OAuth、JSON)。
  2. 每周读一份 release notes——选你项目里依赖最重的库(React、Postgres、Kubernetes)。
  3. 每周读一个 stack trace——刻意拿别人 issue 里的 trace 练读栈。
  4. 每月读一篇论文——用三遍法。
  5. 遇到不认识的术语,立即查 https://www.computerhope.com/jargon.htm 或维基百科英文版。
// rule

读得多,写就好。技术文档阅读是程序员英语中"性价比最高"的训练——它直接构成你日常工作输入,每天都做。

6.10 本章小结

下一章我们换到"输出"端——技术写作。