A path segment (the parts in a path separated by /
) in an absolute URI path can contain zero or more of pchar that is defined as follows:
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
So it’s basically A
–Z
, a
–z
, 0
–9
, -
, .
, _
, ~
, !
, $
, &
, '
, (
, )
, *
, +
, ,
, ;
, =
, :
, @
, as well as %
that must be followed by two hexadecimal digits. Any other character/byte needs to be encoded using the percent-encoding.
Although these are 79 characters in total that can be used in a path segment literally, some user agents do encode some of these characters as well (e.g. %7E
instead of ~
). That’s why many use just the 62 alphanumeric characters (i.e. A
–Z
, a
–z
, 0
–9
) or the Base 64 Encoding with URL and Filename Safe Alphabet (i.e. A
–Z
, a
–z
, 0
–9
, -
, _
).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…