30% TANIEJ!

Thinking in Java. Edycja polska. Wydanie IV

Thinking in Java. Edycja polska. Wydanie IV

67.90 zł   97.00 zł

Bookmark and Share

RFC3492

Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)

Punycode is a simple and efficient transfer encoding syntax designed for use with Internationalized Domain Names in Applications (IDNA). It uniquely and reversibly transforms a Unicode string into an ASCII string. ASCII characters in the Unicode string are represented literally, and non-ASCII characters are represented by ASCII characters that are allowed in host name labels (letters, digits, and hyphens). This document defines a general algorithm called Bootstring that allows a string of basic code points to uniquely represent any string of code points drawn from a larger set. Punycode is an instance of Bootstring that uses particular parameter values specified by this document, appropriate for IDNA. [STANDARDS TRACK]

pozycje od 6 do 6 z 35,  strona 6 z 35
RFC 3492                     IDNA Punycode                    March 2003


   when unique encodings are needed.  Second, the integer is not self-
   delimiting, so if multiple integers are concatenated the boundaries
   between them are lost.

   The generalized variable-length representation solves these two
   problems.  The digit values are still 0 through base-1, but now the
   integer is self-delimiting by means of thresholds t(j), each of which
   is in the range 0 through base-1.  Exactly one digit, the most
   significant, satisfies digit_j < t(j).  Therefore, if several
   integers are concatenated, it is easy to separate them, starting with
   the first if they are little-endian (least significant digit first),
   or starting with the last if they are big-endian (most significant
   digit first).  As before, the value is the sum over j of digit_j *
   w(j), but the weights are different:

      w(0) = 1
      w(j) = w(j-1) * (base - t(j-1)) for j > 0

   For example, consider the little-endian sequence of base 8 digits
   734251...  Suppose the thresholds are 2, 3, 5, 5, 5, 5...  This
   implies that the weights are 1, 1*(8-2) = 6, 6*(8-3) = 30, 30*(8-5) =
   90, 90*(8-5) = 270, and so on.  7 is not less than 2, and 3 is not
   less than 3, but 4 is less than 5, so 4 is the last digit.  The value
   of 734 is 7*1 + 3*6 + 4*30 = 145.  The next integer is 251, with
   value 2*1 + 5*6 + 1*30 = 62.  Decoding this representation is very
   similar to decoding a conventional integer:  Start with a current
   value of N = 0 and a weight w = 1.  Fetch the next digit d and
   increase N by d * w.  If d is less than the current threshold (t)
   then stop, otherwise increase w by a factor of (base - t), update t
   for the next position, and repeat.

   Encoding this representation is similar to encoding a conventional
   integer:  If N < t then output one digit for N and stop, otherwise
   output the digit for t + ((N - t) mod (base - t)), then replace N
   with (N - t) div (base - t), update t for the next position, and
   repeat.

   For any particular set of values of t(j), there is exactly one
   generalized variable-length representation of each nonnegative
   integral value.

   Bootstring uses little-endian ordering so that the deltas can be
   separated starting with the first.  The t(j) values are defined in
   terms of the constants base, tmin, and tmax, and a state variable
   called bias:

      t(j) = base * (j + 1) - bias,
      clamped to the range tmin through tmax

Costello                    Standards Track                     [Page 6]
pozycje od 6 do 6 z 35,  strona 6 z 35

Książki warte uwagi