Which technique is used in hash algorithm to produce fixed length string from variable length?

Suppose you want to hash a pair of bit strings $(m, r)$ with a hash function $H_0$ defined on bit strings, where $m$ is variable-length and $r$ is fixed-length—maybe a unique salt, or a secret key, or an unpredictable randomization. We will consider possible functions $H(m, r) := H_0(\operatorname{encode}(m, r))$ for some encoding function (although there are other possible ways to hash tuples using an iterated hash function).

First, what security properties are you hoping for when you say ‘hash’? Here are some possibilities:

  • Independence. If $r \ne r'$, then the functions $m \mapsto H(m, r)$ and $m \mapsto H(m, r')$ are (effectively) independent functions.

    Here we are modeling $H_0$ as a uniform random choice of functions with quirks like the length extension attack below; we are interested not in how $H$ behaves on different messages with the same $r$, but rather how $H$ behaves with different $r$.

    This is important for password hashing, because techniques like rainbow tables provide a batch advantage trying to find inputs $m_1, \dots, m_n$ given the outputs $h_1 = f(m_1), \dots, h_n = f(m_n)$ of the same function[1][2]. However, you should use a serious password hash like scrypt or argon2 if you're hashing passwords, and they are already defined in terms of separate password and salt parameters.

  • Collision resistance. It is hard for the adversary to find $(m, r) \ne (m', r')$ such that $H(m, r) = H(m', r')$. This is a standard property of functions like SHA-256 and BLAKE2b; if it holds for $H_0$ we might hope that it holds for $H$.

    This is important for commitments: you can prove to someone that you have a message $m$ without revealing it by picking $r$ uniformly at random and revealing $h = H(m, r)$; then when you want to reveal what the message was, you can give them $(m, r)$ and they can verify $h = H(m, r)$.

  • (Enhanced) target collision resistance. It is hard for the adversary to find a message $m$ such that, given uniform random $r$ after choosing $m$, they can find $m' \ne m$ and $r'$ with $H(m, r) = H(m', r')$.

    With mere TCR[3] (first introduced to the literature as the less-pronounceable UOWHF[4]), the adversary is restricted to $r' = r$; with eTCR[5], the adversary is allowed to choose $r'$, so eTCR is a stronger requirement.

    (e)TCR is important for randomized signature schemes like RSA-PSS and Ed25519. The first secure signature scheme in history, Rabin's, used a randomized hash. Even if $H_0$ turns out to have collisions like MD5, there are choices for $H$ that remain eTCR to the best of our knowledge. Using a TCR construction would have thwarted MD5 certificate forgery, including a particular attack used by the governments of the United States and Israel to sabotage Iran's nuclear program—but RSA, Inc., and the IEEE adopted $H_0(r \mathbin\| H_0(m))$ instead of $H_0(r \mathbin\| m)$ in the standards[6][7] for PSS signatures. (More history.)

  • Pseudorandomness. It is hard for an anyone who doesn't know $r$ to distinguish the function $m \mapsto H(m, r)$ from a uniform random function of the same shape.

    Pseudorandomness is useful for many things including deterministic key derivation and message authentication codes.

These properties aren't exhaustive, but they're likely to be relevant in a wide variety of applications.

What encoding might we use to define $H$ in terms of $H_0$, and what properties does the resulting $H$ provide? Let's assume $H_0$ is an iterated hash function, like essentially all of them, and let's assume for simplicity that $m$ an integral number of blocks long.

  1. Suffix-MAC: $H(m, r) := H_0(m \mathbin\| r)$, putting $r$ at the end.

    • provides independence
    • provides exactly as much collision resistance as $H_0$: a collision in $H$ is a collision in $H_0$ and vice versa
    • provides at most as much (e)TCR as $H_0$ provides collision resistance: finding a collision in $H_0$ is enough to break (e)TCR of $H$—so, no (e)TCR for MD5 or SHA-1!
    • provides at most as much pseudorandomness as $H_0$ provides collision resistance, for iterated Merkle–Damgård functions with no output filter like MD5 or SHA-256: finding a collision in $H_0$ is enough to distinguish $H$ from uniform random—so, no pseudorandomness at all for MD5 or SHA-1!
  2. Prefix-MAC: $H(m, r) := H_0(r \mathbin\| m)$, putting $r$ at the beginning.

    • provides independence
    • provides exactly as much collision resistance as $H_0$: a collision in $H$ is a collision in $H_0$ and vice versa
    • provides at least as much (e)TCR as $H_0$ provides collision resistance: an (e)TCR break of $H$ implies a collision in $H_0$, but finding a collision in $H_0$ does not necessarily help to find a collision in $H$—e.g., nobody has broken eTCR for MD5 in this form
    • provides no pseudorandomness for for iterated Merkle–Damgård functions with no output filter like MD5, SHA-256, and SHA-512, for which length extension attacks apply: given $H(m)$ but not $m$, it is easy to find $H(\operatorname{pad}(m) \mathbin\| m')$ for any suffix $m'$, which a uniform random function exhibits with negligible probability—so, no pseudorandomness for MD5, SHA-1, SHA-256, SHA-512!
      • Exception: If $m$ is fixed-length, then prefix-MAC provides pseudorandomness under reasonable assumptions about $H_0$ even when $H_0$ is MD5, SHA-256, or SHA-512.
      • Note that SHA-3, BLAKE2, SHA-512/256, and essentially all new designs in the past decade do not have this weakness, so $H(r \mathbin\| m)$ is pseudorandom—but they usually also provide separate keyed hashes like KMAC128 or keyed BLAKE2, which are better to use because they are effectively independent even if you happen to hash the message $r \mathbin\| m$ with SHA-3 or unkeyed BLAKE2.
  3. Envelope-MAC or sandwich-MAC: $H(m, r) := H_0(r \mathbin\| m \mathbin\| r)$, putting $r$ at the beginning and the end.

    • provides independence
    • provides exactly as much collision resistance as $H_0$
    • provides at least as much (e)TCR as $H_0$ provides collision resistance
    • provides pseudorandomness under reasonable assumptions about $H_0$

    This is a predecessor to HMAC and actually it provides about the same security as HMAC with less complexity, as long as $m$ is padded so that the $r$ keys appear in their own blocks; it was left in the dustbin of history mainly because someone noticed that it does not provide security if the trailing $r$ straddles a block boundary[8] and thereby gave it a bad reputation, but it has since been proved that with appropriate padding it is a perfectly good choice[9] (paywall-free).

  4. HMAC: $H(m, r) := H_0\bigl((r \oplus \mathrm{opad}) \mathbin\| H_0((r \oplus \mathrm{ipad}) \mathbin\| m)\bigr)$, using prefix-MAC first to compress the message and then as an output filter, with two keys derived by a simple key schedule.

    • provides independence
    • provides probably about the same collision resistance as $H_0$
    • provides at least as much (e)TCR as $H_0$ provides collision resistance
    • provides pseudorandomness under mostly reasonable assumptions about $H_0$

Which hashing technique is used for hashing?

The NTLM algorithm is used for password hashing during authentication. It is the successor of the LANMAN algorithm. NTLM was followed with NTLMv2. NTLMv2 uses an HMAC-MD5 algorithm for hashing.

What type of algorithm always gives a fixed length output?

Determinism — A hash algorithm should be deterministic, meaning that it always gives you an output of identical size regardless of the size of the input you started with. This means that if you're hashing a single sentence, the resulting output should be the same size as one you'd get when hashing an entire book.

What kind of technique is hashing technique?

Hashing is a technique or process of mapping keys, and values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used. at the index x%10 in an Array.

Which of the following function that converts any length message into a fixed length hash value that acts as an authenticator?

A one-way hash function, also known as a message digest, is a mathematical function that takes a variable-length input string and converts it into a fixed-length binary sequence that is computationally difficult to invert—that is, generate the original string from the hash.