Archive for April, 2008

DFLY: syscall frequency

Tuesday, April 29th, 2008

What does the syscall traffic look like for a “make buildkernel”?

% ktrace -di -f /home/dion/projects/syscall-hist/trace1.ktlog -t c make buildkernel
% ktrace -di -f /home/dion/projects/syscall-hist/trace2-cw.ktlog -t cw make buildkernel

Then, I used this “thing” to chew the log files and spit out a nicer form. This code is pretty loosely based on the source to kdump.

#include <sys/ktrace.h>
#include <sys/time.h>
 
#include <err.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
 
static void process_ktrace_log(FILE *logf);
 
uint64_t syscalls[512] = { 0 };
 
int main(int argc, char *argv[]) {
    FILE *logf;
 
    logf = fopen(argv[1], "r");
    process_ktrace_log(logf);
    fclose(logf);
 
    return 0;
}
 
static void process_ktrace_log(FILE *logf) {
    struct ktr_header ktrhdr;
    struct timeval tv_curr, tv_first, *tvp_first = NULL;
    int ktrlen;
    char *m = NULL;
    int m_size = 0;
 
    FILE *logsc = fopen("sc.log", "w");
    FILE *logcsw = fopen("csw.log", "w");
 
    while (fread(&ktrhdr, sizeof(struct ktr_header), 1, logf) == 1) {
        if ((ktrlen = ktrhdr.ktr_len) < 0)
            errx(1, "bogus length 0x%x", ktrlen);
        if (ktrlen > m_size) {
            m = (void *)realloc(m, ktrlen+1);
            if (m == NULL)
                errx(1, "%s", strerror(ENOMEM));
            m_size = ktrlen;
        }
 
        if (ktrlen && fread(m, ktrlen, 1, logf) == 0)
            errx(1, "data too short");
 
        if (!tvp_first) {
            tv_first = ktrhdr.ktr_time;
            tvp_first = &tv_first;
        }
 
        timersub(&ktrhdr.ktr_time, tvp_first, &tv_curr);
 
        switch (ktrhdr.ktr_type) {
            case KTR_SYSCALL:
                {
                    struct ktr_syscall *ktrsc = (struct ktr_syscall *)m;
                    fprintf(logsc, "%16lld %d\n", 
                            (uint64_t) tv_curr.tv_sec * 1000000L +
                                       tv_curr.tv_usec,
                            ktrsc->ktr_code);
                    syscalls[ktrsc->ktr_code]++;
                }
                break;
            case KTR_CSW:
                {
                    struct ktr_csw *ktrcsw = (struct ktr_csw *)m;
                }
                break;
            default:
                break;
        }
    }
 
    fclose(logsc);
    fclose(logcsw);
 
    {
        FILE *schistlog = fopen("schist.log", "w");
        int i;
 
        for(i = 0; i < 512; i++) {
            if (syscalls[i]) fprintf(schistlog, "%d %lld\n", i, syscalls[i]);
        }
 
        fclose(schistlog);
    }
}

Then I used this gnuplot input to generate a quick plot:

set terminal png
set output 'schist-1m.png'
set xrange [0:60000000]
plot "sc.log" using 1:2 with dots

Finally, I used this little python script (I’m guessing some awk/sed genius could do it on the command line, but I can’t):

#!/usr/pkg/bin/python2.4
 
import sys
import re
 
sc = open("/usr/include/sys/syscall.h")
scdict = {}
for line in sc.readlines():
    mo = re.match("^#define\s+SYS_(\S+)\s+(\d+)\s*$", line)
    if mo is not None:
        scdict[mo.group(2)] = mo.group(1)
sc.close()
 
dpair_re = re.compile("^\s*(\d+) (\d+)$")
for line in sys.stdin.readlines():
    mo = dpair_re.match(line)
    if mo is not None:
        print '%15s %3s %s' % (mo.group(2), mo.group(1), scdict[mo.group(1)])

Then I sorted it on the command line:
% ./schist.py < schist.log | sort -rn > schist-chew.log

And the top 15 are: (full results)

        1047837 197 mmap
         828677   5 open
         412613   3 read
         355321   6 close
         343725 476 fstat
         287935  73 munmap
         278365 475 stat
         114136   4 write
         108059 342 sigaction
          88929 199 lseek
          64747 477 lstat
          42090  92 fcntl
          40985 472 set_tls_area
          29434 253 issetugid
          27946  59 execve

[EDIT] Added the graph and final results.

[SHA, Right] What’s Joux Talkin’ Bout?

Tuesday, April 29th, 2008
"Differential Collisions in SHA-0"

Chabaud & Joux begin by attacking a weakened SHA-0 variation.  The first version, which
they call SHI1, is equivalent to SHA-0 with the addition operations in A_{i+1} replaced
with XORs and the f_i functions replaced by XOR.  This effectively removes the
non-linearity introduced by the f_i function and the addition operations.

The analysis of SHI1 begins by examing the effect of perturbations of the vector
W_{i} (0 <= i < 80) directly (instead of trying to study the perturbations of the
message block, M or W_i (0 <= i < 16)).  In other words, suppose we directly supply the
vector W, instead of using the expansion function E.  Since we are mounting a
differential attack, we will start with some vector W and study how permutations of this
vector effect the final result.

The paper gives the following differential path to correct a flip of W^{i}_{1}:
W^{i+1}_{6}
W^{i+2}_{1}
W^{i+3}_{31}
W^{i+4}_{31}
W^{i+5}_{31}
This differential path gives the bits we need to flip to "undo" a given bit flip (bit
1 of W^{i} or W^{i}_{1} for this example).

TODO: Give the output of the python test script -- showing the correction?
TODO: Add image from the Joux paper (Figure 2)

Performing the above flips, we can produce a second vector, W', which will produce the
same result -- a collision!

This shows we can produce a collision if we can supply both W and W' in their
entirety, but SHI1 contains an expansion function producing W from the message block
M.  The last 64 words of W are derived from a recurrence primed with the first 16
words (which is M).  Some values we could choose for W (or W') would be impossible
to generate using any M.  This leads to our next step; we must ensure that both W
and W' can result from the expansion process.

Let's start by representing the collisions as an error vector, m_{0}.  m_{0} is
80 bits long and will contain a 1 in position i if we intend to negate W^{i}_{1}
(for now, we will only concern ourselves with correcting flips in bit 1 -- see
Note 1 in the paper).  m_{0} must be 0 for the i >= 75 "since a perturbation in
round i is never corrected before round i + 6, and since all perturbations must be
corrected by round 80."

The intuitive explanation of the process
----------------------------------------

Important observation:
* The expansion process does not interleave bits!  This turns it into a function
from 16 bits to 80 bits over each bit in the word.

1. Find valid perturbations -- these are deduced by ensuring they fit the expansion
recurrence relation.  It is important to see that since the compression functions
starts primed with some A - E, the recurrence must actually start at the 11th word
(5 steps [A-E] behind the 16th that the recurrence is defined at).

The search is brute force with a search space of 2^16.  It is simple.

We will call the chosen error vector e_{0}

TODO: Include the functions taken from sha_exp_rev.py that compute valid error
vectors.

2. Now, derive the global differential mask (which is M in the paper -- M is also
the message block... bad naming).  The global differential mask is derived by fixing
the flips found in the previous step with the differential path described in the
prebvious section.  Since the SHI1 defines all combination function in the
compression function as XOR, we can XOR the differential paths for all the bits
flipped in e_{0} to compute up with the global mask.  We will call the global mask
G.  We only need the first 16 words of this mask since those will define the rest of
it (via the expansion function).

NOTE: Maybe worth pointing out that this will generate a valid W' because e_{0}
satisfies (9).

3. Given the global mask M and *any* input message M, SHI1(M) == SHI1(M \xor G).

Collision!  Hooray!  Wait.  That's just SHI1.  It's all linear.  We just solved an
algebra equation.  Oh.

[SHA, Right] The Algorithm

Tuesday, April 29th, 2008

SHA-0 (or something more clever here)

SHA-0 was specified in the first Secure Hash Standard (SHS) (current draft) issued in 1993. Two years later, NIST issued a small change to the SHA-0 expansion function; the new function is known as SHA-1. There was no justification for the updated function, but it is assumed the NSA found a collision attack.

Structure of the Hash Function

The SHA hash function works on blocks of 512 bits and outputs a hash value of 160 bits. The function proceeds as follows:

  1. Pad the message by adding a single 1 bit and then a string of 0 bits suffixed with the length (in bits) of the message without padding (represented as a 64-bit big endian integer). The number of zero bits is chosen as the minimum number to evenly pad the message to the next 512 bit boundary.
  2. The five state variables are initialized to fixed constants:
    A = 0x67452301
    B = 0xefcdab89
    C = 0x98badcfe
    D = 0x10325476
    E = 0xc3d2e1f0
  3. For each message block, call the compression function using the current state, A through E, and the current block. The compression function will return the new state. Pseudocode:
    state[i + 1] = compress(current_block, state[i])
  4. The output of the function is the concatentation of the state variables. Pseudocode:
    return A ++ B ++ C ++ D ++ E

The Compression Function

The compression function is where all the “good stuff” happens.

compress(M, h) is computed as:

  1. Divide the 512-bit message block, M, into 16 32-bit words W_0, W_1, \ldots, W_15.
  2. Expand the 16 words into 80 words using the expansion function defined by the recurrence relation:
    W_i = W_{i-3} \oplus W_{i-8} \oplus W_{i-14} \oplus W_{i-16}
  3. Initialize the state variables, A, B, C, D, E, using h
  4. For each word in W, W_i, compute the new state as:
    \begin{align*}A_{i+1} &= (W_i + A_i \lll 5 + f_i(B_i, C_i, D_i) + E_i + K_i) \mod 2^{32}\\B_{i+1} &= A_i\\C_{i+1} &= B_i \lll 30\\D_{i+1} &= C_i\\E_{i+1} &= D_i\end{align*}
    The function f_i(B,C,D) is defined below.
  5. The output is given as:
    (A_0 + A_80, B_0 + B_80, C_0 + C_80, D_0 + D_80, E_0 + E_80)
The function "f_i(X,Y,Z)" and "K_i" are defined as:
  If i is in the range:
    [ 0, 20)   "IF"   f_i = (X & Y) | (X & Z)             K_i = 0x5A827999
    [20, 39)   "XOR"  f_i = X ^ Y ^ Z                     K_i = 0x6ED9EBA1
    [40, 59)   "MAJ"  f_i = (X & Y) | (X & Z) | (Y & Z)   K_i = 0x8F1BBCDC
    [60, 79)   "XOR"  f_i = X ^ Y ^ Z                     K_i = 0xCA62C1D6

NOTE: Add a link to the python implementation.