SEETF 2022 Crypto writeups

Lost Modulus, The True ECC, DLP and Neutrality

Java

2022-06-06

cyber

python, writeup

Challenges and thoughts

Lost Modulus
Excellent beginner RSA challenge, with the quirk that you’re not given $n$, nor $e$. I happened to have set a similar challenge for an internal CTF once, and so got first blood on this.

UniveRSAlity
A more programming oriented challenge. Has a section playing with JSON and squeezing bytes into a tiny size.
(No writeup yet)

The True ECC
Discrete log problem on an ellipse. Fun, but went down a rabbit hole trying to find an isomorphism. Eventually realised that the order of the group with super smooth, then proceeded to mess up the Pohlig-Hellman implementation.

DLP
Another DLP problem. This one in a standard modulo group. As someone who doesn’t know $p$-adics, this problem took quite a bit of researching. Definitely possible with some trial and error.

Neutrality
Best challenge (imho). Tantalizingly simple setup. A one time pad, but exactly half of the bits are $1$. Solved in an interesting way.

Lost Modulus

44 Solves
I’ve hidden my flag as a modulus, I’m sure no one will be able to retrieve it

Setup

The main encryption function is given below:

def encrypt(m1, m2):
    e = getPrime(256)
    assert m1.bit_length() >= 1600 and long_to_bytes(m1).startswith(b"SEE{"), \
    'first message must be at least 1600 bits and begin with "SEE{"'
    
    assert 500 <= m2.bit_length() <= 600, \
    'second message must be within 500 to 600 bits'

    return pow(m1, e, n), pow(m2, e, n)

Upon connecting, we’re prompted for m1 and m2. If successful, we recieve the outputs.

Crucially, neither the value of n nor e are disclosed to us. n also happens to be derived from the flag:

n = bytes_to_long(FLAG)

while n.bit_length() < 2048:
    n *= n

Equations

Given that this is an RSA challenge, we’re going to run into math somewhere. Let’s first start by writing each equation.
$$
c_1 = m_1^e \mod n \\
c_2 = m_2^e \mod n
$$
Given that n is an unknown, it makes more sense to write it as part of the equation, since we’re unable to compute $\mod n$ ourselves.
$$
c_1 = m_1^e + k_1 n \\
c_2 = m_2^e + k_2 n
$$
As is, the two equations are unrelated. But what if we force them to be related? With unpadded RSA’s homomorphic properties in mind, we know that
$$
E(a\cdot b) = (a^e)(b^e) = (ab)^e = E(a) \cdot E(b)
$$
And hence
$$
E(a^b) = E(a)^b
$$
Of course, this is all under $\mod n$:
$$
E(a^b) + nk = E(a)^b \\
E(a)^b - E(a^b) = nk
$$
Perfect! We’ve more or less isolated $n$. We’ll deal with the unknown factor $k$ later.

We can use our two messages as follows:
$$
m_1 = a^b, m_2 = a \\
E(a)^b - E(a^b) = nk \\
c_2^b - c_1 = nk
$$
All that’s left is to pick appropriate $a$ and $b$. The challenge does restrict $m_1 > 2^{1600}$ and $2^{500} < m_2 < 2^{600}$. This tell us that only $b \ge 3$ work, since the number of bits scales with the power. Now, we can set
$$
m_2 = \left \lceil { \text{“SEE\{…”}^{1/3} } \right \rceil \\
m_1 = m_2^3
$$
Which ensures that they fit the condition. From here, getting $nk$ is easy.

Isolating $n$

Finally, we need to isolate $n$. Here’s a little trick: per connection, we can only get one output $nk$. But who said we can only connect once? By connecting multiple times, and using slightly different $m_1, m_2$, we can get many different $nk_1, nk_2, nk_3\ …$! Hence, we can perform a simple GCD operation to extract $n$:
$$
n = \text{gcd} (nk_1, nk_2, …)
$$
With that, we can repeatedly take square roots to extract the original flag.

The True ECC

20 Solves
You know what you think of when you hear “Elliptic Curve”? It’s ellipses of course!

Setup

We’re provided with a group with its associated operator:
$$
(x, y) \in \mathbb{F}_p^2 \\
x^2 + ay^2 = 1 \\
(x, y) \times (w, z) = (xw-ayz, xz+yw) \\
a = 376014,\ p = 2^{521} - 1
$$
From here, it’s standard DH. Hence, the target is to solve the discrete logarithm on this group.

First look

As the challenge said, this describes an ellipse. There’s a similar challenge on CryptoHack, but instead of $x^2 + ay^2 = 1$, it’s $x^2 - ay^2 = 1$. While it is indeed possible to swap out $a = -(p-a)$, this doesn’t transform the challenge into CryptoHack’s. Crucially, $a$ is a quadratic residue in both, but $p-a$ isn’t. The solution there requires that $a$ has a square root, and hence doesn’t work here.

Paper hunting

As you do, we started searching for this group in the literature. We found a paper: “A Note on Cyclic Groups, Finite Fields, and the Discrete Logarithm Problem” which mentions a key result for the $x^2 - ay^2 = 1$ form of the group:
$$
\text{The group is of order } p - \left ( \frac ap \right )
$$

With $( \frac ap )$ denoting the Legendre symbol.

As mentioned earlier, we can change the form of our group by substituting $a$ for $ a’ =-(p-a)$, which is identical $\mod p$.
$$
(x, y) \in \mathbb{F}_p^2 \\
x^2 - a’y^2 = 1 \\
(x, y) \times (w, z) = (xw+a’yz, xz+yw) \\
a’ = p-376014,\ p = 2^{521} - 1
$$
And since $p-376014$ is a quadratic non-residue, we have that the order is $p - (-1) = p+1$. Indeed, we can verify this by checking $(x, y)^{p+1} = (1, 0)$, the identity point.

print(G * (p+1))
print(A * (p+1))
print(B * (p+1))

(1, 0)
(1, 0)
(1, 0)

Pohlig-Hellman

But wait, $p = 2^{521}- 1$. Hence $p+1 = 2^{521}$, which makes the order extremely smooth. This means that we can solve the discrete logarithm efficiently using Pohlig-Hellman.

First, we need to get the order of $g$ itself. We know that the group’s order is $2^{521}$, hence the order of $g$ must be some smaller multiple of $2$. We can count backwards from $521$, stopping when $g \cdot 2^{k}$ is no longer $(1, 0)$. This tells us that one higher, $2^{k+1}$, is the order of $g$.

for k in range(521, -1, -1):
    print(k, G*2^k)
    
521 (1, 0)
520 (1, 0)
519 (6864797660130609714981900799081393217269435300143305409394463459185543183397656052122559640661454554977296311391480858037121987999716643812574028291115057150, 0)
518 (0, 4933663095141084727279631578976133543495929397222497716068101484425317635581526996358350866139637980808679837735912670165020804007354867802815153320120466630)
...

Hence, the order of $g$ is $2^{520}$.

From here, we can proceed to implement the rest of the algorithm. This implementation is based off the Wikipedia equations.

e = 520 # order of g
gamma = G * (2^(e-1))

def pohlig(h):
    xk = 0
    
    for k in range(e):
        hk = (G * -xk + h) * (2^(e-1-k))
        
        # normally, we do BSGS. But there are only two possible
        # values, so we can just check hk == gamma for dk = 1.
        if (hk == gamma):
            xk += 2^k
    
    return xk

dlogA = pohlig(A)
print(dlogA)
1589751248623626893528421560300030437266981369969303509254567645856824144987509893536438358158445818359875350681075731857177654303541340038222788806443757470

With this, we can recover the secret by calculating B * dlogA, then decrypt the flag accordingly.

DLP

19 Solves
Huh wait, but I thought DLP is hard.

Setup

Another DLP. Here, the group we’re given is the integers mod $n$, with a generator $g$:
$$
n = \prod_{i=1}^{16} p_i^{w_i} \\
g = r^{\prod (p_i-1)}
$$
For 256 bit primes $p$, and $1 \le w \le 9$. We’re tasked to recover $m$, given $g^m$.

Prime Powers

There’s a lot going on, with many points that could be vulnerable. One thing we noticed immediately was that $g = g^m = 1 \mod p_i$. This is easily proven:
$$
g = r^{\prod (p-1)} \\
g = r^{\prod (p_i-1) \mod p-1} = r^0 = 1 \mod p \\
g^m = 1^m = 1 \mod p
$$
While we didn’t quite know how to apply it, it was interesting nonetheless.

all(
    g  % prime == 1 and
    gm % prime == 1
    for prime in primes
)
True

The next logical step was to break the modulus down, and the prime powers were a great place to do this. Supposing we could find the discrete logarithms $g^{m_i} = g^m \mod (p^w)_i$, we could easily recombine all 16 $m_i$ to give $m$:
$$
g^{m_i} = g^m \mod (p^w)_i \\
m_i = m \mod \varphi (p^w)_i \\
m = \text{CRT}(m_i, \varphi(p^w)_i) \\
\text{where }\ \varphi(p^w) = p^{w-1} (p-1)
$$
With that in mind, we searched for discrete logarithm modulo prime power into Google. The first result was exactly what we needed: A way to convert solutions $x$ of $a^x = b\mod p$ into solutions of $a^x = b \mod p^k$.

Typically, this wouldn’t be that useful, as we’d still have to solve the discrete logarithm $\mod p$, which could still take a while. However, let’s bring back that property we observed:
$$
g = g^m \mod p_i = 1 \\
(g)^1 = (g^m) \mod p_i
$$
As it turns out, we’ve already solved it. With the algorithm from StackExchange, we could easily find a value for
$$
g^x = g^m \mod (p^w)_i
$$

Solving

Implementing the algorithm:

def theta(x, p, e):
    x = pow(x, p-1, p^(2*e-1))
    numer = int(pow(x, p^(e-1), p^(2*e-1)) - 1)
    denom = p^e
    
    assert numer % denom == 0
    return (numer//denom) % p^(e-1)

# given x1, where a^x1 = b mod p, returns x such that a^x = b mod p^e. 
def lift(x1, a, p, e, b=None):
    if b is None: 
        b = int(pow(a, x1, p))
    else:
        assert pow(a, x1, p) == b%p
    
    x2 = theta(b, p, e) * pow(theta(a, p, e), -1, p^(e-1))
    x2 = int(x2)
    
    x = CRT_list([x1, x2], [p-1, p^(e-1)])
    return x

Now, we can go through each prime power $(p^w)_i$, calculating the discrete log for each prime power, $m_i$. From there, we can reconstruct $m$ using CRT.

m_vs = []
m_mods = []

for p, w in zip(primes, power):
    new_pow = lift(1, g, p, w, b=gm)
    sub_mod = (p-1)*p^(w-1)
    
    m_vs.append(new_pow)
    m_mods.append(sub_mod)
    
m = CRT_list(m_vs, m_mods)
pow(g, m, n) == gm
True

Though, we should take note that the true value of $m$ generated is less than a certain value:

1	m = randint(0, product(p**(w-1) for p, w in zip(primes, power)) - 1)

And unfortunately,

1 2	m < product(p**(w-1) for p, w in zip(primes, power)) False

Not to worry. The modulus that $m$ was computed to is the LCM of all sub_mod. Noting that $p^{w-1}$ is a factor in each respective sub_mod, this implies that
$$
m’ = m \mod \prod (p^{w-1})_i
$$
is also a valid solution. This time,

prodpw = product(p**(w-1) for p, w in zip(primes, power))

m2 = m % prodpw

pow(g, m2, n) == gm
True

m2 < prodpw
True

It solves the discrete log, and also fits within the original bounds. Running the final section of code with m2 reveals the flag.

Neutrality

5 Solves
I’ve been learning about one-time pads, but they sometimes generate more 1 bits than 0 bits or vice-versa. I think this leaks some information, so it’s better to ensure that the one-time pad is always neutral.

Setup

The setup is incredibly simple. The flag is encrypted with a one-time pad. This pad has exactly half bits at 1, and the other half at 0. We receive 200 different encryptions, and that’s it. Each encryption gives us about 320 bits, so the flag is 40 bytes.

Assumptions

The first thing we should note is that without any assumptions on the flag, it’s going to be impossible to solve, since there are tons of inputs that have a chance of encrypting to these values.

The first assumption we can lay down is that the flag format is consistent with previous challenges, ie, SEE{ ... }. This already gives us 40 bits of information, and 40 bits of known pad.

Another assumption that we can use is that the middle portion of the flag will only consist of lowercase letters, numbers, and underscores. If we end up not finding any solutions, we can revisit these assumptions (perhaps allow uppercase).

PyTorch

wait hear me out

Honestly, I didn’t expect this to work either. The intended solution would probably involve LLL, but I couldn’t think of (and was too braindead at 3am) to construct a lattice. Instead, I turned to what I knew how to do: Shitty machine learning.

We can model the remaining 280 bits as real variables to be optimized according to a certain loss function. This loss function will capture the constraints of the challenge, and hopefully find a solution. With a well designed loss function, it should find the solution.

Preprocessing

We first need to convert all the data to a form that can be used by PyTorch.

Let’s read and convert each of the encryptions to bits in a matrix.

with open("output.txt") as f:
    data = f.readlines()
    
data = [*map(int, data)]

import numpy as np

# 200 samples, 320 bits
bits = np.zeros((200, 320))
for idx, x in enumerate(data):
    bits[idx] = [*map(int, bin(x)[2:].rjust(320, "0"))]
    
bits
# array([[1., 1., 0., ..., 1., 0., 0.],
#        [1., 1., 1., ..., 1., 0., 1.],
#        [1., 0., 0., ..., 1., 1., 0.],
#        ...,
#        [1., 1., 0., ..., 1., 0., 1.],
#        [0., 0., 0., ..., 0., 1., 1.],
#        [0., 0., 0., ..., 0., 0., 1.]])

We can do the same for SEE{}:

from Crypto.Util.number import *
from math import ceil, log

def binarr(x):
    n = ceil(log(x, 256))*8
    return np.array(list(map(int, bin(x)[2:].rjust(n, "0"))))

# SEE{ 
fstart = b"SEE{"
fstart = bytes_to_long(fstart)
fstart = binarr(fstart)

# }
fend = b"}"
fend = bytes_to_long(fend)
fend = binarr(fend)

With these arrays, we can do fun things, like calculate how many bits have been flipped already. The resulting 200 long array tells us, for each sample, how many bits were flipped in the current ciphertext.

diffs = (bits[:, :4*8] != fstart).sum(axis=1) + (bits[:, -8:] != fend).sum(axis=1) 
diffs
# array([21, 18, 18, 25, 23, 24, 23, 22, 22, 20, 23, 22, 15, 19, 16, 17, 20,
#       19, 19, 24, 23, 21, 25, 20, 18, 18, 18, 24, 20, 16, 15, 22, 17, 19,
#       15, 17, 22, 21, 24, 22, 19, 15, 19, 26, 18, 21, 20, 25, 21, 22, 20,
#       21, 21, 16, 23, 26, 21, 19, 20, 19, 19, 24, 24, 24, 16, 21, 20, 18,
#       22, 21, 23, 23, 22, 20, 20, 18, 19, 22, 22, 21, 19, 28, 21, 15, 19,
#       18, 20, 16, 23, 21, 17, 14, 21, 23, 19, 19, 23, 22, 17, 26, 19, 24,
#       23, 23, 22, 22, 16, 16, 21, 19, 22, 19, 19, 21, 17, 16, 21, 15, 14,
#       18, 18, 19, 19, 21, 18, 16, 13, 24, 20, 20, 18, 22, 18, 18, 19, 21,
#       21, 20, 21, 18, 19, 18, 24, 18, 19, 22, 13, 18, 19, 22, 21, 19, 18,
#       12, 18, 20, 22, 21, 22, 21, 26, 19, 14, 16, 19, 20, 24, 21, 19, 22,
#       18, 23, 18, 18, 17, 18, 18, 21, 18, 18, 19, 22, 21, 16, 19, 22, 22,
#       20, 21, 22, 21, 18, 19, 23, 22, 19, 20, 18, 20, 16])

Modelling

The key for a successful gradient descent optimization is continuity and differentiability in the loss function. During optimization, the optimizer aims to make the loss function slightly smaller than before.

We’ll model each bit’s trainable variable as a real number. To convert it to an actual bit between 0 and 1, we can apply the sigmoid function to each element.
$$
\text{sigmoid}(x) = \frac1 {1 + e^{-x}}
$$
Although the bits are not exactly 0 or exactly 1, we need to give the model this fuzziness to allow it to train properly. This way, there is a smooth decrease in loss as it changes bits from 0 to 1.

Next, we can find the number of bit deltas between the model’s bits and each of the 200 sample bits. We can then add that to the earlier found deltas from the assumed SEE{}, and for each of the 200 samples, it should exactly equal 160. The absolute difference between the total number of deltas and 160 can be our final loss function. Together, it looks like this:

import torch
sig = torch.nn.Sigmoid()
var = torch.tensor([0.0]*280, requires_grad=True)

noised = torch.tensor(bits[:, 4*8:-8])
tdiffs = torch.tensor(diffs)

out = sig(var)
delta = (torch.abs(noised-out)).sum(axis=1) 
match = (torch.abs(delta + tdiffs - 160)).mean()

When match is 0, we’ve found a set of bits that could have been the flag.

Let’s add the rest of the training loop:

sig = torch.nn.Sigmoid()

for i in range(1000000):
    optimizer.zero_grad()
    
    out = sig(var)
    delta = (torch.abs(noised-out)).sum(axis=1) 
    match = (torch.abs(delta + tdiffs - 160)).mean()
    
    loss = match 
    
    # pass gradients, optimise slightly
    loss.backward()
    optimizer.step()
    
    # training progress
    if i % 5000 == 0:
        interpret(var)
        print(float(loss))
        
b'qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq'
3.064923250703141
b"u8/\x04<_'qn[b%}au`Og{4cc7b.0Br1a}Epdf"
0.16595518081914634
b"58/\x05<_'sl[b%}ku\xe1oi\x7f<c!7b&0Br0a}Erdf"
0.04761254854034633
b"5:/\x05<_'sn[b%}ku\xe1oi\x7f<c!?b&0Bs0c}Gpdf"
0.017458999711670913
b"5\xba/\x05<_'sn[b%}ku\xe1oi\x7f<c!?b&0Bs0c}Gpdf"
0.009512746030159178

Well, it converged, but not to anything great. We haven’t included the condition that the characters should be printable. We can do that by summing the minimum of the difference between each set of 8 bits and a list of “approved” 8 bits.

chars = string.digits + string.ascii_letters + "_"
comp = torch.tensor([binarr(ord(c)) for c in chars])
def force_set(x):
    diff = torch.abs(comp-x).sum(dim=1)
    return torch.min(diff)

We can add that to the loss:

1	loss = match + sum(force_set(y) for y in out.reshape(-1, 8))/35

And now when we run the loop:

b'qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq'
3.064923250703141
b'e8m$0_bqo[bewaeagi{4aawbFrBp0atExde'
0.34255598894902506
b'50-50_can_be_leaky_4c17bf2c20c4a8df'
0.1167937920015538
b'50-50_can_be_leaky_4c17bf2a20c4a8df'
0.01567253488514325

It works!