6 minutes
M0rsarchive
Note to fellow-HTBers: Only write-ups of retired HTB machines or challenges are allowed.
Challenge info ¶
M0rsarchive [by swani]
Just unzip the archive … several times …
The challenge ¶
The challenge description says it all: we get a zip file and we need to encrypt *each and every* layer to get to the flag.
First reconnaissance ¶
Unzipping the first zip file, M0rsarchive.zip — which just contains the actual challenge — we get the following files:
$ sha256sum M0rsarchive.zip
c1c6436427e2bd82c114890069c72eb7c1b67d56ae101ccca195fbc2325707b0 M0rsarchive.zip
$ unzip M0rsarchive.zip
Archive: M0rsarchive.zip
[M0rsarchive.zip] flag_999.zip password:
extracting: flag_999.zip
extracting: pwd.png
$ ls
flag_999.zip M0rsarchive.zip pwd.png
$ unzip -l flag_999.zip
Archive: flag_999.zip
Length Date Time Name
--------- ---------- ----- ----
624326 2018-10-03 20:02 flag/flag_998.zip
99 2018-10-03 20:02 flag/pwd.png
--------- -------
624425 2 files
Opening pwd.png in the default image viewer, we get a pretty small image with some coloured stripes and dots: morse!
Decrypting the morse code in a random morse decoder tool, we get 9
.
So we unzip flag_999.zip
using this password. This gives us the next layer, containing yet again a zip file and an image with a morse code.
$ unzip flag_999.zip
Archive: flag_999.zip
[flag_999.zip] flag/flag_998.zip password: #9
extracting: flag/flag_998.zip
inflating: flag/pwd.png
$ cd flag/
$ ls
flag_998.zip pwd.png
The morse code is now split over 2 lines, so this will probably be a 2-character pin/password.
This calls for automation!
Automating the boring stuff with Python ¶
This might be a chapter in Al Sweigart’s book, I don’t know as I still have to read it (it’s on my TODO-list…). So we’ll wing it by ourselves 😋
What do we need to achieve?
- Open an image
- “Read” morse code contained in image
- Decode morse into text
- Unzip protected zip using this password
- Repeat endlessly (or until we run out of zip files)
For §2 I was first looking into Optical Character Recognition (OCR), but this was taking me waaaaaay to far.
So let’s think about it for a second. We see an image containing 2 colours, with dots .
and dashes —
of respectively 1x1px and 3x1px.
So we should be able to go over each pixel and match the colours.
The following code enumerates the pixels in an image. We assume the first (top-left) pixel had the background colour and that differently coloured pixels will be part of the morse code.
We then convert these pixels into symbols we can match, a bit like using 1’s and 0’s.
Finally we take the output, clean it up a little and convert it into morse (3 consecutive symbols are a dash, single symbols are dots, and newlines are letter separators).
Image to morse converter
#!/usr/bin/env python
# -*- coding: utf-8 -*-
def getMorse(image):
"""Get morse out of image
This assumes the background colour is static and morse code has a different colour.
Morse code can be in any colour, as long as it is not the same as the top-left pixel.
>>> getMorse('pwd.png')
['----.']
"""
from PIL import Image
import re
im = Image.open(image, 'r')
chars = []
background = im.getdata()[0]
for i, v in enumerate(list(im.getdata())):
if v == background:
chars.append(" ")
else:
chars.append("*")
# print "{0: <4}: {1: <15} ({2})".format(i, v, "[-] BG" if v == background else "[+] FG")
output = "".join(chars)
# Clean output by removing whitespace front and back
# Then make dash out of every combination of 3 dots
# Convert dots to actual dot
# Convert whitespace between letters (i.e. >1 bg pixel) to seperator
# Remove whitespace
# Return list of letters
output = re.sub(r'^\s*', '', output)
output = re.sub(r'\s*$', '', output)
output = re.sub(r'\*{3}', '-', output)
output = re.sub(r'\*', '.', output)
output = re.sub(r'\s{2,}', ' | ', output)
output = re.sub(r'\s', '', output)
output = output.split('|')
return output
Next, we convert the morse back into text. For this we basically use a dictionary with the morse code as key and the letter as value.
Note that most websites give you dictionaries that are the other way around. While you can match based on the value, it’s more efficient to first write some code to mirror the dictionary and then use that one for the decoding, since we don’t need to encode anything.
Morse decoder
def getPassword(morse):
"""Decode morse
Convert morse back into text.
Takes list of letters as input, returns converted text.
Note that challenge uses lowercase letters.
>>> getPassword(['----.'])
'9'
"""
MORSE_CODE_DICT = {
'.-': 'A', '-...': 'B', '-.-.': 'C', '-..': 'D',
'.': 'E', '..-.': 'F', '--.': 'G', '....': 'H',
'..': 'I', '.---': 'J', '-.-': 'K', '.-..': 'L',
'--': 'M', '-.': 'N', '---': 'O', '.--.': 'P',
'--.-': 'Q', '.-.': 'R', '...': 'S', '-': 'T',
'..-': 'U', '...-': 'V', '.--': 'W', '-..-': 'X',
'-.--': 'Y', '--..': 'Z', '-----': '0', '.----': '1',
'..---': '2', '...--': '3', '....-': '4', '.....': '5',
'-....': '6', '--...': '7', '---..': '8', '----.': '9',
'-..-.': '/', '.-.-.-': '.', '-.--.-': ')', '..--..': '?',
'-.--.': '(', '-....-': '-', '--..--': ','
}
for item in morse:
return "".join([MORSE_CODE_DICT.get(item) for item in morse]).lower()
Note that the zip passwords are in lowercase. If you forget to add this detail to the code, you’ll get to somewhere in the 700’s before you encounter a zip where the decoded password won’t work.
To use this Python script in an automated process, it’s best to add a main-method which will call these methods and pass the necessary parameters.
def main():
""" Auto start
Used for automation.
Automatically call methods and use 'pwd.png' as input image.
"""
print getPassword(getMorse("pwd.png"))
if __name__ == "__main__":
main()
Finally, we need to write some code to recursively unzip each layer. A few lines of Bash will help us with this. I used a similar routine for the Eternal Loop challenge.
#!/bin/bash
RESULT=0
while [ $RESULT -eq 0 ]
do
PASSWORD="$( python /home/user/HTB/Challenges/Misc/M0rsarchive/M0rsarchive.py )"
ZIPFILE="$( ls *.zip )"
unzip -P "$PASSWORD" "$ZIPFILE"
RESULT=$?
echo "Unzipped $ZIPFILE using password $PASSWORD ($RESULT)"
cd flag
done
Getting the flag ¶
We run the script and once it’s no longer able to unzip the next layer, it means it has reached the end. So we now basically just have to check the contents of the final layer.
Using find will give you an error, because the path to the flag file has become way too long.
$ find . -iname "flag" -type f -exec cat {} \;
cat: ./flag/flag/[...]/flag/flag/flag: File name too long
I decided to use a little bit of automation for this as well:
$ while [ $? -eq 0 ]; do cd flag/; done
$ pwd
/home/user/HTB/Challenges/Misc/flag/flag/flag/[...]/flag/flag/flag
$ cat flag
HTB{##########}
Understanding how to extract the morse code from the image was the most tricky part of this challenge.
I’m pretty glad with my solution and I’m curious to that of others. Perhaps someone did implement OCR for this?