Note to fellow-HTBers: Only write-ups of retired HTB machines or challenges are allowed.

Challenge info

M0rsarchive [by swani]
Just unzip the archive … several times …

The challenge

The challenge description says it all: we get a zip file and we need to encrypt *each and every* layer to get to the flag.

First reconnaissance

Unzipping the first zip file, M0rsarchive.zip — which just contains the actual challenge — we get the following files:

$ sha256sum M0rsarchive.zip 
c1c6436427e2bd82c114890069c72eb7c1b67d56ae101ccca195fbc2325707b0 M0rsarchive.zip

$ unzip M0rsarchive.zip 
Archive: M0rsarchive.zip
[M0rsarchive.zip] flag_999.zip password: 
 extracting: flag_999.zip       
 extracting: pwd.png

$ ls
flag_999.zip M0rsarchive.zip pwd.png

$ unzip -l flag_999.zip 
Archive:  flag_999.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
   624326  2018-10-03 20:02   flag/flag_998.zip
       99  2018-10-03 20:02   flag/pwd.png
---------                     -------
   624425                     2 files

Opening pwd.png in the default image viewer, we get a pretty small image with some coloured stripes and dots: morse!

Decrypting the morse code in a random morse decoder tool, we get 9.

So we unzip flag_999.zip using this password. This gives us the next layer, containing yet again a zip file and an image with a morse code.

$ unzip flag_999.zip 
Archive:  flag_999.zip
[flag_999.zip] flag/flag_998.zip password: #9
 extracting: flag/flag_998.zip       
  inflating: flag/pwd.png

$ cd flag/
$ ls
flag_998.zip  pwd.png

The morse code is now split over 2 lines, so this will probably be a 2-character pin/password.

This calls for automation!

Automating the boring stuff with Python

This might be a chapter in Al Sweigart’s book, I don’t know as I still have to read it (it’s on my TODO-list…). So we’ll wing it by ourselves 😋

What do we need to achieve?

  1. Open an image
  2. “Read” morse code contained in image
  3. Decode morse into text
  4. Unzip protected zip using this password
  5. Repeat endlessly (or until we run out of zip files)

For §2 I was first looking into Optical Character Recognition (OCR), but this was taking me waaaaaay to far.
So let’s think about it for a second. We see an image containing 2 colours, with dots . and dashes of respectively 1x1px and 3x1px.
So we should be able to go over each pixel and match the colours.

The following code enumerates the pixels in an image. We assume the first (top-left) pixel had the background colour and that differently coloured pixels will be part of the morse code.
We then convert these pixels into symbols we can match, a bit like using 1’s and 0’s.
Finally we take the output, clean it up a little and convert it into morse (3 consecutive symbols are a dash, single symbols are dots, and newlines are letter separators).

Image to morse converter

#!/usr/bin/env python
# -*- coding: utf-8 -*-

def getMorse(image):
  """Get morse out of image

  This assumes the background colour is static and morse code has a different colour.
  Morse code can be in any colour, as long as it is not the same as the top-left pixel.


  >>> getMorse('pwd.png')
  ['----.']

  """

  from PIL import Image
  import re

  im = Image.open(image, 'r')

  chars = []
  background = im.getdata()[0]

  for i, v in enumerate(list(im.getdata())):
    if v == background:
      chars.append(" ")
    else:
      chars.append("*")

    # print "{0: <4}: {1: <15} ({2})".format(i, v, "[-] BG" if v == background else "[+] FG")

  output =  "".join(chars)

  # Clean output by removing whitespace front and back
  # Then make dash out of every combination of 3 dots
  # Convert dots to actual dot
  # Convert whitespace between letters (i.e. >1 bg pixel) to seperator
  # Remove whitespace
  # Return list of letters
  output = re.sub(r'^\s*', '', output)
  output = re.sub(r'\s*$', '', output)
  output = re.sub(r'\*{3}', '-', output)
  output = re.sub(r'\*', '.', output)
  output = re.sub(r'\s{2,}', ' | ', output)
  output = re.sub(r'\s', '', output)
  output = output.split('|')

  return output

Next, we convert the morse back into text. For this we basically use a dictionary with the morse code as key and the letter as value.
Note that most websites give you dictionaries that are the other way around. While you can match based on the value, it’s more efficient to first write some code to mirror the dictionary and then use that one for the decoding, since we don’t need to encode anything.

Morse decoder

def getPassword(morse):
  """Decode morse

  Convert morse back into text.
  Takes list of letters as input, returns converted text.

  Note that challenge uses lowercase letters.

  >>> getPassword(['----.'])
  '9'

  """
  MORSE_CODE_DICT = {
    '.-': 'A', '-...': 'B', '-.-.': 'C', '-..': 'D',
    '.': 'E', '..-.': 'F', '--.': 'G', '....': 'H',
    '..': 'I', '.---': 'J', '-.-': 'K', '.-..': 'L',
    '--': 'M', '-.': 'N', '---': 'O', '.--.': 'P',
    '--.-': 'Q', '.-.': 'R', '...': 'S', '-': 'T',
    '..-': 'U', '...-': 'V', '.--': 'W', '-..-': 'X',
    '-.--': 'Y', '--..': 'Z', '-----': '0', '.----': '1',
    '..---': '2', '...--': '3', '....-': '4', '.....': '5',
    '-....': '6', '--...': '7', '---..': '8', '----.': '9',
    '-..-.': '/', '.-.-.-': '.', '-.--.-': ')', '..--..': '?',
    '-.--.': '(', '-....-': '-', '--..--': ','
  }

  for item in morse:
    return "".join([MORSE_CODE_DICT.get(item) for item in morse]).lower()

Note that the zip passwords are in lowercase. If you forget to add this detail to the code, you’ll get to somewhere in the 700’s before you encounter a zip where the decoded password won’t work.

To use this Python script in an automated process, it’s best to add a main-method which will call these methods and pass the necessary parameters.

def main():
  """ Auto start

  Used for automation.
  Automatically call methods and use 'pwd.png' as input image.

  """
  print getPassword(getMorse("pwd.png"))


if __name__ == "__main__":
  main()

Finally, we need to write some code to recursively unzip each layer. A few lines of Bash will help us with this. I used a similar routine for the Eternal Loop challenge.

#!/bin/bash

RESULT=0

while [ $RESULT -eq 0 ]
do
        PASSWORD="$( python /home/user/HTB/Challenges/Misc/M0rsarchive/M0rsarchive.py )"
        ZIPFILE="$( ls *.zip )"
        unzip -P "$PASSWORD" "$ZIPFILE"
        RESULT=$?
        echo "Unzipped $ZIPFILE using password $PASSWORD ($RESULT)"
        cd flag
done

Getting the flag

We run the script and once it’s no longer able to unzip the next layer, it means it has reached the end. So we now basically just have to check the contents of the final layer.

Using find will give you an error, because the path to the flag file has become way too long.

$ find . -iname "flag" -type f -exec cat {} \;
cat: ./flag/flag/[...]/flag/flag/flag: File name too long

I decided to use a little bit of automation for this as well:

$ while [ $? -eq 0 ]; do cd flag/; done
$ pwd
/home/user/HTB/Challenges/Misc/flag/flag/flag/[...]/flag/flag/flag
$ cat flag
HTB{##########}

Understanding how to extract the morse code from the image was the most tricky part of this challenge.

I’m pretty glad with my solution and I’m curious to that of others. Perhaps someone did implement OCR for this?