Introduction to Information Security, Assignments of Computer Science

An introduction to the cs 6035: introduction to information security course at georgia tech. It covers various topics related to information security, including buffer overflow, assembly programming, and reverse engineering. A series of "capture the flag" (ctf) challenges that students must complete as part of the course project. The challenges cover a range of security concepts and techniques, such as exploiting buffer overflows, analyzing assembly code, and understanding client-server communication. The document also provides information about the virtual machine (vm) environment that students will use to complete the project, including download links and login credentials. Overall, this document serves as a comprehensive guide for students enrolled in the cs 6035 course, providing them with the necessary resources and instructions to successfully navigate the course project and develop their information security skills.

Typology: Assignments

2020/2021

Uploaded on 06/23/2024

home-ew
home-ew 🇺🇸

1 / 25

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
GT CS 6035: Introduction to Information Security
Project
Capture The Flag!
Learning Goals of this Project:
Students will learn introductory level concepts about binary exploitation. This lab develops
understanding of control flow hijacking through different tasks/challenges designed to show
certain vulnerabilities or weaknesses in a C program. A python library pwntools will be used to
show some exploitation techniques and automation to successfully hack a program
The final deliverables:
A single json formatted file will be submitted to Gradescope. This file should be named
project_ctf.json. A template can be found in the Home directory.
See Submission Details for more information
Important Reference Material :
This Intro to pwntools/pwndbg video showing how to automate some exploits and use
our exploit framework on the VM
If you’re an absolute beginner with no Linux experience, This Website may be able to help
pwntools Documentation
GDB command cheat sheet
Submission:
Gradescope (autograded) - see Submission Details
Virtual Machine:
(Note: downloads can be very slow when the project first releases due to very high traffic in the
first few hours/day)
- Parallels vm for apple m1 based systems
-You can install Parallels for mac here
-VM Download Link
- Username: parallels, Password: password
- Intel/AMD x64 version (suggest using 6.1.16 but can try any version if already installed)
-VM Download
-Windows Virtualbox 6.1.16 Download
-Mac VirtualBox 6.1.16 Download
- Username: cs6035, Password: cs6035
- Note, there is no root permissions on the VM
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19

Partial preview of the text

Download Introduction to Information Security and more Assignments Computer Science in PDF only on Docsity!

Project

Capture The Flag!

Learning Goals of this Project:

Students will learn introductory level concepts about binary exploitation. This lab develops understanding of control flow hijacking through different tasks/challenges designed to show certain vulnerabilities or weaknesses in a C program. A python library pwntools will be used to show some exploitation techniques and automation to successfully hack a program

The final deliverables:

A single json formatted file will be submitted to Gradescope. This file should be named project_ctf.json. A template can be found in the Home directory_._ See Submission Details for more information

Important Reference Material :

● This Intro to pwntools/pwndbg video showing how to automate some exploits and use our exploit framework on the VM ● If you’re an absolute beginner with no Linux experience, This Website may be able to help ● pwntools Documentation ● GDB command cheat sheet

Submission:

Gradescope (autograded) - see Submission Details

Virtual Machine :

(Note: downloads can be very slow when the project first releases due to very high traffic in the first few hours/day)

  • Parallels vm for apple m1 based systems
    • You can install Parallels for mac here
    • VM Download Link
    • Username: parallels, Password: password
  • Intel/AMD x64 version (suggest using 6.1.16 but can try any version if already installed)
    • VM Download
    • Windows Virtualbox 6.1.16 Download
    • Mac VirtualBox 6.1.16 Download
    • Username: cs6035, Password: cs
  • Note, there is no root permissions on the VM

00_intro

Step 1: Open a terminal and cd into the project directory project_ctf/00_intro. $ cd ~/project_ctf/00_intro Inspect the contents of the readme file $ cat readme Follow the instructions in the readme to modify e.py with your GTID (9 digit numeric school ID number that looks like 901234567 and afterwards execute the script to get your first flag! Your output will look like this. Copy this submission hash and place in the json file in your home directory ~/project_ctf.json

SUBMIT YOUR FIRST FLAG TO MAKE SURE IT WORKS BEFORE

CONTINUING

Also, it is a very good idea to submit each flag you get to make

sure it works before moving on, in case of any issues

(Applicable for all flags): If for whatever reason you don’t get a

flag and you’re positive you should, try running the exploit once

or twice. The flag generator can have some unexpected

behaviors. When in doubt, make a post in Ed Discussion to ‘All

Instructors’ and we will assist you if possible

01_buffer_overflow_

In this task you will learn details about binaries compiled from C code, and how some basic things can be exploited such as process redirection or control flow hijacking. The steps in this flag are discussed in-depth in the intro video. In this directory you have an executable binary named ‘flag’ which is vulnerable to a buffer overflow in one of its functions. We will be using an exploitation library called pwntools to automate some of the overflow techniques and get the binary to call a function it otherwise wouldn’t have. This function called ‘call_me’ generates a key using your Gradescope User ID to get a valid flag that will pass the autograder. Now we will run the binary just to see what the program is doing by running the executable $ cd ~/project_ctf/01_buffer_overflow_ $./flag We see the binary is asking for a string, input any text you want or just press enter, and you’ll see that the program does nothing and just exits. That's just to simplify the code so we can focus on the exploit. The binary is statically linked to a shared object which has a lot of methods that construct the key and has a simple function called ‘call_me()’ which will print out your key. This is where we will start learning about binary file formats. Without going into a deep dive about program structure, operating systems, compilers, assembly language, machine code, etc. you will still be able to understand that there are two aspects that are key in binary exploitation

- Data - is simple enough, it is just any collection of bits that represent some kind of data element (like an ASCII character, integer value, pointer, etc) - Addresses - At this scope we can just think of addresses as fully unique identifiers of specific data elements. These are logical locations the computer understands.

A buffer overflow occurs when too much data is fed into an unprotected (or poorly protected) data buffer. The way that 64-bit C programs work is, a small amount of bytes past the beginning of the stack frame, data is stored at an address called the Instruction Pointer which is a register pointing to the currently executed instruction. If we override this with a valid address we can manipulate the control flow of the program and have it execute arbitrary (or otherwise unintended) code, with a well-formed attack. Starting off easy, we are going to modify e.py and learn a few basics of the pwntools library. Open e.py with your favorite text editor and analyze the content and comments. Once you understand what they do, proceed to fill in the cyclic size (this number is up to you, based on your understanding of the program and what would break it) to get a segmentation fault message by running $ ./e.py dbg This will open up a gdb terminal with a breakpoint set at main() Type ‘c’ to continue from the breakpoint ( sometimes need to press ‘c’ twice if you don’t see the error, this is an issue with how gdb attaches to processes ) _Note: The screenshot below is taken from an Intel based OS. For ARM based OS (Apple M1), the registers will be different. For more info you can visit the ARM Documentation_*

Stepping away from the pwntools library for a moment, we now need to find something usable within the binary that will allow us to actually call a function or do something other than just crashing the program. Now we will use a linux command ‘objdump’ which takes a binary file and will output a dump of the binary which will give some key information about the binary. The -D flag will output binary addresses, machine code, and assembly code of the binary into a file. objdump -D flag > flag.asm Then open flag.asm You will see a bunch of (likely) confusing information that at a high level translates to the code that you can see in the ‘flag.c’ file. You aren’t going to have to go through this file in any extreme expanse (unless you want to?) we are just going to focus on finding an address within the binary file that holds the machine code responsible for making a function call to ‘call_me()’. Search for the string ‘call_me’ in flag.asm and keep looking until you find the assembly instruction: For Intel/AMD/CPUS: call <call_me> For M1/aarch64 based systems bl <call_me> For Intel/AMD CPUS: Note down the highlighted address showing the call (it will be different in your binary):

On M1 / aarch64 based systems, the command will be a bl that looks like this. Now open e.py and adjust the line BINARY_ADDRESS = 0xdeadbeefdeadbeef With the hexadecimal value of the address above (prepend 0x to the value highlighted) Now run ./e.py again from the command line (without dbg) and check the terminal output. Did you get it? Awesome! Submit your first flag to gradescope (follow APPENDIX for more details) ● If not, retrace your steps in this task and also make sure you used the call call_me address in the earlier step and not the address of the actual function call_me()

02_bad_rando

This Program (very conveniently) leaks out part of the libc base address

  • this address is randomized via ASLR so it will change a little bit every time the program is launched
  • run the program a few times and notice what bytes are different and which ones aren't Next step will be analyzing the C file and see what we are comparing against in order to get to 'call_me'
  • system() is a libc function, use GDB to get the address of system using 'p system'
  • repeat this a few times and see if you notice a pattern for the address returned from system Fortunately there's only one byte that is missing from our formula, so we can do some scripting in python to try out the remaining values.
  • pwntools has a function called recv<line|until|all>() that will let us do some manipulation with the string returned (before we send the payload) and allow that to coerce the input we send in.
  • the recv functions will return a BYTES object, so you will need to do some clever manipulation of said strings that are returned, this will probably take a few iterations and permutations to get the value in the right format
  • note that the C file is using scanf to read in a hexadecimal number, meaning you don't need to use p64(), you are sending in the STRING REPRESENTATION of a hex number, that means WITHOUT the '0x' in the beginning, and you send the string directly on the command line like 'ffaabbccdd' or 'f701234abcd' etc!
  • your task is going to be:
  • get the value leaked from the program
  • modify it with the offset of the system() function
  • fill in the remaining byte with a random value
  • send to the process
  • (repeat until you get a flag)
  • note: i recommend using recvall() after you send in each payload, and write your loop logic around the output (see other flags for what kind of string output you can expect) to see if you got the right value!

02_p4s5w0rd

STRINGS!

Now it's time to learn a really useful technique to find all the available strings in a program. And by strings, we mean any collection of printable characters that exist in the binary. So things like variable names, hardcoded paths, debug messages, or eeeeevenn.... passwords? Hopefully not in a real program but you would be surprised. This binary has zero debugging information and you do not have the source code available, but guess what? The program is written terribly and is very unsafe, with passwords stored in plain text that can easily be dumped/searched in the binary! I would recommend running the program once or twice to see what it's doing (checking a series of responses to questions) and if you get every question right, then you will get the flag! To get the strings for the program, run the command: $ strings flag This will output it all to the terminal which isn't super helpful, so would suggest redirecting output to a file like: $ strings flag > flag_str Now you will be able to grep/search/navigate the file in a new terminal and will (hopefully) be able to figure out what the correct responses would be for the given questions. (hint, strings are stored in the binary in the order that they’re written in the C code, might be a good idea to search for the questions they’re asking and it should be pretty easy to determine the answer from there!) Good luck!

03_XORbius

Time to rev up those Reverse Engineering motors, because you need to unravel the logic that this program is checking against in order to get to the call_me() function! No buffer overflow this time, you just ‘simply’ need to input the right values that will correctly decode the logic and pass the checks. If you’re unfamiliar with C operators, this TUTORIAL has all the necessary operations detailed. Suggest pen and paper for this one to work through the logic by hand, or do a ton of experimentation to get the right value!

03_pointy_pointy_point

We see there is an unsafe() function which has some checks for different local variables. The positioning of these variables is important because they are declared before the input buffer which means that a buffer overflow will cause data to be overwritten. You will find additional details on this flag in the readme file of the folder. This program is a Buffer Overflow, however you will not be changing the control flow to a specific binary address, rather you will need to enter in the right values to trick the pointer arithmetic logic and get to the call_me() function.

Pictorially, this is what our crafted exploit needs to look like (remember stack grows down) Now we will need to supply the argument, which will be on the stack immediately after our pop gadget, figure out what that value needs to be, and add it as p64() after the pop gadget Then we need to put the address of the function as the next call, use objdump or gdb to find the addresses (you should probably get the second function address while you’re at it). The call to our pop gadget will ‘ret’ and then hit this second function call to enter one of the unsafe functions Finally, we need to finish our execution chain by calling the second function which will allow for exploitation. Append that address to your chain and see if you get a flag!

03_hunt_then_rop

(APPLE_M1_AARCH64)

You’ve made it! You are now on your final task. In this directory is the entire contents of /usr/bin/, a collection of binary files that make up a lot of common linux uses. One of these files has been overwritten by a vulnerable program. It is your task to figure out which one. You are given a list of checksum values that are known good, so your first task will be determining the sha256 hash of all of the files in this directory, and then finding the one that does not match. You are free to do this however you would like. NOTE: in your scripting method, ignore the files ‘checksums’ and ‘user.txt’. They will likely report a mismatch but you can be certain they are not the files in question $ cat checksums Once you find the file it is time to begin our exploit of that file. This is a bit more complex than the other flags and will require a full ROP (return oriented programming) exploit to chain calls together, and we will also need a new tool called Ropper to find a ‘gadget’ in order to supply a function argument and pass a specific check. In 64-bit programs, the function gets arguments through registers, in the case of aarch64 based ARM CPUs, the x0 register supplies the first function argument. So we need to find a gadget (a piece of code that we can override the instruction pointer with, that will perform a certain action and then continue with the control flow hijack) that will pop a value from the stack into the appropriate register. A helpful reference here should show you which registers might be useful to you. We want a gadget that will use the LOAD REGISTER (ldr) operation to access something from the stack pointer [sp]. use ropper like this (See: screenshot for example output) ropper -- file flag | grep “ldr”

our pop gadget will ‘ret’ and then hit this second function call to enter one of the unsafe functions Finally, we need to finish our execution chain by calling the second function which will allow for exploitation. Append that address to your chain and see if you get a flag!

Rubric

This project is worth 15% of your grade. There are a total of 110 points for this project, if you complete all flags and get all 110 points, you get an extra 10% of the project applied to your grade If you complete all flags you will get an effective extra credit of 1.5% final course grade applied

Flag %grade

00_intro 0

01_basic_overflow_1 10

01_basic_overflow_2 10

02_assemble_the_assembly 15

02_bad_rando 15

02_p4s5w0rd 15

02_the_server_client_one 15

03_hunt_then_rop 10

03_pointy_pointy_point 10

03_XORbius 10

Total % Possible 110