Hacking/Shellcode/Restricted instruction set
From Skypher
|
▼Main Page |
|
Some vulnerabilities required a shellcode that can pass through heavy filtering before being run in order to be succesfully exploited. Data such as filenames, paths, urls, etc... get checked for illegal characters before being processed by an application. Filters that remove or convert certain characters can make exploitation really difficult, but not always impossible. On two seperate phrack articles, rix and obscou have already proven that it is possible to write working alphanumeric and unicode shellcode. My shellcode encoder ALPHA proves that it is possible to write completely upper- and/or lowercase, alphanumeric ascii shellcode and even alphanumeric unicode shellcode. This article explains how I developed the decoders that ALPHA uses to make working alpahnumeric code and provides ideas for a more universal solution to the problem of working with a restricted instruction set.
When coding assembler using a very limited instruction set, you'll often think that you cannot a certain thing, like set a register to NULL. I found that quite a few times you're just overlooking the very unobvious. Sometimes you will have to write a page of code to do what you could have done with one instruction (assuming that instruction is not in your limited instruction set). A good example of this is my SEH GetPC code; it uses a large number of instructions to do what a simple CALL/POP (6 bytes) could have done, if only the CALL instruction could be encoded using alphanumeric characters only.
Requirements
The techniques discussed in this article should work on any IA-32 machine, independent of the Operating System it is running. You will need to have access to an assembler or a compiler that can turn assembler into machine code and/or a debugger for creating and testing your code.
A generic solution to encoding code
The problem with filtered input is that the number of possible values that a byte is allowed to have is less then 256. We do not yet know what the code we want to execute is going to be. We must therefore assume that it will contain all 256 possible values a byte can normally have. We are looking for generic solution for turning ANY code into code that contains only those values for bytes that are allowed. My solution is twofold:
- Create a decoder that can pass through the filter and work.
- Encode the original code as data that can pass through the filter, such that the decoder can regenerate it when it gets run.
The size of the resulting code is the size of the decoder plus the size of the encoder original code. In order to keep the size small, you should keep the decoder simple and the encoded data packed efficiently. Unfortunately, there is a trade of between these two sizes; to denser you want to pack your encoded data, the more complex a decoder will often need to be. So smaller data often means a larger decoder. It makes sense to create a simple decoder first to see if it can be done and then consider your options for increasing the desity of your encoded data. Because we are encoding shellcode, which is often very small, there is not a lot of data to encode. This means that the encoded data is likely to be small and any increase in the data density will not make a huge impact on the overall size.
Data density
For instance, there are 62 alphanumeric characters, but ALPHA only use 4 bits to encode data (16 values, 67% efficiency). Assuming you are encoding 200 bytes of shellcode, (200*8 bits), you could theoretically encode your data in 269 bytes, if your data desity was optimal. Instead, it is encoded into 400 bytes. You could decrease the size of the encoded data by up to 131 bytes by using a more optimal encoding. However, you will need to create a new decoder that can decoder it, which will very likely be more than 131 bytes larger than the simple decoder. The end result will therefore be larger in total.
Very low data density
If the encoded data contains a very limited set of bytes, you might consider using a decoder table, for instance: if your shellcode uses 15 different characters, you might consider replacing each one of these bytes in the shellcode with a valid character. The decoder creates a lookup table where every encoded byte translates back to the original. But since this is a very specific case, which doesn't happen very often, I won't be discussing this in detail.
Decoder loops
I refer to a "decoder loop" when I'm talking about a piece of assembler code that can decoded any number of encoded bytes to the original data byte by byte. All my decoders follow these simple steps (but not necessarily in this order):
.-> |1. Read encoded data (input)| L |2. Decode it| O |3. Store the result (output)| O |4. Move to the next piece of data| P |5. Check whether it's reached the end of the input`--'|6. Jump to step 1 if not.V(decoding finished)
Some of these steps can be done using one instruction, some will take more and sometimes two or more steps can be combined into one or more instruction(s). It all depends on which instructions you have available, how you en-/decode your data and how you plan to make all this work. I'll discuss each step separately and give some examples of how to implement them and how to combine them.
