One limitation of most alphanumeric shellcode decoders, including those in ALPHA2 and the soon-to-be-released ALPHA3 is that they need to know where they are located in memory in order to decode themselves and run correctly. This makes using a nopslide hard in most circumstances, because you mostly only need a nopslide if you do not know exactly where your shellcode is in memory to begin with.
Countslide GetPC is a new technique that I developed to allow the use of nopslides and determine exactly where your shellcode is if you can roughly predict where it will be located in memory.
Given a range of addresses Amin – Amax in which you can predict your shellcode to start, we will calculate the average address Aavg and the maximum absolute deviation Dmax like so:
Aavg == (Amin + Amax) / 2
Dmax == (Amax – Amin) / 2
Using a nopslide of length Dmax * 2 starting at an address in this range and a return address of Aavg + Dmax will always cause the nopslide to get hit and thus the code at the end of the nopslide to get executed:
Aavg Aavg – Dmax Aavg + Dmax D = -Dmax Nopslide code O = 2 * Dmax D = X Nopslide code O = Dmax – X D = +Dmax Nopslide code O = 0 Return address
In this example, the actual deviation D from Aavg indicates where the exploit actually ends up jumping to. The base address of the nopslide Anop plus the offset in the nopslide where execution starts O are equal to the return address Aavg + Dmax:
Anop + O == Aavg + Dmax
Because Aavg and Dmax are values we predict, we can calculate the base address Anop of the nopslide if we can calculate O. And because we know the length of the nopslide is Dmax * 2, we can calculate the base address of the code that follows the nopslide Apatcher as well:
Anop == Aavg + Dmax – O
Apatcher == Aavg + Dmax * 3 – O
So, any address Aavg + Dmax * 3 + X will be in the code that follows the nopslide at offset O + X (if that code is large enough). We can choose to overwrite a byte at that address to modify the code following the nopslide. Which byte of the code gets modified depends entirely on the value of O. This means that the value of O can directly influence what our code does and this is what we use to calculate the value of O.
A small piece of code which I will call the patcher of length P is put after the nopslide followed by a second nopslide of length Dmax * 2 which I will call the countslide. When executed, the patcher overwrites a byte in the countslide at address Aavg + Dmax * 3 + P (the modification address), which is always inside the countslide. Here’s an example:
Anop + Dmax * 2 + P Anop Anop + Dmax * 2 Anop + Dmax * 4 + P Nopslide patcher countslide Anop + O Anop + O + P + Dmax * 2 Aavg + Dmax Aavg + Dmax * 3 + P Return address Modification address
The countslide will consist entirely of one byte INC ECX instructions. The patcher will overwrite one byte at the predictable address Aavg + Dmax * 3 + P with a one byte POP ECX instruction. It then stores the predictable value Aavg + Dmax * 3 + P + 1 on the stack after which the countslide is executed.
Here is what will happen after the exploit makes code jump to address Aavg + Dmax in the nopslide:
- the nopslide executes until it reaches the patcher,
- the patcher modifies the countslide at Aavg + Dmax * 3 + P,
- the patcher saves the value Aavg + Dmax * 3 + P + 1 on the stack, after which the countslide is executed,
- the countslide increments ECX over and over, acting like a normal nopslide, until it runs into the patched POP ECX,
- the POP ECX instruction pops the value Aavg + Dmax * 3 + P + 1, saved there by the patcher, off the stack into ECX.
- the countslide then continues to increment ECX for every one byte instruction it executes, until it reaches its end.
The number of INC ECX instructions executed in the countslide after the POP ECX Ninc depends on Dmax and O as follows:
Ninc == Dmax * 2 – O – 1
So, taking into account that the POP ECX sets ECX to Aavg + Dmax * 3 + P + 1, after the countslide has completely been executed, the value in ECX will be:
ECX == Aavg + Dmax * 3 + P + 1 + Ninc
ECX == Aavg + Dmax * 5 + P – O
And because Anop + O == Aavg + Dmax, this means the value in ECX is:
ECX == Anop + Dmax * 4 + P
Which, as you can see in the second diagram above, is exactly where our countslide ends, so at this point ECX == EIP. The countslide is followed by the shellcode, which can use ECX as the source of its base address.
*UPDATE*: ALPHA3 comes with a working version of Countslide mixedcase alphanumeric ascii GetPC for x86.


7 Comments to “Countslide alphanumeric GetPC”
2010/01/03
I suspect that i’m being overly dense here…
You assume that your nop sled will reside in a predictable sub-region of [Amin, Amax] and based on this you can calculate the fixed address of the countslide (Aavg + Dmax * 3 + P). this effectively means you are able to pre-compute the location of your actual shellcode to begin with, since the address of the patcher and the following countslide is a constant. if this is true, why not just replace the patcher with a bit of code that directly initializes ECX to the effective address of the shellcode, since in essence this seems to be what you’re doing. I can understand that this value may need to be encoded somehow to be alpha compatible, although it seems like the patcher already has to account for this in your example considering that it pushes this value to the stack.
what is the benefit of the added complexity that you describe above?
2010/01/03
I need to do all these elaborate tricks to calculate the exact value of Anop not Aavg; the later is the predictable average address where the shellcode will likely be but doesn’t say much about the exact location of the shellcode when it gets executed. The former is what you need to know; it is the exact location where the shellcode starts for each try and it changes with each try. This is a value predicted to be in the range [Amin, Amax]. The countslide trick is the only way that I know to calculate it.
2010/01/10
i guess what i’m saying is if you can predict that code execution will start somewhere within [A_min, A_max], then you can assume the address of EIP is A_max and then append the “patcher” stub starting at A_max + 1. This is because A_min and A_max are fixed addresses per your description above.
in other words, you fill [A_min, A_max] with your NOP sled, followed by the “patcher” piece which is really just a fixed geteip that stores the address of A_max + lengthof(patcher) in the desired register, followed by the decoder stub.
what is the downside of doing it this way over what was described above?
2010/01/10
We are working under the assumption that you cannot know EXACTLY where your shellcode will start. If you would, you would not need a nopsled at all. Since we do not know where the shellcode starts, we do not know where in the shellcode to put the patcher to have at an exact location. We can only make sure it is in a certain region.
2010/01/11
ah yes, what you say makes sense. i was thinking about it a different way (aka the wrong way
. that is indeed a clever approach, nice work!
Trackbacks & Pingbacks
Linux x86 ShellCodes – 104 « "xcdx80"
Pingback on March 24th, 2011 at 14:09
Linux x86 Shellcoding – 104 « "xcdx80"
Pingback on April 29th, 2011 at 22:07
Leave a Comment