Installing the Capstone disassembler (with commentary)

The premise

I came across this tool Capstone as I wanted to make a simple tool for binary analysis and have some fun with homegrown tools. One module I implemented first is a parser which reads through a PE (.exe) file and parses the header to find where the sections, tables, and other metadata are in the file; And then knowing where the sections are, the data is, and the code is. I was thinking: let’s find the basic blocks, plot the cfg? What would that need: to understand the binary code. It looks simple: just make a big hashmap mapping bytes to the mnemonics and operators right? Until I realized this actually is not that simple, the ISA we’re talking is not small and by the way I found Capstone which does what I needed here BUT also it is doing it faster and supports all architectures, with some helper functions out-of-the-box as well: i.e. point it to the beginning of a code section, and it disassembles all the way.

So Capstone is a framework for disassembly: it helps you dissasemble bytes into human readable instruction for [almost] any ISA on [almost] any platform. You can view the full features on their website, but the bottom line is this tool is ubiquitous in reversing tools, you can view 402 projects in which Capstone has been used on their website among them is radare2. which you might recall from another post

Another feature is that there are bindings for many languages: Go, Js, .NET (C#), Java, Python, Haskell, Rust, PHP and more. This means you can easily integrate it into whatever else you are doing in any language, a big bonus. In this tutorial lets first get it to work on Windows with C (this is the toughest option. If you want to use it in js, there’s no hitch: just get the minified js and you can use it; couldn’t be easier!)

Spending few hours stuck trying to get Capstone playing nice with Visual Studio, led me to learn more about how C Runtime works in the Windows world, and this is something useful for other Visual Studio endeavours as well, say you want to do a project whith OpenGL and want glm, freeglut you will see a lot of this. (Or any other C related project really) Also another benefit of looking into Capstone’s repository is that Capstone is a well-crafted tool, writers say the best way to learn writing is to read good prose, similarly Capstone conforms to good coding practices and looking into the code is instructive, and also it is a relatively big C project working on linux, windows and mac and it is also beneficial to check it out if you are interested in making a multi-platform C project, a class in using CMake. One of the features of Capstone that you can build your customized version of Capstone, trimming the features you do not need, this again is a pattern which is very useful in various projects.

Windows - Staring MSVC in the eye

For windows, the problem is building C using MSVC has its quirks. Trying out the sample test from the website, we set up a new console project in Visual Studio and then, of course, we need to tell Visual Studio how to actually use Capstone we need to tell it somehow, where to look for headers and stuff.

The windows distributions when extracted, includes a folder called include where the headers are, and there are .lib files and a .dll and cstool.exe which is a commandline tool. You can do (this is right from the readme file at cstool directory):

The _-d_ option shows details. Pretty cool for a quick query. Moving on, .lib files contain the IAT (Import Address Table) which tells the linker where to look for different functions in the .dll which includes the actual code, and is loaded at runtime. So if no .lib will spew out errors from the linker that I need a function but I cannot find it, and no .dll means it tries to find the .dll (and fails) or if there is a .dll but it is not compatible from that .lib tries to find a function at the address from the IAT but would hit some weird errors this time.

First we tell Visual Studio (I am on 2017, but instructions would be similar) where that include folder is located [my project is named caps]: Go to Project > caps Properties…

As it can be seen you just head to C/C++ > General there is a field for Additional Include Directories and you add the directory of the include file (containing the relevant headers). (don’t forget the separation delimiter _;_) Now we put capstone.dll and the .lib in the project directory. By default it looks for dll files there, but you can also specify it at the VC++ directories tab, and not put the dll in project directory: you might want to be fancy and put the .lib and .dll files in seperate folders and not the root directory of the project. You can also just put them anywhere, but it’s better if they aren’t outside your project folder i.e. imagine you want to release the code.

And finally we specify what .lib files we need using: Linker > Input > Additional Dependencies

It should now build without error right? That’s what I thought but I got a ton of errors, errors like LNK 2001 unresolved external symbol vsnprintf. After searching the names showing up here, I realized the linker is having problems even with memcpy, in linux these are all part of the in libc, the [shared] library for standard C. So we probably have a .lib problem here. TL;DR to have C runtime, turns out we need to add these .lib files as well (the same way as before): _ucrt.lib; libcvruntime.lib; libcmt.lib; _

Adding these should fix the problem. You can see some details here, but basically ucrt.lib is the Universal CRT: back in the old days it used to be different but in Visual Studio 2015 Microsoft refactored CRT and standard C library, POSIX extensions and some microsoft-specific macros and variables were moved into UCRT and some compiler-specific (as opposed to the Universal in UCRT) were moved to libvcruntime.lib and also libcmt.lib (LIBC[D].LIB, LIBCMT[D].LIB, and MSVCRT[D].LIB are all similar the MT is for MultiThreaded, the D is for DLL versions, which are not statically linked and thus also require the corresponding DLL). You can view which symbols are in which by omitting that lib and trying to build:

The better way though is to use Microsoft’s DUMPBIN tool to see info on each library. As an example, some runtime check functions (you can spot them as they have RTC in their names) are in libcmt.lib , and the famous free, malloc, strlen, … we know and love, are in ucrt.lib. And if you see weird long symbols, particularly with atsigns in the name that is due to name mangling.

Linux - Goodbye .dll ; hello .so

In linux things are easier really, fastest way is to just build from source: get the github repo and (as the instructions say):

1
2
$ sudo ./make.sh 
$ sudo ./make.sh install

And you are good to go. If you do prefer not to build from source for some reason, then use the packeged versions:

1
$ sudo apt-get install libcapstone-dev

This is for development and gets you libcapstone3 as a dependency but you can just apt-get libcapstone3 directly if you wish. At the time of writing, the capstone website is outdated and instructs you to apt-get libcapstone2 but that package is now extinct (won’t find it) and you should try it with version 3. (If you build from soruce you are getting the latest, which is version 4)

There are examples, if you have got the repo from github, at capstone/tests/ and you can make and then try out the generated binaries. If you see the makefile there you will get an idea of how it works, but it is a bit complicated so if you are in a separata directory and just trying to compile your code, do this:

1
$ gcc test.c -lcapstone

The _-l[LIBRARY_NAME]_ switch tells the compiler (actually the linker, ld) that you are linking against [LIBRARY_NAME] (here capstone) library which you have installed. If you do not specify this option, you will not get into a problem with the header, gcc knows where to find the header but then it will spew errors. Comparing to windows, this is what happens in place of those .dll , .lib work we did. Using:

1
$ dpkg -L libcapstone3

You can view which directories are affected (-L switch is for list) and you can see that installation would put Capstone files in directories like /usr/share , /usr/lib , .. (one for headers, shared libraries, docs etc.) and so when you specify -lcapstone to gcc it looks for capstone.so or capstone.a in these directories. Slick!

Post by: Iman Hosseini

An excursion in evaluation of binary analysis tools

As outlined in a previous blog post, there are multiple tools for binary and source analysis. They are crafted with slightly different motivations and considerations by their respective developers and thus it is not trivial or well-defined to generally evaluate them against each other.

If we really want a comparative evaluation though, to have a meaningful one we need to identify a feature supported by all the tools under consideration, and generating callgraph is a baseline feature all tools support, and the idea is simple: the tool would analyze the code (be it binary or source code) and identify the functions and if a function foo can call bar, make an edge foo->bar in the callgraph.

The test program which I used to test, was just defining a bunch of functions which call each other. The main was empty, but it suffices as all it matters is the callgraph which is about how these defined functions call each other. And to then make that structured I generated hybercubes with them: for an N-hypercube, for 0 to 2^N - 1 name the functions as: name i = “f”+[binary representation of i] so for i = 2 we have f00, f01, f10, f11. Now treat each binary representation as coordinates in an N-dimensional space, each function would correspond to a vertex of a hypercube, for 2d its a regular square, and for 3d a (normal)cube. How are the vertices connected? Each two points are connected if their Hamming distance is 1 (i.e. they differ in only one coordinate).


FIG1. 10 dimensional graph, rendered using GraphStream



Hypercubes admit nice symmetries and are aesthetically pleasing but there is a downside: for our test we want to cover programs of different sizes, and we would fancy a way to cover different sizes as we want. With hypercubes we do not have a gauge, notice that ‘size’ here, would be the number of edges, and for a N-cube it is N^3, so we can get only programs of sizes 1, 8, 27, 64, … and so not only we are not making programs of arbitrary size but also the sizes aren’t even uniform, and the distance between possible program sizes diverge as N increases. The title image is a glimpse of one of the callgraphs generated by bap. It is very easy with bap:

1
$bap executable -dcfg > output.dot

Due to the symmetry of the hypercubes, it is very efficient to test the resulting callgraphs: if we have the correct number of edges, and each edge is between functions at Hamming distance 1, the callgraph is correct. But to have a better tunable measure for performance I generated programs using another model: a (n,p) program is a program with n functions, and for each function there is a probability p that it calls any other function => the expected number of edges is NN\P. Now we have two parameters to tune the size.

One measure of performance then would be the time it takes for a tool to make the callgraph. But time, is variable. We can take a number of trials and average over them to get a more stable measure, but still from system to system it would differ and also it depends at the state of the system at runtime. Valgrind is a tool which lets you profile machine instructions: the number of instructions does not depend on runtime, and much less on the system as for the same architecture, the same compiler would generate the same machine code.


FIG2. Radare2 takes linear instruction (/time) to find callgraph



FIG3. Radare2 vs BAP (Yellow), x: edge size | y: instuction count


For Radare2, as shown in FIG2, which is a plot of [y: number of edges - x: number of instructions] the number instructions (which roughly is itself, linearly related to time) increases linearly with the number of edges (each edge, is a function call, inside the body of another function) so it seems that the underlying implementation of Radare2 is O(N) -where N is the number of calls/edges-.

When comparing Radare2 with bap (FIG3), we see that even though they are both linear, radare2 is way faster; and for bap there is a threshold after which there is a bump in the slope. Also, rather peculiarly, if you feed bap very large (larger than 100 function) code you get false positives: bap detects functions which are not existent. We should be getting edges like: “ f12 -> f18 “ but we suddenly begin to get bogus ones too like: “ sub_4064ab” -> f308 “.

I tried various ansatzes to find out what is the reason behind this, (radare2 does not show this artifact) but it seems totally arbitrary: even the functions which are messed up, are not at the end/beginning i.e. out of 200 functions, f121 is messed up, f200 is fine. I discussed it with Ivan Gotovchits, the main developer of bap as well, and he said it’s just some false positive somehow, and maybe it’ll be fixed if I turn off byteweight, which didn’t fix it.

The code for the C code generator is available at:https://github.com/S4Lab/Benchmark-Utils

Post by: Iman Hosseini

Coffee in S4Lab

Photos by Parmida Vahdatnia

It is now roughly 8 months that I am part of the S4Lab and during this time I have learned about various qualities here that makes S4 great, and recently there has been a new addition which specifically I am very fond of: We’ve got a full stack coffee framework.

Grinding the beans in-house means that you get the aroma of coffee welcoming you to the lab and a fresh cup of coffee is the perfect beginning for a day of work. And if you are thinking decaf…

Honoré de Balzac, famous novelist and playwright, on coffee: “This coffee falls into your stomach, and straightway there is a general commotion. Ideas begin to move like the battalions of the Grand Army of the battlefield, and the battle takes place. Things remembered arrive at full gallop, ensuing to the wind. The light cavalry of comparisons deliver a magnificent deploying charge, the artillery of logic hurry up with their train and ammunition, the shafts of which start up like sharpshooters. Similes arise, the paper is covered with ink; for the struggle commences and is concluded with torrents of black water, just as a battle with powder.” Only for us, instead of similes it is data structures arising, and the instead of paper we have emacs. [Editorial Note: vim would have been/is a much better selection, than emacs]

Post by: Iman Hosseini

Automated Binary Analysis (a short intro)

Automated program analysis is the practice of analyzing computer programs using [other] programs, as opposed to code audits by a developer. The objective can be to find security vulnerabilities, to optimize, or to reverse engineer an obfuscated program and gain knowledge about its flow. The are benefits to this approach. It allows scalability to cater to the huge number of programs at large which would not be possible to audit “manually”. Code auditing is a skill that is not gained easily, so a developer good for the job doesn’t come cheap. Also, online and live monitoring of code might be a necessity.
Automated program analysis can be implemented at two levels: static analysis is concerned with analysis of code at compile time, without actually executing the program. And dynamic analysis is done with executing the program and considering the runtime as well. Also, the input program can be source code of the program, or the compiled binary. Each approach has its benefits and disadvantages. Inferring the callgraph of a program, tracking api-calls, and recovering control flow graph of the program can be among the tasks of analysis platforms.
A tool used for runtime analysis of programs is symbolic execution, which runs through the program assuming symbolic (rather than concrete) values for the variables to gain further knowledge about the flow of a program.

A concrete example


Trail of Bits sketches a way to detect heartbleed. It is based on a characterization of the vulnerability via calls to ntoh and hton which can taint a variable, then calling memcopy and passing a tainted value to it, without bound-checking.
To this end, they have used Clang Analyzer. As part of LLVM, there is a tool to do static analysis of programs: scan-bulid. To perform a custom analysis, everything goes into a C++ class and registers as a plugin. Here’s how the code looks:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void NetworkTaintChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {
const IdentifierInfo *ID = Call.getCalleeIdentifier();
if(ID == NULL) {
return;
}

if(ID->getName() == "ntohl" || ID->getName() == "ntohs") {
ProgramStateRef State = C.getState();
SymbolRef Sym = Call.getReturnValue().getAsSymbol();

if(Sym) {
ProgramStateRef newState = State->addTaint(Sym);
C.addTransition(newState);
}
}

Similarly, to check for function calls, we could have used BAP). In its original and preferred OCaml it looks like:

1
2
module CG = Graphs.Callgraph
module CFG = Graphs.Tid


This gives access to callgraph and cfg. Now imagine we wanted a function that will take the call graph cg, the target function, and the subroutine term sub, and return a sequence of calls that has a destination function, that reaches target in the call graph. BAP provides for us nifty methods.

1
2
3
4
5
6
7
8
9
10
11

let callsites cg target sub =
Term.enum blk_t sub |>
Seq.concat_map ~f:(fun blk ->
Term.enum jmp_t blk |> Seq.filter_map ~f:(fun j ->
match Jmp.kind j with
| Goto _ | Ret _ | Int (_,_) -> None
| Call dst -> match Call.target dst with
| Direct tid when reaches cg tid target ->
Some (Term.tid blk)
| _ -> None))

In a similar manner, every custom analysis is developed in an OCaml (or other bindings) file and then registered as a plugin to BAP. (instructions here)

So many tools!


There are also other tools built with different motivations in mind, among them angr , ROSE, radare2.
Radare2 is more suited for ctf, ROSE used to be source analysis and thus supports source analysis as well as binary and angr was used to make mechanical phish which won 3rd place at Darpa’s Cyber Grand Challenge. ROSE is developed in C++, and is fast and great for research, but when you want to work on small programs (like a ctf challenge) you’d rather take other options, like bap which is based on OCaml and you can use baptop which is a REPL environment that you can interactively run commands and play around with the binary.

The most recent work published using angr, is (USENIX Security ‘18) HeapHopper: Bringing Bounded Model Checking to Heap Implementation Security which leverages the tool to implement another tool called HeapHopper which inspects heap implementations for vulnerablities. The method has been proven to be successful on a ptmalloc vulnerability, and uses angr as it’s symbolic execution engine checking compiled programs, after attempted exploit, for security violations.

Post by: Iman Hosseini

S4Lab Walkthrough

What we do at S4Lab

The short version of our work is that we like the kind of magic that concerns software. That means while security is not mathematically provable and the world is full of smart programmers with cute programs, we want to show the weakness in software and systems (how cute a program can be?) and find ways to secure them.


Software Security

Researchers have introduced lots of program analysis methods over the last four decades. Initially, the primary goal of these methods was compiler optimization. With the emergence of malware, new dynamic approaches have been introduced for the sake of finding the malicious behavior that only appears at runtime.

In recent years with rising the cost that software bugs may cause, researchers used the available automatic program analysis methods to test programs and answer the question “Does a program have any bugs?”. Altough none of the methods answer this question entirely, the goal of the automatic program testing is identifying as many real bugs as possible with minimal user augmentation and measure/show their risk.
Here at S4lab, we want to answer the following questions:

  • What kind of magic reveals software bugs as much as possible?
  • Can we decide which part of the programs are more vulnerable?
  • Can we improve the magic to prove vulnerabilities?
  • How can we measure the risk each bug has?
    Traditionally the program analysis methods are categorized as follows:

Dimensions

If you want to get familiar with these approaches, we suggest the following references:


System Security

Although using secure software is essential, it is always possible that interactions between the different layer of the computing stack remain vulnerable. Therefore developing tools and methods to detect potential flaws in communication channels, i.e., kernel/user, kernel/virtualization layer, kernel/hardware layer, etc. is important.
To get familiar with this area we suggest reviewing the following papers:


Find what is wrong


Find what is wrong? (artwork by @wackyposters)

You see, but you do not observe. The distinction is clear. [Sherlock Holmes]

While our primary focus is software and system security, any new security-related problems may interest us, especially finding new and innovative ways to measure security or to show that there are flaws in software and system that we use in our everyday life. You should be curious enough to sense that there is something wrong about them, indeed you should observe carefully.
To get familiar with these topics we suggest the following references:

What you should do to join S4Lab

If we knew what it was we were doing, it would not be called research, would it? [Albert Einstein]

You should understand that security related fields are not as straightforward as shown in Hollywood movies. In most cases, we do not know what the next step is, and we are not sure if we can find a way to prove that security is broken. But that doesn’t mean we should stop trying.
You should be patient, ambitious, and determined enough to work at S4Lab,
If you find S4Lab suitable for your work, please follow the steps below and you will see how deep the rabbit hole goes.


Let us know what interests you more

Summarize 2 of papers we have suggested reviewing above, choose the ones that interest you more. The writing should be your own words and try to avoid using the paper sentences.
You can add any related data, possible open problems that you may find, or only focus on the papers.
Each summary should not be longer than 2 pages.


Prove that you are a hacker (i.e. problem solver)

Most hackers are young because young people tend to be adaptable. As long as you remain adaptable, you can always be a good hacker. [Emmanuel Goldstein]

Being a hacker means you do not leave a problem unsolved because it is written in Haskell.
Before joining S4Lab, you need to prove that you are a hacker, to do so, please answer the following questions:

  • Many websites expose their “.git” files, please show how it could be dangerous.
  • Imagine that we have 2**48 text files. Explain how can we find which files are the same.
  • Write a hello-world C program and explain how we can dump its binary code with radare2.

Please remember that you do not need to write very much, and try to solve each task as automatically as you can.


Stay tuned

Upload the requested data, on your public GitHub account and send us its URL.


You should love Git!

Post by: Solmaz Salimi

Top NDSS'18 Papers

It was some time ago that I read a post somewhere (really can’t remember where) with a simple question. What are the top papers which one should read from conferences every year in order to keep up to date with the latest and most important developments. This was an interesting and yet challenging question. Nevertheless, we (i.e. our research team) decided to review papers from the top4 security conferences in our regular weekly meetings and through a process select the top 5 papers from each conference.

How did we go about it?
This was our first try on the issue, so not much of a scientific approach! We started studying papers as they were presented from the first session and on. In every meeting we selected the top papers from the sessions reviewed, and then argued if they should be kept in a top 10 list, and if the top 10 list was full, if any papers should be removed from the list to make room for the new paper selected. At the end of the process, we were left with 10 papers for the whole conference, from which we argued and selected the top 5.

What is considered “TOP”?
“Top” is a subjective matter, and to make matters worse security is a very wide field with very different topics which require very different expertise. Over the sessions, it became clear that there were biases. For example if a topic was new for everyone, they would argue that it should be included in the “top” list, where as there may have been previous research on the topic that the students where unaware of. There was also bias on topic which everyone was more familiar with. In such cases the argument was that a major problem was solved(i.e. although the scope of the work was very narrow). So again, top is a subjective matter. But I believe that as time goes on, and we keep doing this for a few more conferences, this bias would be less of an issue as everyone would have seen more papers to compare with.

Was the process useful?
Most definitely. Although we did not cover every detail in the papers we read, but we did study current day problems which is are being addressed by researchers and the approach taken to tackle the problem. In fact as papers were reviewed from different sessions, students were exposed to different problems and solutions; Where the subject was very different from what the are researching on. I find this to be extremely important. Students usually get too focused on a specific topic, and they lose sight as to the bigger picture. In other words, such exercise is great in providing “breadth” to the student, in addition to the “depth” obtained as part of research on a specific topic.

Here are the Top5 selected papers from NDSS’18:

1- Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics. Erick Bauman (University of Texas at Dallas), Zhiqiang Lin (University of Texas at Dallas), and Kevin Hamlen (University of Texas at Dallas).
2- Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections. Di Tang (Chinese University of Hong Kong), Zhe Zhou (Fudan University), Yinqian Zhang (Ohio State University), and Kehuan Zhang (Chinese University of Hong Kong).
3- Cloud Strife: Mitigating the Security Risks of Domain-Validated Certificates. Kevin Borgolte (UC Santa Barbara), Tobias Fiebig (TU Delft), Shuang Hao (UT Dallas), Christopher Kruegel (UC Santa Barbara), and Giovanni Vigna (UC Santa Barbara).
4- Device Pairing at the Touch of an Electrode. Marc Roeschlin (University of Oxford), Ivan Martinovic (University of Oxford), and Kasper B. Rasmussen (University of Oxford).
5- Settling Payments Fast and Private: Efficient Decentralized Routing for Path-Based Transactions. Stefanie Roos (University of Waterloo), Pedro Moreno-Sanchez (Purdue University), Aniket Kate (Purdue University), and Ian Goldberg (University of Waterloo).

Last but not least, we are always happy if we could get more people in the room arguing on the papers. So if you like to participate in the process for the next conference review please drop me an email.

Post by: Mehdi Kharrazi