Kam1n0 v2.x is a scalable assembly management and analysis platform. It allows a user to first index a (large) collection of binaries into different repositories and provide different analytic services such as clone search and classification. It supports multi-tenancy access and management of assembly repositories by using the concept of Application. An application instance contains its own exclusive repository and provides a specialized analytic service. Considering the versatility of reverse engineering tasks, Kam1n0 v2.x server currently provides three different types of clone-search applications: Asm-Clone, Sym1n0, and Asm2Vec, and an executable classification based on Asm2Vec. New application type can be further added to the platform.
A user can create multiple application instances. An application instance can be shared among a specific group of users. The application repository read-write access and on-off status can be controlled by the application owner. Kam1n0 v2.x server can serve the applications concurrently using several shared resource pools.
Kam1n0 was developed by Steven H. H. Ding and Miles Q. Li under the supervision of Benjamin C. M. Fung of the Data Mining and Security Lab at McGill University in Canada. It won the second prize at the Hex-Rays Plug-In Contest 2015. If you find Kam1n0 useful, please cite our paper:
S. H. H. Ding, B. C. M. Fung, and P. Charland. Kam1n0: MapReduce-based Assembly Clone Search for Reverse Engineering. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 461-470, San Francisco, CA: ACM Press, August 2016.
S. H. H. Ding, B. C. M. Fung, and P. Charland. Asm2Vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In Proceedings of the 40th IEEE Symposium on Security and Privacy (S&P), 18 pages, San Francisco, CA: IEEE Computer Society, May 2019.
Asm-Clone applications try to solve the efficient subgraph search problem (i.e. graph isomorphism problem) for assembly functions (<1.3s average query time and <30ms average index time with 2.3M functions). Given a target function (the one on the left as shown below), it can identify the cloned subgraphs among other functions in the repository (the one on the right as shown below).
Semantic clone search by differentiated fuzz testing and constraint solving. An efficient and scalable dynamic-static hybrid approach (<1s average query time and <100ms average index time with 1.5M functions). Given a target function (the one on the left as shown below), it can identify the cloned subgraphs among other functions in the repository (the one on the right as shown below). Support visualization of abstract syntax graph.
Asm2Vec leverages representation learning. It understands the lexical semantic relationship of assembly code. For example, xmm*
registers are semantically related to vector operations such as addps
. memcpy
is similar to strcpy
. The graph below shows different assembly functions compiled from the same source code of gmpz_tdiv_r_2exp
in libgmp. From left to right, the assembly functions are compiled with GCC O0 option, GCC O3 option, O-LLVM obfuscator Control Flow Graph, Flattening option, and LLVM obfuscator Bogus Control Flow Graph option. Asm2Vec can statically identify them as clones.
In this application, the user defines a set of software classes which are based on functional relatedness and provides binaries belong to each class. Then the system automatically groups functions into clusters in which functions are connected directly or indirectly by clone relation. The clusters that are discriminative for the classification are kept and serve as signatures of their classes. Given a target binary, the system shows the degree it belongs to each software class.
Use Asm2Vec as its function similarity computation model
The figure below shows the major UI components and functionalities of Kam1n0 v2.x. We adopt a material design. In general, each user has an application list, a running-job list, and a result file list.
The current release of Kam1n0 consists of two installers: the core server and IDA Pro plug-in.
Installer | Included components | Description |
---|---|---|
Kam1n0-Server.msi | Core engine | Main engine providing service for indexing and searching. |
Workbench | A user interface to manage the repositories and running service. | |
Web user interface | Web user interface for searching/indexing binary files and assembly functions. | |
Visual C++ redistributable for VS 15 | Dependecy for z3. | |
Kam1n0-IDA-Plugin.msi | Plug-in | Connectors and user interface. |
PyPI wheels for Cefpython | Rendering engine for the user interface. | |
PyPI and dependent wheels | Package management for Python. Included for IDA 6.8 &6.9. |
The Kam1n0 core engine is purely written in Java. You need the following dependencies:
Download the Kam1n0-Server.msi
file from our release page. Follow the instructions to install the server. You will be prompted to select an installation path. IDA Pro is optional if the server does not have to deal with any disassembling. In other words, the client side uses the Kam1n0 plugin for IDA Pro. It is strongly suggested to have the IDA Pro installed with the Kam1n0 server. Kam1n0 server will automatically detect your IDA Pro by looking for the default application that you used to open .i64
file.
The Kam1n0 IDA Pro plug-in is written in Python for the logic and in HTML/JavaScript for the rendering. The following dependencies are required for its installation:
Next, download the Kam1n0-IDA-Plugin.msi
installer from our release page. Follow the instructions to install the plug-in and runtime. Please note that the plug-in has to be installed in the IDA Pro plugins folder which is located at $IDA_PRO_PATH$/plugins
. For example, on Windows, the path could be C:/Program Files (x86)/IDA 6.95/plugins
. The installer will detect and validate the path.
Ensure you have the Oracle version of Java 11. (Not default-jdk in apt.)
sudo add-apt-repository ppa:webupd8team/java
~webupd8team not found
), if you are on a proxy, make sure you set and export your http_proxy
and https_proxy
environment variables, and then try again with the -E
option on sudo. Additionally, if you are getting a 'add-apt repository command not found error, try: sudo apt install -y software-properties-common
.sudo apt-get update
, and sudo apt-get install oracle-java8-installer
java -version
; you may need to manually set the JAVA_HOME environment variable (in /etc/environment
), JAVA_HOME=/usr/lib/jvm/java-11-oracle
Download the latest release for Linux (Kam1n0-IDA-Plugin.tar.gz and Kam1n0-Server.tar.gz) from Kam1n0-Community.
Extract the two tarballs (i.e. tar –xvzf Kam1n0-IDA-Plugin.tar.gz and tar –xvzf Kam1n0-Server.tar.gz)
The Kam1n0-Server.tar.gz file will create the server directory.
Inside the server
directory, you should see a file called kam1n0.properties
, which is where you will set various configurations for kam1n0; this is very important.
Set kam1n0.data.path
to where you would like your kam1n0-related data to be written to. We choose to put it in the same place that we keep our server
. kam1n0.ida.home
refers to where your IDA installation is located. Comment this line (and kam1n0.ida.batch
, the line following) if you do not have IDA and don't plan to use kam1n0 for disassembly. For more (accurate) information about the kam1n0.properties
file, see the kam1n0.properties.explained
file.
Run kam1n0-server-workbench: java -jar kam1n0-server-workbench.jar
. This should cause a window to pop up, which prompts you to actually start kam1n0. Alternatively, run kam1n0-server: java -jar kam1n0-server.jar --start
. This starts the server from the console without a window.
To connect and use it, go to 127.0.0.1:8571
(the default port kam1n0 listens on should be 8571, but can be changed in kam1n0.properties) in your browser. You should see the pretty kam1n0 web UI. From there, follow the tutorial on the Kam1n0-Community repo if you do not know how to use kam1n0.
The assembly code repositories and configuration files used in previous versions (<2.0.0) are no longer supported by the latest version. Please contact us if you need to migrate your old repositories.
Clone the latest stable branch (don't forget --recursive
!):
git clone --recursive -b master2.x --single-branch https://github.com/McGill-DMaS/Kam1n0-Community
IntelliJ: Import the root /kam1n0/kam1n0/ as a maven project. All the submodules will be loaded accordingly. EclipseEE: Add the cloned git repository to the git view. Import all maven projects from the git repository. You may need to modify the classpath to address any error. All the resources path are dynamically modified when running inside an IDE (through the kam1n0-resources submodule).
To build the project:
cd /kam1n0/kam1n0
mvn -DskipTests clean package
mvn -DskipTests package
The resulting binaries can be found in /kam1n0/build-bins/
To run the test code, you will need to first download chromedriver.exe
from http://chromedriver.chromium.org/ and add its absolute path into an environment variable named webdriver.chrome.driver
. It is also required that there is a chrome browser installed in the system. The test code will launch a browser instance to test the UI interfaces. The complete testing procedure will take approximately 3 hours.
cd /kam1n0/kam1n0
mvn -DskipTests clean package # you can skip this one if you already built the package
mvn -DskipTests package # you can skip this one if you already built the package
mvn -DforkMode=never test
These commands only compiles java with pre-compiled wheels of libvex and z3. It works out-of-the-box. The build of libvex and z3 is platform-dependent. We use a fork of libvex from Angr. More serious build scripts as well as installers for windows/linux can be found under /kam1n0-builds/
We have a Jenkin server for contineous development and delivery. Latest stable release will be posted here. Periodically we will synchronize our internal experimental branch with this repository.
The software was developed by Steven H. H. Ding, Miles Q. Li, and Benjamin C. M. Fung in the McGill Data Mining and Security Lab and Queen's L1NNA Research Laboratory in Canada. It is distributed under the Apache License Version 2.0. Please refer to LICENSE.txt for details.
Copyright 2014-2021 McGill University and the Researchers. All rights reserved.