June
2024
Introduction
At Interrupt Labs we often find that difficult problems are best tackled through close collaboration between researchers. Whilst collaboration features are present in the most popular tools for native binary analysis (Binary Ninja, Ghidra and IDA Pro), the same is not true for the most popular Java bytecode analysis tools (JEB and JADX).
This post introduces JADX Collaboration, a plugin for JADX-GUI that allows users to seamlessly share their analysis with one another without ever leaving the program.
Investigation
Before designing the plugin, we need to understand how JADX’s renaming system works internally.
Let’s begin by finding where the rename information is stored. As a starting point, we can use the fact that renames are saved to disk with the rest of the project information, and after some searching, we find the JadxProject
class which has save
and load
methods for synchronising the in-memory project with the on-disk project file. JadxProject
contains ProjectData
, which in turn contains JadxCodeData
, which in turn contains a list of renames (ICodeRename
).
ICodeRename
is an interface with three methods:
getNewName
, returning aString
.getNodeRef
, returning anIJavaNodeRef
interface with three methods:getType
, returning aRefType
enum (FIELD
,METHOD
,CLASS
orPKG
).getDeclaringClass
, returning aString
.getShortId
, returning aString
.
getCodeRef
, returningnull
or anIJavaCodeRef
interface with two methods:getAttachType
, returning aCodeRefType
enum (INSN
,CATCH
,VAR
orMTH_ARG
).getIndex
, returning anint
.
We can guess that components retaining their names after being compiled are referred to only by nodeRef
(null codeRef
), whereas components not retaining their names are narrowed down by nodeRef
(e.g. referencing the method to which an argument belongs) and identified using codeRef
(e.g. referencing the index of the argument in the method).
To test this assumption, let’s load a simple .jar
file into JADX and rename the HelloWorld.main
method to methodRename
and its argument to argumentRename
. Here is how the renames were saved to disk:
[
{
"nodeRef": {
"refType": "METHOD",
"declClass": "HelloWorld",
"shortId": "main([Ljava/lang/String;)V"
},
"newName": "methodRename"
},
{
"nodeRef": {
"refType": "METHOD",
"declClass": "HelloWorld",
"shortId": "main([Ljava/lang/String;)V"
},
"codeRef": {
"attachType": "VAR",
"index": 196608
},
"newName": "argumentRename"
}
]
In terms of designing the plugin, our most important takeaway is that the combination of nodeRef
and codeRef
uniquely identifies the target of a rename.
Design
Requirements
When designing the plugin, let’s consider three key requirements:
- The plugin should not require custom server software to distribute data between users, and should instead use something existing like git.
- The plugin should not require a constant connection to the distribution mechanism. A connection should only be necessary when a user wants to send or receive changes.
- The plugin should have some method of conflict resolution (e.g. if two users rename the same target at more or less the same time).
Core Design
We will base the core design of the plugin on git, and as such use three separate locations for rename information:
- Project - The renames as stored by JADX in
JadxCodeData
. Analogous to git’s working directory. - Local repository file - A representation of the project renames, but with additional information needed to accurately synchronise with the remote repository file (e.g. user ID and version information). Analogous to git’s local repository.
- Remote repository file - A representation of the renames shared between all users. Analogous to git’s remote repository.
We’ll also have two operations for synchronising changes:
- Pull - Analogous to a git commit followed by a git pull:
- The local repository file is updated from the project.
- The local repository file is updated from the remote repository file.
- The project is updated from the local repository file.
- Push - Analogous to a git commit followed by a git push:
- The local repository file is updated from the project.
- The local repository file is updated from the remote repository file.
- The remote repository file is updated from the local repository file.
- The project is updated from the local repository file.
Distributing the Remote Repository File
To maximise the plugin’s utility, we should make the process of distributing the remote repository file customisable using scripts. A pre-pull script should be responsible for copying the remote repository file from a remote location to a local location and a post-push script should be responsible for the reverse.
The scripts should use their return codes to indicate success, temporary failure or permanent failure and the plugin should respond accordingly:
Conflict Resolution
One of the most challenging aspects of designing this plugin is detecting and handling conflicts. After some research, we decide to use a version vector:
- Each rename in the local and remote repository file has a version vector, with an integer for every user (zero by default).
- When a user updates a rename in their local repository file, their integer in the version vector is incremented.
- When the local repository file and remote repository file are synchronised, the version vectors are compared and different actions are taken accordingly. The table below documents these actions (user A is the local user):
Implementation
Interfacing With JADX
The development version of JADX has a plugin API which makes a few things easy:
- Adding per-project configuration options (useful for specifying the repository file and scripts).
- Registering menu options and key-binds (useful for triggering the pull and push actions).
- Dispatching events (useful for reusing some of JADX’s existing logic).
Unfortunately, however, the problem of how to refresh the relevant GUI components after retrieving new renames is not so easy. There is a ReloadProject
event which refreshes the entire project, but that wouldn't be suitable for large projects that take a long time to initially load. There is also a NodeRenamedByUser
event which is handled by RenameService
, where methods from a JRenameNode
interface contained in the event are used to calculate which classes to refresh.
When using JADX normally, only one rename occurs at a time, so the built-in JRenameNode
implementations are not suitable for our purposes. We can instead:
- Compare the old and new project renames after completing a pull or push action to find those that had changed.
- Find the classes associated with the changed renames:
- The containing class for a variable, field or method.
- The class itself for a class.
- The contained classes for a package.
- Add those associated classes, along with any classes that import them to a set. This is not quite optimal because, for example, not all dependent classes will need to be refreshed after a method rename, but it should be good enough for now.
- Implement
JRenameNode
with just enough functionality to pass through theRenameService
logic and deliver the previously calculated set of classes.
Synchronising Changes Over Git
As we discussed in the design, the method of distributing the remote repository file is customisable. To simplify the usage of the plugin, let’s write sample pre-pull and post-push scripts to synchronise using git.
Pre-Pull
[ $# -eq 1 ] || exit 2
dir="$(dirname "$1")" || exit 3
cd "$dir" || exit 4
git pull || { git merge --abort; exit 5; }
exit 0
The script assumes that the git repository is the parent of the remote repository file argument and attempts to git pull
. On failure it aborts any merges and reports a permanent failure (exit code two or above). On success it returns exit code zero.
Pre-Push
[ $# -eq 1 ] || exit 2
dir="$(dirname "$1")" || exit 3
file="$(basename "$1")" || exit 4
cd "$dir" || exit 5
git reset || exit 6
git add "$1" || exit 7
git commit --allow-empty -m "Update $file repository" || exit 8
git push || { git reset --hard HEAD~1; exit 1; }
exit 0
Again, the script assumes that the git repository is the parent of the remote repository file argument. This time it does a git reset
(to ensure we don’t accidentally commit other files) before doing a git add
and git commit
on the remote repository file (--allow-empty
is important so that things don’t break if there have been no changes). It then tries to git push
, using git reset
to undo any changes if there is a problem and reporting a temporary failure (exit code one).
Conclusion
Overall, the plugin meets the requirements we laid out and seems to work well. Whilst the initial setup is a bit complex, using it after that is as simple as pressing a key combination or selecting a menu option
A combination of manual and unit testing was used to evaluate the plugin, but it still needs to be trialed in a real multi-user scenario.
Thank you for reading. Documentation and source-code can be found here.