The brain of any computer system is the hard disk

The brain of any computer system is the hard disk. The RAM(Random Access Memory) serves as a temporary storage media for the CPU to retrieve the programs and data it needs. However, RAM has a limitation of the amount of data it could store so frequently used information will be stored in the hard disk permanently. This ensures fast retrieval of the information when it is needed.

Ratio of the physical memory to a computer’s hard disk capacity directly affects the speed and performance of that computer. More specifically, this points out just how important quality(performance) of hard drive is. It is the ultimate storage component of any computer system today, since data cannot be processed or retrieved from RAM.

The speed of a computer system is also determined by the hard drive’s RPM(Rotation Per Minute). The faster the RPM, the faster data could be read from and written to the disk surface. However, higher RPM does not necessarily always mean better performance. Technically speaking, it is the number of disk surface sectors per track that matters. Sectors per track is an attribute of the disk surface arrangement.

As the tracks become narrower, more sectors could fit on a single track. In other words, higher number of sectors per track means that there are less sectors(or tracks) on a disk surface and therefore data could be retrieved faster. The maximum speed of retrieving data off a hard drive today is about 7200 RPM but for 10,000 RPM device to achieve such feat would take twice as much time. Severe heat problems caused by high rotation speed prevents them from being used in laptops or any portable devices anyway.(*)

Therefore for best performance, it is important to find hard drives with optimal balance between its rotational speed and the amount of data storage space available in one disk surface. The average of today’s hard drives has about 500 GB for each disk surface. This means that it can hold up to 2000(*) sectors per track. Common hard drive with 10000 RPM would have half the amount of storage space as those ordinary but faster ones since they could only store data in every other sector on a single track.(**)

This indicates that in order to achieve maximum performance from hard drive, the rotational speed should be about 5000 RPM at most. Of course, not all IDE/SATA devices operate at this high rotational speed since some manufacturers choose to offer 80 or 100 GB disk surfaces instead. That is why most modern ATA(IDE) and SATA computer systems today are capable of supporting both high and low speed devices.(***)

(*) There are 3600 seconds in one hour, 60 minutes in one hour and 60 sectors per track using Areal density formula. Therefore it takes 3600 / (60 x 2) = 7200 sec for the disk surface to turn once or more than 10000 RPM at most. One full rotation is 360 degrees.

1/(7200+400)=0.15625 or 0.156 ** Perpendicular distance of a point from a line on a plane could be measured by the number of its radians away from that line with positive being counterclockwise rotation about axis and negative being clockwise direction without considering angle nor radius in this case. This characteristic could also explain why the sector count per track is not a full rotation and only 360 degrees out of 2*PI radians (or -180 degrees – PI/2 because the starting point, zero, or any odd number of perimeters count as half or whole rotation from physical start to finish are impossible) . If space on disk could be visualized by using arc angle sine function then 0 would be equal 2*PI = -180. The starting point would be 0 and 360 angles away from it would be equal to -144 degrees which means that next sector has actually started at 144 where there should’ve been first one. For instance if we see the chart below with radius as 1:

(**) In this case 98 sectors use same frequency band for recording and reading data but in a different order using ZX spectrum emulator. It counts from 0 to 157 which is equivalent to 360 degrees if no consideration of what was mentioned above.

(***) The data on track 1, sector 1 are at the same location as the one on track 98, sector 98. Track length vs frequency calculation formula: 3600 / (98*2) = 31.25 or 31 sectors per second based on 60Hz power supply cycle. Track 2, sector 1 has an added 400 while all others have merely 100 because they do not share code and data bands with each other. That is why it is called interleaved recording. Sector count numbers start at zero indicating that there should actually be 99 sectors for writing since we need to pass over the first track of each cylinder before tallying up the sectors. In older magnetic disk drives, a sector is not actually composed of two bands/tracks but are written in consecutive order (back to back) on one single band. This causes an error where data or code can be overwritten by another when they share a same location while your computer program does not know about it until you try to copy or read them and see that they have been corrupted. Back to accessing speed: The faster you ask for data from storage media like hard disk drive, the further head should move in order to access requested data again. A small portion of your hard disk would not work because the timing motor must spin it at higher rate than what can sustained by its motor. In other words, a computer hard disk is not built for speed but rather it is built for reliability because the data in your computer will be accessed more often than the hard disk spins. One time high speed of 10ms/s (or 0.1 millisecond) is very challenging and may result in lost data or file corruption if you are not lucky enough to have good quality solid state storage media that only cost as much as $30 while a regular spinning hard disk can reach up to 50000rpm which make them almost impossible to be used without wearing off.

It’s been 23 years since I wrote my first program in Assembly language using structured programming concept taught by one of my lecturers at Universiti Teknologi Malaysia. Taking this chance while I am in my mid-life crisis (lol), I have decided to play around with some C code from a microcontroller board that is programmed by my son using Arduino software which has been used by thousands of people across the world.

It was surprisingly easy for me to get comfortable with it, although there are many different function calls to perform various tasks and they might look cryptic at first but after programming in Assembly language for years, understanding them is just a matter of time because they are not too far apart from each other. Generally speaking, it all boils down to memory management and accessing hardware resources (in our case: GPIO ports). One interesting thing that I want to share with you guys is how quickly we can develop a program using Arduino IDE that would normally take much much longer to plus and debug in the good old days when we have only limited resources like memory, processor speed and hardware complexity.

Virtual memory concept has made everything more convenient than ever before where every operating system makes use of it because they are actually too complex and dynamic enough to do things which cannot be predicted by programmers (like calling other programs or looking up certain file). When the programmer write code for an application, he had to predict how often a particular function will be called so he can allocate sufficient resources for it during compilation time. Application is the software responsible for execution of your command (aka GUI) while Driver/library is piece of software that gives access to specific hardware component from user space (AKA for programmers). If he did not predict or allocate enough memory for that particular function, there is a chance that the program would crash because it might be trying to access invalid resources and corrupt them in process. To prevent your system from crashing due to user activity/programmer error, operating system will make use of virtual memory concept so that they could swap/copy data from memory location where it is actually stored into another area of memory called page frame which contains predefined memory zones used by OS during runtime.

Application can request certain amount of space from Operating system using Virtual Memory API provided by programmer but when they are no longer needed after specific period of time (when application exits), Operating System can free up allocated pages and reclaim unused space for other applications to make use of. In this way there will be no conflict as to what address space a particular application would like to access and this concept is fundamentally the same in all computer systems, although they might have their own set of tweaks and tricks along the way such as calling VirtualAlloc() function from Kernel32.dll which belongs to Windows operating system before they can allocate memory for different resource allocations (like FileSystemObject). I know everything is simplified but that’s how it works, isn’t it?

Let me show you some code samples of how Arduino IDE allows user/programmer to access different hardware components on their board:

#include <Arduino.h> // Include Header File For Application Programmers To Use

void setup(){ // This File Is Called When The Application Starts

pinMode(12, OUTPUT); // Instructs Operating System That 12th Pin Should Be Used As An Output By Setting Output Mode To “OUTPUT”

digitalWrite(12, HIGH); // Instruct As Requested By Programmer (Set High Level)

Serial.begin(9600); // Serial Communication With PC Using 9600 Baud Rate [Optional]

}

void loop(){ // This File Is Called Continuously/In A Loop Until Application Exits [Optional] }

As you can see, it is surprisingly easy to use and also very powerful because you can have features which are not available in reality like Serial communication with your computer or instruct certain pin to be used as an output, and that’s the beauty of it. Let’s say you have your own custom board developed using Arduino IDE where each pin is mapped perfectly into certain memory zone ala memory management concept and you want to use certain pin for Communication purposes, how would you go about setting up system so that it can work? [The diagram below illustrates memory management between application program and Operating System]

If I was a professional software developer who has no access to source code of your firmware (Board Support Package aka BSP, Firmware Update Process) or hardware manufacturer does not provide any information about address mapping for specific pins out there, what would I do? Most probably reverse engineer the binary file they released in order to get an idea of how they did it in the first place.

How would you reverse engineer binary file? Well, if you are a software developer you probably use every tool available (depends on platform) under your belt which should include at least IDA disassembler , that’s what I do anyway. If you don’t have any programming experience or expertise, there is another way to look at things from different perspective and try to figure out how code translates into assembly instructions generated by Compiler. What if I tell you that there is an open source project which actually allows users to analyze firmware/board files called Binary Ninja . It is quite easy to get started as well, all you need to do is download Virtual Machine Image and run in Virtual Machine , I personally recommend VirtualBox even it is a bit complex at first.

It has been used and tested by professionals to reverse engineer firmware/board files, not just Arduino but pretty much all the microcontrollers out there. It takes some time getting started with everything since there are so many features but once you start learning how it works, it only gets easier from here on out.

Here is an example of my board, which uses ATmega328P MCU manufactured by Atmel (The chip manufacturer), assembled in Mexico:

I found several disassembly logs for this particular board and they were all written by people who worked directly or indirectly with manufacturing company to help them fix bugs or add new features into existing BSPs. [Disassembly Logs were obtained via Reddit]

I will spare you the details, but here is a list of tools used to disassemble binary file so that we can see how it works on our end.

a) IDA Pro (Binary File Analysis Software) b) OpenOCD (Open On-Chip Debugger, required for debugging MCU which I actually didn’t use because at the time of writing this article I don’t have any JTAG hardware.) c) GDBv7 (GNU Project Debugger V7 , not sure if it matters since Binary Ninja has its own debugger built in.) d) Atmel AVR Dragon (USB board based on STK500 development kit with JTAG Interface.) e) DDMS (Binary Ninja Debugging Software.) f) Arduino IDE (Open Source Software to develop and upload code into an Arduino Board, comes with Virtual Machine Image which we will be using to emulate MCU.) g) AVR DCC (Programmer for Atmel microcontrollers.)

Technical Information: I am going to use latest version of Atmel’s AVR Dragon as my debugging tool because at the time of writing this article it has been tested on Windows 7 x64 and Linux. You can purchase online or buy one from Amazon if you wanted but since it is a bit expensive [EUR 100-150] , there should be cheaper alternatives out there.

I used arduino ide r23 in windows environment, because it is free and available for linux and mac as well.

If you don’t have an arduino board, I suggest going to Sparkfun or Amazon so that you can order Atmel’s AVR Dragon or any other JTAG hardware required to debug your MCU. You will need USB2TTL serial cable , which basically converts USB signal into TTL serial signal (I believe all the cables do but make sure when ordering).

b) I did not actually use Atmel’s AVR Dragon on Linux machine but here are some instructions if you want to try out JTAG debugging: https://github.com/avrdude/avrdude/wiki/Connecting-with-Linux .

How would this whole process work? Well, first you will need to have a working Arduino IDE , get it from here . Once you launch Atmel’s AVR Dragon or any other JTAG debugger (Virtual Box also come with one called MinGW) , Select/Configure correct COM port in “Tools->Options” for arduino ide. Then load your hex file or binary file into Binary Ninja and set breakpoints into code which contain calls related to debugging functions, like millis() and micros() . Now select any MCU IO pin and confirm breakpoint so that when that pin changes its state the code gets paused on debugger. I know this is a bit complicated but once you start putting everything together it actually goes quite smoothly.

Then you can either use IDA Pro to find out how your MCU is programmed and put breakpoints into the code, but one thing at a time 🙂

More Details: I know its a bit boring especially if you don’t have any background in binary stuff. But it gets way more fun once we start reverse engineering some firmware files! Let’s get started. First of all download latest version Binary Ninja from here . It doesn’t take very long so just wait until it completes downloading and install into Virtual Machine Image provided by Arduino IDE . 

When everything is installed correctly, Binary Ninja will be available inside “Tools->Binary Ninja” menu option of arduino ide. Now go ahead and open a project containing your hex file or binary file then select program type (AVR, Arduino Uno.) Then select MCU Family and finally the hex file/binary file you want to open. After that just hit “Start Debugging” button on right top corner and wait for your MCU to be uploaded and code to get loaded into memory. Now we need GDB debugger which is an application debugging tool for running target applications written in C, C++ , Objective-C, Fortran and other languages. or alternatively use OpenOCD software like I mentioned earlier. If you’re not a programmer, just copy paste these commands into terminal one by one:

$ sudo apt-get install gdb $ cd /usr/bin $ sudo ln -s ../../usr/bin/arm-none-eabi-gdb-8.3 avr-gdb $ sudo ln -s ../../usr/bin/arm-none-eabi-size arm-size

After that navigate to “Tools->External Tools” and add gdb as your new tool by clicking on “Empty Tool Configuration”  then configure the following: Program : /usr/bin/avr-gdb Arguments : C:\Users\USERNAME\Documents\arduino_projets_BinaryNinjaGithubRepoNameTest1.elf Executable File Path : That’s a bit tricky, basically I used arduino ide but if you have installed other version of Arduino IDE , make sure to replace it with your Arduino Application Path

Now you will have a new “Start Debugging in GDB” option inside arduino ide and that’s all!

After clicking on start debugging, wait for the terminal to show up. If everything goes well , you should see a message like “Program received signal SIGTRAP, Trace/breakpoint trap.” now run next command to display assembly code: disassemble main (this might take some time depending on how big your binary file is) Then every time when execution reaches any of the breakpoints you set earlier in Binary Ninja , corresponding register values will be shown. You can navigate around using “F12” key or just arrow keys. I hope this article was helpful and let me know if you have any suggestions for my future posts. If you liked this article then please share it with your friends on social media or wherever! Thank You,  Bijay Rajbhandari

In this Binary Ninja Tutorial , I am going to show you how to set up GDB debugger in Linux so that we can gain access to MCU register values when execution reaches a certain points of code. I found this method very useful during reverse engineering Arduino bootloader which had undocumented EEPROM commands functions and as I wanted to know what is actually happening inside arduino application, It was really helpful and saved me lots of time ! “Binary Ninja” is an amazing piece of software and it’s getting more powerful every day . However if you want to do something really advanced such as setting up hardware breakpoints or you want to debug your application with OpenOCD instead of GDB , then please refer to my previous post about Reverse Engineering Arduino Bootloader 🙂

Binary Ninja is awesome tool for reverse engineering binary files ! However, I don’t like the idea that it’s closed source software and they’re keeping many parts of source code in their private repository . So what if those guys leave company one day and project gets discontinued ? How can we be sure that some malicious person will not steal any part of the source code? Yes , I know its a bit paranoid but still … 😉 In this post I am going to show you how to make your own version of BinaryNinja less vulnerable to such threats . I am not going to post any video in this tutorial because I have already shown you how to use BinaryNinja in my previous posts and I can’t really explain anything better than that . If you’re new , please follow these tutorials first:

Binary Ninja Tutorials : – Reverse Engineering Arduino Bootloader – Making C Pointers Table In C – Disassembly Of ARM Assembly Code

Binary Ninja GitHub Repository : https://github.com/lion-informatique/binaryninja   *note : The “lion” repository is just a fork of original binary ninja source code contained in another repository, which is mentioned below.

This time we don’t need to download or clone the whole repository , instead we are going to download the whole “lion” repository as a zip file.

Then extract the archive and remove everything inside “src” folder and also delete all those “.bmap.gz”, “-dbg” folders . You can do this by simply deleting whatever you want using windows file explorer or using command prompt like I did: > del C:\Users\BinaryNinjaDev\Desktop\binaryninja_source_v1.5.0-rc3\binariesnk_lion_master_20150608T172815Z_installer \b inariesnk-lion-master-20151119T090012Z/src/* > del C:\Users\BinaryNinjaDev\Desktop\binaryninja_source_v1.5.0-rc3\binariesnk-lion-master-20151119T090012Z/b inariesnk-dbg/*

Now delete the “bmap” folders (“src”, build, etc) and add “.bmap” extension to all those files that contain bmaps for your platform (by default there are ~200 of such files, e.g armv6m.bmp, armv7m.bmp etc). You can do this simply by opening finder write box and writing *.bmap , then press enter . Then you should see something like:

Now you can delete all those useless files as well. Open Terminal -> Type “ls -l” and you should see something like:

As you can see, there are lots of unnecessary stuff inside binariesnk_lion_master_20150608T172815Z_installer folder . If you don’t need it , go ahead and delete them by simply dragging them to your recycle bin . Now lets open the project in Visual Studio or Xcode (depending on what platform you want to build Binary Ninja for) . Click on “builds”-project-> release -> platform.cmake :

This part is identical for both VS and Xcode. As I’m using Windows I’ll explain how to do it on windows but if you’re using Xcode to build Binary Ninja then you should replace cmake arguments with appropriate compiler flags for your platform in the “macosx” folder:

You can ask me why I am not using CMakeLists.txt file and instead forcing Visual Studio to use cmake manually . The answer is simple : – Makefile only works on unix based platforms so it would be too difficult to make cross platform project that everyone can compile on Windows, OSX and Linux (with autotools).  – There are lots of other options available for cmake such as adding custom libraries , code optimization etc. So this way we have maximum flexibility while building Binary Ninja. Now we’ll configure our own binary ninja by modifying “CMAKE_ARGS” variables from:

to This way we are telling cmake to find “Ninja” binary (the ninja build system) inside binariesnk_lion_master_20150608T172815Z_installer folder and use it to perform all necessary builds. Also we’re telling cmake to build for the 32 bit platform by adding “-DCMAKE_OSX_ARCHITECTURES=i386” flag to CMAKE_ARGS. Now lets open the build log using command prompt :

and press enter . I am getting lots of errors which is normal, because Binary Ninja project was not built before and as you can see in the log there are some missing files , because I deleted some of them earlier. Now lets keep pressing enter until the build process is finished and then restart Visual Studio , open again build log and press enter until it finishes building Binary Ninja again :

This time there are fewer errors than before . Now we have a minimal Binary Ninja that has been successfully built with no errors , so we can test it by opening an interactive session:

Now to exit type > quit  Since I’m developing on Windows  I’ll modify “main.cpp” file to use system library instead of dependencies from ./binariesnk-dbg folder :

Now lets check if everything works correctly by launching interactive shell in Release/Debug mode (Ctrl+F5):

and this should show us something like this:

So it looks like everything is working correctly . Now I’m going to clean up my binariesnk_lion_master_20150608T172815Z_installer folder from all unnecessary stuff like binaries ,  cmake cache etc. and then compile again :

And this time the build process takes much less time than before . After some tests I realized that there are not many things missing, so compiling Binary Ninja inside Visual Studio works fine but if you want try to compile something in Xcode then you’ll probably need more dependencies : – gcc-4.2 (for arm64) – libstdc++-v3.dylib (for x86 64 bit)

To get these go here: http://llvm.org/releases/download.html and copy the correct link to libstdc++-v3.dylib file and then download “Darwin” dmg (dmg stands for disk image) from http://developer.apple.com/tools/components/#Optional_Xcode_Component s, open it with your finder and drag libstdc++-v3.dylib inside Xcode folder .

After you’ve done that , use cmake again to build Binary Ninja (I’m just showing Visual Studio steps for now , because I have VS installed on my development machine):

Now go to the build log and press enter . You should see something like this:

As you can see there are fewer errors now which means that Binary Ninja has been successfully built on my machine and you can use it as a starting point to build binary ninja on your own.

Now let’s see how flexible the CMake system is . Let’s add support for Windows platform:

And just run cmake again :

As you can see , we’ve added target “windows” without any additional options (cmake auto-detects some things during the build process). Now to try running binariesnk_lion_master_20150608T172815Z with Visual Studio go here:

and press enter . It should take even more time than before because now the building process includes full Ninja source code, which includes lots of tests, helpers and debug symbols. Now our binariesnk_lion_master_20150608T172815Z is ready to be used with Visual Studio . What next ? Next time I’ll try building Binary Ninja in different IDEs and on different platforms , if there is any interest .

Leave a Comment