PSA: I’d love a job where I could work on more than one of: Android BSPs, Drivers, Emulation/Virtualization, GPUs, LLVM. UK/USA/Germany preferred. Cambridge UK/London preferred even more.
I have a lot of experience bootstrapping firmware and debugging complex software problems. I have repeatedly ported Linux to different boards without any kind of datasheets. I also have experience developing higher-level software in various domains (Server backend, graphics rendering, video encoding).
I believe my strongest point is that I can quickly learn a new area of knowledge using a systematic approach. At work I am good at reverse-engineering stuff and prototyping/digging examples for obscure undocumented systems, figuring out dependencies. I believe this is a competitive advantage because it allows me and my team evaluate solutions and ideas from other vendors at the very early stage.
I also like to explain tech concepts to non-tech people.
Below I provide a detailed description of the projects I am proud to have been involved into. I believe the experience I have allows me to work on any aspect of firmware initialization and system-level debugging as well as virtualization and I could also quickly fill enough gaps to work on a job involving compilers or media processing (DSP and GPGPU algorithms).
- Firmware/BSP: u-boot, LittleKernel (LK, TLK), UEFI, EDK2
- Tooling: GNU (GCC) Toolchain, clang, LLVM, coverage (gcov, lcov). Distributions: Debian, Yocto (OpenEmbedded)
- GPU: OpenGL/ES, Mesa, Metal, CUDA, OpenCL
Here are some areas I would be interested to work on. Ideally I would like a research/project bootstrapping opportunity to combine knowledge of several of the areas in one project.
I am currently working at Kaspersky Lab developing solutions based on a custom in-house operating system. During porting software I have worked on implementing missing functionality across the software stack: Libc, VFS, Device Drivers.
I have ported GCOV to collect kernel code coverage data. I found out many people I knew also needed coverage, so I made a tutorial based on the Open-Source project - LittleKernel OS.
The project I had been doing for almost a year was creating a compatibility layer to run VxWorks projects. I have ported a firmware for a networking appliance to run on our OS without loss in functionality. I have also developed a tool to convert VxWorks projects and a clang-based refactoring tool for C files. I also had to develop a QEMU-based simulator to speed up debugging hardware initialization issues.
The project was a big success for our department, highly praised by our management. We were able to demonstrate it at multiple conferences and attract new customers (though it is debatable whether it was our job as SW engineers or the marketing department should’ve cared).
I made several blog posts vaguely describing the process.
I have used the clang and LLVM to transform a lot of legacy code. One project I can discuss was identifying variables that lacked initializers and initializing them with the default values. The upshot is that I learned to use some of the LLVM APIs and also realized the necessity to explore other solutions.
For a couple weeks in 2014 I was working together with Chris Wade on his project of emulating iPhone SoC using ARM CPUs with Hardware Virtualization Extensions.
At the end of 2013 I experimented with virtualization using ARM Cortex-A15 CPU - TI OMAP5432. I believe I was one of the first people outside ARM/TI to enable HYP mode (Hypervisor support) on this CPU in U-Boot.
I also got the XNU kernel (which is the kernel used by Apple in iOS and OS X) to run on my development board.
I was contacted by the author of the iEmu - an emulator which allowed running the binary OS image of iPhone 4 on an X86 PC. We started working on getting the emulation working on the Cortex-A15 CPUs with Virtualization Extensions. I did some bootstrapping work, but then concentrated on my full-time job and GSoC.
During Summer 2014, while studying for a Master’s degree, I’ve successfully completed the GSoC project with the FreeBSD. Initially I wanted to port FreeBSD to some ARM board to learn about FreeBSD kernel, but the mentor suggested porting FreeBSD to Android Emulator so that everyone could test it. I have done that, and also ported the latest (nigthly) version of Android Emulator to FreeBSD. I had to do it because the stable version had a bug, and also because FreeBSD at the moment did not run 64-bit Linux binaries.
The exposure to FreeBSD kernel which used clang got me interested in trying out the clang tools (ASan, AST rewriter) and I have subsequently used them at work to improve software quality and implement custom code analysis tools. I have a blog post about the GSoC experience:
At NPTV I was working on a video-streaming server. I have mostly learned a lot about networking and rendering from more senior engineers.
Initially I did a lot of bug-fixing, but later was working on performance optimization. I have implemented a simple algorithm to minimize the number of redraws in scene by splitting the stage into 64 tiles and using stencil buffer to only repaint the modified areas. I have also implemented geometry batching to significantly reduce the number of draw calls. That has improved performance by around 5-10 percent on average and dramatically in case of complex view hierarchies.
We were perphaps one of the earliest users of AddressSanitizer and other sanitizers in production, and that allowed us to uncover a lot of bugs. However, we found out that enabling AddressSanitizer in the application breaks the NVIDIA driver, which provided part of our critical functionality.
We ended up patching the LLVM toolchain to make it compatible with how the NVIDIA driver uses the memory. I have sent an email to the llvm mailing list, and some of the AddressSanitizer developers confirmed it was the only solution, albeit a fragile one.
My main project was porting our software to the Intel GPU H.264 Encoder. The problem I have spent time to fix was implementing zero-copy texture sharing between the OpenGL renderer and the H.264 Encoder. Linux OpenGL stack only allows to get the memory descriptor (libdrm BO) for OpenGL ES objects, so I had to come up with a hack to implement similar functionality for non-ES OpenGL.
The result was that our encoding frame rate jumped from 30FPS to 300FPS since it was now not limited by the PCI bandwidth. This allowed us to process up to four clients in 4K resolution on a single Core i3 machine, and it uncovered a potential for cutting the costs dramatically compared to NVIDIA GPUs.
I have used the OMAP5432UEVM board extensively to do a lot of development for ARM.
During the time I worked on the project, Linux kernel support for ARM HYP mode and LPAE was still being worked on, so I had to introduce a set of hacks to get it working. I have started threads on LKML to discuss these hacks and after making sure they are correct but people on LKML were already working on this topic, I decided to keep my own tree for my purposes and wait for others to commit the missing pieces upsteam.
Even though I did not push changes upstream, I had a working solution enabling virtualization via HYP mode almost 1.5 years before upstream software. Numerous people have contacted me to thank and request help setting up this configuration, and I’m pleased I was able to help them.
I was interested in learning about driver and BSP development for the XNU kernel and decided to port XNU to OMAP5. I used the ARM port by the developer which goes by the handle “winocm”. The code that I have published and described in my bloog allows one to boot XNU and see the debugging UART output. I have subsequently used this kernel in an emulation project, adding the support for additional peripherals.
At Ksys Labs (ru) I was working with the microkernels - namely, Fiasco.OC, Intel NOVA and the Genode OS. I was working on the projects in a team of two together with my supervisor. We have communicated with the original authors of the projects both online and in real life during the conferences.
Our biggest achievemnts over the course of 2 years were:
For the purpose of removing the proprietary binary blobs from the Android code, I needed to reverse-engineer the firmware loader for the modem. The project called “Replicant” which was a FOSS Android distro already had the support for the older generation of Samsung modems so I took the majority of other code from it, and later contributed my changes back to upstream.
At work I still had to keep a fork of the Replicant code because I needed some fixes for stability (memory corruption) and functionality (non-ascii SMS decoding) which I was unable to get accepted upstream for some time.
I Ported Linux kernel to various phones and tablets. My aim was learning how to do driver and OS kernel development. Another goal was getting a fully Open-Source (FOSS) software stack with no vendor-supplied drivers.
My biggest achievement was the port to the Sony Xperia X1 phone which originally ran Windows CE. I have also contributed to the ports to other similiar devices which were mainly done by other people (HTC HD Mini, HTC Rhodium)
I also blogged a bit about my experience with OEM Linux. My pet peeve with OEM/Hardware vendors is that when they design a BSP, they always hack the internals of driver frameworks, copy-paste drivers and abuse platform data/FDT. The result is that updating the device to a newer kernel is almost like a rewrite from scratch. It hurts both customers/enthusiasts (who don’t want to spend their time on that) and vendors (updating to a new Android release takes at least half a year and as the result almost no devices receive updates).
Besides, I have submitted two minor patches that were accepted to the upstream Linux Kernel. They are really minor, because they fix small logic errors and introduce little functionality. I wanted to submit a large patch with the refactoring of Qualcomm RPC driver, but got talked out of it by a Qualcomm engineer because future SoCs were a total redesign. However, I have learnt how to push patches upstream with git-send-email.
I decided to avoid trying to work with upstream because keeping with the pace while working alone or in small teams is too hard, and reverse-engineering drivers and developing a BSP takes time comparable to the life-cycle of the device.
I was interested in porting Android to a mobile phone that was originally running Windows Mobile. The phone was Sony Ericsson Xperia X1 phone, known by the OEM name as “HTC Kovsky”. I was later contracted to repeat part of the process for the HTC Photon which had the MSM7227 CPU and was identical to its Android counterpart which was easier.
I ported both the Linux Kernel, wrote most drivers, and ported the Qualcomm LK Bootloader which allowed to run Linux natively without chain-loading through Windows CE. I consider it a good achievement because all hardware worked as good as in the original Windows CE firmware, and because many people of the xda-developers.com community have used the firmware and thanked me.
In the course of the project I have got to know many people from the OpenMoko hardware community. I have also ported various non-Android distributions including SHR Linux and Mer (early predecessor of both Sailfish and Tizen). Reverse-engineering the driver for the Camera signal processor (Video Front End or VFE) was my introduction to MMU, DMA and peripheral memory management.
I have also learnt a bit about the architecture of X11 and Android windowing systems while porting the libhybris. Libhybris is an adaptation layer that allows to mix the code with incompatible ABIs - GNU GLIBC and Android Libc in the same application. It was used to run Android OpenGL drivers in X11.
The Qualcomm SoCs have two CPUs: the Application CPU (AP, Application processor) and the Modem CPU (BP, baseband processor). The BP runs the firmware called AMSS which controls most peripherals. AP uses a custom RPC to request BP to perform certain actions. Between different versions of Android and AMSS the RPC format differs. Windows Mobile/CE AMSS is older and does not control many peripherals. The access to them is given to AP. Therefore drivers had to be written for Linux.
At the time I started there was a complete port Android port for one of the older MSM SoCs which provided a lot of invaluable information. The initial port for my device and similar was done by the htc-linux.org project but it did not support a lot of peripherals and was not of production-grade quality. It was not something I could use daily on the phone without being embarassed. My goal would be to make a firmwmare that would boot from internal memory (NAND) and have ALL the hardware working exactly as in Windows Mobile or better. With the exception of the frontal camera and the camera autofocust this goal was accomplished over the course of the year.
I have done the following work on the Linux Kernel at the HTC-Linux.org project:
On the userspace side I have:
The following was done to LK Bootloader (an open-source bootloader used by Qualcomm):
The source codes for this project are available at the links below.
I consider it a great project because in the process I made a lot of connections to people throughout the industry, and our kernel was even used by one of the ChromeOS engineers internally. Besides, these projects got me a high-paying and challenging job before I even graduated.
I was working with the Acer A500 tablet powered by the NVIDIA Tegra 2 SoC. My aim were running GNU Linux distro with X11 subsystem instead of android. I have ported the BSP for this particular table to the ChromeOS kernel which at that point was the upstream for GNU/Linux (non-android) kernel development from NVIDIA.
Acer delayed the source code release for a long period of time. I have discovered (by analysing GPIO and register dumps obtained under Linux/Android) that the tablet was very similar to the NVIDIA Ventana reference board and to the tablet from the other vendor - namely, Asus TF101. I was able to port the kernel and have the majority of hardware with the exception of the frontal camera working long before the official source code release by Acer. I was later joined by another developer who used my kernel tree for running GNU/Linux on Asus TF101.
This kernel was later used by [] from ChromeOS project who set up a platform for building ChromeOS images for Asus TF101.
At the time I was working on the Tegra project, Linux Kernel introduced the unified kernel which could boot across multiple ARM SoCs using a single binary. That posed certain requirements on the OEM bootloaders. Since almost all of the OEM Tegra boards were providing the same MTYPE (Machine Type) I needed a way to boot newer-style kernels without touching the OEM bootloader. I decided to write a wrapper that would at compile-time prepend Linux binary image with assembly code to set the corresponding ARM register with the correct MTYPE value.
I have also ported Linux kernel and Trolltech QTopia to Asus P525 PDA, and my friend ported it to Asus P535.
I had two laptops with discrete graphics. I have tried fixing the “vga_switcheroo” module in Linux Kernel to improve power-saving. After a lot of discussions on LKML and IRC I was unable to find a solution that would be accepted upstream. However, I was able to solve all power-draining issues locally. A detailed write-up is available in my blog.
Intel GPU SDK had a tool called “intel-gpu-dump” to read out the contents of the GPU configuration registers. I wrote a python tool that allows to write back (restore) the GPU state as dumped by the abovementioned tool. I used it for debugging the eDP panel initialization issues on the Apple Macbook laptop.
My experience with both software and embedded systems started when I had a smartphone running Windows Mobile 5.0. The interesting thing about Windows Mobile was that it comes composed of independent software packages and with the intent of OTA updates. However, all binaries were statically linked to a certain address which prohibited porting them from other devices. Together with a group of fellow hackers I have developed a tool to relocate the Windows CE Modules - DLLs and EXEs. This has allowed to port firmwares from other devices.
As a member of various online communities devoted to tinkering/hacking smartphones and PDAs, I have done the following:
To learn how to use OpenGL 2.0+ and shaders, I wrote a simple engine. During writing it I have learnt about certain OpenGL pecularities (such as glBufferData). https://github.com/astarasikov/sxge
While it is quite basic, it was useful because I have used it to practice and prototype smooth animation concept and also used it for evaluating the zero-copy texture sharing between OpenGL and H.264 encoder on the Intel GPU platform. See the branch for details
I have got myself an Apple laptop and a Dell multitouch display. Then I realized there was no driver for it for OS X and I decided to write one myself to get familiar with OS X APIs. The blog post in the link describes the process in more details.
I decided to learn both the Swift language and Metal GPU API; and also brush up ray tracing and ray casting and made a simple demo with distance functions in Metal and Swift.
I ported Libfreenect2 renderer from OpenCV to OpenGL 3.2 for Mac OS X for my experiments.
This is an application that was intended as the copy of a windows-only application called ISE2. It allows one to search for images in binary files by specifying the offset in bytes or pixels and the byte layout (such as RGB565, RGB888 etc)