Karthik's Blog

About Me

Fri, 19 Jan 2024 00:00:00 +0000

Hello, I am Karthik, a Computer Science PhD student at Stony Brook University. My current research is in Kernel Security, Static Analysis, Fuzzing, LLVM, ARM TrustZone.

Checkout my development setup

Handles

Early Detection of Configuration Errors to Reduce Failure Damage

Fri, 19 Jan 2024 00:00:00 +0000

Paper: Early Detection of Configuration Errors to Reduce Failure Damage

Defines Latent Configuration (LC) Errors which are caused due to insufficient validations on the configuration, later until the configuration is actually used
- There might be a large time between loading this configuration, generally in the initialization phase, to actually using it (thus latent).
When such configurations are related to reachability, availability or serviceability, LC errors can lead to downtime.
Two main issues with such configs
- The values are not checked at all. eg: check if file exists
- The values are not checked according to the usage. eg: value is used in open(config_value, WRITE)
Paper implements a checker based on the static analysis and instrumentation
Static analysis:
- Taint analysis to go from the configuration to the actual usage along the data flow path. Control flow is ignored in most cases to avoid over tainting
- Along with these instruction, the dependent values are also extracted. Eg: open(config_value, permission) <- here permission is dependent value
- Any value that cannot be determined are skipped. Eg: a dependent value read from network
Instrumentation:
- Code is generated to perform same check as that in the actual usage, but in a “sandboxed” manner
- Here any side effect on the program is avoided. Eg: a local copy of global value is used instead of the actual global value.
- Utilities are written to check the actions performed by some library and system calls.
This generated code is run right after the initialization phase of the program
- Developer need to annotate two things
  - The interface of how configuration values are fetched
  - The place where program moves from initialization state to execution
TOCTOU issues are avoided by adding support to run these checkers regularly in a thread

Efficient Scalable Thread-Safety-Violation Detection

Fri, 19 Jan 2024 00:00:00 +0000

Paper: Efficient Scalable Thread-Safety-Violation Detection

Existing solutions
- Static or dynamic analysis to identify the potential buggy locations to inject delays.
  - Injects small number of delays but large analysis time
- Inject probabilistic random delays
  - Inject large number of delays but small analysis time
- TSVD tries to find the middle ground
TSVD employs two techniques to select the points to inject delays
- Near miss tracking
- Happens before relationship identification
Near miss tracking
- Identify two operations on a thread-unsafe object, one of which is write and happens close to each other on different threads
- If the time difference falls within the threshold, mark it as dangerous pair
Happens Before relationship identification
- If adding a delay at location 1 delays the execution of the location 2.
Delay is injected on all such pairs
- Delay is decayed if a pair does not trigger error
- Once the probability of delay drops to 0, the pair is removed from dangerous pair list
Built to support .NET projects
- Instrumentation and Runtime library
Evaluation
- Why some bugs were missed?
  - Two operations are close to each other only on some rare executions
  - False positive happens before prediction
  - Delay injection was not sufficient to capture the bugs

kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels

Fri, 19 Jan 2024 00:00:00 +0000

Paper: kAFL

Feedback fuzzing of closed source kernel mode components
Feedback using hardware capabilities
- Intel PT
  - Q: What does it give?
Challenges with kernel fuzzing
- Lots of states
- Interrupts and threads
- No straightforward way to “invoke” the kernel
Technical details
- x86-64: Kernel and userspace is split into halves
  - Total virtual address space: 2^48
    - Why?
  - Each get 2^47
  - Switching from user to kernel on syscalls do not switch page table
Intel Processor Trace
- Three types
  - Taken-Not-Taken: For conditional jumps, tell if a branch is taken or not
  - Target-IP: Indirect jumps, target IP
  - Flow Update Packets: Interrupts and async events.
- Filters can be added to these
  - IP range
  - Privilege level / ring
  - CR3 filter: Only when the cr3 value matches. Helps in filtering per process
System Design
- Components
  - Host user space process: kAFL
  - QEMU-PT + KVM-PT for getting the processor trace from guest
  - Usermode agent in the target OS
- Setup:
  - Agent performs a hypercall to provide kernel panic handler
  - Host patches this to get the feedback on crash
    - Instead of waiting for hte timeout
    - Then CR3 is exchanged from agent to host
      - This is used to set the filter
    - Then a shared memory address is exchanged where the agent expects the input for fuzzing
    - Fuzzing loop starts
    - While fuzzing is being performed, the QEMU-PT decodes the trace
    - When the agent is done, it sends a hypercall (hc_finished).
      - On this VM-Exit, it stops tracing
- Fuzzing logic
  - This is the core and does similar to AFL
  - Also runs fuzzing in parallel
    - Most fuzzing is not CPU bound, so this helps
- User mode agent
  - Broken into loader and agent
  - Agent lets you run arbitrary program, thus making it easier
  - Also loader checks if the program crashed and so it can restart
- KVM-PT
  - This helps in tracing virtual cpu instead of logical
  - By enabling on vm-entry and disabling on vm-exit
- QEMU-PT
  - QEMU-PT also filters the stream of executed addresses—based on previous knowledge of non-deterministic basic blocks—to prevent false-positive fuzzing results, and makes those available to the fuzzing logic as AFL-compatible bitmaps
  - ???
- Also cache the disassembly results to speed up populating the bitmap
- Stateful and non deterministic
  - Interrupts generate non-deterministic exections
  - So the fuzzer runs the program multiple times and identifies such basic blocks
  - Adds it to blacklist
  - This is ignored when updating the coverage map
- Hypercalls
  - Accessible from ring3
  - So add custom hypercalls that can help in fuzzing
    - Eg: crash, ask for input
KVM-PT
- vCPU specific traces
  - MSR autoload feature lets you load MSRs on exit or entry
- Continuous tracing
  - Uses ToPA
    - Table of physical address
    - Each address is associated with behavior on overflow
      - First -> interrupt
      - Second -> Stop tracing
        
        But keep it large enough for this to never happen
  - On overflow it triggers and results in vm switch
  - Buffer is cleared and switched back to the VM
QEMU-PT
- Userspace application to interact with KVM-PT
- When to start stop
- Also does the decoding the trace to generate a AFL map
- Our Intel PT software decoder acts like a just-in-time decoder, which means that code sections are only considered if they are executed according to the decoded trace data
  - ???
Discussion
- OS specific code
  - Not a necessity but improves fuzzing (cr3 value, custom process to test kernel)
- Kernel JIT
  - Out of scope
  - But very interesting
  - Intel PT does not give all the instruction pointers and need the executable to decode
    - Becomes tricky

Rx: Treating Bugs As Allergies— A Safe Method to Survive Software Failures

Fri, 19 Jan 2024 00:00:00 +0000

Paper: Rx: Treating Bugs As Allergies— A Safe Method to Survive Software Failures

Software failure recovery to make the softwares more available
Makes use of Checkpointing and Rollback to revert to an older state
Then makes some environmental changes and continues the execution of the application.
- If none of the changes work, it goes back one more checkpoint and retries
Components
- Proxy: Separates client and server interactions and helps in the saving and replay of requests upon re-execution.
- Sensors: Identifies when there is an error in the application using exceptions, interrupts etc.
- Checkpointing and Rollback
  - Based on: Flashback
  - Deletes oldest checkpoint based on stratergies.
- Environmental wrappers: For modifying environment during re-execution
  - Memory allocation wrappers: eg: zero fill, add padding
  - Scheduling wrapper to change the unit of time for scheduling
  - User request dropping
- Control unit: Coordinates with all the components
  - Also provides useful information for the programmer to diagnose and fix errors.
Tested on Squid, Apache, CVS, MySQL

My development setup

Fri, 19 Jan 2024 00:00:00 +0000

Windows

Lenovo Legion Y540
- i7 processor with 8 cores
- 24 GB RAM
Windows 11 Pro
WSL with Ubuntu 20
Ubuntu 22 VM running on Hyper-V
- Mostly accessed using X-Forwarding to WSL

MacBook

M3 Air with 16 GB + 512 GB
This is a new machine

Tools

tmux
VS Code
- With VIM extension
zsh with OH MY ZSH
Terminal
- Windows Terminal on Windows
- Default terminal app on Mac
tmux

Other applications

Notability for reading papers and taking notes on iPad
Obsidian for note taking on laptop
- Vault on iCloud to sync across devices
Slack and Discord for messaging
Microsoft Edge as a browser
- Google Scholar PDF reader extension for reading papers. Provides good navigation support for links.
1Password for password management