Defying Analysis With Sparse Malware

If you're writing tools for red teaming or pentesting, the main point of your backdoors, or implants as people are starting to call them, is to enable remote control of a system without being detected. If that fails, and your backdoor is found, the next best option is to avoid analysis so any of your other backdoors can't be found, your infrastructure won't be found as quickly, or maybe just to screw with the malware analysts and waste their time.

The general theory behind anti-analysis is to identify operations that happen during analysis that don't happen or can be avoided during normal execution, then figure out how to break those, or interfere with them as much as possible. So, for example, obfuscation techniques are popular that rewrite each CPU instruction into complex substitutions and emulations of the original so analysts will have a lot more chaff to sort through while manually examining the file, but execution won't be noticeably slower.

But anti-analysis doesn't need to be so complex; one of the easiest anti-analysis techniques is just to make an executable bigger. When Windows loads an executable into memory, it reads it in and maps it to memory section by section as defined in the executable header, and with some exceptions, generally ignores any bytes outside of those sections. Analysis tools, on the other hand, often do linear sweeps through every byte of the executable, in addition to analyzing by section, to store a copy of the full executable, calculate hashes, extract all strings, and identify executable oddities. As an example, one of the most popular tools for malware researchers and antivirus firms is VirusTotal, a service now owned by Google that analyzes uploaded files by testing them against dozens of antivirus vendors' software to help you determine if a file is malicious, and enable the AV companies to obtain and manually analyze submitted files. VT has an upload size limit of 128MB. So if you append 256MB of random bytes to the end of your executable, it can't be uploaded to VT, but will still run with no problem. The additional bytes won't even be mapped into memory.

But all those additional bytes do take up space (and result in significant hard disk activity when they're first being written) so there's only so much you can do without getting noticed when the system owner wonders what's taking up so much hard disk space. And this is where sparse files get interesting. In NTFS, the master file table (MFT) keeps track of where each file is stored on disk in a series of data runs specifying the location of each fragment of the file.

©Michael Wilkinson, This document may be freely distributed provided this notice remains intact. The original is located at

©Michael Wilkinson, This document may be freely distributed provided this notice remains intact. The original is located at

Sparse files are files that have one or more data runs that are not actually present on disk; the MFT just keeps track of the sections of the file that aren't present and when a program reads those bytes, NTFS just returns zeros. I assume this is useful for some kind of database software, but it's also convenient for this cheap anti-analysis trick. By converting an executable file to a sparse file and then extending it to say, 1GB, you can make an executable file that doesn't take up any more space on disk, yet has a ridiculously large size. Any incident responder or malware analyst attempting to copy this file off the system, upload it to somewhere like VirusTotal, or even open it in any of a number of analysis programs is going to fail or be seriously slowed as most network protocols and analysis programs don't have special optimizations for sparse files, so they'll be churning through the entire GB.

The code necessary to make one of these files is actually pretty simple. The below C program will convert a file into a sparse file and extend (or contract) it to the provided size.

using namespace std;
void main(int argc, char** argv) {
	if (argc < 3) {
		cerr << "Usage: " << argv[0] << " file size" << endl;
		cerr << "Cannot open file" << endl;
	DWORD outlen;
	DeviceIoControl(h, FSCTL_SET_SPARSE, NULL, 0, NULL, 0, &outlen, NULL);
	newlen.QuadPart = atoll(argv[2]);
	SetFilePointer(h, newlen.LowPart, &newlen.HighPart, FILE_BEGIN);

Run it and...

An extended file

The only limit I've seen is that if you extend an executable beyond the 32-bit integer limit you'll get an error trying to run it, so you can't make 100GB files; just stick to a few GB and you'll be fine.

  1. No comments yet.
(will not be published)