About Me

Bay Area, CA, United States
I'm a computer security professional, most interested in cybercrime and computer forensics. I'm also on Twitter @bond_alexander All opinions are my own unless explicitly stated.

Friday, February 24, 2012

Lessons from a DFIR job search

Those of you who follow my blog know that I’ve long been trying to transition into computer security for a few years now. After the US Cyber Challenge I got a QA job at an antivirus company that gave me exposure to application security issues and malware basics, but I’d really been trying for a digital forensics or incident response (DFIR) job. As of last October, I finally got a job doing infosec – including incident response - at Bay Area-based social network (not Facebook).

While I was trying to transition over, I talked to a lot of experienced DFIR people about how they got into this. Ultimately, the path that worked for me ended up being a bit different than their suggestions, so perhaps these notes about what worked and what didn’t will help other people looking for forensic work.

Transition from Operations

Several people I’d talked to started in an operations type of role and then transitioned over, supplementing their sysadmin skills with additional study to make the jump to DFIR. This path is a logical one because sysadmins know how their servers work, their configuration, how to keep them running and reliable, and networking. They will have been exposed to security issues in OS hardening, firewall configuration, and patch management. Most likely they will have responded to incidents as well, at least a DDoS if not a server breach. Ultimately, I think this is the easiest path and the one that most forensics people will continue to come from.

Law Enforcement

Quite a few forensics people got their beginning in law enforcement, particularly federal. Some local police departments are also big enough to support some cybercrime officers – particularly in the Bay Area – but entering forensics through local police is very unreliable as all new officers are treated the same. They all go to the Academy and work patrol for at least a few years before they even have a chance to try for an investigative position like cybercrime. Federal law enforcement is a more reliable route, but with the current Federal budget positions are few and far between. You also have to be hired by age 35 and positions are all over the country. I tried for the FBI, but shortly after the written test they entered a hiring freeze that’s lasted a year so far. It’s still a viable route if you’re not picky about where you live or what agency you work for, but given my age and my wife’s living preferences, it’s not going to happen. At some point in the future perhaps I could get a civilian position at a cybercrime lab, but those are still uncommon.


A few people I talked to recommended going the consulting route, getting hired by a consulting firm and doing forensic work for them for a few years. By all accounts consulting is extremely hard work and requires a lot of travel, but the managers at the firm I talked to didn’t care about a low level of experience. As long as I had a handle on the basics, was comfortable with travelling and working very long hours, they would give me the training and mentoring. This seemed like a very likely route and would be a good one for other interested forensic people to follow, but it turned out the firm I was talking to stopped hiring for the time being and I got the job offer for my current job before I caught another firm’s interest.


As I was talking to the consulting firm, I was also applying for corporate jobs. Despite having a forensics certification, I had a hard time getting corporate interest due to my lack of experience. Some good contacts got me a couple interviews, but no offers. Hiring managers who were looking for DFIR people could generally find someone with more experience than me. That’s when I came across a posting for my current company. It wasn’t true DFIR, it was more spam and fraud with some additional application security issues in responding to XSS attacks. Still, it was close enough to some of the things I’d done in the past to get me an interview. During the interviews, I asked about what elements of security were covered under the security team, including incident response and forensics. Despite my lack of experience in this realm, my training and study of DFIR plus my previous experience with web app security encouraged the manager to create an infosec position to start building out an incident response capability. I got the job.

Hopefully these notes about my DFIR job search will help other new and inexperienced forensic hopefuls find a position that works for them. I’d love to read anyone else’s forensic job search stories, please share them in the comments.

Thursday, October 20, 2011

Status update

Wow, I can't believe I haven't updated this blog since July. A lot has been going on since then, and I've been too busy to keep up the blog. While I hope to have more time and material in the near future, I'm starting a new role at a new company at the end of the month and I don't yet know their stance on personal blogging. Once I've had a chance to get settled there and get to know their stance on blogging, hopefully I'll be back here posting regularly about what's going on.

In the meantime, I've been learning more about incident response. In particular, Harlan Carvey's written some great articles on his blog, I highly recommend them. They're more pointed at high level overviews of IR rather than step by step how to do it, but that's what I need right now: basics of how to approach it. Besides, the details of IR vary dramatically based on your org's situation and needs, so that's really the only way to do it.

Also, Brian Baskin's DerbyCon talk "How to Get Fired After a Security Incident" is now available online. It's a great presentation about common mistakes made in forensics and incident response.

The short version of both Harlan's and Brian's message: prepare yourself before you discover your breach.

Monday, July 11, 2011

Sabotage, Stuxnet and the future of cyber attacks

Last year, before LulzSec and Sony's epic fail, the big topic in computer security was the uniquely sophisticated and targeted malware known as Stuxnet. I blogged about it back in September. Now, Kim Zetter of Wired Magazine's Threat Level blog just posted a great overview of the effort to reverse engineer Stuxnet. If you haven't read it yet, you should now. Not only does it present a lot of great info on Stuxnet, it also gives some good insight into malware reverse engineering in general. The rest of this post will presume you've read it.

Most of the article is excellent, well researched and well written. However, I take serious issue with one of Ralph Langner's quotes towards the end of the article. Here's the excerpt:
They will likely have no second chance to unleash their weapon now. Langner has called Stuxnet a one-shot weapon. Once it was discovered, the attackers would never be able to use it or a similar ploy again without Iran growing immediately suspicious of malfunctioning equipment.

“The attackers had to bet on the assumption that the victim had no clue about cybersecurity, and that no independent third party would successfully analyze the weapon and make results public early, thereby giving the victim a chance to defuse the weapon in time,” Langner said.
In an ideal world, Langner would be completely correct, but in practical terms he's wrong. I have great respect for Langner, his expertise and his work, but it seems that almost daily I'm reading about people falling for the same attacks over and over again. As just one example, Stuxnet spread from network to network through infected USB drives. This isn't a new attack, back in 2008 the Department of Defense was hit by a major attack spread through USB. That virus was successful, but was a re-use of a virus from 2007. One would hope that the US Government takes information security seriously, but just this year DHS tested how many employees would pick up an infected USB drive and plug it into a secure system. Result: 60%. If there was a company or government logo on the drive, it was up to 90%. Old and well-known attacks work, even on high-value targets that really ought to know better. Similarly, although 0-day exploits are highly valued for malware and hacking attempts, the majority of malware out there is successful using exploits for which patches are available.

However, let's give the Iran's Atomic Energy Organization the benefit of the doubt. Let's presume since Stuxnet, they're keeping updated with every critical security patch for every piece of software they run -- an impressive feat! That can't keep them safe, new exploits are discovered daily. To get a sense of the scale of the problem, take a look at the Exploit Database and remember that those are only the exploits that are discovered by responsible security researchers, not criminals. To further complicate issues, Stuxnet has a compartmental structure. From the article: "[Stuxnet] contained multiple components, all compartmentalized into different locations to make it easy to swap out functions and modify the malware as needed." It seems apparent that the authors of Stuxnet could simply swap in new 0-day attacks and continue as before. In fact, earlier this year a security researcher discovered a serious bug in Siemens' industrial control software and wrote proof-of-concept malware to exploit it. He claims that Siemens didn't take aggressive enough action to patch the exploit.

Frankly, the recent hacking of Lockheed, Sony, Oak Ridge National Labs, Sony, InfraGard, Sony, RSA, Sony, HBGary, Sony, and assorted government contractors proves that any network can be penetrated. The only difference now is that people are more aware that attacks on the sophistication level of Stuxnet are possible. This gives incident responders a better chance to identify and react to malware and breaches. This is what Zetter referred to when she wrote "the attackers would never be able to use it or a similar ploy again without Iran growing immediately suspicious of malfunctioning equipment." The difficulty is, equipment malfunctions. Software has bugs and hardware fails, particularly when you're a country dealing with jury-rigged equipment smuggled in under trade embargoes. For any given failure, cyber attack is the least likely cause, that's why Iran's centrifuge failure rate could increase dramatically for months before a cause was found.

To further complicate the issue, I find it highly unlikely that Iran has sufficient personnel with the skills necessary for incident response and advanced malware reverse-engineering. Quite frankly, even the US government is having problems recruiting and retaining people with those skills. It's hard to imagine that Iran has an easier time with this problem.

Quite frankly, in my opinion the only limiting factor on cyber attacks against physical infrastructure is the will and resources to do it again. It's only a matter of time before another powerful and skilled group decides they want to execute a similar attack.

Monday, June 6, 2011

Reverse engineering a malicious PDF Part 3

Welcome to my series in progress about reversing a malicious PDF. Last time I worked through the first exploit, geticon(), and its shellcode payload. Next, I'll be looking at printf(), which is triggered if the user is running Adobe Reader 7.1. Here's the code:
Again, the payload is the appropriate shellcode variable I found previously (shcode_printf). It's different code this time, but it follows the same pattern. Build the NOP sled, this time in the variable nop. Append the payload to create heapblock. Build a bigger, 261310 character NOP sled in block. Create an array mem and populate it with 1400 copies of the full NOP sled plus heapblock to create the heap spray. Attack the vulnerable function util.printf, overflow the buffer and Adobe Reader hits the NOP sled and executes the shellcode.

This is an older exploit, CVE-2008-2992 made public in November of 2008. A patch was available at the same time the vulnerability was published.

Now, the fun part. What does the shellcode do? Like last time, let's look at the hex and see if there's any obvious URLs.
No such luck. Back to scLog to execute the shellcode. Executing the shellcode shows it loads shell32 to get the Temp path and loads urlmon to try (but fail) to download a file to Temp. Just like before, it tries to access /forum.php?f=PDF and passes along the exploit used (printf). Again, the file would have been saved as a.exeSimple enough.

Onward to exploit 3, collab().
First, we have a function fix_it, which takes two variables: yarsp and len. It enlarges the string to twice len, then cuts it down to half len. Once again, the shellcode is taken from shcode_collab and stored as var shellcode. Variables cc and addr are set to hex numbers, and sc_len is set to twice the length of the shellcode (338). These are used to calculate the new variable len (equal to 4093910). All of this is leading up to the good stuff, beginning with var yarsp. This variable is defined with a few NOP codes, which is then run through fix_it with len. This extends yarsp to a real NOP sled, 2096955 characters long. The variable count2 is defined and a for loop is used to generate mem_array, which is the heap spray. Next, the var overflow is created and extended to 65536 characters. Finally, overflow is passed to the vulnerable Collab.collectEmailInfo() method to trigger the exploit. This is another old exploit, discovered and patched in 2007 (CVE-2007-5659)

So far, so good. Now, onto the shellcode. I run it through scLog just like the others ... and just like the others it loads shell32 to find the temp directory, uses URLDownloadtoFile to try to access /forum.php?f=PDF (Collab)&key=... and save it to temp as a.exe.

The first three exploits all followed the same basic pattern: create a big NOP sled, attach shellcode, replicate it a few hundred times into a heap spray, overflow the buffer and let it go. All the shellcodes had essentially the same function as well, download a trojan EXE to the temp directory. As a result, I'll leave the last exploit and shellcode as an exercise to the reader. :)

Thursday, June 2, 2011

Reverse engineering a malicious PDF Part 2

In Part 1, I began analyzing a malicious PDF. Within the PDF, there was a fair amount of obfuscated malicious Javascript present, which I parsed through. Through many transformations and text replacement, the Javascript eventually decoded and executed the attack code, saved as the variable etppeifjeka.

The attack code was initially obfuscated with excessive exclamation marks:
but once the exclamation marks were removed, it became neat and tidy code. Unlike the malicious Javascript I analyzed last month, once the exclamation marks were removed this code even had line breaks, making it much more legible. The attack code contains several functions: nplayer, printf, geticon, and collab. The PDF contains code to read which version of Acrobat is running, and based on that chooses the exploit to launch.
Adobe has provided some documentation for the app.viewerVersion method. In this case, it's looking at the version of the EScript plugin (which provides Javascript support). The EScript plugin version number is actually the same as the version number for Acrobat itself. Thus, if Acrobat is version 9 or version 8.12 or higher, it runs geticon. If Acrobat is version 7.1, it runs printf. If Acrobat is version 6 or below version 7.11, it runs collab. The last one is oddly written, they might have been trying to write "between 9.1 and 9.2" but as it's written nplayer will be triggered if it's greater than 9.1 or less than 9.2 ... which means if it hasn't hit one of the other functions it'll hit this one.

Here's the code for the geticon function:
Back in part one, I guessed that the shcode variables were the shellcode payloads for the PDF. This function confirms my guess. First, the function grabs shcode_geticon to collect the appropriate shellcode. Then, it's appended to the end of a short NOP sled and saved as the variable garbage. The next bit of code (lines 46-53) uses a variable nopblock to extend the size of the NOP sled. By the time we get through the while loop, the variable block contains a NOP sled 262045 characters long. Since all of this is being stored in memory as the code executes, this is a heap spray. Then, an array called memory is constructed (line 54-55), containing 180 copies of block plus the shellcode. Lines 56-61 construct var buffer with 4012 copies of %0a%0a%0a%0a which are line feeds in hex. Finally, the array buffer is passed to the vulnerable geticon function. The buffer overflows, the execution hits one of the NOP sleds present and executes the shellcode.

Incidentally, this is a well known exploit. I found the exact same exploit code being shared on a security research message board two years ago. Just because all that the news talks about is the new, sophisticated malware, that doesn't mean the old stuff goes away quickly. For example, Contagio has seen samples of this from last January.

Now we've finally reached the point where the shellcode is executed. In this case, what does it do? Daniel Wesemann of the SANS Internet Storm Center provides a short Perl script to take the shellcode and dump it to hex to see if there's anything obvious, like a URL. Here's the code and the results:
In this case, it wasn't very helpful. There's no obvious URL or even anything that follows a URL pattern. The next article in Daniel Wesemann's series continues to compile and disassemble the shellcode for reversing, but I don't read assembly so it's off to another option for me. Malware Tracker has an online shellcode analysis tool, but it didn't work for the four shellcode samples in this pdf. Cruising online guides to shellcode analysis led me to a tool called libEmu. However, when I ran the code in libEmu, it hit my 10000000 step limit before execution actually completed. It looks like either I did something wrong, or I hit an infinite loop in the shellcode. Odd. The same happened with each of the four shellcodes.

Since I'm still new to malicious PDF analysis, I talked to some guys in Threat Research here and they pointed me to a tool called PDF Stream Dumper. Interestingly, I was able to execute the shellcode I copied-and-pasted into it, but it choked on the actual PDF stream that Didier Stevens' tools processed without difficulty in Part 1. This confirms the need to use multiple tools, you never know when one will fail you.

I executed the shellcode within PDF Stream Dumper. There are a couple different ways that you can do this, scDbg uses the libEmu emulation and it crashes the analysis just like libEmu does running under Linux. Running the scLog version (live, not emulated analysis) the shellcode executes. scLog notes that the shellcode loads urlmon.dll, which is an Internet Explorer library for fetching files from remote URLs. Then, the shellcode tries to access shell32.dll, but scLog kills the shellcode to prevent that. Looking at the memory dump in a hex editor, I see a call to a url: /forum.php?f=PDF (GetIcon)&key=87c1a082278ace8fdf2f63b86db29d6f&u= and a reference to a.exe

These file references imply downloading an external file, but implication isn't proof. So, let's use scLog again and let it actually load all DLLs this time and see what happens. First, some cautionary notes: this is malware we're executing and I'm turning off some of scLog's safety functions. As a result, I'm adding some safety back in. I took a snapshot of my analysis VM first so I could revert if needed. Since this shellcode looks like a downloader, I'm running Wireshark to see if anything actually gets downloaded.

It tries to access /forum.php?f=PDF (GetIcon)&key=87c1a082278ace8fdf2f63b86db29d6f&u= and download a file as a.exe to the user's temporary directory. However, since that's a local URL, it fails and the shellcode crashes. Wireshark confirms that nothing was downloaded. Interestingly, /forum.php takes parameters including where the file request is coming from (a PDF) and even which exploit is being used (GetIcon). Interestingly, it looks like this PDF was intended to be viewed online, not downloaded (or emailed) and viewed.

In summary, the geticon() function exploits a known vulnerability in Acrobat to hook urlmon.dll to download and execute additional malware to exploit the user's system. The vulnerability is known as cve-2009-0927 and there is a patch available to prevent it from affecting your system. Moral of the story: keep your system patched and be careful where you browse.

Next, I'll get to the other exploits and shellcode.

Thursday, May 26, 2011

Reverse engineering a malicious PDF Part 1

One of the projects I work on is a malicious Javascript scanner. It also scans PDFs since the malicious part of PDFs is usually encoded Javascript. To test the scanner, we regularly collect malicious PDFs and run them against the scanner to see if they're detected. Of course, in order to determine if it's really malicious, sometimes you need to go in by hand and see what's going on. To this end, the Didier Stevens wrote a chapter on analyzing malicious PDFs I'll be using that as a reference as I go through a malicious PDF here. I recommend reading it alongside this article. Didier is far better at this than I am, so I won't be trying to explain structural concepts which he explains far better. The PDF I'll be looking at is named 4469.pdf. It was downloaded "in the wild" from a website listed on the Malware Domain List.

Didier provides several python scripts that are useful for analyzing PDFs, the first is pdfid.py. It examines the PDF for indicators of a possibly malicious PDF, such as the presence of Javascript, automatic actions, and document length (most malicious PDFs are only one page). Here's what the results look like for 4469.pdf.
In this case, the PDF is only one page, contains Javascript and contains code that will launch when the PDF is opened (OpenAction). This is potentially suspicious, so let's keep investigating. We know that the Javascript is where the malicious activity will happen, so let's look at that first, using Didier's pdf-parser.py
Pdf-parser.py only located Javascript is in indirect object 2 0 of the PDF. However, indirect object 2 0 references indirect object 11 0 as an OpenAction and Javascript. In a moment we'll see why pdf-parser.py didn't identify indirect object 11 0 as containing Javascript. For now, we see that the pdf is invoking Javascript when the file is opened, which we expect from a malicious PDF, and we expect that indirect object 11 contains our payload.

Using pdf-parser.py again, I can parse out indirect object 11, which is a stream object compressed with the Flate method. Interestingly, this is exactly the same situation in Didier's example script, so it seems this is a common way to obfuscate malicious code in pdfs.  Since it's common, Didier provides a method to uncompress the script: pdf-parser.py --object 11 --filter --raw 4469.pdf .... and voila, we have malicious code:
(click to enlarge)
It keeps going like that for another couple pages.

Just like in the malicious Javascript I took a look at last month, the functions and variables all have random names: function hddd(fff), var fpziycpii, etc. There's also plenty of junk characters and excessive transformations to make analysis more annoying. Here's one example towards the end of the script:
for(yrauyiyqouoi=0;yrauyiyqouoi<gmgdouaeyd;yrauyiyqouoi++){var dsfsg = yrauyiyqouoi+1;xrywreom+='var oynaoyoyaia'+dsfsg+' = oynaoyoyaia'+yrauyiyqouoi+';';this[fuquoudieeel](xrywreom);}
There's even a section where every letter is interspersed with a bunch of exclamation marks. It's all messy, but nothing we can't eventually analyze. Let's start at the top.

The first code introduced is function hddd which takes the parameter fff. It takes the parameter and replaces the ** with %u. There are four separate strings processed by this function, which means each string is actually a unicode-encoded string. These are stored as variables: shcode_geticon, shcode_newplayer, shcode_printf, and shcode_collab. Based on the names, these strings are likely the shellcode payloads, but we'll see when we get there.

Next we have:
var fpziycpii = 'e';var uinsenagexo = 'l';var fuquoudieeel = fpziycpii+'va'+uinsenagexo;var ioafyyad = this[fuquoudieeel];rtaoyuupaue = "ioa!fyya!d!('!t!hi!s![!fuqu!ou!d!ie!ee!l](o!y!na!o!yoy!aia!'+!gmg!do!u!a!e!yd!+!')!;!'!)!;".replace(/[!]/g, '');
Stepping through it, the variable fuquoudieeel takes the first two variables and combines them to get "eval", so ioafyyad is this[eval]. Next, rtaoyuupaue is a string that has the replace function executed on it. In this case, the replace function just removes all the extra exclamation points that are in there, yielding:


If we substitute in the known variables, we get:


That's an improvement, but there's still work to do. The variable gmgdouaeyd is later defined as 1100, so we get oynaoyoyaia1100,a variable which isn't defined yet. There's a section towards the end with oynaoyoyaia0 = eiuaopyj; but obviously that's not the same variable. It may be a typo, or it may be junk code ... we'll see. For now, let's move on.

Next we have another function:

function iuoyzemuyyi(ieuohhrk)
var iuioathlpau = '!';
var unetoptou = '';
var yaomwteez = ieuohhrk.charAt(xqqauiae);
if(yaomwteez == iuioathlpau) {  } else { unetoptou+=yaomwteez;
return unetoptou;
This function is a longer, more complicated way of removing the exclamation marks from a string. Like the last one, this is applied another code section stored as a string and obfuscated with five exclamation marks. The string is un-obfuscated and stored as the variable etppeifjeka. That's a long section and it looks like that's part of the payload, so we'll get to that in part 2. For now, let's skip past it and see how it's used.

The last section is this:
eiuaopyj = ''+etppeifjeka+'';
var gmgdouaeyd = 1100;
var xrywreom = '';
oynaoyoyaia0 = eiuaopyj;
var dsfsg = yrauyiyqouoi+1;
xrywreom+='var oynaoyoyaia'+dsfsg+' = oynaoyoyaia'+yrauyiyqouoi+';';
This section is odd, to say the least. The for loop constructs a string which is stored in the variable xrywreom. The loop counts from 0 to 1100 and builds a section of code that declares a series of variables oynaoyoyaiaX where X is the current number, and the variable is set to equal the previous number. The output looks like this:
var oynaoyoyaia1 = oynaoyoyaia0;var oynaoyoyaia2 = oynaoyoyaia1;var oynaoyoyaia3 = oynaoyoyaia2;var oynaoyoyaia4 = oynaoyoyaia3;var oynaoyoyaia5 = oynaoyoyaia4;
It goes up to var oynaoyoyaia1100 = oynaoyoyaia1099; Each step of the loop, the loop runs this[fuquoudieeel](xrywreom); which executes the code stored in the variable. This creates 1100 variables and sets them all equal to eiuaopyj (the variable holding the obfuscated section we haven't examined yet). Let's go back to the earlier section where we saw a reference to oynaoyoyaia. We had deobfuscated it to this point:
which evaluates back to etppeifjeka, the probable payload.

After the for loop is the last line of Javascript in this PDF: ioafyyad(rtaoyuupaue); As we've already discovered, ioafyyad is this[eval] and rtaoyuupaue is this[eval]('(oynaoyoyaia1100);'); so that is the line of code that actually triggers the exploit.

All that's left to do is deobfuscate the exploit itself and see what it does.

Thursday, May 5, 2011

Firefox 4 Browser Forensics, Part 5

We're nearing the end of my series on Firefox 4 forensics (click here for the full list). Media coverage has finally started to make people aware of how much their online behavior is tracked, and the addition of "Private Browsing" modes in all major browsers is making browser anti-forensics easier than ever. This means we'll probably encounter it in our investigations.

First, I'll cover actions that prevent the creation of artifacts: turning "Remember History" off and using "Private Browsing" mode. Then I'll cover various some methods of destroying artifacts that have been created. I won't be covering third-party products.

Preventative antiforensics

To test "Private Browsing" mode, I activated private browsing, searched for Nmap and downloaded the latest version and then closed Firefox. First, I wanted to see if the page was listed in the browser history, so I opened places.sqlite and queried: "select * from moz_places where url like '%nmap%';" No result.  Same with searching for 'input' in typed_urls, no cookies from the domain and nothing in the download history either. However, the google search for "nmap", many nmap images, the websites http://nmap.org, and http://nmap.org/download.html all appear in the browser cache with the appropriate timestamps and fetch count. This, plus having the creation time of the downloaded file, tells us exactly what the user did and when.

Turning off browsing history is pretty easily done, it's front-and-center on the "Privacy" tab in options. Default is "Remember history", but there are custom history settings as well as just "off". To test the artifacts, I turned off browsing history, googled "metasploit", and downloaded the latest version. As is expected, nothing is appearing in moz_places (browsing history). Nothing's showing up in the cache or the download history, so oddly enough turning off browsing history protects privacy better than "Private Browsing" mode. That means the only possibilities for detection are outside of Firefox, such as using the operating system to track who was logged in when the downloaded file was created and who executed it.

Note, this is accurate at the time of writing, for the current version of Firefox (4.0.1). Once this is made known, it's entirely possible that any of these behaviors will change. You should always run your own testing to confirm behavior before trusting it in a case.

Evidence destruction

But what if the target of our investigation didn't know in advance that he needed to cover his tracks? Firefox has several options to remove recorded data, from the selective to the blunt.

The most selective way to remove data is through the history pane. If you open the history pane and right-click on a history item, you can select "forget this site". Let's imagine this is a "violation of policy" case: browsing porn at work. I browsed to www.pornhub.com and started a video streaming to get a good cache. Opening up the history, it looks like Pornhub connected to several other porn sites, so if our suspect didn't make sure to forget all of the relevant sites there would still be evidence of their illicit browsing. In this case, however, I'm going to make sure and forget about all of them. After "forgetting" all the sites, there are no traces left in places.sqlite. There's evidence that sites were forgotten because of the gap in id numbers, but no indication of what was formerly there. Interestingly, using "forget this site" completely destroys the cache, but only removes the selected site(s) from the browsing history. This is a clear sign of evidence destruction, and the deleted cache files could likely be recovered from unallocated space or from backups (such as Volume Shadow Copy).

If any of the databases are deleted, Firefox will automatically create a new empty copy of it the next time it's run. Normally, the databases will have a modified date of the last browsing event, but a creation date of when Firefox was originally installed. The creation date is not even modified when Firefox is upgraded or the history is "forgotten" through the browser options. Therefore, if the creation date of the tables is more recent than the creation date of core Firefox files (such as firefox.exe), it's a clear indication that the table was deleted around the creation date of the existing table. It may be recoverable through standard means.

Directly modifying the databases would be somewhat more difficult to detect. The databases are modified constantly through regular browsing, so the timestamps wouldn't be a clue. However, like "forgetting this site", there will be a gap in the normally sequential ID numbers that could indicate that something was deleted, and examining the last_visit_date of the sites surrounding the gap might allow you to determine when the missing sites were visited. If backups of the databases exist, they might have the missing data. Also, the cache isn't nearly as user-friendly to edit as a sqlite database so if the cache isn't cleared it could provide a clue for what was lost. Even if the cache had been cleared, the deleted files might be recoverable through standard methods.

This isn't meant to be a complete overview of all possible methods of antiforensics with Firefox, just a quick highlight of some possibly relevant issues and how to detect and overcome them. This is the end of my Firefox 4 forensics series, I hope it'll be a useful reference for your investigations. If any of this information turns out to be incorrect or changes in future versions, please let me know and I'll edit the appropriate post.