Create Your Own Anti-Virus Signatures with ClamAV

Posted on 22/11/2008 by Adam

I use ClamAV on my own mail servers, I’ve also used it at work alongside several commercial AV engines and every now and again there will be a viral attachment that none of the AV engines catch, especially when a new threat is released. As a Linux user, most virus and malware threats mean little to me, however if you are responsible for Windows users then you need to be on top of the game.

Even though viral email attachments aren’t the major attack vector for Windows PCs that they were a few years ago, a few times recently I’ve found the need to block viral emails which the major AV engines weren’t catching or they were sufficiently behind the curve that I’ve had to create my own signatures to block viral attachments while I waited for the AV vendors to catch up.

Enter ClamAV. ClamAV is an anti-virus toolkit for Unix and Windows. Aside from being an on-demand virus scanner, ClamAV comes with a suite of tools for creating your own anti-virus signatures which can then be used as part of the regular AV definitions when running a scan.

The first thing you need is something which you want to detect. It might be a virus, some other piece of malware or maybe just a nuisance application installer. It helps if you’re not running Windows so you don’t infect yourself with whatever it is you are trying to detect and running the following commands will be easy for you. If you have an email with your attachment or file in, you need to save the attachment to your PC. If it’s still on the mail server, either download the mail and save the file or if you have shell access to the server, copy the entire mail file itself to your PC which is easy if you’re using maildirs. If you use mboxes you need to take a copy of the mail somehow so it’s in a file of it’s own (look at csplit for example).

If you have a file containing the email rather than having saved the attachment from within your mail client, you need to split the text and attachment parts out from each other. The following script does this for you. You need Perl and the MIME::Parser module from CPAN (sudo cpan install MIME::Parser for Ubuntu users).

#!/usr/bin/perl
use MIME::Parser;
$file = $ARGV[0];
my $parser = new MIME::Parser;
mkdir(“/tmp/$$”);
$parser->output_under(“/tmp/$$/”);
$parser->output_prefix(“msg”);
$entity = $parser->parse_open(“$file”);
$entity->dump_skeleton;

Save it as strip-attach.pl or something and make it executable. Then run it with an argument of the file to strip such as:

strip-attach.pl <mail file>

The output will give you the paths to the text portion and the attachment portion of the email. If you saved the email attachment to your PC from your mail client, you can start to pay attention now.

What you now have is the file you want to block. If it’s zipped, compressed or in any other kind of container then unzip it or extract it as ClamAV can see inside these archives if you configured it to do so and you have the right tools installed (like unzip under Linux for example).

Next create a signature of the file using ClamAV’s sigtool:

cat testfile | sigtool –hex-dump | head -c 2048 > customsig.ndb

In this case, testfile is your undesirable file and we have taken a signature of the first 2KB, otherwise the signature would be huge and therefore scanning would be inefficient. We have saved the generated signature in customsig.ndb. In theory, you need to take a signature of a unique portion of the file. You can also take a signature from an off-set within the file, it doesn’t have to be from the start of the file. See the ClamAV signature docs for more detail on how to create signatures.

You should edit customsig.ndb and prefix the content with the appropriate Name, Type and Offset in the following format:

Name:Type:Offset:malware hex output

Such as:

Trojan.Win32.Emold.A:1:*:4d5a80000100000004001000ffff000040010000000000004000000000000000000000000000000000000000

Name is the virus name. Type is one of the following:

0 = any file
1 = Portable Executable (ie Windows exe)
2 = OLE2 component (e.g. a VBA script)
3 = HTML (normalised)
4 = Mail file
5 = Graphics
6 = ELF
7 = ASCII text file (normalised)

Offset is either * or an offset in bytes from the beginning of the file to where the hex string occurs. This is best left as * unless you know your where in the file your hex string occurs. Read the Clamav documentation if this is the case.

For most purposes, a type of 0 (or 1 for a Windows exe), and an offset of * will suffice.

Either name the virus yourself if it’s just a file you don’t want on your network or it’s a new virus, or take a look at what other AV engines call a virus by submitting your suspicious file to somewhere like http://www.virustotal.com/. ClamAV has it’s own virus naming conventions as detailed in the docs.

My good friend and malware expert Barbie of Message Labs and Birmingham Perl Mongers gave a talk at LugRadio Live UK 2008 where he explained that the people that are first to identify a new virus are the people who name it, though different AV vendors often use the different names and the name which is popularised in the press is the one that sticks. If you detect a virus before anybody else, then name it as you like and then find a way of making sure everybody uses your chosen name. Fun and profit awaits you 🙂

Now, test the signature against your suspect file:

clamscan -d customsig.ndb testfile

It’s pretty inefficient to store one virus signature per file, so if you’re going to be doing this frequently or you want your signature to used as part of regular operations, you may as well start keeping your own virus db file as part of ClamAV itself. Simply copy your customsig.ndb to the directory used by ClamAV’s own signatures. On most Linux boxes that’s /var/lib/clamav/, though it might be something like /usr/local/share/clamav/ on FreeBSD or if you compiled ClamAV yourself. So restart ClamAV and run a regular scan without having to specify your custom sig:

clamscan testfile

And that’s it. Add each new signature line into the customsig.ndb file you put in ClamAV’s signatures directory but be sure to test it first from a standalone sig file so you know it works as expected without affecting the operation of the main ClamAV installation.

Having created sigs for files which the commercial AV engines weren’t catching, I submitted the suspicious file I was working on to the ClamAV team for detection within ClamAV. Now I guess you have to be a bit closer to the project and certainly more experienced than the novice I am to generate sigs and have them included in ClamAV, but there’s nothing stopping you submitting the suspicious files to the project by uploading them at http://www.clamav.net/sendvirus/.

I did exactly that and was quite pleased to get an email a few weeks later which said a signature for the file I submitted had been included in a ClamAV update, although the same file had been submitted by several other people.

Most people suggest advocacy or documentation as ways non-programmers can help a project, it just goes to show that there are many more ways to help a Free Software project than you might think if you’re not a programmer.

So, why would you want to use ClamAV? If you run mail servers then you should be using it already, regardless of whether you run a proprietary AV engine. ClamAV is free and plugs easily into most Unix style mail servers, either directly or though something like Amavis. ClamAV is pretty good at catching phishing emails too, which is something I’ve not seen much of from the major AV vendors. Details on dealing with phishing sigs are here.

A few years ago I worked at a college where Windows permissions were sufficiently lax that the students were able to install MSN Messenger (now known as Windows Live Messenger) on the PCs which were supposed to be for educational purposes only, as certain applications they needed to run required access to write to parts of the registry so they couldn’t be locked down any further without serious effort. We had a terrible time trying to keep up with removing it and stopping them downloading it. Had we known at the time, (ignoring the concept of actually trying to lock the machines down properly), we could have run ClamAV on a filtering proxy and created a signature which detected MSN Messenger or other unwanted installers, blocked them at the gate and run a scan across the user directories for saved copies brought in on memory sticks. While it’s fighting fires instead of solving the bigger problem, you could apply a simple fix to the major threats and it would buy you enough breathing space to solve the real problems.

Note that ClamAV is not an in memory, on-access, real-time background virus scanner, it won’t detect viruses in files as you open or execute them. You need to manually scan files to detect viruses, it’s not intended as a replacement for a desktop AV, it’s intended for gateway services like web and mail filtering or scheduled scanning.

Do I need to tell you any more? Go geddit tiger.

13 thoughts on “Create Your Own Anti-Virus Signatures with ClamAV”

Pingback: Create Your Own Anti-Virus Signatures with ClamAV | Qelly Security Center
shahid on 29/05/2009 at 16:27 said:

how can i complish such task in c#. Plz help
Max on 07/08/2009 at 15:39 said:

Hey, Adam, very good article, it helped me a lot, keep up doing it. Thanks.
Dan on 19/08/2009 at 08:11 said:

Hey, thats very useful article. Is it possible to accomplish such a task in java?
Anti Virus on 04/09/2009 at 08:00 said:

[…]Putting ClamAV to work is easy. Your first step should be deciding where you want to install ClamAV — workstation or server.Once you have installed ClamAV in either locale, downloading the latest virus signatures from a ClamAV mirror using the command freshclam. You can set advanced parameters for this lookup, such as proxies, log generation, specific mirrors, in the freshclam.conf file located under ClamAV’s installation, one of the more useful features is the capability to configure time-based lookups, which allow ClamAV to automatically fetch updates every certain time without manual intervention.

The basic ClamAV process for inspecting a file and determining if it’s not a known virus is the command-line clamscan utility. Upon execution it will by default inspect every file present in the working directory against the local ClamAV database. You can pass flags to this command to invoke recursive inspection, removal of infected files, and even alarms.

Although clamscan & freshclam fulfill the virus inspection and database update process respectively, their invocation may be considered somewhat clumsy by some, since they are both command-line functions and require extracting of suspicious files into specific directories for manual inspection. For those wanting a more comprehensive interface, there are also graphical front-ends like ClamWin, which is designed for Windows operating systems, ClamXav for Mac OS X, and an OS-agnostic GUI utility written in Java named ClamShell.[…]
Puneet Dikshit on 28/07/2010 at 11:25 said:

hi, adam nice info but it is not sufficient for more than 90 percent people becaue they use windows not linux so kindly tell in details or in easy language so everyone contribute there effert on making a number one antivirus “”Clamwin””.
hoping that you can understand i have 3 virus in my P.C i know it but i am unable to delete tham or remove them i have tried more than 16 antivirus but nothing heppens, so kindly send me detail information on that means signature files.
Adam Sweet on 28/07/2010 at 11:49 said:

Nice one 🙂

You act like I owe you something because you run Windows and have 3 viruses? This article is for people who run Linux servers. In any case I offer you some options:

1) Figure it out yourself based on the information above. That’s what I’d have to do, I’m not here to spoon feed you. I run Linux. This article is for people who run Linux.

2) Take a backup of your important files and re-install your machine. That’s what people who have viruses they can’t get rid of on Windows have to do.

3) Run an operating system that is immune to Windows viruses. Maybe Linux, maybe something else, then the viruses you have a problem with won’t affect your machine and you can use my instructions above.

4) Install Linux on another hard disk in the same computer, install clamav on your Linux disk, create signatures for the viruses you have and then tell clamav to scan your Windows disk. Clamav won’t remove the files for you, it will just tell you what they are, you can remove them manually.

5) Boot from a Linux live CD and manually remove the problem files from your Windows disk.

I’m slightly concerned by the fact you think you have viruses but not one of 16 AV engines can remove them. What told you you have viruses in the first place and how much do you trust it? While I don’t believe AV companies are perfect, I’ve only encountered 1 piece of malware I couldn’t remove in 10 years of IT and that was 6 years ago. Then again, I don’t run Windows so my exposure is limited, but I’d be wary of removing files that no AV engine can remove on the basis that maybe whatever told you you have viruses is either wrong or itself untrustworty.

Try uploading your malware files to:

http://virscan.org/

They scan files with 35 AV engines to tell you whether your files are malware or not.

If none of the suggestions above are suitable for you, then you either don’t have the technical understanding to solve your problem in a way I can relate to or you’re just plain doing it wrong.
arun on 28/09/2010 at 13:07 said:

i gonna to do project on creating antivirus i needed the help of what the function (ie) analysis for creating antivirus
Brandon Perry on 06/01/2011 at 07:03 said:

@shahid

using (FileStream fs = File.OpenRead(“path/to/file.gz”))
{
var data = new byte[fs.Length];
int read, i = 0;

while ((read = fs.Read (data, 0, data.Length)) > 0 && read <= 2048) {
ms.Write (data, 0, read);
i += read;
}

byte[] fileHeader = ms.ToArray(); //will be up to 2048 in length

Code untested, but I hope it helps show you what the logic should be.

}
Adam on 28/03/2011 at 21:04 said:

Found this very useful and was able to create the files however when I go to test the new sig file using clamscan I get the following:

# clamscan -d test.ndb somefile.php
LibClamAV Error: cli_ac_addsig: Signature for PHP.Downloader.dor.A is too short
LibClamAV Error: cli_parse_add(): Problem adding signature (3).
LibClamAV Error: Problem parsing database at line 1
LibClamAV Error: Can’t load test.ndb: Malformed database
ERROR: Malformed database

Any input? I’ve been going though various doc files from ClamAV.net.

ClamAV 0.97/12915/Sun Mar 27 20:04:17 2011
Almighty Dervisher on 11/04/2011 at 02:25 said:

Just so you know the only difference between Linux and Windows is Windows is an OS which is profitable to hack (thus gaining either employee’s information, blackmailing companies, or getting a job from it.) where Linux isn’t much so.

Since Linux is becoming more popular, suddenly there are a rare few multi-platform viruses about, and some even targeted to Linux.

It’s not that Linux can’t be hacked, it’s just not as many hackers care to do it. I’m sure Linux is a much more secure OS than Windows, but there is no such thing as ultimate protection.

Here’s a question though,
let’s say I created a virus and before it attacked a computer, it renamed itself based on a bit of the computer’s information (e.g. MAC Adress, Username, etc) so that way the virus’s name may have been originally ‘Virus_Ur_Doomed.exe’ but it changed itself to bewrf2r2.exe.
the question is, how would virus scanners/firewalls pick up that it’s a virus? From what I read it seems that you base it off of the name of the virus. I may be wrong, but could you care to explain how firewalls and scanners catch a certain virus where it isn’t name-dependant? I’ve always wanted to make a custom firewall, just to protect the places (mostly browsers) where most firewalls and programs such as spybot S&D don’t cover (My parents and I both use a firewall and spybot, it seems they still get viruses through whatever programs they download and use. I’d like to fix this.)
So if you could perhaps give a brief explanation how you’d pick up and find ‘virus definitions’, that’d be great.
Adam on 11/04/2011 at 11:48 said:

Adam, try making a longer hexdump, such as:

head -c 4096

for example. ClamAV changed the specifications on the length of signatures in 2010, which was after this article was written, though I wasn’t aware they’d cut off support for smaller signatures. I couldn’t find any documentation on this and I don’t really have time to download the code and search it for how long signatures should be.

I did find some comments on people with corrupted or badly formatted signatures which caused ClamAV to bail out, you should be able to find similar stories by Googling.

Final thought, how big was your original file? If you specified a sig length of 2048 (ie 2 KB) and you files is less than 2 KB, then it’s not going to work is it? 😉

Finally, I assume PHP.Downloader.dor.A is the signature you created? Are you certain it’s formatted correctly? Like this:

Name:Type:Offset:malware hex output

as detailed in the sig creation documentation:

http://www.clamav.net/doc/latest/signatures.pdf

I’m not aware that the syntax for signature specification has changed, but I haven’t looked into it. You could check your own sigs and compare them to the ones supplied by ClamAV (in /var/lib/clamav/ on my machines, but it depends on what system you’re running and whether you hand compiled or installed from packages). Or pull down the source code and trawl through it to find what the code says it expects.
Adam on 11/04/2011 at 11:49 said:

Almighty Dervisher:

No, it doesn’t depend on the filename at all for the reason you highlight. That would be quite a limitation.

In the example we’re passing the contents of the malware file to sigtool and telling sigtool to create a hexdump of the file contents. Then we’re taking the first 2 KB of the hexdump as the signature of the virus, which should theoretically be enough to uniquely identify the original file. As my article says, it’s sometimes necessary to take your hexdump from an offset number of bytes within the file if the first part isn’t unique enough.

Hope that clears it up for you.