Mailing List Archive

[clamav-users] Multiple Streams embedded as base64 inside xml
Hi All,

I have an xml file which has list of pdf files embedded as base64.

When I scan that xml file, does it also scans those base64 content inside
that xml or do i need to convert those base64 contents into different
streams and scan them individually?

Regards,
Gorkem Cinar
Re: [clamav-users] Multiple Streams embedded as base64 inside xml [ In reply to ]
Hi there,

On Thu, 23 Apr 2020, G?rkem ?INAR via clamav-users wrote:

> I have an xml file which has list of pdf files embedded as base64.
>
> When I scan that xml file, does it also scans those base64 content inside
> that xml or do i need to convert those base64 contents into different
> streams and scan them individually?

If ClamAV recognizes that there's base64 encoded text to be scanned it
will try to scan it, but it's not as simple as that. See for example

https://blog.talosintelligence.com/2013/01/the-0-day-that-wasnt-dissecting-highly.html

To get an answer in one particular case - but perhaps _only_ in that
particular case, see

http://www.clamav.net/documents/creating-signatures-for-clamav

especially the part about half way down the page which talks about

clamscan --debug

and saving temporary files to show how ClamAV has processed the file.

A signature is just something which matches a string of bytes in the
data being scanned. It's quite possible that a scan could catch some
known problem in *any* file, no matter how compressed, containerized
and obfuscated, if there's already a signature which matches something
in the raw file (that is, before any extraction and/or decoding takes
place); so it might not be necessary for ClamAV to do any processing
on the file before scanning. Some signatures look specifically for
strings which have been obfuscated; try for example

sigtool -l | grep Obfuscated

for what's in your ClamAV database.

While ClamAV is of course capable of decoding base64 text, there are
caveats. There's a tradeoff between scan times and the probability
that something detectable might be present in what's being scanned,
and the signatures themselves contain a field which determines their
applicability so that ClamAV doesn't waste its time scanning for some
threat which cannot be present in the scanned data. If a signature is
restricted to a certain kind of data (it doesn't have to be, but many
are), then no matter whether or not it would match anything in the
scanned data, it won't be used in the scan if ClamAV believes that it
is not scanning that kind of data. One of the things many malicious
authors try (sometimes quite hard, as you've seen) to do is hide the
real intent of their creation. Sometimes they're successful, so even
if the answer to your question was a simple "yes", you couldn't really
rely on it.

Not only are you to some extent at the mercy of the malware authors,
you also to some extent depend on the whims of the signature writers.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Multiple Streams embedded as base64 inside xml [ In reply to ]
G.W. Haywood via clamav-users wrote:
> It's quite possible that a scan could catch some
> known problem in *any* file, no matter how compressed, containerized
> and obfuscated, if there's already a signature which matches something
> in the raw file (that is, before any extraction and/or decoding takes
> place);

That's not entirely true, although I'd be happy to be proven wrong.

I've tried a couple of times to create signatures for Javascript malware
(and asked for pointers on this list a couple of times), based on an
obfuscation pattern in a series of raw files. I have yet to find a way
to actually match on the actual raw file in those cases.

-kgd

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Multiple Streams embedded as base64 inside xml [ In reply to ]
Hi there,

On Fri, 24 Apr 2020, Kris Deugau wrote:

> G.W. Haywood via clamav-users wrote:
>> It's quite possible that a scan could catch some
>> known problem in *any* file, no matter how compressed, containerized
>> and obfuscated, if there's already a signature which matches something
>> in the raw file (that is, before any extraction and/or decoding takes
>> place);
>
> That's not entirely true, although I'd be happy to be proven wrong.
>
> I've tried a couple of times to create signatures for Javascript malware (and
> asked for pointers on this list a couple of times), based on an obfuscation
> pattern in a series of raw files. I have yet to find a way to actually match
> on the actual raw file in those cases.

I see some posts from you in 2016 which seemed to be basically about
normalization. Normalization was causing signatures for those things
to fail to match, but switching normalization off would have the same
effect on signatures which needed to work on normalized text. Absent
a signature type which calls for non-normalized text, I think the way
I'd approach that would be to run two instances of clamd - one for the
bulk of the signatures, and one for the (few?) custom signatures which
need to work on the raw files. In 2015 you said that you had trouble
getting signatures of the form

AB??CD??EF??...

to work. I don't know if that's still a problem, but if I were going
to look for such things I'd find it much quicker and easier to add a
Perl regex to my milter configuration than to write ClamAV signatures.
4-5 years ago I was heavily overworked with a new milter, otherwise I
might have piped up at the time. For the omissions I apologize.

I've remarked before that the bodies of mail which you and I seem to
see are very different. I don't recall ever seeing any of the kind of
obfuscation which has bothered you, but then I probably drop the mails
before they get as far as body scanning. That's a luxury I can afford
which perhaps you can't, but anything from a Yahoo server which claims
a gmail sender address is, in my view, fair game...

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml