Add support for parsing docx based MS Office files.
Main purpose is extracting embedded files. You will need to parse the XML,
locate the embedded data, then decode(base64/OLE?) / and decompress
So I did analysis of how clamAV currently scanning a .DOCX file . From my
understanding it treats as a ZIP file and extracts to a temporary folder,
and scanning each xml file and inserted media files such pictures,video
etc.(If I am not correct, kindly explain me).
After that, I tried embedding a EICAR test virus in a picture file by using
Steghide tool. Then I scanned that picture file ,but clamav didnt recognize
it. Reason may be steghide encrypts the virus file.
So I like to know following things,
1. Why clamav didnt recognize encrypted virus?
2.Anyone help me to start my project?(Still now I gone through the source
code using gdb, so I have little knowledge about code)