Tool for extracting possible IoC information from files

This blog post presents a tool called ioc_strings that can be used to gather relevant technical information from file strings. The tool is developed for CinCan project to be used in incident analysis Continuous Integration (CI) pipelines, and also for standalone use for incident analyst. ioc_strings tool extracts possible IoC (Indicator of Compromize) information from files, such as urls, domains, emails, hashes etc. These IoC types are compatible with Cortex-Analyzers, therefore it is possible to feed these gathered possible IoCs to Cortex-Analyzers and receive informative evaluation to see if these possible IoCs are actual IoCs.

Benefit in finding only relevant information

In image below, left side is Linux strings command output with 112k output strings. Right side is output for ioc_strings with 27 lines that only contains relevant possible IoC information for further analyzing. Without ioc_strings it would be a huge job to identify all the strings output strings manually.

Strings versus IoC strings

Functionalities

The tool uses Linux strings command to gather all strings from a file, and then it loops through every one of them. It also splits single strings at whitespaces to improve the amount of gathered possible IoCs. For example, string ip = 127.0.0.1, which does not identify as IoC type, would yield 3 strings: ip, =, and 127.0.0.1. From these 3 strings 127.0.0.1 identifies as IP address. Python libraries iocextract and validators are utilized for identifying IoC types.

ioc_strings can be also used as Python library to identify IoC type. Example code and output:

import ioc_strings

ioc1 = ioc_strings.IOC("8.8.8.8")
print(ioc1.is_ioc())
print(ioc1.data)

ioc2 = ioc_strings.IOC("testing")
print(ioc2.is_ioc())
print(ioc2.data)

output:

True
{'8.8.8.8': ['ip']}
False
{'testing': []}

Example usage

Example files scanned are from https://github.com/ytisf/theZoo.

Input path can be either file or directory. If input path is a directory, all filepaths are searched recursively and extracted one by one. Example case directory structure:

theZoo/malwares/Binaries/Brain.A
├── Brain.A
│ ├── Brain.A.img
│ ├── Brain.A.txt
│ ├── nobrains
│ │ ├── BRAIN
│ │ ├── DEBRAIN.C
│ │ ├── DEBRAIN.EXE
│ │ ├── DREAD.ASM
│ │ ├── DREAD.INC
│ │ ├── DWRITE.ASM
│ │ ├── DWRITE.INC
│ │ ├── README
│ │ ├── VACCINE.COM
│ │ ├── VACCINE.PAS
│ │ └── VACCINE.TXT
│ └── nobrains.zip
├── Brain.A.md5
├── Brain.A.pass
├── Brain.A.sha
└── Brain.A.zip

Command:

iocstrings theZoo/malwares/Binaries/Brain.A/

output:

03f1e073761af071d373f025359da84ec39ada19
c56f135fdaff397ad207f61b4f2042fe
nobrains.zip
nobrains.zip
hubak@elf.stuba.sk
vaccine.com
jwright@atanasoff.cs.iastate.edu

IoC types can also be included in the output with `-t` option. Command:

iocstrings theZoo/malwares/Binaries/Brain.A/ -t

output (JSONL format):

{"c56f135fdaff397ad207f61b4f2042fe": ["hash"]}
{"03f1e073761af071d373f025359da84ec39ada19": ["hash"]}
{"nobrains.zip": ["domain"]}
{"nobrains.zip": ["domain"]}
{"hubak@elf.stuba.sk": ["email"]}
{"vaccine.com": ["domain"]}
{"jwright@atanasoff.cs.iastate.edu": ["email"]}

Output can be also filtered by IoC type. Command:

iocstrings theZoo/malwares/Binaries/Brain.A/ -t --filter email

output:

{"hubak@elf.stuba.sk": ["email"]}
{"jwright@atanasoff.cs.iastate.edu": ["email"]}

Summary

The ioc_strings tool is alternative and perhaps more convenient choice for Linux strings command, when analyzing malware files or memory dumps. If you are interested in testing the tool, see the GitHub repository: https://gitlab.com/CinCan/ioc_strings


Onni Hakkari
Project Engineer
Institute of Information Technology, JAMK University of Applied Sciences

Share: