My First Threat Hunting Expedition
One of the things I like doing is reading technical books about various topics. Reading technical books can be challenging because they are not always easy to understand. And what's more, a good technical book should always have exercises.
Here’s my plan: I want to read and do the work of Cyber Threat Hunting, by Nadhem AlFardan, published by Manning. Doing the work is essential for learning. I also like to think that learning in public can help. In other words, for each chapter that interests me, I will do an in-depth technical review by reviewing and redoing the content proposed in the book, the exercises and adding my consideration.
Also, I will upload each script and file in my repository on GitHub, which is a branch of the original repo containing the necessary files.
I will start from the 3rd chapter called Your First Threat Hunting Expedition.
Your First Threat Hunting Expedition
Here’s the threat scenario:
The red team crafted a Microsoft Word document with a suspicious payload and then attached the document to an email they sent to users. Opening the document executes the payload automatically. The code contained in the payload can bypass the existing security controls on the users' machines running Windows 10, going undetected by the anti-virus software. In addition, other security monitoring tools deployed did not generate security alerts for the SOC team to respond to.
Important note: In the book, the author will use Humio to conduct threat hunting. However, the author notes that other tools like Elastic or Splunk could also be used. I will try to use PowerShell (with the help of Claude).
The raw data is available in this public repository. I will upload each script and result in my personal repository.
The first query (not added here) aims to search for events exhibiting Microsoft Office spawning PowerShell. Here's the explanation:
sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational"
: Search for Sysmon events only.XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
is the sourcetype value assigned to events sent by an agent running on the endpoints and collecting Sysmon events generated by the endpoint._raw.EventID=1
: Search for process creation Sysmon events._raw.ParentCommandLine=/winword.exe/i
: Search for events with parent command line containing the stringwinword.exe
, case insensitive._raw.CommandLine=/powershell/i
: Regex-based and case-insensitive search for events with command line containing powershell.
Normally, you would set up multiple queries and test different hypotheses. This particular query returns zero results. What to do now? At first, check that the query is correct. If it is, adjust your approach. We recognize that our query is correct but does not produce the outcome we expected, so we can modify our initial hypothesis. Here’s an excerpt from the book:
Microsoft Office Word spawning a Windows command shell (cmd). The activity maps to MITRE ATT&CK sub-technique T1059.003 "Command and Scripting Interpreter: Windows Command Shell." In this technique, Microsoft Word did not directly spawn a PowerShell process, but rather PowerShell scripts were created.
It is possible that an attacker instructed Microsoft Word to write a PowerShell script to disk instead and then got that script executed using other commands or processes.
The first search looks for the following:
- Sysmon events with EventID 11 (FileCreate);
- Image field containing the string winword (the name of Word process), case insensitive;
- TargetFilename field containing the string .ps1, case insensitive.
So here’s how to do it in PowerShell:
# Import your CSV file
$events = Import-Csv -Path "ch3_sysmon_events.csv"
# Filter for Event ID 11 (FileCreate) where Word is creating .ps1 files
$suspiciousWordPs1 = $events | Where-Object {
# Match EventID 11
$_.EventID -eq 11 -and
# Match Image containing "winword" (case insensitive)
$_.Image -match "winword" -and
# Match TargetFilename ending with .ps1 (case insensitive)
$_.TargetFilename -match "\.ps1$"
}
# Display the results
$suspiciousWordPs1 | Format-Table -AutoSize
# Export to a new CSV
$suspiciousWordPs1 | Export-Csv -Path "suspicious_word_ps1.csv" -NoTypeInformation
The result of the CSV file (turned into JSON for better viz) is the following:
[
{
"Channel": "Microsoft-Windows-Sysmon/Operational",
"CommandLine": "",
"Computer": "DESKTOP-PC01",
"DestinationIp": "",
"DestinationPort": "",
"EventID": 11,
"EventRecordID": 301,
"Event{@xmlns}": "http://schemas.microsoft.com/win/2004/08/events/event",
"Execution{@ProcessID}": 1872,
"Execution{@ThreadID}": 3648,
"Image": "C:\\\\Program Files\\\\Microsoft Office\\\\Office14\\\\WINWORD.EXE",
"Keywords": "0x8000000000000000",
"Level": 4,
"Opcode": 0,
"ParentCommandLine": "",
"Protocol": "",
"Provider{@Guid}": "{5770385f-c22a-43e0-bf4c-06f5698ffbd9}",
"Provider{@Name}": "Microsoft-Windows-Sysmon",
"Security{@UserID}": "S-1-5-18",
"TargetFilename": "C:\\\\Users\\\\pc01-user\\\\AppData\\\\Roaming\\\\www.ps1",
"Task": 11,
"TimeCreated": "2021-11-21T13:02:28.484734300Z",
"TimeCreated{@SystemTime}": "2021-11-21T13:02:28.484734300Z",
"UtcTime": "2021-11-21 13:02:28.475",
"Version": 2,
....
}
]
With a simple query we got a lot of useful information. We know the host name, DESKTOP-PC01, which generated the sysmon event. The Sysmon event of type 11 shows that Microsoft Word created a PowerShell script, www.ps1
, under Roaming in the AppData folder, a hidden folder by default.
This is the first clue, we also take note of the UtcTime, “2021-11-21 13:02:28.475”, in order to have an event timeline.
After that, we would like to know if the file was accessed or executed by any other process. How can we do it? We can run a free text search for all the file name www.ps1
across all Sysmon events for the host name DESKTOP-PC01. Then, to be able to know if perhaps other activities were performed other than creating this file, we could perform a search for winword.exe
.
# Filter for events from the specific hostname containing "www.ps1"
$matchingEvents = $events | Where-Object {
# Check hostname matches DESKTOP-PC01
$_.Computer -eq "DESKTOP-PC01" -and (
# Check for "www.ps1" in any relevant field that might indicate file access or execution
$_.TargetFilename -match "www\.ps1" -or
$_.Image -match "www\.ps1" -or
$_.CommandLine -match "www\.ps1" -or
$_.ParentCommandLine -match "www\.ps1" -or
$_.SourceFilename -match "www\.ps1" -or
$_.ProcessCommandLine -match "www\.ps1" -or
# Add a general check to catch any other fields
($_ | ConvertTo-Json -Compress) -match "www\.ps1"
)
}
# Display the results and count
Write-Host "Found $($matchingEvents.Count) events matching the criteria."
>Found 2 events matching the criteria.
# Group the results by EventID to see what types of events involve this file
$eventTypes = $matchingEvents | Group-Object -Property EventID |
Select-Object @{Name="EventID"; Expression={$_.Name}}, Count
Write-Host "Event types found:"
$eventTypes | Format-Table -AutoSize
# Display detailed results
$matchingEvents | Format-Table EventID, TimeCreated, Computer, Image, CommandLine, TargetFilename -AutoSize
# Export results to a new CSV
$matchingEvents | Export-Csv -Path "DESKTOP-PC01_www_ps1_events.csv" -NoTypeInformation
We found 2 Sysmon events. The results CSV is (only the second one is reported):
{
"Channel": "Microsoft-Windows-Sysmon/Operational",
"CommandLine": "\\\"\"C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe\\\"\" -ExecutionPolicy Bypass & C:\\\\Users\\\\pc01-user\\\\AppData\\\\Roaming\\\\www.ps1",
"Computer": "DESKTOP-PC01",
"DestinationIp": "",
"DestinationPort": "",
"EventID": 1,
"EventRecordID": 306,
"Event{@xmlns}": "http://schemas.microsoft.com/win/2004/08/events/event",
"Execution{@ProcessID}": 1872,
"Execution{@ThreadID}": 3648,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe",
"Keywords": "0x8000000000000000",
"Level": 4,
"Opcode": 0,
"ParentCommandLine": "\\\"\"C:\\\\Windows\\\\System32\\\\cscript.exe\\\"\" C:\\\\Users\\\\pc01-user\\\\AppData\\\\Roaming\\\\www.txt //E:VBScript //NoLogo %~f0 %*",
"Protocol": "",
"Provider{@Guid}": "{5770385f-c22a-43e0-bf4c-06f5698ffbd9}",
"Provider{@Name}": "Microsoft-Windows-Sysmon",
"Security{@UserID}": "S-1-5-18",
"TargetFilename": "",
"Task": 1,
"TimeCreated": "2021-11-21T13:02:43.173773000Z",
"TimeCreated{@SystemTime}": "2021-11-21T13:02:43.173773000Z",
"UtcTime": "2021-11-21 13:02:43.152",
"Version": 5,
...
This is a sysmon Event of type 1 (process creation) where **www.ps1
appears in the CommandLine field containing "C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe" -ExecutionPolicy Bypass & C:\Users\pc01-user\AppData\Roaming\www.ps1"
.
The event shows powershell.exe
executing the script file with -ExecutionPolicy Bypass
. Using the Bypass policy means nothing is blocked, and no warnings, prompts, or messages will be displayed.
The event also reveals more information. The ParentCommandLine is "C:\Windows\System32\cscript.exe" C:\Users\pc01-user\AppData\Roaming\www.txt //E:VBScript //NoLogo %~f0 %*
. We see a new file of interest, www.txt
, located in \AppData\Roaming, the same location where winword.exe
created www.ps1
. More on that later.
The new task would be searching for www.txt, and continue analyzing the value of the ParentCommandLine field.
The cscript.exe
process is run with the following parameters:
- //E:VBScript specifies VBScript as the scripting engine
- //NoLogo suppresses the banner at startup
- %~f0 expands the first argument to cmd %0
- **%*** expands the rest of the arguments
Take a note of the event’s creation time, "UtcTime": "2021-11-21 13:02:43.152". The event occurred fifteen seconds after winword.exe
created www.ps1
.
Here’s the timeline for our DESKTOP-PC01:
- @13:02:28 **
www.ps1
created bywinword.exe
- [NEW!] @13:02:43
powershell.exe
executed the powershell scriptwww.ps1
. The parent process iscsscript.exe
Let’s answer an interesting question: Why a document created a script? Also, was it already there? To answer that, we need a similar query to the one we did before. Let’s search through the Sysmon events for the ones that contains winword.exe
with a free text search.
# Filter for events from DESKTOP-PC01 containing "winword.exe" (case insensitive)
# A more strict search in order to restric the results
$winwordEventsExact = $events | Where-Object {
# Check hostname matches DESKTOP-PC01
$_.Computer -eq "DESKTOP-PC01" -and (
# Look for exact matches in common fields
$_.Image -like "*\winword.exe" -or
$_.ParentImage -like "*\winword.exe" -or
# Command line containing exactly winword.exe (not part of another word)
$_.CommandLine -match "(^|\s)winword\.exe(\s|$)" -or
$_.ParentCommandLine -match "(^|\s)winword\.exe(\s|$)"
)
}
# Display the results and count
Write-Host "Found $($winwordEvents.Count) events matching winword.exe on DESKTOP-PC01."
# 11 events found
# Group the results by EventID to understand the event types
$eventTypes = $winwordEvents | Group-Object -Property EventID |
Select-Object @{Name="EventID"; Expression={$_.Name}}, Count
Write-Host "Event types found:"
$eventTypes | Format-Table -AutoSize
# Display detailed results
$winwordEvents | Format-Table EventID, TimeCreated, Computer, Image, CommandLine, ParentImage -AutoSize
# Export results to a new CSV
$winwordEvents | Export-Csv -Path "DESKTOP-PC01_winword_events.csv" -NoTypeInformation
The query above shown 9 results, Claude overspecialized the query in order to trim down the events (the first query I tried there were 11 events). In the book the author presents 8 events matching. Perhaps there are some differences in the query, unfortunately I wasn’t able to find out why. For our intended purposes, we will focus on what the book says.
The search the author got is 8 results, we need to look at the ones before 13:02:28 (the first event in our timeline). By 13:02:17 we notice there’s an interesting log. The first event, in particular is the one we might looking for, is another event with EventID 1, a file creation. We notice that the CommandLine has an indication of the Download folder. The file created is critical_list.doc. Let’s update the timeline.
- @13:02:17 critical_list.doc **opened by
winword.exe
- [NEW!] @13:02:28 **
www.ps1
created by*winword.exe*
- @13:02:43
powershell.exe
executed the powershell scriptwww.ps1
. The parent process iscsscript.exe
.
We have this critical_list.doc which is interesting. Now let’s discover how did the file make it to the folder. The third results of events is also worth a look. Specifically, there’s a CommanLine field, in which there’s the /Embedding option enabled. This means that macro are enabled after opening the Word file. This third event occurred ad 13:02:19. Let’s update the timeline.
- @13:02:17 critical_list.doc **opened by
winword.exe
- [NEW!] @13:02:19
winword.exe /Embedding
command issued - @13:02:28 **
www.ps1
created bywinword.exe
- @13:02:43
powershell.exe
executed the powershell scriptwww.ps1
. The parent process iscsscript.exe
.
Have a look at the seventh event from the previous results. The EventID is 13, this Registry event type identifies registry value modifications. But the TargetObject field tells us even more: it contains a list of Microsoft Word document file locations for which a user has explicitly enabled editing and macros.
The UtcTime is 13:02:25. So, the timeline updated is:
- @13:02:17 critical_list.doc **opened by
winword.exe
- @13:02:19
winword.exe /Embedding
command issued - [NEW!] @13:02:25 Registry value updated indicating user enabled editing and macros
- @13:02:28 **
www.ps1
created bywinword.exe
- @13:02:43
powershell.exe
executed the powershell scriptwww.ps1
. The parent process iscsscript.exe
.
What about now? We want to discover when and what created www.txt
and how the Word file end up in the machine (and maybe if it got into other machines too). So let’s perform a free text search for www.txt
in sysmon events.
## Filter for events from DESKTOP-PC01 containing "www.txt" (case insensitive)
$wwwTxtEvents = $events | Where-Object {
# Check hostname matches DESKTOP-PC01
$_.Computer -eq "DESKTOP-PC01" -and
# Check for "www.txt" in any field (case insensitive)
($_ | ConvertTo-Json -Compress) -match "www\.txt"
}
# Sort events by UtcTime in ascending order
$sortedEvents = $wwwTxtEvents | Sort-Object -Property UtcTime
# Display the results in a table with the specified fields
$sortedEvents | Select-Object UtcTime, Computer, EventID, CommandLine, ParentCommandLine |
Format-Table -AutoSize
# Export results to a new CSV
$sortedEvents | Select-Object UtcTime, Computer, EventID, CommandLine, ParentCommandLine |
Export-Csv -Path "DESKTOP-PC01_www_txt_events.csv" -NoTypeInformation
The query returns seven results. But none of them are related to creating the www.txt
. This is because Sysmon configuration file doesn’t capture file creation of txt files. Not because Sysmon can’t. It can log .txt
file creation, but it depends on the configuration. The particular configuration file used, the author later showed, that Sysmon creation file events are generated for file with ps1 extension. Sometimes we do not have enough information and that’s OK.
Then the book goes into detail describing each one of the events. Long story short, these log can augment our timeline:
- @13:02:17 critical_list.doc opened by
winword.exe
- @13:02:19
winword.exe /Embedding
command issued - @13:02:25 Registry value updated indicating user enabled editing and macros
- @13:02:28 **
www.ps1
created bywinword.exe
- [NEW!] @13:02:41 **
winword.exe
executingcsscript.exe
- @13:02:43
powershell.exe
executed the powershell scriptwww.ps1
. The parent process iscsscript.exe
. - [NEW!] @13:02:54 **
csscript.exe cmd.exe
to executerundll32.exe
(4 events)
When threat hunting is incredibly simple to get lost in the lots of activities an analyst could/have to do. It is important to annotate every detail that could also be useful to the incident response team.
We actually finished the chapter and that’s great. Now let’s get to the exercises in the last pages. Here’s the exercises:
- Can you find the sysmon network connection events?
- What processes initiated the outbound network connections uncovered in question 1?
- What were the destination IP addresses and ports for the network connections?
- Update the timeline to reflect the new findings.
The first one can be solve with a query that searches EventID 3 and that contains the string “powershell”. Also, the events will be ordered time-wise.
# Filter for Sysmon network events (EventID=3) from DESKTOP-PC01 containing "powershell" (case insensitive)
$powershellNetworkEvents = $events | Where-Object {
# Check hostname matches DESKTOP-PC01
$_.Computer -eq "DESKTOP-PC01" -and
# Match EventID 3 (network connection)
$_.EventID -eq 3 -and
# Check for "powershell" in any field (case insensitive)
($_ | ConvertTo-Json -Compress) -match "powershell"
}
# Sort events by UtcTime in ascending order
$sortedEvents = $powershellNetworkEvents | Sort-Object -Property UtcTime
# Display the results in a table with the specified fields
$sortedEvents | Select-Object UtcTime, DestinationIp, Protocol, DestinationPort, Image |
Format-Table -AutoSize
# Export results to a new CSV
$sortedEvents | Select-Object UtcTime, DestinationIp, Protocol, DestinationPort, Image |
Export-Csv -Path "DESKTOP-PC01_powershell_network_events.csv" -NoTypeInformation
The results shows five connection, two from a local IP and three to Internet IP addresses. Here’s the results of the query:
[
{
"UtcTime": "2021-11-21 13:03:03.705",
"DestinationIp": "192.168.155.134",
"Protocol": "tcp",
"DestinationPort": 80,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe"
},
{
"UtcTime": "2021-11-21 13:03:04.715",
"DestinationIp": "192.168.155.134",
"Protocol": "tcp",
"DestinationPort": 80,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe"
},
{
"UtcTime": "2021-11-21 13:03:06.225",
"DestinationIp": "146.112.61.110",
"Protocol": "tcp",
"DestinationPort": 443,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe"
},
{
"UtcTime": "2021-11-21 13:03:06.792",
"DestinationIp": "2.20.7.24",
"Protocol": "tcp",
"DestinationPort": 80,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe"
},
{
"UtcTime": "2021-11-21 13:03:08.836",
"DestinationIp": "146.112.61.110",
"Protocol": "tcp",
"DestinationPort": 443,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe"
},
{
"UtcTime": "2021-11-21 13:03:10.545",
"DestinationIp": "146.112.61.110",
"Protocol": "tcp",
"DestinationPort": 443,
"Image": "C:\\\\Windows\\\\System32\\\\WindowsPowerShell\\\\v1.0\\\\powershell.exe"
}
]
The process powershell.exe
initiated the connection.
All outbound connections used TCP port 80 and the IP addresses are:
- 192.168.155.134
- 2.20.7.24
- 146.11.61.110
The new timeline is:
- @13:02:17 critical_list.doc **opened by
winword.exe
- @13:02:19
winword.exe /Embedding
command issued - @13:02:25 Registry value updated indicating user enabled editing and macros
- @13:02:28 **
www.ps1
created bywinword.exe
- @13:02:41 **
winword.exe
executingcsscript.exe
- @13:02:43
powershell.exe
executed the powershell scriptwww.ps1
. The parent process iscsscript.exe
. - @13:02:54
csscript.exe cmd.exe
to executerundll32.exe
(4 events) - [NEW!] @13:03:03 - 13:03:10
powershell.exe
initiates connections to internal and external IP addresses (5 events)
Conclusion
I learned a lot during this walk through threat hunting exercise. The learning curve for PowerShell query is steep to be honest, but AI Assistants can also help you when stuck.
I really enjoyed writing this, hope it will be useful to someone reading the book, see you soon!