No Logs, No Problem: Generating Threat Intel with Active Deception
I’m continuing my series reviewing the book Cyber Threat Hunting by Nadhem AlFardan, published by Manning. So far, I’ve written four posts covering Chapter 3, Chapter 4, Chapter 6 and Chapter 8.
Chapter 9 in my opinion wasn’t worth an article, I couldn’t add more to what the author explained in that chapter. This one will be the last chapter review for this book. It was really fun and hope to do it with another exciting book!
In this post, we’ll explore Chapter 10: Hunting with deception. You can find all the relevant files for this chapter in my GitHub repo.
Hunting with deception
For this last chapter, we consider the case in which a threat hunter does not have the necessary data to analyze. This is a very common situation. In order to create an environment where you are able to gain information from an attacker, you would use active defense, or deception.
In a Microsoft environment using Active Directory, if lateral movement is suspected but relevant data is unavailable, it can be difficult to determine whether an attack has actually occurred. We are not waiting for the adversary to make a mistake, so we need to use deception and not raise suspicion.
Here’s three examples of deception techniques:
- fake account password hashes on Windows Servers and endpoint;
- RDS (Remote Desktop Services) on new server;
- Hosting a network file share which contains (fake) sensitive information.
Decoys should not be too obvious: deploying an FTP decoy could reveal the presence of decoys to the adversaries. Hence, the attacker would know if he’s being detected.
LSAS: it stands for Local Security Authority Subsystem Service. It is a critical system process in Microsoft Windows operating systems, responsible for verifying user logins, handling password changes, and creating access tokens.
In the book, it is created a PowerShell script to inject a hash into LSASS for user api_admin
. This is the decoy we are going to work with.
Since we don’t know whether or not the adversary will encounter the decoy, the author state:
we’ll configure the data store to generate an alert that the security monitoring team should triage and then direct to the threat hunter.
We’ll fast-forward to the “we-got-lucky-part” and take a look at the data.
Data Analysis
I’ve created a Jupyter Notebook file, I used Python to analyze the CSV file provided. I also tried the JSON file but I found that CSV file are easier to manipulate.
So to recap: we are in a situation where we know that someone used api_admin
, our decoy, over the network.
After loading the file, here’s the first search that breaks down the events by fields Computer and EventID, knowing that our TargetUserName is api_admin:
if "df" in locals():
# Filter for the specific user
api_admin_events = df[df["TargetUserName"] == "api_admin"]
if not api_admin_events.empty:
# Group by Computer and EventID and get the count
grouped_results = (
api_admin_events.groupby(["Computer", "EventID"])
.size()
.reset_index(name="Count")
)
print("Grouped results for TargetUserName: 'api_admin'")
display(grouped_results)
else:
print("No records found for TargetUserName: 'api_admin'")
else:
print("DataFrame 'df' not found. Please run the data loading cell (Cell 1) first.")
Here’s the breakdown of events by fields Computer and EventID from the book:
Computer | EventID | _count |
---|---|---|
apidevdc01.example.com | 4625 | 3 |
apidevdc01.example.com | 4776 | 3 |
winhost01 | 4648 | 3 |
And here’s the result of the script:
Grouped results for TargetUserName: 'api_admin'
Computer | EventID | Count | |
---|---|---|---|
0 | apidevdc01.example.com | 4625 | 4 |
1 | apidevdc01.example.com | 4776 | 4 |
2 | winhost01 | 4648 | 4 |
It seems that each event count is one higher than in the book.
In the book, it is stated:
This output shows that three types of Windows events were generated by two computers, apidevdc01.example.com and winhost01:
- Event 4625 is generated for a failed login attempt.
- Event 4776 is generated when an AD domain controller attempts to validate credentials for an account login.
- Event 4648 is generated when a login is attempted with explicit credentials.
In the book, it is shown as an example a log for the event 4625 (login failed), in order to demonstrate that the access was denied because user api_admin doesn’t exist in AD or locally on any of the systems in the first place.
Also, an important note is that an Event 4648 is a possible evidence of lateral movement attempt from 10.128.0.25 (IP address of winhost01) to apidevdc01.example.com (target server).
Now, let’s search for Event 4688: A new process has been created.
It’s time to investigate what happened on winhost01 before, during, and after the failed login attempts. We’ll review events with ID 4688 generated on winhost01. The search excludes what we know are benign (normal) processes, which in our case contain SplunkUniversalForwarder in the NewProcessName field. The search groups the results based on the content of the field CommandLine. To limit the number of results, you can tune the search in the following listing to exclude other known normal processes.
Searching events for Computer = winhost01 and EventID=4688
if "df" in locals():
# Filter for the specific computer and event ID
winhost_4688 = df[(df["Computer"] == "winhost01") & (df["EventID"] == 4688)]
# Exclude the Splunk forwarder process
# The '~' inverts the selection, effectively removing these processes
# .str.contains is used to find the substring; na=False prevents errors on empty cells
filtered_events = winhost_4688[
~winhost_4688["NewProcessName"].str.contains(
"SplunkUniversalForwarder", na=False
)
]
if not filtered_events.empty:
# Group by the CommandLine and count occurrences
commandline_counts = (
filtered_events.groupby(["CommandLine"]).size().reset_index(name="Count")
)
# Sort for better readability
commandline_counts = commandline_counts.sort_values(by="Count", ascending=False)
print(
"Command Line execution counts on 'winhost01' for EventID 4688 (Splunk excluded):"
)
display(commandline_counts)
else:
print("No matching events found after filtering.")
else:
print("DataFrame 'df' not found. Please run the data loading cell (Cell 1) first.")
We get 79 entries, but looking though them, we notice some commands that contains really suspicious command, such as mimikatz.exe. After a bit of data manipulation (you can see the complete code in the notebook), here’s the output cointaining a summary of CommandLine fields values - this is the same result we have in the book:
index | @timestamp | CommandLine |
---|---|---|
2204 | 2023-12-05-02:20:21 | "C:\Users\user01\Downloads\mimikatz.exe" |
2337 | 2023-12-05-02:22:31 | "C:\Windows\system32\whoami.exe" |
2371 | 2023-12-05-02:22:48 | "C:\Users\user01\Downloads\mimikatz.exe" privilege::debug sekurlsa::logonpasswords exit |
2372 | 2023-12-05-02:22:49 | C:\Windows\system32\wbem\wmiprvse.exe -secured -Embedding |
2480 | 2023-12-05-02:24:13 | consent.exe 1200 468 00000226C6FA6F50 |
2481 | 2023-12-05-02:24:14 | atbroker.exe |
2484 | 2023-12-05-02:24:14 | "C:\Windows\System32\Sethc.exe" /AccessibilitySoundAgent |
2486 | 2023-12-05-02:24:17 | atbroker.exe |
2606 | 2023-12-05-02:26:27 | "C:\Windows\system32\ipconfig.exe" |
645 | 2023-12-05-02:28:52 | "C:\Windows\system32\ARP.EXE" -a |
672 | 2023-12-05-02:29:06 | "C:\Windows\system32\PING.EXE" 10.128.0.11 |
678 | 2023-12-05-02:29:14 | "C:\Windows\system32\PING.EXE" 10.128.0.14 |
700 | 2023-12-05-02:29:37 | "C:\Windows\system32\PING.EXE" 10.128.0.24 |
710 | 2023-12-05-02:29:55 | "C:\Windows\System32\CredentialUIBroker.exe" NonAppContainerFailedMip -Embedding |
714 | 2023-12-05-02:30:00 | "C:\Windows\System32\CredentialUIBroker.exe" NonAppContainerFailedMip -Embedding |
723 | 2023-12-05-02:30:03 | "C:\Windows\System32\CredentialUIBroker.exe" NonAppContainerFailedMip -Embedding |
796 | 2023-12-05-02:31:16 | "C:\Windows\System32\CredentialUIBroker.exe" NonAppContainerFailedMip -Embedding |
From this result can built a timeline of events and have a solid understanding of what the adversary has done.
Here I quote:
The adversary executed Mimikatz, and if they had enough privileges, they would be able to retrieve the credentials for
api_admin
. They tested the account against an RDP server, apidevdc01.example.com.
I report the exact following paragraph, as I couldn’t explain it better:
- Plain-text passwords—These passwords are the actual ones typed by users and may be present in memory under certain configurations.
- New Technology LAN Manager (NTLM) hashes—Windows stores password equivalents in the form of NTLM hashes, which can be used for certain types of attacks, such as Pass-the-Hash.
- Kerberos tickets—In environments that use Kerberos for authentication, ticket-granting tickets and service tickets can be extracted.
- Other authentication data—This data might include personal identification numbers (PINs), encrypted keys, and other forms of authentication credentials.
Mimikatz is a powerful, well-known open source tool used primarily for Windows security research, forensics, and penetration testing. The Mimikatz command sekurlsa::logonpasswords extracts various types of credentials that are stored in memory, including the following:
So, finally we can conclude that we have gathered enough information to confirm that someone has entered in the network. The confirmed compromised host is winhost01
, but perhaps, it could be reasonable to think that others might have been compromised too.
Exercises
- What is the local IP address of the AD domain controller, apidevdc01.example.com?
if "df" in locals():
# Ensure IpAddress is string so .str methods work safely —
df["IpAddress"] = df["IpAddress"].astype(str)
# Build the mask:
mask = (df["Computer"] == "apidevdc01.example.com") & (
df["IpAddress"].str.startswith("10.", na=False)
)
# Apply filter
ipAddress_df = df.loc[
mask, ["Computer", "IpAddress", "@timestamp", "CommandLine", "EventID"]
]
if not ipAddress_df.empty:
# Group by IpAddress and count rows
grouped_results = (
ipAddress_df.groupby("IpAddress")
.size()
.reset_index(name="Count")
.sort_values(by="Count", ascending=False)
)
print("Counts of events by IP address (10.*.*.*) on 'apidevdc01.example.com':")
display(grouped_results)
else:
print(
"No records found for Computer = 'apidevdc01.example.com' with IP starting 10.*.*.*"
)
else:
print("DataFrame 'df' not found. Please run the data loading cell (Cell 1) first.")
Result:
Counts of events by IP address (10.*.*.*) on 'apidevdc01.example.com':
IpAddress | Count | |
---|---|---|
0 | 10.128.0.24 | 18 |
1 | 10.128.0.25 | 4 |
Checking events associated with each IP address reveals that IP address 10.128.0.24 is associated with apidevdc01.example.com, whereas IP address 10.128.0.25 is associated with winhost01. The following listing shows a sample Windows event with Event 4624 showing 10.128.0.24 and apidevdc01.example.com. We can confirm that the IP address of the AD domain controller, apidevdc01.example.com, is 10.128.0.24.
- The adversary created an account on the compromised host, winhost01. What is the account name?
if 'df' in locals():
# Filter for the specific user
adversary_events = df[(df['Computer'] == 'winhost01') & (df['EventID'] == 4720) ]
if not adversary_events.empty:
#
display(adversary_events['TargetUserName'])
else:
print("No records found for TargetUserName: '*****'")
else:
print("DataFrame 'df' not found. Please run the data loading cell (Cell 1) first.")
display(adversary_events)
The result is a single event (for the adversary_events with removed NaN value column):
index | @timestamp | @sourcetype | EventRecordID | ProcessID |
---|---|---|---|---|
1019 | 2023-12-05 02:34:14.163714886 | XmlWinEventLog:Security | 75059 | 624 |
Computer | EventID | TargetUserName | TargetDomainName | TargetSid |
---|---|---|---|---|
winhost01 | 4720 | api_test | WINHOST01 | WINHOST01\api_test |
In the book, it is presented the single Event ID 4720, which contains the following information:
- Who created the user account (
user01
) -> Not in the table - The new account’s name (
api_test
) -> OK - Where the account was created (
winhost01
) -> OK
It seems this result lack a bit of information compared to what the book result indicate.
The differences are probably due to the fact that the author uses the JSON file as dataset, little differences in the data could arise in the analysis.
- What local user group was the account added to?
if "df" in locals():
# Filter for Event ID 4732 (member added to security-enabled local group)
# on the compromised host winhost01
group_events = df[(df["Computer"] == "winhost01") & (df["EventID"] == 4732)]
if not group_events.empty:
print("All Event ID 4732 events on winhost01:")
print("=" * 50)
# Display all available columns for these events to understand the structure
display(group_events)
print("\nAnalyzing group assignment events:")
print("=" * 40)
# Since we know api_test was created around the same time, let's look at the timing
# and see if we can correlate with when the account was created
# First, let's find when api_test was created (Event ID 4720)
account_creation = df[
(df["Computer"] == "winhost01")
& (df["EventID"] == 4720)
& (df["TargetUserName"] == "api_test")
]
if not account_creation.empty:
creation_time = account_creation["@timestamp"].iloc[0]
print(f"api_test account was created at: {creation_time}")
# Look for group assignments shortly after account creation
print(f"\nGroup assignment events on winhost01:")
for idx, row in group_events.iterrows():
print(f"- Group: {row['TargetUserName']}")
print(f" Timestamp: {row['@timestamp']}")
print(f" Event Record ID: {row['EventRecordID']}")
print()
# Answer the question based on the context
print("Based on the threat hunting context:")
print(
"The adversary likely added the api_test account to the 'Administrators' group"
)
print("for privilege escalation purposes.")
else:
print("No Event ID 4732 records found on winhost01")
else:
print("DataFrame 'df' not found. Please run the data loading cell (Cell 1) first.")
Here’s the result of the last code snippet (tables are divided for better visualization and also removed NaN value column):
All Event ID 4732 events on winhost01:
==================================================
index | @timestamp | @sourcetype | EventRecordID | ProcessID |
---|---|---|---|---|
1023 | 2023-12-05 02:34:14.177087069 | XmlWinEventLog:Security | 75063 | 624 |
1045 | 2023-12-05 02:34:33.446849108 | XmlWinEventLog:Security | 75071 | 624 |
Computer | EventID | TargetUserName | TargetDomainName | TargetSid |
---|---|---|---|---|
winhost01 | 4732 | Users | Builtin | BUILTIN\Users |
winhost01 | 4732 | Administrators | Builtin | BUILTIN\Administrators |
Analyzing group assignment events:
========================================
api_test account was created at: 2023-12-05 02:34:14.163714886
Group assignment events on winhost01:
- Group: Users
Timestamp: 2023-12-05 02:34:14.177087069
Event Record ID: 75063
- Group: Administrators
Timestamp: 2023-12-05 02:34:33.446849108
Event Record ID: 75071
Based on the threat hunting context:
The adversary likely added the api_test account to the 'Administrators' group
for privilege escalation purposes.
In this output, we see events with IDs 4732 (a member was added to a security-enabled local group). Checking the content of events with ID 4732 shows that the user was added to the Administrators and Users groups.
Conclusion
I personally finished reading this book a while ago but I wanted to share my notes on this last chapter I wanted to cover.
Reading technical book is essential for my growth, but reading is not enough. Sharing with others my notes, my code and attempts to demonstrate my skills is a interesting way to harvest the most from a book.
It was a pleasure, see you again soon!