I-SOON Leak Analysis using Python and Generative AI

Author: Thomas Roccia | @fr0gger_

Introduction

Analyzing leaked data can be a tedious task, especially if it's written in a foreign language. Luckily, with Python, it's possible to rapidly automate this process.

In the following notebook, we will analyze the data leak from I-Soon that provides sensitive information about potential Chinese espionage capabilities. This data leak is an interesting use case as most of the data are PNG files that require the use of OCR to automate the process of extraction. The leak is available here: https://github.com/I-S00N/I-S00N

The leak contains txt, logs, md and png files. This notebook will focus on the PNG file that represents the most amount of data.

The goal of this notebook is to provide the tools and workflow to let you analyze this kind of data by yourself.

Let's dive deep.

Disclaimer

Please use the data available in this notebook "as is". This document outlines a methodology for analyzing this kind of data and should not be considered an intelligence report.

The output provided may require additional verification due to possible inaccuracies in the translation or limitations inherent to LLM technologies.

Nevertheless, this document provides an initial step for analyzing leaked data, particularly PNG files in a foreign language.

Requirements

In [4]:
#!pip install Pillow pytesseract
#!pip install googletrans # Be carefull you might have some issue with the version of HTTPX use in this lib and OpenAI 
#!pip install openai
#!pip install bokeh

# You also need an OpenAi API key
api_key = "sk-"

Analyzing the data

As always, the first thing to do before jumping into the data is to spend time understanding what kind of information we have, the structure, the format, the number of documents...

This is a crucial step to ensure you analyze your data in the right way. As we focus on the PNG file let's count how many we have in the repository.

In [49]:
import os
from collections import Counter

# Passing the directory 
image_directory = '0'
files = os.listdir(image_directory)

# Extracting file extensions and counting occurrences
file_extensions = [os.path.splitext(file)[1] for file in files]
extensions_count = Counter(file_extensions)

# Printing statistics about file types
print("File Type numbers:")
for ext, count in extensions_count.items():
    print(f"{ext if ext else 'No Extension'}: {count}")
File Type numbers:
.md: 70
.png: 489
.log: 6
.txt: 11
No Extension: 1

Let's create a bar and pie chart to visualize the repartition.

In [62]:
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
from bokeh.transform import cumsum
from bokeh.models import ColumnDataSource, HoverTool
from math import pi
import pandas as pd

output_notebook()

file_type_counts = {'png': 489, 'md': 70, 'log': 6, 'txt': 11, 'other': 1}

data = pd.Series(file_type_counts).reset_index(name='value').rename(columns={'index': 'file_type'})
data['angle'] = data['value']/data['value'].sum() * 2*pi
data['color'] = ['#0999d3', '#718dbf', '#e84d60', '#ddb7b1', '#ddb777']

# Pie chart
p = figure(height=350, title="File Type Distribution", toolbar_location=None,
           tools="hover", tooltips="@file_type: @value", x_range=(-0.5, 1.0))

p.wedge(x=0, y=1, radius=0.4,
        start_angle=cumsum('angle', include_zero=True), end_angle=cumsum('angle'),
        line_color="white", fill_color='color', legend_field='file_type', source=data)

p.axis.axis_label = None
p.axis.visible = False
p.grid.grid_line_color = None

show(p)
Loading BokehJS ...
In [65]:
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
import pandas as pd

output_notebook()

data = pd.DataFrame(list(file_type_counts.items()), columns=['file_type', 'count'])

# Convert DataFrame 
source = ColumnDataSource(data)

# Create figure
p = figure(x_range=data['file_type'], plot_height=250, title="File Type Distribution",
           toolbar_location=None, tools="")

# Add vertical bars 
p.vbar(x='file_type', top='count', width=0.9, source=source, legend_field="file_type")

# Set attributes
p.xgrid.grid_line_color = None
p.y_range.start = 0
p.yaxis.axis_label = "Count"
p.xaxis.axis_label = "File Type"
p.legend.orientation = "horizontal"
p.legend.location = "top_center"

# Display the plot
show(p)
BokehJS 2.4.3 successfully loaded.

To give you an example of what kind of information are available let's take a look to one of the image. image

Here it is quite interesting because we have a screenshot that contains text, images and also diagrams. In other images we have screenshots of discussion or screenshot of Windows folders. Which makes a bit more difficult to analyze with context. And that will require an additional analysis. But let's focus here of extracting the text from our images.

Extracting and translating text for one image

Now that we have a better repartition and that we know a little bit better the content of the data, we are going to extract the text using OCR.

In [1]:
import os
# Set the TESSDATA_PREFIX 
os.environ['TESSDATA_PREFIX'] = 'tessdata-main'

Analyzing one image at a time

In [2]:
from PIL import Image
import pytesseract
from googletrans import Translator, LANGUAGES

# Load the PNG image
image_path = '0/12756724-394c-4576-b373-7c53f1abbd94_15.png'
image = Image.open(image_path)

# Use Tesseract to do OCR on the image
text = pytesseract.image_to_string(image, lang='chi_sim')  # 'chi_sim' for simplified Chinese

print("Extracted Text:", text)

# Translate in English
translator = Translator()
translated_text = translator.translate(text, dest='en')
print("Translated Text:", translated_text.text)
Extracted Text: 专 业 的 数 孙 情 报 解 决 方 案 提 供 商

1. 6.5 “ 产 品 图 片

噱 | 鱼 黑 | 关 黑 心 日 - 心 煜 乔 息 -| 里

小 林 5 ( 怡 织 ; 耶 fL; 电 量 ; 648; 弈 幕 亮 阮
白 pnas Q 晓 士 R 关 Hnars a

E E e J 万
-
一 3 万 = E ES3
5 “ 二

snasriaa ]ngutumsape

E ESE3SH

i E

E H

E 圆

Rs E

E 5

E 王

E 芸

〈Android 远 程 控 制 管 理 系 统 界 面 图 )

1.7 “Linux 远 程 控 制 管 理 系 统

1. 7. 1 “ 产 品 简 介
Linux 远 程 控 制 系 统 是 一 款 针 对 Linux 系 统 , 可 对 其 进 行 远 程 控 制 和 取 证 设
备 信 息 的 系 统 。
系 统 主 要 通 过 将 设 置 好 的 服 务 端 安 装 到 目 标 主 机 上 , 上 线 后 通 过 控 制 端 的 操
作 对 目 标 主 机 进 行 控 制 。 支 持 正 向 连 接 和 反 向 连 接 两 种 上 线 朱 式 。

连 接 模 式

(Linux 远 程 控 制 系 统 运 行 形 态 图 )

安 淘 信 息 技 术 有 限 公 司 16150

Translated Text: Division of the Digital Grandson Reporting Correctional Corporation

1. 6.5 "Product Graphics

Puppet | Fish Black | Guan Heixin Day -Xinyu Qiao Ending -| Li

Kobayashi 5 (Yaori; Ye FL; Electricity; 648; Yi Mu Liang Ruan
White PNAS Q Xiaoshi R Guan HNARS A

E e j Wan
-
One 30,000 = ES3
5 "Two

SNASRIAA] NGutumsape

E eSe3SH

I E

E H

E -circle

RS E

E 5

E king

Ead

"Android remote control management system interface diagram)

1.7 "Linux remote control management system

1. 7. 1 "Product Simplified
Linux remote control system is a native to the linux system, which can be set up for remote control and certification.
The unification of the letter.
The Lord of the Department must pass through the setting of the set of service to the target host, and after online, pass through the control end
Make control of the target host.Two types of pilot and reverse connection of the branch are the same.

Connection

(LINUX remote control system transportation form of unified operation)

Antaoxin Integration Skills There is a limited company 16150

And now we have the image above translated! You can do it for a specific image by changing the path of the image, in the code above.

Processing the Image and storing them in a json file

So now let's automate the process of extraction and create a JSON file to request and access the data.

In [26]:
import os
import json
from PIL import Image
import pytesseract
from googletrans import Translator

# Directory containing PNG files
image_directory = '0'

# Sorting the file to keep track of the context when possible
files = sorted(os.listdir(image_directory))  

translator = Translator()
results = []

for file_name in files:
    # Check if the file is a PNG
    if file_name.endswith('.png'):
        file_path = os.path.join(image_directory, file_name)
        
        # Perform OCR using Tesseract
        image = Image.open(file_path)
        text = pytesseract.image_to_string(image, lang='chi_sim')  # Adjust lang as needed

        if text and not text.isspace():
            print(f"Extracted Text from {file_name}")#, text)
            
            try:
                translated_text = translator.translate(text, dest='en') 
                #print(f"Translated Text from {file_name}:", translated_text.text)
                
                # create the json file with both original and translated text
                results.append({
                    'file_name': file_name,
                    'original_text': text,
                    'translated_text': translated_text.text
                })
                
            except Exception as e:
                print(f"Translation failed for {file_name} with error: {e}")
        else:
            print(f"No text found in {file_name} or text is not suitable for translation.")

# Save the json
with open('text_translations2.json', 'w', encoding='utf-8') as f:
    json.dump(results, f, ensure_ascii=False, indent=4)

print("All done! The extracted and translated texts are saved in text_translations2.json, maintaining the original order.")

# The process can take up to 30 min
Extracted Text from 0-08a6bcd3-6477-4252-8f35-4f8f80d114f9.png
Extracted Text from 0-0b54af64-c2cd-4acb-9864-73a584aa6ebc.png
No text found in 0-0baba509-5e81-4b88-b509-843822d09e21.png or text is not suitable for translation.
Extracted Text from 0-0f319bf6-e667-4bac-a974-dfda1142e9ff.png
No text found in 0-129ac70f-8942-4ca7-b1f2-ddeaa3d984b5.png or text is not suitable for translation.
Extracted Text from 0-1a20ded1-50fc-4153-9a95-e158eeb7199e.png
Extracted Text from 0-1afcf93d-50f1-4f1e-896d-87b0da7519f7.png
No text found in 0-1b0dc208-d2bb-43ea-b744-534f3b759394.png or text is not suitable for translation.
No text found in 0-1cc570d8-cddb-401e-8c37-ef10c0e4841f.png or text is not suitable for translation.
Extracted Text from 0-300450bf-221e-4eeb-bdda-dc1115c947ea.png
No text found in 0-32eb7662-f212-4811-a7c1-1cfeb121cd99.png or text is not suitable for translation.
No text found in 0-330f554f-a3e6-4bd3-8b1b-d5949e1f30e8.png or text is not suitable for translation.
Extracted Text from 0-3556e54c-d418-447d-bb2a-43ac0408cc7a.png
Extracted Text from 0-383d824e-7588-4a92-84b7-fd953dd91cba.png
Extracted Text from 0-493542fc-495f-4756-8451-c4ed084d8bf7.png
Extracted Text from 0-4ae9bf34-c16c-4684-aa92-fec65a151275.png
Extracted Text from 0-4c74b697-0681-4223-9982-5ffaf4e98ed0.png
No text found in 0-4ea07c23-a1a6-411b-bcfb-552d095b66c9.png or text is not suitable for translation.
Extracted Text from 0-5a84cde3-7175-4044-8c88-d4c883a8fd38.png
No text found in 0-5ae9bdca-fdf9-4948-8c11-a9e400b331aa.png or text is not suitable for translation.
Extracted Text from 0-5d4e3e02-1dfc-469e-8af9-8dbe2b9f1564.png
Extracted Text from 0-5ef1d666-e19d-4570-b800-6693a4f680ee.png
Extracted Text from 0-62583414-9e32-4d09-8989-b5fa32a98a81.png
Extracted Text from 0-62ff30cf-de5f-4388-82aa-b69b0fd0f07c.png
No text found in 0-645dfc97-3268-4e1d-920d-4138545456fa.png or text is not suitable for translation.
Extracted Text from 0-6848748d-2881-4c26-b153-fcd5373d2f1c.png
Extracted Text from 0-6bcc0131-e4ad-421e-bb1f-d8ebe5eeec7b.png
No text found in 0-6cbb3eeb-17e9-4af6-8da1-36eb6437f7bc.png or text is not suitable for translation.
Extracted Text from 0-6e9aced1-df28-4e57-b7c8-641609ff4450.png
No text found in 0-70c63791-2797-4bf0-a778-ea08819aa9de.png or text is not suitable for translation.
Extracted Text from 0-7150f512-e7a2-4f2c-86bc-58b671b25ba9.png
Extracted Text from 0-785cc8c9-1225-4f93-b633-349bc5113512.png
Extracted Text from 0-79d9b7f2-cfe4-4615-9b75-8fea33fc0c9d.png
Extracted Text from 0-94b16e53-f035-4aa9-a76e-80bc6e936d10.png
Extracted Text from 0-96af60b3-299c-4e26-bca3-d9eb3e113b94.png
Extracted Text from 0-987ba39a-cc1c-4367-8d6d-f5a49a940198.png
Extracted Text from 0-9a8077f5-ac41-491f-b192-6b4609324bda.png
Extracted Text from 0-9c8c9989-2293-4e68-9ffe-6f7a5f14562f.png
Extracted Text from 0-aa99f763-6849-4f6b-adf2-58f0cc2ed545.png
Extracted Text from 0-adaf869e-920a-4a17-91bd-e2ef3125c10e.png
Extracted Text from 0-af93eff8-2973-4746-9041-b2223016b117.png
No text found in 0-b0a4acaa-d768-4f6d-8e54-6d20f271bb7c.png or text is not suitable for translation.
Extracted Text from 0-b3ce4d51-6024-4b43-b0d2-d3faaf3c2879.png
Extracted Text from 0-b6eb1b15-cf99-475c-921f-f06e5c1019d4.png
Extracted Text from 0-b8b76b6d-a50e-4246-82ee-3c8a5dcd523e.png
Extracted Text from 0-b8cea3b1-4dde-4438-9b1a-6faf690bbad0.png
Extracted Text from 0-b9d9c584-5e21-4a49-952b-ffecca4eb91e.png
No text found in 0-bcad4fdf-3771-4873-92fa-23240654118a.png or text is not suitable for translation.
Extracted Text from 0-c5f1d959-39d1-4176-9cb1-1fb6e8baedc3.png
Extracted Text from 0-dd5b6a38-dc17-4122-a242-32006b381b3a.png
Extracted Text from 0-de359f8d-0745-4a93-959a-d1a6c361e326.png
Extracted Text from 0-e07a9457-86f1-4f0f-86d7-8ea816b8d8d3.png
No text found in 0-e705d192-90ee-4fd1-9dcd-061958d1817f.png or text is not suitable for translation.
Extracted Text from 0-ee47dfea-2626-4107-8ab3-4663167e0493.png
No text found in 0-f0ce8a7b-909d-4fc5-ba13-ea66b2dc6448.png or text is not suitable for translation.
Extracted Text from 0-f313f521-80a1-4db5-a8a7-53d29ee09890.png
Extracted Text from 0-f41b7574-57b4-4c9f-907c-2a3c48a56157.png
No text found in 0-fc27ce32-9c96-416c-9c38-84977255e0ba.png or text is not suitable for translation.
Extracted Text from 0-fcf90a92-794c-40c6-aa4f-8ea82f8bed51.png
Extracted Text from 0-fe221e78-67e4-4d88-b73d-e58a9943a036.png
Extracted Text from 01cdc26f-e773-4ad7-8808-d04abf16aae7_1_0.png
Extracted Text from 01cdc26f-e773-4ad7-8808-d04abf16aae7_2_0.png
Extracted Text from 07f179c5-5705-4dbd-94a7-66eed1e066b0_0.png
Extracted Text from 07f179c5-5705-4dbd-94a7-66eed1e066b0_1.png
Extracted Text from 07f179c5-5705-4dbd-94a7-66eed1e066b0_2.png
Extracted Text from 08a6bcd3-6477-4252-8f35-4f8f80d114f9.png
Extracted Text from 0b54af64-c2cd-4acb-9864-73a584aa6ebc.png
Extracted Text from 0baba509-5e81-4b88-b509-843822d09e21.png
Extracted Text from 0f319bf6-e667-4bac-a974-dfda1142e9ff.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_0.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_1.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_10.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_11.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_12.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_13.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_14.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_15.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_16.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_17.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_18.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_19.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_2.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_20.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_21.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_22.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_23.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_24.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_25.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_26.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_27.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_28.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_29.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_3.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_30.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_31.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_32.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_33.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_34.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_35.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_36.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_37.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_38.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_39.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_4.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_40.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_41.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_42.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_43.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_44.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_45.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_46.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_47.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_48.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_49.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_5.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_6.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_7.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_8.png
Extracted Text from 12756724-394c-4576-b373-7c53f1abbd94_9.png
Extracted Text from 129ac70f-8942-4ca7-b1f2-ddeaa3d984b5.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_0.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_1.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_10.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_2.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_3.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_4.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_5.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_6.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_7.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_8.png
Extracted Text from 178e3898-903d-47cf-bfbe-061e7dc18895_9.png
Extracted Text from 1a20ded1-50fc-4153-9a95-e158eeb7199e.png
Extracted Text from 1afcf93d-50f1-4f1e-896d-87b0da7519f7.png
Extracted Text from 1b0dc208-d2bb-43ea-b744-534f3b759394.png
Extracted Text from 1cc570d8-cddb-401e-8c37-ef10c0e4841f.png
Extracted Text from 2db27de1-d5c5-4f89-8572-da697a6329e4_1_0.png
Extracted Text from 2db27de1-d5c5-4f89-8572-da697a6329e4_2_0.png
Extracted Text from 2db27de1-d5c5-4f89-8572-da697a6329e4_2_1.png
Extracted Text from 2db27de1-d5c5-4f89-8572-da697a6329e4_3_0.png
Extracted Text from 2db27de1-d5c5-4f89-8572-da697a6329e4_4_0.png
Extracted Text from 2db27de1-d5c5-4f89-8572-da697a6329e4_5_0.png
Extracted Text from 300450bf-221e-4eeb-bdda-dc1115c947ea.png
Extracted Text from 32eb7662-f212-4811-a7c1-1cfeb121cd99.png
No text found in 330f554f-a3e6-4bd3-8b1b-d5949e1f30e8.png or text is not suitable for translation.
Extracted Text from 3348953d-66e9-4cac-8675-65bb5f2ef929_1_0.png
Extracted Text from 3348953d-66e9-4cac-8675-65bb5f2ef929_2_0.png
Extracted Text from 3348953d-66e9-4cac-8675-65bb5f2ef929_2_1.png
Extracted Text from 3348953d-66e9-4cac-8675-65bb5f2ef929_3_0.png
Extracted Text from 3348953d-66e9-4cac-8675-65bb5f2ef929_4_0.png
Extracted Text from 3348953d-66e9-4cac-8675-65bb5f2ef929_5_0.png
Extracted Text from 3556e54c-d418-447d-bb2a-43ac0408cc7a.png
Extracted Text from 383d824e-7588-4a92-84b7-fd953dd91cba.png
Extracted Text from 3f451a52-d210-48d9-b56e-d28b9570bdc4_0.png
Extracted Text from 48fd4c79-41ca-459e-a5a5-a3738e7a4af3_0.png
Extracted Text from 493542fc-495f-4756-8451-c4ed084d8bf7.png
Extracted Text from 4ae9bf34-c16c-4684-aa92-fec65a151275.png
Extracted Text from 4c74b697-0681-4223-9982-5ffaf4e98ed0.png
Extracted Text from 4ea07c23-a1a6-411b-bcfb-552d095b66c9.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_0.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_1.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_10.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_11.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_2.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_3.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_4.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_5.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_6.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_7.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_8.png
Extracted Text from 5387a301-0af8-4e24-a197-20189f87b9ef_9.png
Extracted Text from 547aba02-6757-49c1-acb5-6df217cebfc7_0.png
Extracted Text from 547aba02-6757-49c1-acb5-6df217cebfc7_1.png
Extracted Text from 547aba02-6757-49c1-acb5-6df217cebfc7_2.png
Extracted Text from 547aba02-6757-49c1-acb5-6df217cebfc7_3.png
Extracted Text from 54990932-71af-48dd-9a7a-2617b1407c54_0.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_0.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_1.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_10.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_11.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_12.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_13.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_14.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_15.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_16.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_17.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_18.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_19.png
No text found in 585875ff-f8c5-4a02-acd7-fef37dc9ff11_2.png or text is not suitable for translation.
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_3.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_4.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_5.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_6.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_7.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_8.png
Extracted Text from 585875ff-f8c5-4a02-acd7-fef37dc9ff11_9.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_0.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_1.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_10.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_11.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_12.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_13.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_14.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_15.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_16.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_17.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_18.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_19.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_2.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_20.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_21.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_22.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_23.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_24.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_25.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_26.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_27.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_28.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_29.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_3.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_30.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_31.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_32.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_33.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_34.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_35.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_36.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_37.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_38.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_39.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_4.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_40.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_41.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_42.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_43.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_5.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_6.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_7.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_8.png
Extracted Text from 5a6b122c-39c1-4581-8c1f-2d6f36a9f8a0_9.png
Extracted Text from 5a84cde3-7175-4044-8c88-d4c883a8fd38.png
No text found in 5ae9bdca-fdf9-4948-8c11-a9e400b331aa.png or text is not suitable for translation.
Extracted Text from 5d4e3e02-1dfc-469e-8af9-8dbe2b9f1564.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_0.png
No text found in 5e5bd90e-60c5-402f-b488-750456a81a13_1.png or text is not suitable for translation.
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_10.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_11.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_12.png
No text found in 5e5bd90e-60c5-402f-b488-750456a81a13_2.png or text is not suitable for translation.
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_3.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_4.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_5.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_6.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_7.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_8.png
Extracted Text from 5e5bd90e-60c5-402f-b488-750456a81a13_9.png
Extracted Text from 5ef1d666-e19d-4570-b800-6693a4f680ee.png
Extracted Text from 62583414-9e32-4d09-8989-b5fa32a98a81.png
No text found in 62ff30cf-de5f-4388-82aa-b69b0fd0f07c.png or text is not suitable for translation.
Extracted Text from 645dfc97-3268-4e1d-920d-4138545456fa.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_0.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_1.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_10.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_11.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_12.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_2.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_3.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_4.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_5.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_6.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_7.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_8.png
Extracted Text from 64bba692-d430-440c-9f1e-2575f45770af_9.png
Extracted Text from 6848748d-2881-4c26-b153-fcd5373d2f1c.png
Extracted Text from 6bcc0131-e4ad-421e-bb1f-d8ebe5eeec7b.png
Extracted Text from 6cbb3eeb-17e9-4af6-8da1-36eb6437f7bc.png
Extracted Text from 6d7fc7b3-c892-4cb5-bd4b-a5713c089d88_0.png
Extracted Text from 6e9aced1-df28-4e57-b7c8-641609ff4450.png
Extracted Text from 70c63791-2797-4bf0-a778-ea08819aa9de.png
No text found in 7150f512-e7a2-4f2c-86bc-58b671b25ba9.png or text is not suitable for translation.
Extracted Text from 785cc8c9-1225-4f93-b633-349bc5113512.png
Extracted Text from 79d9b7f2-cfe4-4615-9b75-8fea33fc0c9d.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_0.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_1.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_10.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_11.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_12.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_2.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_3.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_4.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_5.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_6.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_7.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_8.png
Extracted Text from 912204cb-8ab7-48b8-9abf-d803f3804d08_9.png
Extracted Text from 94b16e53-f035-4aa9-a76e-80bc6e936d10.png
Extracted Text from 96af60b3-299c-4e26-bca3-d9eb3e113b94.png
Extracted Text from 987ba39a-cc1c-4367-8d6d-f5a49a940198.png
Extracted Text from 9a8077f5-ac41-491f-b192-6b4609324bda.png
No text found in 9c8c9989-2293-4e68-9ffe-6f7a5f14562f.png or text is not suitable for translation.
Extracted Text from 9d7bc879-3250-4013-ac04-5ff9bd6dff40_0.png
Extracted Text from 9d7bc879-3250-4013-ac04-5ff9bd6dff40_1.png
Extracted Text from 9fd06037-11f1-4ad5-9a7d-cbfb3fa4193b_0.png
Extracted Text from 9fd06037-11f1-4ad5-9a7d-cbfb3fa4193b_1.png
Extracted Text from 9fd06037-11f1-4ad5-9a7d-cbfb3fa4193b_2.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_0.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_1.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_10.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_11.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_12.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_13.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_14.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_15.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_16.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_17.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_18.png
No text found in 9fe6b262-9944-417d-a0c4-9f2de1de2994_2.png or text is not suitable for translation.
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_3.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_4.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_5.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_6.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_7.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_8.png
Extracted Text from 9fe6b262-9944-417d-a0c4-9f2de1de2994_9.png
Extracted Text from a1ba4d8b-f382-44c4-ac3f-746a44746bb4_0.png
Extracted Text from a1ba4d8b-f382-44c4-ac3f-746a44746bb4_1.png
Extracted Text from aa99f763-6849-4f6b-adf2-58f0cc2ed545.png
Extracted Text from adaf869e-920a-4a17-91bd-e2ef3125c10e.png
Extracted Text from aedc6a39-7862-4bbc-99e7-780ab3980282_1_0.png
Extracted Text from aedc6a39-7862-4bbc-99e7-780ab3980282_2_0.png
Extracted Text from aedc6a39-7862-4bbc-99e7-780ab3980282_2_1.png
Extracted Text from aedc6a39-7862-4bbc-99e7-780ab3980282_3_0.png
Extracted Text from aedc6a39-7862-4bbc-99e7-780ab3980282_4_0.png
Extracted Text from aedc6a39-7862-4bbc-99e7-780ab3980282_4_1.png
Extracted Text from af93eff8-2973-4746-9041-b2223016b117.png
No text found in b0a4acaa-d768-4f6d-8e54-6d20f271bb7c.png or text is not suitable for translation.
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_0.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_1.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_10.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_11.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_12.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_13.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_14.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_15.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_16.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_17.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_18.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_19.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_2.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_20.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_21.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_22.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_23.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_24.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_3.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_4.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_5.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_6.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_7.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_8.png
Extracted Text from b3031e66-40b6-45e8-9bcd-891dc1a280da_9.png
Extracted Text from b3ce4d51-6024-4b43-b0d2-d3faaf3c2879.png
Extracted Text from b6eb1b15-cf99-475c-921f-f06e5c1019d4.png
Extracted Text from b8b76b6d-a50e-4246-82ee-3c8a5dcd523e.png
Extracted Text from b8cea3b1-4dde-4438-9b1a-6faf690bbad0.png
Extracted Text from b9d9c584-5e21-4a49-952b-ffecca4eb91e.png
Extracted Text from bcad4fdf-3771-4873-92fa-23240654118a.png
Extracted Text from c5f1d959-39d1-4176-9cb1-1fb6e8baedc3.png
Extracted Text from d410e4aa-fb52-4ed4-9078-4483267a02b3_0.png
Extracted Text from d410e4aa-fb52-4ed4-9078-4483267a02b3_1.png
Extracted Text from d410e4aa-fb52-4ed4-9078-4483267a02b3_2.png
Extracted Text from d410e4aa-fb52-4ed4-9078-4483267a02b3_3.png
Extracted Text from d410e4aa-fb52-4ed4-9078-4483267a02b3_4.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_0.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_1.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_10.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_11.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_12.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_13.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_14.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_15.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_16.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_17.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_18.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_19.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_2.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_20.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_21.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_3.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_4.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_5.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_6.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_7.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_8.png
Extracted Text from d5ff8b65-db15-418a-b33e-169498d79110_9.png
Extracted Text from dbc9c90e-a3e6-4d71-bb93-5fb8394095ac_0.png
Extracted Text from dd5b6a38-dc17-4122-a242-32006b381b3a.png
No text found in de359f8d-0745-4a93-959a-d1a6c361e326.png or text is not suitable for translation.
No text found in e07a9457-86f1-4f0f-86d7-8ea816b8d8d3.png or text is not suitable for translation.
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_1_0.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_1_1.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_1_2.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_1_3.png
Translation failed for e182d867-dc18-43fd-a418-26dcf784242f_1_3.png with error: The read operation timed out
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_1_4.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_2_0.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_2_1.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_2_2.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_2_3.png
Extracted Text from e182d867-dc18-43fd-a418-26dcf784242f_3_0.png
Extracted Text from e705d192-90ee-4fd1-9dcd-061958d1817f.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_0.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_1.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_10.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_11.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_12.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_13.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_14.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_2.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_3.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_4.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_5.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_6.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_7.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_8.png
Extracted Text from eda5b003-9250-4913-b724-74cca86240af_9.png
Extracted Text from ee47dfea-2626-4107-8ab3-4663167e0493.png
No text found in f0ce8a7b-909d-4fc5-ba13-ea66b2dc6448.png or text is not suitable for translation.
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_0.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_10.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_11.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_12.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_13.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_14.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_15.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_16.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_17.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_18.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_19.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_2.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_20.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_3.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_4.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_5.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_6.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_7.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_8.png
Extracted Text from f179eb06-0c53-44df-a13f-570be23355bb_9.png
Extracted Text from f313f521-80a1-4db5-a8a7-53d29ee09890.png
Extracted Text from f41b7574-57b4-4c9f-907c-2a3c48a56157.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_0.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_1.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_10.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_11.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_12.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_13.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_14.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_15.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_16.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_17.png
No text found in f7205881-3904-42ec-ab2c-04f36fa24785_2.png or text is not suitable for translation.
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_3.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_4.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_5.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_6.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_7.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_8.png
Extracted Text from f7205881-3904-42ec-ab2c-04f36fa24785_9.png
Extracted Text from fc27ce32-9c96-416c-9c38-84977255e0ba.png
Extracted Text from fcf90a92-794c-40c6-aa4f-8ea82f8bed51.png
Extracted Text from fe221e78-67e4-4d88-b73d-e58a9943a036.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_0.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_1.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_10.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_11.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_12.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_13.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_14.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_15.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_16.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_17.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_18.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_19.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_2.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_3.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_4.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_5.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_6.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_7.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_8.png
Extracted Text from fe245192-1f9c-4f28-9b32-046fb7ce7e1e_9.png
All done! The extracted and translated texts are saved in text_translations_ordered.json, maintaining the original order.

Leveraging Generative AI to Analyze the data

Now let's use Gen AI to help us analyzing the data collected.

In [1]:
import json
import os

# Load the JSON file containing the translations
with open('text_translations2.json', 'r', encoding='utf-8') as file:
    data = json.load(file)
    
all_translated_texts = ""

# Loop through each item in the data list
for item in data:
    # Append the translated text to the all_translated_texts variable
    all_translated_texts += item['translated_text'] + "\n\n"

len(all_translated_texts)
Out[1]:
507799
In [2]:
all_translated_texts[:1000]
Out[2]:
'Euuui\n\nEhehytnr\nHHEH\nBTH\ntuattp\ny\'a]> l.] [[_ _ ′] ′ | y "quantity` _ 芗]\nEhar\nE\n\nE\nriver\n\nE\n\nnnarseran\n\\ @e\n\nFor a moment\n\nE\nCountry -1> Country\n\ng\nR2\nn\nH\nF&HSi\n\nE\n\nE\n\nE\n\nShirayuki 50 孛\n\nE\n\nThe department is simply called (the post -response is slow), and now it can be\nIs it overwhelming?\n\nIt has no physical slow\n\nDifferent from the same file service device, separate after coming out\nFu Baosun is\n\nTake the single one by one\n\nNow I look at the internal capacity\n\nWell, we want to buy Kui Box\'s Maotai Manda National Public Consultation\n\nTo be honest, this point is taken before, but it is\nThe timeliness is not good, many Dongxi reads today\nThere is a value from the beginning, and it will not be available in two days\n\nAccording to the experience of you, you have updated this one\nWow, please\n\nIs it possible to pass back from time to time?\n\n『 E\n\nS B oesxysc\n4\nLice Diao fork K 虬 K Ji stare at DAVE\n\nEUCEECR\n\nrnecoveas 一\nE\n\nrnee i\nbzscs E\n\na 心\n\nEompr\n\nsncne\n\nAn E\nR\nwater\n\nVR country\n\nZ Stupid Z '
In [9]:
file_name = data[0]['file_name']  # Adjusted to access the first item in the list
translated_text = data[0]['translated_text']

from openai import OpenAI
os.environ["OPENAI_API_KEY"] = api_key

client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-4-0125-preview",
  max_tokens=4096,
  messages=[
    {"role": "system", "content": "You are a Cyber Threat Intelligence analyst specialized in China operation. You are dedicated to analyzing leaked sensitive information in relation to Chinese espionage capabilities. The data contains multiple format documents, chat conversion, screenshot of products."},
    {"role": "user", "content": f"Make me a summary of this information: {all_translated_texts[:10000]}" }
  ]
)

print(completion.choices[0].message.content)
Based on the provided text, it appears to be a mixture of fragmented data, possibly from a larger intelligence gathering operation related to Chinese espionage or surveillance activities. The text contains various elements such as references to espionage capabilities, data breaches, targeted espionage operations against specific countries, technological vulnerabilities, and potential espionage tools and methods. Here's a succinct summary categorizing the key points:

1. **Cyber Espionage Tools and Vulnerabilities**: Mentions of "Mikrotik's 0day" and "gmail acquisition" suggest discussions or reports on exploiting vulnerabilities in network equipment and email services for intelligence gathering. The question "Is there any related to iOS?" could indicate an interest in vulnerabilities within Apple's operating systems.

2. **Data Breaches and Intelligence Gathering Operations**: The text lists extensive data breaches affecting companies and government entities across various countries, including Myanmar, Vietnam, and potentially more. The data compromised includes personal identifiers, contact information, work details, etc. This suggests a coordinated effort to gather intelligence on individuals and organizations across different sectors.

3. **Espionage Targets and Their Information**: Several references to "country" and specific regions, alongside fragmented technical details, imply a focus on geographic and political entities as intelligence targets. There's also mention of recruiting or observing individuals, possibly for espionage or influence operations.

4. **Leaked Documents and Communications**: The presence of file names, potential document titles, and communication snippets hint at leaked or intercepted documents and communications, possibly as part of espionage activities. These could contain sensitive or strategic information targeted in cyber espionage campaigns.

5. **Technological Espionage and Surveillance Products**: Various references to "products," "devices," and specific technologies like "Shirayuki 50 孛" indicate discussions about espionage equipment or surveillance products being used or evaluated for operations.

6. **Operational Security Concerns and Practices**: Dialogues questioning the timeliness of operations, the effectiveness of post-response, and the management of sensitive information suggest an ongoing evaluation of operational security and practices within espionage activities.

7. **Espionage Operations and Their Impact**: The assorted data indicate active espionage operations aimed at collecting intelligence on political, economic, and social aspects across multiple countries. The fragmented nature of the text suggests it's part of a larger dataset or report, likely containing sensitive insights into the capabilities, targets, and outcomes of espionage activities.

Given the sensitive and fragmented nature of the information, it appears to be part of a larger intelligence leak or a compilation of data from surveillance and espionage operations. Further analysis and context would be needed to fully understand the implications and specific details of the mentioned operations and tools.

Let's modify a little bit our prompt!

In [10]:
file_name = data[0]['file_name']  # Adjusted to access the first item in the list
translated_text = data[0]['translated_text']

from openai import OpenAI
os.environ["OPENAI_API_KEY"] = api_key

client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-4-0125-preview",
  max_tokens=4096,
  messages=[
    {"role": "system", "content": "You are a Cyber Threat Intelligence analyst specialized in China operation. You are dedicated to analyzing leaked sensitive information in relation to Chinese espionage capabilities. The data contains multiple format documents, chat conversion, screenshot of products..."},
    {"role": "user", "content": f"Make me a summary of the following information and include evidences such as names, ip or any other artefacts: {all_translated_texts[:10000]}" }
  ]
)

print(completion.choices[0].message.content)
The provided text appears to be a collection of seemingly random sequences interspersed with mentions of potential espionage-related activities, cyber threats, and intelligence gathering efforts attributed possibly to Chinese operations or interests. Given the fragmented and coded nature of the excerpt, the following summary attempts to identify key points and artifacts of interest within the constraints of the provided data:

### Potential Cyber Espionage Tools and Targets
- **Mikrotik's 0day and Gmail acquisition**: References to vulnerabilities or exploits possibly used for gaining unauthorized access.
- **iOS-related questions**: May imply an interest or ongoing efforts to compromise Apple's iOS devices.

### Data Breaches or Intelligence Gathering Efforts
- Specific details about data collections, including numbers of records and types of data obtained from various organizations and countries, notably from Myanmar, Vietnam, and presumably other regions. The data mentioned includes:
  - **Vietnam Airlines**: Passenger ID, work, purpose, etc.
  - **Vietnam South China Yunshang Business Dot**: Email, password, login letters, etc.
  - **Burma Myanmar Yunshang Business Datoma**: Phone numbers, etc.

### Allegations of Criminal Activity
- Mention of a **Russian spy black customer group "Tura"** involved in data theft, using a tool or codename **"Credit Guardian Star Chain Road"**.

### Potential Chinese Cyber Operations and Individuals
- References to **"Shirayuki 50 孛"**, which might be a codename for a project, operation, or product.
- Names like **Zhang Dedi, Gu Guangyan**, and **Wettin 33849215 @ Milk**, suggesting individuals associated with operations or intelligence gathering efforts.
- Date and time stamps (e.g., **September 15, 2021**, **May 25th 15:16**), possibly relating to operations or communications.

### Cybersecurity Incidents and Tactics
- The list could indicate targeted industries or areas of interest for intelligence gathering or cyber-attacks, including aviation, business databases, and governmental organizations.
- The text also mentions various forms of data (e.g., **"gmail secrets", "net surveys"**), suggesting focuses for cyber espionage activities.

### Miscellaneous
- Several IP addresses, domain names, or other cyber artifacts are not explicitly listed in the provided text but might be encoded within the mentioned datasets (e.g., **"Heart WeChat Electric Edition", "Big RSKTH6SI or CN"**).
- There are also numerous references to "Datoma" and "Table Power Limited Committee", which could be related to the databases or methods of organizing gathered intelligence.

### Conclusion
The leaked information appears to encompass a broad spectrum of cyber espionage activities, potentially tied to Chinese operations, encompassing cyber attack tools, data breaches, targeted surveillance, and possibly cyber-physical operations. Without context or decryption, the exact intentions or operations behind these excerpts remain speculative. However, the text includes technical terms, possible codenames, individual names, and specific details suggesting sophisticated and wide-ranging cyber intelligence efforts.

Interesting! Now we can get a broader overview of what kind of infomation the data might contains.

Retrieval Augmented Generation (RAG)

In that part I want to experiment by using a RAG to load our data.

In [14]:
#!pip install langchain langchain-community chromadb jq
#!pip install langchain-openai
#!pip install jq
#!pip install --upgrade --quiet  langchain langchain-openai faiss-cpu tiktoken

Using ChromaDB

Let's create a simple RAG using Langchain to be able to query our data based on the question we have.

In [77]:
from langchain.prompts import ChatPromptTemplate
from langchain_community.vectorstores import Chroma, FAISS
from langchain_community.document_loaders import JSONLoader
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Create the embedding
embedding_function = OpenAIEmbeddings()

# Loading our JSON file
loader = JSONLoader(file_path="text_translations2.json", jq_schema=".[] | .translated_text", text_content=False)
documents = loader.load()

db = Chroma.from_documents(documents, embedding_function)
retriever = db.as_retriever()

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

model = ChatOpenAI(temperature=0, model_name="gpt-4-0125-preview")

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)
In [49]:
query = "Can you give me details about intelligence capabilities from this data leak?"#3"Give me details about the email collection platform"
print(chain.invoke(query))
Based on the provided context from the data leak, the intelligence capabilities described involve the use of professional Advanced Persistent Threat (APT) penetration techniques. The document outlines a service system that specializes in intelligence services, emphasizing a "loose supply" approach. Here are the key points regarding the intelligence capabilities:

1. **Professional APT Penetration Team**: The service boasts a research team with significant expertise in APT penetration. This suggests a high level of skill in conducting sophisticated cyber attacks that remain undetected over long periods.

2. **Rich Experience in APT Penetration**: The document highlights the team's extensive experience in APT operations, indicating they have successfully executed multiple APT campaigns in the past. This experience likely contributes to their ability to navigate complex security environments and achieve their objectives.

3. **Cooked APT Implementation Flow**: The term "cooked" here suggests a well-established, tested, and refined methodology for implementing APT attacks. This implies the service has a systematic approach to conducting cyber espionage or cyber attacks, which could include stages such as reconnaissance, initial compromise, establishment of a backdoor, privilege escalation, lateral movement, persistence, and exfiltration.

4. **Focus on Domestic Public and Peaceful Departments**: The service targets domestic public sectors and departments that are described as "peaceful," which may include government, healthcare, education, or other critical infrastructure sectors. This targeting strategy indicates a focus on entities that might hold valuable information or have vulnerabilities due to their peaceful nature.

5. **Business Needs and Special Objectives for APT**: The service is tailored to meet the specific business needs of its clients, suggesting a customizable approach to cyber espionage. The mention of "Special Objective for APT" implies that operations are conducted with specific goals in mind, such as obtaining key information data from targeted entities.

6. **Obtaining Key Information Data of Specific Targets**: The primary objective of the APT operations is to acquire critical data from specific targets. This could involve stealing sensitive information, intellectual property, personal data, or other valuable assets that can be leveraged by the service's clients.

The document also includes a reference to a website (Wwww.j-soon.net), which might be associated with the service or serve as a point of contact. However, without further context, it's unclear what additional information or services might be offered through this site.

In summary, the intelligence capabilities described in the data leak pertain to sophisticated APT penetration techniques aimed at specific targets to fulfill the business needs of clients, with a focus on domestic public sectors.

Using FAISS for RAG with Memory

In [50]:
# Uncomment the following line if you need to initialize FAISS with no AVX2 optimization
# os.environ['FAISS_NO_AVX2'] = '1'

from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

loader = loader
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1433, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(docs, embeddings)
Created a chunk of size 3069, which is longer than the specified 1433
Created a chunk of size 2432, which is longer than the specified 1433
Created a chunk of size 1507, which is longer than the specified 1433
Created a chunk of size 2345, which is longer than the specified 1433
Created a chunk of size 2518, which is longer than the specified 1433
Created a chunk of size 3880, which is longer than the specified 1433
Created a chunk of size 2013, which is longer than the specified 1433
Created a chunk of size 1659, which is longer than the specified 1433
In [73]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts.prompt import PromptTemplate

retriever = db.as_retriever(search_kwargs={"k":5})


# Define your custom template
custom_template = """YYou are a Cyber Threat Intelligence analyst specialized in China operation. You are dedicated to analyzing leaked sensitive information in relation to Chinese espionage capabilities. The data contains multiple format documents, chat conversion, screenshot of products... If you do not know the answer reply with 'I am sorry'.
Chat History:
{chat_history}
Follow Up Input: {question}
Answer: """

CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)

# Initialize memory for chat history
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# Initialize the ConversationalRetrievalChain
qa_chain = ConversationalRetrievalChain.from_llm(model, retriever, condense_question_prompt=CUSTOM_QUESTION_PROMPT, memory=memory)

def execute_conversation(question):
    # Load conversational history from file
    try:
        with open('conversational_history.json', 'r') as f:
            conversational_history = json.load(f)
    except FileNotFoundError:
        conversational_history = []
    
    # Update conversational history with the user's question
    conversational_history.append(("User", question))
    
    # Use the ConversationalRetrievalChain to get the answer
    result = qa_chain({"question": question})
    
    # Extract the 'answer' part from the result
    response_text = result.get('answer', 'Sorry, I could not generate a response.')
    
    # Update conversational history with the bot's response
    conversational_history.append(("Bot", response_text))
    
    # Limit the history to the last 10 turns
    if len(conversational_history) > 10:
        conversational_history = conversational_history[-10:]
    
    # Save conversational history to file
    with open('conversational_history.json', 'w') as f:
        json.dump(conversational_history, f)
    
    # Save conversational history to file
    with open('conversational_history.json', 'w') as f:
        json.dump(conversational_history, f)
    
    # Print only the last message in the conversational history
    last_message = conversational_history[-1]
    print(f"Discussion:\n{last_message[0]}: {last_message[1]}")
In [56]:
# Call the function with a question
execute_conversation("WHich countries might be a target according to the documents?")
Discussion:
Bot: Based on the document, the countries that might be targets include:

1. Afghanistan (referred to as "Afu Khan Guojia")
2. Countries in Southeast Asia (mentioned in the context of anti-terrorism postal mail)
3. Countries in West Asia (mentioned in relation to the Ministry of Communications)
4. Thailand (Thai Ministry of Finance and Ministry of Commerce)
5. Mongolia (Mentioned in relation to Foreign Communications and the Police Bureau)
6. Kazakhstan (referred to in the context of airlines and possibly telecommunications with "Airastanna Airlines" and "Harzakstan Kcell")
7. Malaysia (mentioned in the context of the military, specifically the Malaysian Army Network)
8. Macau (mentioned in the context of airlines, "Macau Airlines")
9. Pakistan (mentioned in the context of cooperation with the Pakistani Public Security Bureau)
10. Syria (mentioned in the context of a specific direction of focus)
11. Uzbekistan (mentioned as "Wu Zabbettan")
12. Iran (mentioned as "Yilang")

These countries are mentioned in various contexts, including counter-terrorism, military involvement, and communications, indicating they might be targets or areas of interest for the activities described in the document.
In [62]:
# Call the function with a question
execute_conversation("Give me more details about any references related to espionage capabilities")
Discussion:
Bot: The document outlines a company's capabilities in network attack, anti-penetration, and security research, emphasizing its experience in network infiltration and data extraction. It mentions the company's service to central government bodies, law enforcement, and departments concerned with public order, providing them with key information data on specific targets or network systems of interest. This includes the extraction and analysis of large-scale data for national web departments, aiming at specific target data excavation services.

The company boasts a professional APT (Advanced Persistent Threat) penetration research team with extensive experience in APT penetration and a well-established implementation process. It caters to domestic public security departments based on their business needs, offering services to obtain crucial information data on specific targets through APT appointments.

Furthermore, the document highlights the company's involvement in counter-terrorism efforts, particularly in collaboration with the Pakistani Public Security Bureau. It suggests a comprehensive approach combining military, civilian, and international cooperation for deep counter-terrorism efforts in the Xinjiang region, aiming to build a comprehensive defense force in the northwestern area. The company provides anti-terrorism data support, including sensitive postal data related to counter-terrorism efforts in Pakistan and other related information.

The services mentioned also extend to penetrating government organization networks, both domestically (in regions like Xinjiang, Hong Kong, and Taiwan) and overseas, targeting specific government organization networks for infiltration based on public security needs.

Overall, the document describes a sophisticated set of espionage capabilities focused on network infiltration, data extraction, and analysis for governmental and law enforcement agencies, with a particular emphasis on counter-terrorism and public security operations.
In [65]:
# Call the function with a question
execute_conversation("Give me more details about the email/social network intelligence platform")
Discussion:
Bot: The email/social network intelligence platform described in the provided context appears to be a sophisticated system designed for comprehensive email analysis and decision-making. Here are the key features and functionalities based on the details provided:

1. **Product Superiority:**
   - **High Accuracy:** Utilizes a large data frame and "Wenben Intelligence" to accurately recognize techniques, enabling fast analysis of high-volume emails.
   - **Powerful Analysis:** Capable of conducting various types of custom analyses on target emails, including but not limited to the mailing list, email content, and sender information.
   - **High Availability:** Offers a stable and reliable system with different interface operations that are convenient and easy to use.

2. **Email Collection:**
   - The platform provides a function for the self-initiated collection of emails based on specified criteria such as target mailbox account, IP address of the server device, and password/security code. This allows for the automatic gathering of email data for analysis.

3. **Mail View:**
   - Supports both single-player and collaborative models for the analysis and judgment of original email data. It enables deep-level analysis and collaboration on various types of email data.

4. **Modules and Analysis:**
   - The system is structured around four major modules: data source, data intelligence, intelligence work, and the user interface. This structure facilitates comprehensive research and analysis of mail data, including search, correlation analysis, and judgment.

5. **System Requirements:**
   - The platform can be deployed on Windows/Linux operating systems, requiring a CPU with 16 cores and a main frequency of 2.2GHz, 8GB of memory, and 2TB of hard disk capacity. It supports cluster functionality for scalable performance.

6. **Learning Algorithms:**
   - Employs learning algorithms to categorize various data within email texts and attachments, extracting both structured and unstructured data for efficient storage and retrieval.

7. **Network Framework Structure:**
   - Utilizes a B/S (Browser/Server) architecture for easy access and use by clients through a web interface. This allows users to log into the system for comprehensive email data analysis.

8. **Product Composition:**
   - The platform supports both private and public cloud deployments. The private cloud version offers an independent platform, while the public cloud version operates on a SaaS model, allowing direct access through an account.

Overall, this email/social network intelligence platform is designed to offer powerful and accurate email analysis capabilities, supporting both individual and collaborative efforts in understanding and making decisions based on email data.
In [67]:
# Call the function with a question
execute_conversation("Extract and summarize any kind of information that might be useful for a defender or a threat analyst, including potential IOCs and TTPs")
Discussion:
Bot: The document outlines a comprehensive set of tools and methodologies used by a professional network attack and anti-penetration team, which could be of significant interest to defenders and threat analysts. Here's a summary of the relevant information, including potential Indicators of Compromise (IOCs) and Tactics, Techniques, and Procedures (TTPs):

### TTPs:

1. **Auxiliary Modules for Initial Reconnaissance:**
   - Scanning and checking various network services.
   - Collecting login credentials through fake services.
   - Port scanning to identify open services and potential vulnerabilities.

2. **Penetration and Post-Penetration Modules:**
   - Utilization of custom and predefined modules for conducting penetration attacks and maintaining access post-penetration.
   - Techniques for lateral movement within the network.
   - Reuse of evidence or high-value information obtained during the initial penetration for further attacks.

3. **Attack Persistence:**
   - Techniques for maintaining access without needing to re-perform complex attack operations. This includes reusing attack parameters and sessions for re-infiltration.

4. **Use of Compromised Hosts as Springboards:**
   - After successful penetration, compromised hosts can be used to attack additional targets.

### Potential IOCs:

1. **Network Signatures:**
   - Unusual port scanning activity.
   - Suspicious login attempts from services mimicking legitimate ones.
   - Traffic patterns indicating lateral movement or data exfiltration.

2. **Host-based Indicators:**
   - Presence of unknown or unauthorized modules and scripts.
   - Unexpected system or network configuration changes.
   - Files or logs indicating the use of specific penetration tools or methods.

3. **Vulnerability Exploitation:**
   - Attempts to exploit known vulnerabilities, which could be identified through unusual system behavior or alerts from intrusion detection systems.

4. **Command and Control (C2) Traffic:**
   - Network communications with known malicious IP addresses or domains.
   - Use of common protocols in uncommon ways, which might indicate C2 activity.

### Defensive Recommendations:

1. **Enhanced Monitoring and Detection:**
   - Implement robust monitoring of network traffic and system logs to detect early signs of reconnaissance and penetration attempts.
   - Use intrusion detection systems (IDS) and intrusion prevention systems (IPS) to identify and block known attack patterns.

2. **Vulnerability Management:**
   - Regularly scan for and patch vulnerabilities to reduce the attack surface.
   - Employ web application firewalls (WAFs) to protect against application-level attacks.

3. **Access Control and Segmentation:**
   - Enforce strict access controls and network segmentation to limit lateral movement.
   - Use multi-factor authentication (MFA) to protect against credential theft.

4. **Incident Response Planning:**
   - Develop and regularly update an incident response plan.
   - Conduct regular security training and simulations to prepare for potential attacks.

By understanding the TTPs and potential IOCs outlined in the document, defenders and threat analysts can better prepare and protect their networks against sophisticated cyber attacks.

What next?

In this document, we explored a method to analyze data in PNG format and in the Chinese language, concerning a leak related to a government contractor with offensive capabilities. We demonstrated how to leverage OCR to extract data from the PNG files and translate them into English to glean more details about it. Then, we used generative AI to summarize some of the information available and finally created a RAG to enable the exploration of and specific data requests without manually digging through the vast amount of information.

There is obviously more to discover, and I hope this process will assist you in further analyzing the data leak.

If you like this notebook, stay tuned for more soon!

Thomas Roccia | @fr0gger_

2
Trackers
Cloudflare
Heroku