开发目的
解决漏洞扫描器的痛点
第一就是扫描量太大,对一个站点扫描了大量的无用 POC,浪费时间
指纹识别后还需要根据对应的指纹去进行 payload 扫描,非常的麻烦
开发思路
我们的思路大体分为指纹+POC+扫描
所以思路大概从这几个方面入手
首先就是 POC,我们得寻找一直在更新的 POC,而且是实时更新的,因为自己写的话有点太费时间了
但是这 POC 的决定是根据我们扫描器来的,因为世面上已经有许多不错的扫描器了,目前打算使用的是 nuclei 扫描器
https://github.com/projectdiscovery/nuclei
Nuclei 是一种现代、高性能的漏洞扫描程序,它利用基于 YAML 的简单模板。它使您能够设计自定义漏洞检测场景,以模拟真实世界的条件,从而实现零误报。
目前也在不断维护更新,当然还有 Xray,Goby 等工具也是不错的选择
然后回到指纹识别技术,这个需要大量的指纹样本,但是世面上的各种工具已经可以做得很好了
指纹识别
这里就的学习一下指纹识别的技术
首先我们需要知道收集指纹目前大概有哪些方法
指纹识别方式
特定文件
比如举一个例子,我们通常是如何判断一个框架是 thinkphp 呢?
我们随便找几个 thinkphp 的网站
特征就是它的图标是非常明显的
[img=720,684.6428571428571]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/7b330944-48ff-46d1-87af-d1dae33ee96f.png[/img]
可以看到图标都是一样的,目前 fofa 和 hunter 已经有这种查找的方法了,一般都是把我们的图标换算为我们的 hash 值
这个就是我们的 favicon.ico 图标
一般网站是通过在路径后加入 favicon.ico 比如http://xxxxxx/favicon.ico
[img=720,904.6706586826348]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/efd0ad4b-d4b2-4e15-9917-536c82b42ec9.png[/img]
然后就能获取这个图标了,而在 fofa 中可以直接拖动查询了,可以直接算出 hash 值
[img=720,381.2142857142857]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/a407609b-8850-4430-8ce0-8217c1901eda.png[/img]
比如 thinkphp 的
[img=720,585.6428571428571]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/d2d6e9a9-bd25-4af3-b404-bfe5cc83c978.png[/img]
然后再次查询
[img=720,446.7857142857143]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/44b217e8-487a-4273-95ab-ece81ee3e74a.png[/img]
全是 tp 的网站参考https://github.com/TideSec/TideFinger/blob/master/Web%E6%8C%87%E7%BA%B9%E8%AF%86%E5%88%AB%E6%8A%80%E6%9C%AF%E7%A0%94%E7%A9%B6%E4%B8%8E%E4%BC%98%E5%8C%96%E5%AE%9E%E7%8E%B0.md
当然除了我们的 ico 文件,还有其他很多的文件
【----帮助网安学习,以下所有学习资料免费领!加vx:YJ-2021-1,备注 “博客园” 获取!】
① 网安学习成长路径思维导图
② 60+网安经典常用工具包
③ 100+SRC漏洞分析报告
④ 150+网安攻防实战技术电子书
⑤ 最权威CISSP 认证考试指南+题库
⑥ 超1800页CTF实战技巧手册
⑦ 最新网安大厂面试题合集(含答案)
⑧ APP客户端安全检测指南(安卓+IOS)
一些网站的特定图片文件、js 文件、CSS 等静态文件,如 favicon.ico、css、logo.ico、js 等文件一般不会修改,通过爬虫对这些文件进行抓取并比对 md5 值,如果和规则库中的 Md5 一致则说明是同一 CMS。这种方式速度比较快,误报率相对低一些,但也不排除有些二次开发的 CMS 会修改这些文件。
[img=720,329.14285714285717]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/bd38f567-12f6-4471-9ce4-0d7e13a26e93.png[/img]
页面关键字
比如 tp 的错误页面大多数都是
[img=720,496.275]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/9dc612ef-e2cf-44ff-8fa6-6b9519834f4d.png[/img]
我们 body 就可以包含这个关键字了
或者可以构造错误页面,根据报错信息来判断使用的 CMS 或者中间件信息,比较常见的如 tomcat 和 spring 的报错页面。根据 response header 一般有以下几种识别方式:
请求头关键字
根据网站 response 返回头信息进行关键字匹配,这个一般是 ningx 这种
[img=720,631.925]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/466830db-0c49-4e80-8561-c9701dcbf45a.png[/img]
能够识别我们的服务器
URL 路径
根据总结wordpress 默认存在 wp-includes 和 wp-admin 目录,织梦默认管理后台为 dede 目录,solr 平台可能使用/solr 目录,weblogic 可能使用 wls-wsat 目录等。
大部分还是根据我们的 body
[img=720,883.8910505836576]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/51ac6bd7-1985-4d53-ba2f-7d328366f3eb.png[/img]
然后点一个进去
[img=720,803.7673130193906]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/8508b08c-d36e-4963-b1fc-1352fd7a95bf.png[/img]
可以看到都是我们的 wordPress 的站点
指纹识别方法
有了我们上面的识别技术,那么我们大概是如何来识别一个指纹的呢
首先使用 python 简单举一个实现
首先就是需要一个配置文件,这个配置文件就需要包含我们的大体指纹和验证方法- - name: ThinkPHP
- matchers:
- - type: header
- rule: X-Powered-By
- keyword: ThinkPHP
-
- - type: body
- keyword: "http://www.thinkphp.cn"
-
- - type: banner
- keyword: thinkphp
-
- - type: path
- path: /thinkphp/library/think/
- keyword: class
-
- - type: favicon_hash
- hash: 1165838194
-
复制代码 然后就是我们的后端处理逻辑了- import yaml
- import requests
- import socket
- import base64
- import mmh3
-
- def get_http_response(url, path=""):
- try:
- full_url = url.rstrip("/") + path
- return requests.get(full_url, timeout=5)
- except:
- return None
-
- def get_tcp_banner(ip, port=80):
- try:
- with socket.create_connection((ip, port), timeout=5) as s:
- banner = s.recv(1024).decode(errors="ignore")
- return banner
- except:
- return ""
-
- def get_favicon_hash(url):
- try:
- res = requests.get(url.rstrip("/") + "/favicon.ico", timeout=5)
- favicon = base64.encodebytes(res.content)
- return mmh3.hash(favicon.decode('utf-8'))
- except:
- return None
-
- def load_fingerprints(path="fingerprints.yaml"):
- with open(path, "r", encoding="utf-8") as f:
- return yaml.safe_load(f)
-
- def match_fingerprint(url, ip=None):
- fingerprints = load_fingerprints()
- results = []
-
- res = get_http_response(url)
- banner = get_tcp_banner(ip or url.replace("http://", "").replace("https://", ""), 80)
- favicon_hash = get_favicon_hash(url)
-
- for fp in fingerprints:
- matched = False
- for matcher in fp["matchers"]:
- if matcher["type"] == "header" and res:
- if matcher["rule"] in res.headers and matcher["keyword"].lower() in res.headers[matcher["rule"]].lower():
- matched = True
- elif matcher["type"] == "body" and res:
- if matcher["keyword"].lower() in res.text.lower():
- matched = True
- elif matcher["type"] == "banner":
- if matcher["keyword"].lower() in banner.lower():
- matched = True
- elif matcher["type"] == "path":
- res2 = get_http_response(url, matcher["path"])
- if res2 and matcher["keyword"].lower() in res2.text.lower():
- matched = True
- elif matcher["type"] == "favicon_hash":
- if favicon_hash == matcher["hash"]:
- matched = True
- if matched:
- results.append(fp["name"])
- return results
-
- # 示例使用
- if __name__ == "__main__":
- target_url = "http://101.200.50.94:8009/"
- result = match_fingerprint(target_url)
- print("识别结果:", result)
-
复制代码 大体逻辑就是这样了
首先就是 yaml 文件为我们的判断依据,对应不同的判断方法我们都有对应的后端处理
一个是对 body 的处理,一个是对 hash 文件的处理
[img=720,795.2142857142857]https://www.yijinglab.com/guide-img/d9634e2f-3b66-42e7-8279-c0877cdd70e5/ef7e6cb6-44e7-40fd-b03d-6591b8d01137.png[/img]
然后再根据规则去匹配
匹配成功输出结果
当然这只是一个简单的逻辑,如果需要实现更高高效快捷第一就是指纹库,第二就是代码运行的速率,提高线程
最终识别代码
首先就是指纹库的获取,这个的话我们就不直接获取了,使用的是 EHole 的指纹库
我们大概看看部分代码- {
- "fingerprint": [{
- "cms": "seeyon",
- "method": "keyword",
- "location": "body",
- "keyword": ["/seeyon/USER-DATA/IMAGES/LOGIN/login.gif"]
- }, {
- "cms": "seeyon",
- "method": "keyword",
- "location": "body",
- "keyword": ["/seeyon/common/"]
- }, {
- "cms": "Spring env",
- "method": "keyword",
- "location": "body",
- "keyword": ["servletContextInitParams"]
- }, {
- "cms": "微三云管理系统",
- "method": "keyword",
- "location": "body",
- "keyword": ["WSY_logo","管理系统 MANAGEMENT SYSTEM"]
- }, {
- "cms": "Spring env",
- "method": "keyword",
- "location": "body",
- "keyword": ["logback"]
- }, {
- "cms": "Weblogic",
- "method": "keyword",
- "location": "body",
- "keyword": ["Error 404--Not Found"]
- }, {
- "cms": "Weblogic",
- "method": "keyword",
- "location": "body",
- "keyword": ["Error 403--"]
- }
-
-
- {
- "cms": "Atlassian – JIRA",
- "method": "faviconhash",
- "location": "body",
- "keyword": ["981867722"]
- }, {
- "cms": "OpenStack",
- "method": "faviconhash",
- "location": "body",
- "keyword": ["-923088984"]
- }, {
- "cms": "Aplikasi",
- "method": "faviconhash",
- "location": "body",
- "keyword": ["494866796"]
- }, {
- "cms": "Ubiquiti Aircube",
- "method": "faviconhash",
- "location": "body",
- "keyword": ["1249285083"]
- }
复制代码 简单看了一下逻辑可以发现和我们的指定方法应该差不多,逻辑就是首先根据 method 去选择方法,一个是 keyword 方法,一个是 faviconhash 方法,是一个大的判断,然乎下面就是根据具体的比如 body,title 等去识别了
代码如下- import json
- import requests
- import hashlib
- import base64
- from bs4 import BeautifulSoup
- from urllib.parse import urljoin
- from concurrent.futures import ThreadPoolExecutor, as_completed
- import argparse
-
- # 加载指纹
- def load_fingerprints(file='finger.json'):
- with open(file, 'r', encoding='utf-8') as f:
- data = json.load(f)
- if "fingerprint" in data:
- return data["fingerprint"]
- raise ValueError("指纹文件格式不正确,应包含 'fingerprint' 字段。")
-
- # 获取 HTTP 响应
- def get_http_response(url):
- try:
- headers = {
- "User-Agent": "Mozilla/5.0"
- }
- return requests.get(url, headers=headers, timeout=8, verify=False)
- except:
- return None
-
- # 计算 favicon hash
- def get_favicon_hash(url):
- try:
- favicon_url = urljoin(url, '/favicon.ico')
- res = requests.get(favicon_url, timeout=5, verify=False)
- if res.status_code == 200:
- m = hashlib.md5()
- b64 = base64.b64encode(res.content)
- m.update(b64)
- return int(m.hexdigest(), 16)
- except:
- return None
-
- # 匹配单条指纹
- def match_one(fpr, res, fav_hash):
- method = fpr["method"]
- loc = fpr.get("location", "body").lower()
- kws = fpr["keyword"]
-
- if method == 'keyword':
- text_body = res.text or ""
- text_head = "\n".join(f"{k}: {v}" for k, v in res.headers.items())
-
- # 处理 title 和 header 等
- if loc == 'header':
- spaces = [text_head, text_body]
- elif loc == 'title':
- soup = BeautifulSoup(text_body, "html.parser")
- title = soup.title.string if soup.title and soup.title.string else ""
- spaces = [title, text_body]
- elif loc == 'body':
- spaces = [text_body]
- else:
- spaces = [text_body]
-
- for space in spaces:
- for kw in kws:
- if kw.lower() in space.lower():
- return True
-
- if method == 'faviconhash' and fav_hash is not None:
- for kw in kws:
- try:
- if fav_hash == int(kw):
- return True
- except:
- continue
-
- return False
-
- # 识别单个 URL 指纹
- def match_fingerprint(url, fps=None):
- fps = fps or load_fingerprints()
- res = get_http_response(url)
- fav_hash = get_favicon_hash(url)
-
- matched = []
- for fpr in fps:
- if 'cms' not in fpr or 'method' not in fpr or 'keyword' not in fpr:
- continue
- if match_one(fpr, res, fav_hash):
- matched.append(fpr['cms'])
-
- print(f"[✓] {url} 指纹识别结果:{list(set(matched))}")
- return {url: list(set(matched))}
-
- # 多线程执行
- def run_multithread(urls, threads):
- fps = load_fingerprints()
- results = []
- with ThreadPoolExecutor(max_workers=threads) as executor:
- future_to_url = {executor.submit(match_fingerprint, url, fps): url for url in urls}
- for future in as_completed(future_to_url):
- results.append(future.result())
- return results
-
- # 主程序入口
- def main():
- parser = argparse.ArgumentParser(description="指纹识别脚本 - 支持多线程")
- group = parser.add_mutually_exclusive_group(required=True)
- group.add_argument("-u", "--url", help="单个目标 URL")
- group.add_argument("-f", "--file", help="包含多个 URL 的文件")
- parser.add_argument("-t", "--threads", type=int, default=10, help="线程数(默认10)")
- args = parser.parse_args()
-
- if args.url:
- match_fingerprint(args.url)
-
- elif args.file:
- with open(args.file, 'r', encoding='utf-8') as f:
- urls = [line.strip() for line in f if line.strip()]
- results = run_multithread(urls, args.threads)
-
- if __name__ == '__main__':
- main()
-
复制代码 加入了支持多线程和支持多目标的思路
结合漏洞扫描
我们光目标识别后,还需要实现精准化打击,正好 Nuclei 引擎支持根据 tag 去寻找我们的目标,完美了
实现思路就是首先寻找我们的 tag,然后在漏洞库里面查询,把有 tag 的和没有 tag 的分别分开放好,然后根据有 tag 的去精准化识别运行,完成最后的精准化 POC 攻击
初步的代码如下- import json
- import os
- import threading
- import time
- import base64
- import hashlib
- import requests
- import argparse
- from bs4 import BeautifulSoup
- from urllib.parse import urljoin
- from concurrent.futures import ThreadPoolExecutor, as_completed
- from queue import Queue
-
- requests.packages.urllib3.disable_warnings()
-
- # ---------- 指纹识别部分 ----------
-
- def load_fingerprints(file='finger.json'):
- with open(file, 'r', encoding='utf-8') as f:
- data = json.load(f)
- return data["fingerprint"]
-
- def get_http_response(url):
- try:
- headers = {"User-Agent": "Mozilla/5.0"}
- return requests.get(url, headers=headers, timeout=8, verify=False)
- except:
- return None
-
- def get_favicon_hash(url):
- try:
- favicon_url = urljoin(url, '/favicon.ico')
- res = requests.get(favicon_url, timeout=5, verify=False)
- if res.status_code == 200:
- m = hashlib.md5()
- b64 = base64.b64encode(res.content)
- m.update(b64)
- return int(m.hexdigest(), 16)
- except:
- return None
-
- def match_one(fpr, res, fav_hash):
- method = fpr["method"]
- loc = fpr.get("location", "body").lower()
- kws = fpr["keyword"]
-
- if method == 'keyword':
- text_body = res.text or ""
- text_head = "\n".join(f"{k}: {v}" for k, v in res.headers.items())
- if loc == 'header':
- spaces = [text_head]
- elif loc == 'title':
- soup = BeautifulSoup(text_body, "html.parser")
- title = soup.title.string if soup.title and soup.title.string else ""
- spaces = [title]
- else:
- spaces = [text_body]
-
- for space in spaces:
- for kw in kws:
- if kw.lower() in space.lower():
- return True
-
- elif method == 'faviconhash' and fav_hash is not None:
- for kw in kws:
- try:
- if fav_hash == int(kw):
- return True
- except:
- continue
-
- return False
-
- def match_fingerprint(url, fps):
- res = get_http_response(url)
- fav_hash = get_favicon_hash(url)
- matched = []
-
- for fpr in fps:
- if 'cms' not in fpr or 'method' not in fpr or 'keyword' not in fpr:
- continue
- if match_one(fpr, res, fav_hash):
- matched.append(fpr['cms'])
-
- print(f"[✓] {url} 指纹识别结果:{list(set(matched))}")
- return {"url": url, "cms": list(set(matched))[0] if matched else ""}
-
- def run_fingerprint_scan(urls, threads, output='res.json'):
- fps = load_fingerprints()
- results = []
-
- with ThreadPoolExecutor(max_workers=threads) as executor:
- future_to_url = {executor.submit(match_fingerprint, url, fps): url for url in urls}
- for future in as_completed(future_to_url):
- results.append(future.result())
-
- with open(output, 'w', encoding='utf-8') as f:
- json.dump(results, f, ensure_ascii=False, indent=2)
-
- # ---------- Nuclei 扫描部分 ----------
-
- class AutoNuclei:
- def __init__(self, res_file='res.json', tag_file='C:\\Users\\86135\\nuclei-templates\\TEMPLATES-STATS.json', thread_count=5):
- self.res_file = res_file
- self.tag_file = tag_file
- self.havetag_file = 'havetag.txt'
- self.notag_file = 'notag.txt'
- self.result_dir = 'result'
- self.thread_count = thread_count
-
- self.cms_targets = {} # {cms: [url1, url2]}
- self.nuclei_tags = set()
- self.tagged_targets = {} # {tag: [url1, url2]}
- self.untagged_targets = []
-
- self.task_queue = Queue()
- self.load_res_json()
- self.load_tags()
- self.classify_targets()
- self.save_targets()
- self.start_scan_threads()
-
- def load_res_json(self):
- with open(self.res_file, 'r', encoding='utf-8') as f:
- data = json.load(f)
- for entry in data:
- cms = entry.get("cms", "").lower()
- url = entry.get("url")
- if cms and url:
- self.cms_targets.setdefault(cms, []).append(url)
-
- def load_tags(self):
- with open(self.tag_file, 'r', encoding='utf-8') as f:
- tags_data = json.load(f)
- for item in tags_data.get("tags", []):
- if item["name"]:
- self.nuclei_tags.add(item["name"].lower())
-
- def classify_targets(self):
- for cms, urls in self.cms_targets.items():
- if cms in self.nuclei_tags:
- self.tagged_targets.setdefault(cms, []).extend(urls)
- else:
- self.untagged_targets.extend(urls)
-
- def save_targets(self):
- with open(self.havetag_file, 'w', encoding='utf-8') as f:
- for tag, urls in self.tagged_targets.items():
- for url in urls:
- f.write(f"{tag}||{url}\n")
-
- with open(self.notag_file, 'w', encoding='utf-8') as f:
- for url in self.untagged_targets:
- f.write(url + '\n')
-
- if not os.path.exists(self.result_dir):
- os.makedirs(self.result_dir)
-
- def scan_worker(self):
- while not self.task_queue.empty():
- try:
- tag, url = self.task_queue.get(timeout=1)
- target_file = f"temp_{int(time.time() * 1000)}.txt"
- with open(target_file, 'w', encoding='utf-8') as f:
- f.write(url)
-
- output_file = os.path.join(self.result_dir, f"{tag}_{int(time.time())}.txt")
- cmd = f"F:\\gj\\Vulnerability_Scanning\\nuclei\\nuclei.exe -l {target_file} -tags {tag} -o {output_file} -stats"
- print(f"[+] 扫描任务启动: {url} -> {tag}")
- os.system(cmd)
- os.remove(target_file)
- except Exception as e:
- print(f"[!] 线程错误: {e}")
-
- def start_scan_threads(self):
- for tag, urls in self.tagged_targets.items():
- for url in urls:
- self.task_queue.put((tag, url))
- threads = []
- for _ in range(self.thread_count):
- t = threading.Thread(target=self.scan_worker)
- t.start()
- threads.append(t)
- for t in threads:
- t.join()
- print("[✓] 所有扫描任务完成!")
-
- # ---------- 主程序入口 ----------
-
- def main():
- parser = argparse.ArgumentParser(description="指纹识别 + Nuclei自动化工具")
- group = parser.add_mutually_exclusive_group(required=True)
- group.add_argument("-u", "--url", help="目标 URL")
- group.add_argument("-f", "--file", help="URL列表文件")
- parser.add_argument("--fp-threads", type=int, default=10, help="指纹识别线程数")
- parser.add_argument("--scan-threads", type=int, default=5, help="Nuclei 扫描线程数")
- args = parser.parse_args()
-
- urls = []
- if args.url:
- urls = [args.url]
- elif args.file:
- with open(args.file, 'r', encoding='utf-8') as f:
- urls = [line.strip() for line in f if line.strip()]
-
- print("[*] 正在执行指纹识别...")
- run_fingerprint_scan(urls, threads=args.fp_threads, output='res.json')
-
- print("[*] 指纹识别完成,开始 Nuclei 扫描...")
- AutoNuclei(
- res_file='res.json',
- tag_file=os.path.join(os.environ['USERPROFILE'], 'nuclei-templates', 'TEMPLATES-STATS.json'),
- thread_count=args.scan_threads
- )
-
- if __name__ == '__main__':
- main()
复制代码 更多网安技能的在线实操练习,请点击这里>>
来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! |