The early growth of the Xiaobaibaihong team (1) passive information collection

Article directory

zero, preface

Articles in this catalog serve as a compilation of previous knowledge

1. Basics

1.0, small suggestion

Some websites such as: https: / /  
https : / /  
https: / / https : / / paper. _ _

In addition, it is recommended to have your own cloud server, proxy (pool)

Things Red Teamers care about:
    Permissions (website, server, database)
    data (sensitive data)

1.1. Basics

Now in actual combat, honeypots are still quite common.

So when you test yourself, try to use your own virtual machine to test,

I usually maintain a virtual machine of my own and take a snapshot. ,, the virtual machine is stuck in the solid state drive

When the attacker visits the honeypot website, the server will crawl the request link,

Other access records, browser cookies, etc., construct a visitor's portrait,

And then locate the specific personnel information.

So suggest:

That is, after participating in Project A and summarizing what is required for backup, restore the virtual machine snapshot.

Php     --> mysql development difficulty is low, small and medium companies, mysql is free

Jsp     --> oracle/mysql Oracle charges and is relatively expensive, suitable for companies with large traffic

Asp     --> sql server are all Microsoft

Client   -- waf -- responsible for balancing -- real server

1.2, general idea

Through the information given by the leader, for example: xx company.

Collection target: website, app client, WeChat applet.

After finishing the work, there will be a wave of missed scans, 0 and Nday will get a breakthrough, and it is necessary to escalate permissions.

Collect intranet information and move other intranet machines laterally. This process involves tunnel establishment and port forwarding.

After obtaining some intranet machines, maintain permissions, and then clean up traces in order to reduce the probability of being discovered.

Logs should be cleaned up, and tools should be deleted. Finally, you can write a report and cross it. The most important thing in the report is to reflect,

The two things mentioned above: XX permissions, XX data.

2. Information collection

2.1. Brief introduction

In the actual testing process, the most important thing is to find a breakthrough point. For the vast majority of Party A,

The protection of the external network is equivalent to that the internal network is too high, and the search for breakthrough points relies heavily on the information collection in the early stage.

Collect the domain name/IP related to the target, and perform a full port scan on the first batch of IP obtained (if conditions permit)

A wave of scans for the above assets obtained now, of course, there is a high probability that they will not be able to enter.

At this time, the information of the employees of the target company on the network is collected, and a specific password book is formed for blasting again.

The probability of success will be much higher. Of course, the collected information can be used for fishing in addition to blasting.

2.2. Basic requirements for information collection

Comprehensive (so a full port scan is performed when conditions are available)

Accuracy (don't miss out, or hit the scope that some customers are not authorized by) (deduct points in light, enter in heavy)

Timeliness (for example, some assets obtained through fofa are too old,
        Please refer to Article 2 for handling inadvertently, so the invalid information should be cleared in time. )

Clarity (to be logically clear about the information collected,
        Can clearly distinguish the logical relationship between various collected information and the relative position of assets, 
        There must be a clear understanding of asset logic and business logic for the overall goal. )

Topology (when attacking a .com , it is found that it will jump to b .com and b .com is also an asset of the enterprise,Collect information 
        on once) (that is, starting from the first batch of assets,
        Do a second topology, a third...)

Reduce the degree of discovery (the IP was blocked by the opposite side before it started)

Careful and patient. (The process is relatively boring, maybe because the laziness scans one port less, there is one less breakthrough point)

2.3. What information is collected

The target real IP,, mainly around the cdn
                            ,, historical analysis IP (ip138 collection), look at subdomains, foreign (multiple ping), etc.

Target development port 、、 Each port has its own unique vulnerability, such as 3306 knows mysql,
                            ,, when you note the website, you need to use the mysql statement to operate

What operating system does the target use? For example, the reverse shell command is different, and there is no session back for a long time.

What middleware does the target use? Different middleware has its own vulnerabilities

The target uses the protocol (http/s), and uses the fingerprint of the https certificate to collect subdomains using the same certificate

DNS 、、Some third-party websites aggregate a large amount of DNS data
                            ,, you can use this to retrieve subdomains

Target website uses fingerprints, rce vulnerabilities for specific CMS

For the website itself, , error information, JS sensitive information, some sensitive interfaces

2.4. Information collection and classification

  • Passive Information Collection:

Use information that others have already collected, such as collecting targets through (spatial) search engines

  • Main information collected:

Use tools to collect information yourself, such as port scanning through nmap

3. Passive Information Collection

For example, to collect information through fofa,

3.1, advantages and disadvantages

     1. High concealment, not easy to be found  
     2. The amount of information collected and the coverage are relatively large

    1. The information collected is not timely and accurate
    2. Unable to collect sensitive or undisclosed information

Application scenarios of passive information collection:
    Passive information collection is very commonly used in actual combat, especially in the offensive and defensive confrontation of network security.
    The various infiltration processes we usually see also often use passive information collection,
    For example, the basic elements of the goal are obtained through passive information collection,
    Then judge the characteristics of the target and analyze the weaknesses according to the information elements, and finally do the next attack activities according to the weaknesses.

3.2. Common means of passive information collection

The search engine syntax is just google as an example:

site:       search the content of the specified site (usually used to specify the target to avoid too much information to search) 
 filetype:  search for the specified file type (usually used to search for sensitive files such as dat, log, txt, xls, rar, bak, etc. to view  sensitive content ) 

Combined use example: search pdf files under qq target 


intitle: Search for web pages with the specified content in the title 

inurl: Search the web page with the specified content in the url 

Combined use example: Sensitive information leakage query of alfa plugin under wordpress 

            inurl:ALFA_DATA intitle:"index of" 

intext: Search for webpages with specified content in the webpage content

site: is based on the domain name 

inurl: based on the sorted url path

Syntax and basic usage of search engine in cyberspace
  • Basic introduction to cyberspace search engine:

A platform for cyberspace mapping and cyberspace security threat analysis,

It will regularly collect all the information related to network devices that are disclosed to the public on the entire network.

  • Differences from search engines:

Search Engine: Website

Cyberspace search: as long as you open the port

  • Take fofa as an example:

title= "abc" searches for abc from the title.

    Example: [Beijing website in the title]

header= "abc" searches for abc from http headers.

    Example: [jboss server]

body= "abc" searches for abc from the html body.

    Example: [The text contains Hacked by ]

domain= "" to search for websites with the root domain name of

    Example: [The root domain name is the website of]

host= "" to search for from the url, note that the search should use host as the name.

    Example: [Government Website], [Educational Website]

(https: //"443" Find assets corresponding to port 443.

    Example: [Find assets corresponding to port 443 ]

  • actual use:

For example, some time ago, UF broke a loophole, you can use fofa to quickly locate the site of the user on the Internet:

    "UF U8-OA"

A slightly more advanced way to use it is to use parentheses and symbols such as && || !=, for example:

    title="powered by" && title!=discuz 
    title!="powered by" && body=discus

    ( body="content=\"WordPress" || (header="X-Pingback" && header="/xmlrpc.php" && body="/wp-includes/") ) && host=""

Added == exact match symbol to speed up the search,

    For example, to find all hosts of, it can be domain== ""


Of course, these spatial search engines also derived some related tools: fofa_viewer

The advantage of these tools is that the results of the query can be exported quickly.

3.3. Processing information collected by passive information

categorization of information
    Categorize the collected information by asset class

association of information
    Roughly identify the relationship between the collected information

screening of information
    Check the collected information for invalidity or inaccuracy to rule out

Highlights of information (very, very important content)
    Present the key points of the collected information

information topology
    Take the key content as the starting point of information collection and do another information collection

Integration and Formatting of Information
    Consolidate all information and unify content storage format

The newly opened business of the target enterprise may not have a CDN, and there may be no security testing, etc.

Judge this asset and recent company news

Marginal business (business that is not the main source of profit)

Outsourcing business\According to experience:

    Seeing a page or a sub-site code logic or code style is completely different from the main site

3.4. Others

How to collect passive information in actual combat:

    Multi-tool, multi-platform, distributed, large-scale information collection comprehensive project combining these three points,

    And the collected information will be centrally integrated and automated for analysis and processing

    We will write some information collection tools by ourselves for some specific goals.

Leave a Comment

Your email address will not be published. Required fields are marked *