The ‘Source code and HTTP headers analysis’ tool shows how exactly Netpeak Spider analyzes the text when calculating the number of words or characters on a page and helps to understand why sometimes the data in Netpeak Spider differs from what you see when visiting the site using your browser. Learn more here → ‘Why do data in Netpeak Spider and browser differ?’.
1. Starting the tool
You can open the tool in several ways:
1.1. Using the context menu (or a hot-key). Choose a necessary URL in the results table and use the ‘Ctrl+U’ hotkey or select the ‘Source code and HTTP headers analysis’ option in the context menu.
You will see the tool window with detailed data in HTTP headers of a request and server response, information about the page, its source code and raw text on the page with no HTML elements.
1.2. Starting via the control panel. Go to ‘Run(Tools) → Source code and HTTP headers analysis’ on the control panel to open the tool.
After starting the tool, you need to enter a necessary URL into the corresponding field and click on the ‘Start’ button.
Netpeak Spider also collects all URLs that you entered previously and shows them as hints.
2. Working with results
Examples of field names that you can see on the left side and their description listed in the table below. The type and number of these fields may vary depending on the checked page, so we explain you the most common ones.
Field name | Description |
General | |
Page Type | A type of requested page (HTML, JSON, Image, etc.) |
Request URL | URL of requested page. |
Request Method | Used request method when accessing the selected page (e.g. GET). |
Status Code | Status code returned by requested page. |
Response Time | Time (in milliseconds) before receiving the first byte from the server. |
Content Download Time | The time (in milliseconds) for which the server returns the HTML code of the page. |
Proxy Server | IP address and port of a proxy, from which the request was sent if a proxy is set in the program settings. Otherwise, this filed will contain the ‘(Not Set)’ value. |
Remote Address | Domain IP address and port, on which requested page is located. |
HTTP response headers | |
Date | Response generation date. |
Content-Type | Type of page content. |
Content-Encoding | Content encoding method used on requested page. |
Connection | Management options for the current connection. |
Vary | Notifies the requesting server how to match future request headers to decide if a cached response can be used instead of requesting a new response from the original server. |
Set-Cookie | Cookie data. Used to send cookies from the server to the User Agent. Value format: = . |
HTTP request headers | |
User-Agent | The current User Agent that was used when requesting the specified page. You can change the User Agent in the program settings. |
Accept | List of valid resource formats. |
Accept-Encoding | List of valid encodings. |
Accept-Charset | A list of supported encodings to provide to the user. |
Host | The URL of the domain on which the requested page is located. |
Cache-Control | Directives for managing caching. |
Pragma | A field that is implementation dependent and may have different values throughout the request-response chain. Used for backward compatibility with HTTP / 1.0 caches, where the Cache-Control HTTP / 1.1 header is not yet present. |
On the left part of the window you can also see the list of GET-parameters if they are present in the URL of the page. For example, if the URL of the page is https://www.example.com/products?sort=popularity&os=windows, you will see the following information:
Query string parameters | |
sort | popularity |
os | windows |
Please note that this information is displayed only for pages returning 2xx status code. The source code can be displayed only for the following types of pages:
- HTML;
- PlainText (e.g. TXT files);
- JavaScript;
- CSS;
- XML;
- GZIP → Netpeak Spider can unpack an archive and show its content
3. Results export
Use the ‘Export’ button to export data in HTTP headers (left panel) and the ‘Save source code’ button to save the source code of the page (right panel).
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article