Google Analytics, GA4 and Search Console integration

Modified on Mon, 09 Oct 2023 at 07:40 PM

1. Connecting a Google account 
      1.1 GA4 integration   

2. Settings

2.1. Google Analytics settings

2.2. Google Search Console settings

2.3. Export search queries from GSC

2.4. Proxy settings for accessing Google services

3. Extracting URLs

3.1. URL extraction use cases

3.2. Google API limits

4. Google Analytics and Search Console parameters

5. Results of the data analysis

5.1. The ‘Dashboard’ panel

5.2. Data in the ‘Overview’ tab in a sidebar

5.3. Issue Reports


Netpeak Spider allows connecting to the Google Analytics (GA) and Search Console (GSC) API to get data in the program interface. So you will be able to get information on sessions, goals, Ecommerce, impressions, etc. for landing pages when performing technical audit of a website.

To start working with GA and GSС in Netpeak Spider, you need to connect a Google account.

1. Connecting a Google account

To add a Google account, open the ‘Google Analytics & Search Console’ tab of the program settings.

Once you click on the ‘Add new Google account’ button, the new tab will be automatically opened in your browser. Select necessary account and allow Netpeak Spider to view data from Google Analytics and Search Console.

Add new Google account to Spider

Note that you need to grant the access to both services. If one of the options is not selected, the account will not be added and you will see the following notification:


access to both services


Once the access is granted, the program will add and remember your account in the settings. If necessary, you can add several accounts and use them simultaneously. The list of saved accounts will be displayed in the corresponding field: 


Save Google account

1.1. GA4 Integration

Integration of GA4 Google Analytics 4 (GA4) is an updated version of GA that replaced Universal Analytics and was officially released in the fall of 2020. Google has now announced that the old version of Analytics will stop working on July 1, 2023. Historical data will be available for some time (tentatively until the end of 2023), but it will no longer collect new data. Here is an article from Google describing the main features on Google Analytics 4.


We also decided to upgrade and added GA4 integration to Netpeak Spider. With a simple process, you can collect the data you need: 

  • select the desired Google account and resource in the settings;
  • specify the period for which you need to download data;
  •  select the type of devices from which the data came (if necessary).


In general, the process of setting up GA4 is almost no different from setting up the integration with Universal Analytics. By the way, we have left the old version of Analytics for now so that you can download historical data and supplement your reports with it.

2. Settings

GA and GSC settings are not bound to Netpeak Spider projects, so opening another project will not affect them

2.1. Google Analytics settings

All Google Analytics accounts connected to Netpeak Spider are displayed in the ‘Google Analytics’ subsection. For each account you can choose:

  • a property
  • a view for the selected property
  • a segment for the selected view


If the chosen view contains goals, all of them will appear:

  • in the ‘Goal names from view’ textbox


Goal names from view


  • inside the ‘Specific goals’ section in the ‘Parameters’ tab in a sidebar

Specific goals

The ‘Date Range’ drop-down menu will allow you to select the time period for exporting data from the service. The menu contains the following ranges:

  • Current month
  • Current year
  • Last 7 days
  • Last 30 days
  • Last 90 days
  • Last 365 days
  • Previous week
  • Previous month
  • Custom – choose this option to set your own date range.


Date range

Each service  receives data with a certain delay which is taken into account by the program automatically:

  • Google Analytics – 1 day
  • Search Console – 3 days

For instance, if you choose the  ‘Last 7 days’ range, the data will be received as follows:
from [current date – 7 days – delay]
to [current date – delay]


The ‘Devices’ filter is used to choose a device to view the analytics accordingly. For instance, clicks to mobile or desktop versions of URLs. 


The filter contains the following options:

  • All
  • Desktop
  • Mobile
  • Tablet 
  • Mobile & tablet


If it is necessary to compare the analytics according to different devices, you need to export the data consistently for each device because the program doesn’t allow to check the data for several device types concurrently in a single project. For example, if you need to compare data according to desktop and mobile devices from GA (GSC), you need to:

  1. Choose the ‘Mobile’ option in the ‘Devices’ filter.
  2. Extract the URLs using the List of URLs menu.
  3. Select necessary GA (GSC) parameters and start crawling and fetching data using the ‘Start’ button.
  4. Export the ‘All results’ report.
  5. Repeat these actions for a desktop devices filter.

2.2. Google Search Console settings

In the Google Search Console settings, you can choose a website from the ‘Site’ drop-down menu. For each site you can select:

  • a country
  • a date range
  • a device (the same way as in the GA settings)


Additionally, you can set the queries for data filtering. All available filtering modes are presented in the table:
 

Mode

Comment

Equals

You can specify only one search query.

Not equals

Contains

Not contains

You can specify several comma separated queries (for example, to exclude all branded searches for your site).


Here is an example of using the ‘Not contains’ mode:


example of using the Not contains


2.3. Export search queries from GSC

In Netpeak Spider 3.6 we have implemented the ability to export search queries from Google Search Console (available in the Pro plan or during trial period).

Export search queries from GSC

If it is necessary to export search queries, you should tick the ‘Enable the export of queries from Search Console’ option. After that, the ‘GSC: queries’ parameter will be available on the 'Parameters' tab. This parameter displays the number of queries for a particular page and if you click on the number you will get a report containing these queries. You can also get this list via the ‘GSC: queries’ option in the ‘Database’ menu.

There is a possibility to get both summary data on requests for pages (requests, clicks, impressions, CTR, average position), and to expand obtained data on requests by such parameters as ‘Device’ and ‘Country’. To do this, activate the corresponding options in this section of the settings.

Enable the export of queries

A few points to remember here:

  1. To start exporting queries for the URLs from the main table use the ‘Export queries from Search Console’ option in the ‘Anasys’ menu. Note that queries will be collected only for crawled URLs.

  2. Each time when your export queries to Netpeak Spider, it collects all queries for the specified period on which a website gets visits but Netpeak Spider will display queries for those URLs that are available in the main table. Thus, even export of the small number of queries may be long-lasting for websites that have a lot of organic traffic.

  3. Search queries analytics allows you to work with a big amount of data for pages in comparison with the web interface of Search Console because Netpeak Spider does not have restriction on 1000 URLs.


2.4. Proxy settings for accessing Google services

In case you need to use a proxy when accessing Google services, follow these steps:

  1. Tick the ‘Use proxy for Google services’ checkbox on the ‘Google Analytics & Search Console’ tab of the program settings.
  2. Tick the ‘Use list of HTTP proxies’ checkbox on the ‘Proxy’ tab of the program settings and configure proxies. 

If several proxies are added, the program will use the first proxy from the list.

To find out how to work with proxies in Netpeak Spider, read the article ‘’.

3. Extracting URLs

3.1. URL extraction use cases

Extracting URLs from GA and GSC will help:

  1. Find URLs that used to get traffic (export from GA) or impressions (export from GSC) but now are unavailable because they return a status code different from the ‘200 OK’.
  2. Find URLs that used to get traffic or impressions but now are orphan because there are no internal links referring to these pages on a crawled website.

When extracting URLs from GSC, the program checks in the ‘All results’ table if there are  URLs with the domain specified in the Google Search Console settings. If there are no such pages in the table, extraction will not start. 


To find orphan URLs, follow these instructions:

  1. Go to the ‘Settings → Advanced’ and disable considering all crawling and indexing instructions to make the program crawl all pages on a website.
  2. Make sure that there are no limits set in the settings (check the ‘General’, ‘Rules’ and ‘Restrictions’ tabs).
  3. Start crawling.
  4. When it is finished, extract URLs from GA and GSС using the ‘List of URLs’ menu:

 GA and GSС in the List of URLs

If there are URLs in the services that used to get impressions or traffic but were not found during the crawling due to the absence of links to these pages, they will be added to the end of the ‘All Results’ table. 


Once the pages are extracted, enable the necessary parameters for analysis and click on the ‘Start’ button to start crawling the list of imported pages. To learn more about GA and GSC parameters, read the section Google Analytics and Search Console Parameters . These URLs can be filtered using the ‘Depth’ parameter: every URL will have a ‘0’ depth because they were analyzed by crawling a list of pages.

3.2. Google API limits

Netpeak Spider can extract large amounts of data but Google API has some speed limits. You can check the approximate time required to download a particular number of URLs in this table:
 

 

A number of URLs

Downloading time, minutes

GA

GSC

100 000

3

3

200 000

5

5

500 000

18

40

2 000 000

30

unknown


In addition to speed limits, keep in mind that:


  1. Google Analytics doesn’t provide information about page protocol so if there are URLs with HTTP and HTTPS protocols in reports, they will get the same values of Google Analytics parameters.
  2. Google Search Console has a limit on the stored data amount. So it is impossible to get data about all URLs via API, if you export data of a large site. But if you create resources in the Search Console for certain folders or subdomains of the site, it will allow you to narrow the selection and get more data about the URLs of these folders or subdomains.
  3. Because of the GSС API specifications, the number of clicks and impressions for URLs is often smaller than displayed in the GSС interface.

4. Google Analytics and Search Console parameters

Netpeak Spider has the following Google Analytics and Search Console parameters:


Google Analytics:

1. Sessions:

  • Users
  • Sessions
  • Bounce Rate
  • Avg. Session Duration

2. All Goals:

  • Goal Completions
  • Goal Conversion Rate

3. Specific Goals – includes parameters from Google Analytics that are set for all goals created under a view (profile). The parameters show the efficiency of certain landing pages in terms of specific goal completions on a website.

4. Ecommerce

  • Transactions
  • Transaction Revenue


Google Search Console:

  1. Clicks
  2. Impressions
  3. CTR
  4. Average position

Learn more about parameters analyzed by Netpeak Spider in the article  .


Fetching Google Analytics and Search Console parameters is performed once the crawling is finished or suspended. It also can be started manually when crawling is finished via the ‘Analysis’ module in case the corresponding parameters were disabled before the crawling started or if it is necessary to analyze them separately from other parameters.

Get Google Analytics and Search Console data

When fetching data from Google Analytics, the program uses the built-in algorithm for avoiding sampling to get the most detailed information by a selected date range. However, it might not work if the large date range is set, many parameters are selected, and the site gets a high traffic volume.

5. Results of the data analysis

Once the parameters are fetched from GA and / or GSC, you can view them:

  • in the ‘All results’ report
  • on the ‘Dashboard’ panel
  • in the ‘Overview’ tab in a sidebar
  • in the issue reports from the ‘Reports’ tab in a sidebar
  • in the ‘Express audit of the optimization quality (PDF)’ report

5.1. The ‘Dashboard’ panel

Netpeak Spider displays three pie-charts based on data from GA and GSC on the ‘Dashboard’ tab in the main table and in the ‘Express audit of the optimization quality’ report:

1. GA: traffic  – statistics on URL traffic and compliance.

traffic from

2. GSC: clicks – statistics on clicks to URLs and their compliance for a chosen period.

clicks from GSC

3. GSC: impressions – statistics on URL impressions and their compliance for a chosen period.

impressions from GSC

5.2. Data in the ‘Overview’ tab in a sidebar

The ‘Overview’ tab in a sidebar contains three groups of pages:

  1. GA: Get Traffic
  2. GSC: Get Clicks
  3. GSC: Have Impressions

There are two subgroups included in each group:

  • TRUE – URLs with the corresponding parameter > 0
  • FALSE – URLs with the corresponding parameter = 0


You can also see the following information next to each group:

  • The number of pages included in the group.
  • The ratio of all crawled pages to the number of pages inside a group. The value is presented in percentage.

5.3. Issue Reports

Netpeak Spider generates 6 issue reports based on collected data from GA and GSC:

  • GA: Non-Compliant Pages with Traffic
  • GA: Max Bounce Rate
  • GA: Compliant Pages w/o Traffic
  • GSC: Non-Compliant Pages with Impressions
  • GSC: Compliant Pages w/o Impressions
  • GSC: Compliant Pages w/o Clicks


Learn more about issue reports generated by Netpeak Spider in the article ‘’.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article