Automation of technical reports for referencing

The views of the author are entirely his own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

As the web becomes more complex, with JavaScript frameworks and front-ends of the library on websites, progressive Web applications, single-page applications, JSON-LD, etc., we are seeing more and more An increasingly large area. When all you have is HTML and CSS and links, there is only so much you can mess up. However, in today’s world of dynamically generated websites with universal JS interfaces, there is plenty of room for errors to get in.

The second problem we face is that it’s hard to know when something went wrong, or when Google changed how they handle something. This is aggravated only when you take into account situations such as site migrations or redesign, where you can suddenly archive a lot of old content or redesign a URL structure. How can these challenges be met?

The old way

Historically, the way you analyze things like this is to look at your log files using Excel or, if you’re hardcore, Log Parser. Those who are great, but they require you to know that you have a problem, or that you are looking for and are to grab a section of newspapers that have the questions you need to answer. Not impossible, and we’ve written about doing this pretty widely both in our blog and our analysis analysis log file.

The problem with this, however, is quite obvious. It requires you to look, rather than make you realize that there is something to look for. With that in mind I thought I would spend a little time to consider if there is anything that could be done to make the whole process take less time and act as an early warning system.

The first thing we need to do is configure our server to send log files somewhere. My standard solution to this has become using journal rotation. Depending on your server, you will use different methods to achieve this, but on Nginx, it looks like this:
WordPress Training in Hyderabad

# Time_iso8601 looks like this: 2016-08-10T14: 53: 00 + 01: 00

If {$ time_iso8601 ~ “^ (\ d {4}) – (\ d {2}

Set $ ​​year $ 1;

$ Month $ 2;

Set $ ​​day $ 3;


<Span class = “redactor-invisible-space”>

</ Span> access_log /var/log/nginx/$year-$month-$day-access.log;

This allows you to view logs for any specific date or set of dates by simply pulling data from the files for that period. After configuring the log rotation, we can then configure a script, which we will run at midnight using Cron, to extract the log file that relates to yesterday’s data and analyze it. If you want, you can watch several times a day, or once a week, or at the interval that best suits your level of data volume.

The next question is: what would we want to look for? Well, once we have the logs for the day, that’s what I get my system reporting on:

30 * status codes

Generate a list of all pages hit by users that resulted in a redirect. If the page linking to this resource is on your site, redirect it to the actual end point. Otherwise, get in touch with anyone who is bound to you and cause them to sort the link to where it should go.

404 status codes

Similar history. Any 404ing resource must be checked to make sure it is missing. Anything that should be there can be examined to find out why it does not resolve, and links to something missing can be treated in the same way as a 301/302 code.

50 * status codes

Something bad happened and you’re not going to have a good day if you see a lot of 50 * codes. Your server is dying on requests to specific resources, or maybe your entire site, depending on exactly how bad this is.

Exploration budget

A list of all the resources that Google has scanned, how many times it has been requested, how many bytes have been transferred and the time needed to resolve these requests. Compare this with your site map to find pages that Google will not crawl, or that it is hammering, and repair as needed.

Main / Lesser Resources

Similar to the above, but detailing most things and less requested by the search engines.