Engineers detail Google's search results arrangement algorithm

xiaoxiao2021-04-09  440

Engineers detailed in Google's search results ranking algorithm Source: SeoClub Author: Matt Cutts Edit: MATOKU

This article author Matt Cutts is a software engineer of Google's quality management department. His work is mainly giving a good website assessment level, and is responsible for development

Stop the outlook or spam website appears in Google search results.

One of the most issues of library administrators is: "What kind of results should be on the top of the search list, how does Google choose?" Now Quality Engineer

Matt-Caz introduced the knowledge of quick entry and explained how Google crawled and indexed online, and how to assess the search results level. Matt also to the school library administrator

Propose suggestions to tell them how to tutorize students.

Crawling and index

There are many things to have before you browse the web page that contains Google Search results. The first is to crawl and index on the web page of the World Wide Web. This work is

Googlebot is done, it is responsible for connecting to a global network server to collect files. Crawling is not really roaming online, but to access the web server to return to a specific web page

On, then scan the web page to create a hyperlink and click the number for each web page. Crawling a lot of files, but these files cannot be used directly for search.

If there is no index, when you want to query the contents such as "Civil War", Google's server will have to read each file every time you search.

Rong. Therefore, the second step is to establish an index, which requires "conversion" to crawling the data obtained. In order not to scan every word on each file, you need to

According to some articles in order to display all files that contain specific words. For example, suppose the word "CIVIL" appears on the file number 3, 8, 22, 56, 68 and 92, but single

The word "WAR" appears on a file number 2, 8, 15, 22, 68, and 77.

Once an index is established, the file is started to grade and determine their relevance. If someone goes on Google search and enter "Civil War", presence and evaluation

Search results need to do two things: First, find a web page that contains users; second, schedule the location of the page in accordance with the correlation. Google has developed an interesting technology to add

The process of speed first steps: not to store all indexes on a computer, but use hundreds of computers to do this. Since the task is assigned to many computers, make the query answer

More quickly.

In order to more describe this process, you can imagine the index of the next 30 pp. If a person finds information in the index, then every search is at least spend.

A few seconds; but if you divide each page of the index to find different people? Thirty people find different parts of the index, more than a person to find more than one. same

Google also assigns data to each computer so that you can find files faster.

How do I find a web page that contains users? Let's return to the "Civil War" example of the above. Word "CIVIL" in number 3, 8, 22, 56, 68 and 92

On the file number "WAR" is numbered 2, 8, 15, 22, 68, and 77, we can display the file on the web and find files containing two words (can be seen from the table below)

Is 8, 22 and 68 documents).

Word CIVIL 3 8 22 56 68 92

Word WAR 2 8 15 22 68 77

Two words appear 8 22 68

A file list containing a word is called a "file identity list", looking for a file containing two words is called the "File List".

Assess the search results

Once you have a web page containing users, you should evaluate the web page. Google uses many technologies, where the PageRank algorithm is the most famous. PageRank is two things: how many links are available from the website to a web page, providing the ranking of the links. Using PageRank, the value of links from CNN and New York Times website is a lot

Not quite famous for two times.

In addition to PageRank, Google also uses many other technologies, such as "Civil" and "WAR" in a file, two words that are close, than only "WAR"

"The word" Revolutionary War "is more dependent. In addition," Civil War "webpage appears in the topic, its correlation is better

The topic is more important to "19th Century American Clothing" (American clothing in the 19th century). Also if "Civil War" appears several times on the web,

A web page is related to the page.

Google's purpose is to find a page of popularity and relevance. If the number of information that matches two questions, the number of information is almost the same, we often choose more famous websites.

link. However, if other aspects indicate that a web page is more related, fewer links or lower ranking are also selected. For example, a web page is all about the "North and South War".

It will be more useful than just a web page that is slightly mentioned in "North and South War", even if this webpage is in a unique website. Once we have the list and score of the file, you will choose the most

High scores, most matching files.

Google extracts a few sentences from each file containing Questionally as a summary display, then the rated URLS and summary are displayed on the search results. As you know

A variety of searches requires a lot of computing resources. Every search requires more than 500 computers to work together, and the search time is less than half seconds.

------------------------------------- Matoku Reviews:

Google has such a powerful feature lies in his unique algorithm, and the above article can learn some of the basic algorithms of Google. I will also find similar articles to everyone.

转载请注明原文地址:https://www.9cbs.com/read-133066.html

New Post(0)