Application of Automated Classification in Search Engine Performance Optimization

xiaoxiao2021-03-06  42

Automated Category: According to the classification criteria, the inspection object is divided into all kinds.

Auto Cluster: Aimalic objects similar to similar, approximation, and characteristic are polymerized according to the internal features of the quoted formation.

Information query mode:

- Categories Browse: Based on the website classification directory, browse the object is website - high cost, update maintenance

- Key words retrieval: The search object is the web page, the amount of information is large, the update is timely, no manual intervention - the amount of information is large, and the quality is difficult to guarantee

==== "Provide category browsing to keyword search results set

Text Categorization:

- Based on Knowledge Engineering: Based on the preparation of reasoning rules in terms of language knowledge - difficulty

- Based on statistics: using the word frequency information to weigh the text (simple, accurate) vector space model - document similarity by two vectors of the angle

Automatic specification step:

Web feature extraction and weighting: improve classification speed and accuracy (exclude interference) word frequency, location

Machine learning:

SVM: Based on the principle of structural risks in the learning theory, the high-dimensional space is found to be super flat as two classes to meet the minimum classification error rate (the maximum classification gap)

Recently, K Nets: For a given new web page, consider the recent K text from the training focus from the webpage, determine the new web page category based on the category belonging to this K art. K value is generally adjusted for learning

Bayesi index algorithm

Automatic clustering implementation steps:

Web page representation

Similarity calculation

Cluster

Give a clustering

Basic implementation of automatic cluster:

Single-free clustering method: set similarity similarity threshold; one article is used to make clustering centers, for new text, calculation with its similarity, in the threshold, adjust the cluster center; otherwise New class cluster center.

Inverse Center Clustering Law: Cut a vector is a cluster center; there is a largest minimum non-cluster center vector to the next cluster center. Determine the clustering

Density Test Method: There are more web pages around a web page, and can be used as a cluster center in a large range of web pages. The web page is divided into unlike web pages, clustered web and loose web pages. All web pages are unlike pages. Certificate of unstoppable web pages, depending on the test conditions, change to the clustering page or loosely page to know the end.

Automatic classification application instance:

WWLIB automatic regulatory system

Grouper automatic clustering system

VIVISIMO automatic clustering system

Application related issues:

Traditional (Library) Classification method VS Network Classification - "Combination

Application time

Application object

Outcome

转载请注明原文地址:https://www.9cbs.com/read-74320.html

New Post(0)