Search engine: "regime" in technology

zhaozj2021-02-16  58

"Fragile" advantage last fall, Google has gained exotic attention in a global scale. Some observers even think it is "the best-hot company on this planet." However, when Google's Technical Director Craig Silverstein faces "whether Google can continue to pay attention to advanced search technology, he will always lead to competitors", his answer is very cautious. "It is very easy to transplant from a search engine to another," Silverstein said, although Google's leading power of "frontline" in this technology battle has been put into research. Personnel and software developers and have a large number of current state-of-the-art technologies, but these technologies cannot guarantee permanent success. "We hope that the next technological breakthrough can also start from Google, but who can make a ticket?" Indeed, in the search engine market, "innovation" is full of "wild cards" in play cards. . WHITNER's analyst whit Andrew said: "In 1999, you can think that Altavista has eaten the entire search market, but in 1997, the overlord of this market is InkTomi, and 1995 is Yahoo." So, you will always Also don't want to know when you will take out a hair kid to turn the "Search Empire" that you build into "Yellow Flower" yesterday. From a certain point of view, Google is fragile because it is not like AT & T or Microsoft to occupy the advantages of infrastructure in their respective fields: AT & T once controls most of the telephone network, Microsoft controls the PC operating system The dominant market can help them ensure that their "regime" is not subverted by the competitors. In fact, the relevant reports in January this year have pointed out that Yahoo may soon abandon with Google's cooperation and turn its search technology. Even a core technician for Apple Computer Development Search Tools believes: "If Google has some new technologies worthy of its birth at the beginning of its birth, it is because there is no one to consider search technology." Yes Google The company "Can't Rong" -Google's search algorithm can be able to maximize the relevant content because it is currently the most popular search engine - it seems to have reflected it in technology to some extent. The lack of lack. Many people believe that many other search engines have already have the technical strength of Google. In searching industries, cruel competition will continue, especially between competition between the owners of cutting-edge technology, will be more intense, such as technical fields such as natural language understanding and machine learning. In the next 5 to 10 years, we will find that the search engine will have a huge improvement, which will provide us with more related, higher value, more ordered information, and all these changes depend on this cruelty Technical competition. Perspective Google If you want to know about treatment information about the 18th-century bad blood disease, if you don't pass the search engine, you will simply know that this information is stored on a server in the Scottish Edinburgh Royal Medical College with a strange name (WWW. Jameslindlibrary.org/trial_records/17th_18th_century/lind/lind_kp.html) file. However, even if you enter "scurvy (bad blood disease)" in Google, MSN, or Ask Jeeves, enter "Scurvy (bad blood disease)", you have no way to accurately find this document of the Royal Medical College.

Because these search engines scan thousands of web pages per second, then store keywords, phrases, headings, subtitles, links, and other description information in the database in the database, when you enter search After the keyword, the search engine will compare each of the keywords and index lists, presented the items that contain one or more keywords to you from the order of correlation from the biggest to small, and you will not Do not expand the carpet-type artificial search again in front of this huge list. In this, how to determine the correlation between an index item and the search keyword is the core of each search engine, and it is also a "secret formula" of each search company. In 1999, Google's abortion protrusions were mainly due to its web rating (PageRank) algorithm. This algorithm is invented by Google founder Larry Page and Sergey Brin in the inventions of Stanford, which used a large number of mutual links between the web pages. sure. Page and BRIN realize that if they build indexes enough, you can measure the importance of this web page by calculating the number of some web links. Of course, this is not a simple count problem, and they also consider other factors, such as how big is the correlation between the topics of the link page and the link page, and how big is the authority and reputation of the link page. .

It turns out that this technology of Google is run very successful, and it can be seen from the user's click rate. The search engine crocodile Altavista has always been referred to in the web page to determine the number of webpages. According to the data obtained from the research institution Media Metrix and Alexa, in July 2000, 2004 Altavista fell from the 8th place to the 61th in the global web traffic ranking, while Google climbed from the 4th place. The word "google" is also rated as the most valuable word in 2002 by the American AuthorAds (American Dialect Society). Of course, the web rating algorithm has some defects. For example, some want your own web address can be rushed to the top website in Google's search results, and even unscrupulously create thousands of or even millions of junk pages, which are linked to his website, thus Artificially enhances its level (although Google has said that they have a way to resist this practice, but however, we are still unknown so far). In addition, the same vulnerability can also cause "Google Bombing". This is a phenomenon in recent: Blogger deliberately proposes a very strong or political point of view on a website, making this website quickly learned by many other websites, and search for Google When entering related keywords in the box, this site can be rushed very brought. For example, the manufacturer of "Google Bomb" is based on the head of the Iraq war, and proposes the US military in sensitive views such as the defeat of the Iraq, and try to cause the Bush government to pay attention to indirectly lift its level. However, some experts have more headaches, the web rating algorithm makes those originally legal, and very compliant with the user retrieved needs, because it is rarely buried in thousands of search results in thousands of search results. . For a specific user, the correlation of a web page with his retrieval requirement does not depend on whether this page is popular. "Star explosive" thinking "Who has controlled the information, who has despite meager but the most extensive power," Mooter Search Engine hostel liesl Capper said that the search engine should return this right to every one of the Internet Individual users. Because of this, Mooter's goal is to make network retrieval easier and personalized. Capper grew up in Zambia and learned psychology in South Africa, she was immigrated in Australia in 1997 and choosing to study search technology as her career. She set up a shop in Sydney City, and hired an experienced software designer Jondarr Gibb, and John Zakos, who is doing Ph.D. graduation thesis, and his paper mainly studied how neural network theory is applied to Internet search. These three people combine psychology, software theory, and neural networks, invented an algorithm for searching engines on webpage-related rating, which can learn and understand the needs of specific users. Before the link to the user "Dump", the Mooter Engine uses this algorithm to analyze the potential meaning of the user's keyword and its synonym, and then combine the correlation of the search page to the specific scene, and will The resulting results are placed in different clusters (Clusters). The user first saw a "Starburst" text display interface, which shows the name of many clusters. For example, enter a retrieval keyword - Paul Cezanne (Paul Cezanne, the representative of the post-Impression Division), the search results show several clusters such as Art, Artists, Cezanne, France, which reflects the part of the psychology.

"When you use traditional search engines to retrieve, face millions of links, you will have a conceptual classification in the minds you want to find," Copper said: "But our brain At a certain time, only three to four information can be handled at the same time. This is a cluster to consider this. "The Mooter engine also accurately understands the user's retrieval requirements. For example, the user enters keyword -dog, in many clusters that are subsequently displayed, there may be a cluster name is "Breeds", click on this cluster, and the user will be mainly related to the puppy. website. If the user selects other clusters, Mooter will change the order in the link list according to the user's interest, and other types of websites may be rated above. On the search interface, there is a "Refine" button, click this button, the engine will further reduce the search range, for example, click the "Breeds" cluster and click the "Refine" button, and Mooter searches for keywords "DOG BREEDs". Display a set of new clusters. "Google's search technology is more in focusing on the web architecture, which is not conducive to mining webpage deep value, and the concept of 'cluster' containing specific themes is very similar to the 'community' of the biological industry. Paul Gardi, Vice President of TEOMA. Before the TEOMA engine gives the search results, it determines a series of "communities" related to keywords and find the authoritative site within this "community", and then determine each of the reference frequencies to the web page based on these authority sites. The relevant level of the page. Ask Jeeves is because the original search technology provider is abandoned to adopt a TEOMA engine, which has increased its retrieval to increase by 30% each year in 2002 and 2003. Similarly, in-depth discussion web value is also a new face-Dipsie's goal. Different, Dipsie believes that Google and TEOMA only index 1% of all documents on the Internet, and Dipsie's search site can openly came out this summer, and its indexing ability will reach 10 billion documents, which is current Google Index. Three times. So, although Google is still searching for the market's "king", many of its competitors are coveted with better creativity. Microsoft Search If a software company is best at hiring the hiring of creative but in terms of young people, and converting their wonderful ideas into a successful product of the entire market, then this company should be Microsoft. Microsoft never gives up any of the hot markets in the computer science field. Once it smells a huge market, they will do their best to eat it. At present, Microsoft has eaten 97% of the PC operating system market and 90% of the office software market, and the search market is one of the few fields that have not extended to the "Microsoft Empire". Therefore, Microsoft has already Search technology is deemed to drive the key to the next business growth. At present, Microsoft's researchers and product developers are working hard to integrate web search capabilities to the next generation of the next-generation WINDOWS operating system that will launch the test version later this year. In Microsoft's search software, users only need to describe their own questions in English, you can get a direct answer. Because Microsoft believes that users should not need to hurt their brains in order to choose an appropriate keyword, and don't have to use ",", "or", "non-", "non-", "non-", "non-", "non-" Need to see the results of those search one from one page. A Microsoft's researcher Eric Brill said that the search engine should be able to understand and answer the problem with the user's natural language. Let's take a look at Bill Gates and his employees have tested a search for this search askmsr.

In its search box, you can enter the user's question, such as "Who Killed Abraham Lincoln? (Who killed Abraham Lincoln?), Then the results obtained no longer a link to the website containing the question answer, but a Very concise answer: "John Wilkes Booth (John Wilkes Bush)." This excellent software does not use what advanced artificial intelligence principle, but uses two surprising "tips" . One is that the search program can learn syntax from a large database that stores a large number of simple sentences, and then reacts the user input by multiple ways, in a variety of ways to match the web content. For example, "WHO KILLED ABRAHAM Lincoln?" Can be rewritten as "_killed Abraham Lincoln" or "Abraham Lincoln WAS Killed By_." These strings will be searched in a certain order, with a standard keyword-based network search method, once One of the strings are matched, and the search program can immediately display the answer to the user. However, in many cases, the program does not necessarily find a sentence that completely matches the string, such as "John Wilkes Booth's Violent Deed At the Ford Theater Ended Lincoln's Second Term Before It Had Started (John Wilkes Bush The atrocities in the Ford Theater have not yet begun to end the second president of Lincoln. "This sentence also fully answered the problem of the user, but it can't fully match any of the above characteristics. In the face of such a situation, Askmsr can also be handled, this is its second "trick". In Askmsr, if "booths" is multiple times and "lincoln", the word is in a sentence, then there is inevitable relationship between the two, and this relationship is the basis for finding the answer, even though this The practice does not guarantee 100% accuracy (see Schedule). However, as the number of web pages increases, the accuracy of AskMSR will increase. Another thing about the search engine, Microsoft is doing is to try to make the search engine truly integrate into the user's computing experience. User's attention is concentrated in "When" and "How to" use the information, not the search engine How work is working. To this end, Microsoft's information acquisition expert Susan Dumais has developed a program called "stuff i've seen", and its interface will appear in Windows toolbar, and after entering the problem in the search box, Stuff i've seen A list of organized lists will be displayed in a single standard window, which can be linked to all related emails, schedules, address thin, office documents, and web pages. Stuff i've seen has a feature called "Implicit Query", if you read an e-mail, implied query features will display some links in a small window, pointing to email The email address of all people and their titles also point to the author of this email to send you the email. In order to make this software more easy to use, Dumais also intends to add a "Find Me Stuff Like this" in the menu that pops up in the right mouse button.

转载请注明原文地址:https://www.9cbs.com/read-19760.html

New Post(0)