Be your own station search engine

xiaoxiao2021-03-06  66

ccterran (original) Author: iwind friends with dreamweaver do a website, no dynamic content, just some personal collection of articles, personal introduction, and so on. Now there is more content, I want to ask me to help him make a search engine. To be honest, this is a problem, so I have a hand. Now I also see someone want to do this in other forums, so I want to talk about this knowledge and focus on understanding the way. Let's think about a thinking before writing the program. Below is my idea, you may have better, but pay attention to this is just a method problem: traverse all files? Read the content? Search keyword, if you match, put an array ? Reading group. Before implementing these steps, I assume that your web page is standard, it is the title ( </ title>), there is also (<BOD *> </ body>), if you are designed with Dreamweaver or FrontPage Then, unless you deliberately delete it, they are exist. Let's take a step by step and improve this search engine in the project. First, the design search form built a search.htm in the root of the website, the content is as follows <html> <head> <title> Search form </ title> <meta http-equiv = "content-type" content = "Text / HTML; Charset = GB2312 "> </ head> <body bgcolor =" # fff "text =" # 000000 "> <form name =" form1 "method =" post "action =" search.php "> <Table width = "100%" cellspacing = "0" cellpadding = "0"> <tr> <td width = "36%"> <div align = "center> <input type =" text "name =" keyword "> </ DIV> </ td> <TD width = "64%"> <input type = "submit" name = "submit" value = "Search"> </ td> </ tr> </ table> </ form> < / Body> </ html> II. The search program then built Search.php files in the root directory to process the data passed by the search.htm form. The content is as follows <? php // Get Search Keyword $ Keyword = TRIM ($ _ post ["keyword"]); // check if it is empty if ($ keyword == "") {echo "You want to search for keywords"; exit; // End program}?> If the keyword entered by the visitor is empty, you can make a prompt. Here is all files. We can use the recursive way to traverse all files, you can use the function OpenDir, readdir, or you can use the PHP Directory class.</p> <p>We now use the former. <? PHP // Traversed all files Function Listfiles ($ dir) {$ hand = OpenDir ($ dir); while (false! == ($ file = readdir ($ handle))) {= ($ file! = "." && $ file! = "..") {// Continue to search if it is a directory (is_dir ("$ dir / $ file") {listfiles ("$ dir / $ file" (} else {// This is handled here}}}}?> On the red word, we can read, process the searched files. Below is the read file content, and check if the content contains keywords Keyword, if you contain the address address to an array. <? php // $ DIR is a search directory, $ keyword is a search keyword, $ array is a stored array Function Listfiles ($ DIR, $ Keyword, & $ Array) {$ Handle = OpenDir ($ dir); while (false! == ($ file = readdir)) {IF ($ file! = "." && $ file! = ".") {if (is_dir ("$ dir / $ file")) {ListFiles ("$ DIR / $ FILE", $ Keyword, $ array);} else {// read file content $ data = fread (FOPEN ("$ DIR / $ FILE", "R"), FileSize (" $ DIR / $ file ")); // Do not search for itself ($ file! =" search.php ") {// does match if (EREGI (" $ keyword ", $ data)) {$ array [] = "$ dir / $ file";}}}}}} // Define Array $ Array = array (); // Execute ListFiles (".", "PHP", $ array; // Print Search Results Foreach ($ Array As $ Value) {echo "$ value". "<br> / n";}?> Now combine this result and starting program, enter a keyword, then discover your website The relevant results in the middle are searched. We are now perfecting it. 1. List the title to put if (EREGI ("$ keyword", $ data) {$ array [] = "$ dir / $ file";} change to IF (EREGI ("$ keyword", $ data) ) {IF ("<title> (. ) </ Title>", $ DATA, $ M)) {$ title = $ m ["1"];} else {$ title = "There is no title"; } $ array [] = "$ DIR / $ file $ TITLE";} The principle is that if <title> XXX </ Title> in the file content, then take the XXX as title, if you can't find it Naming the title is not "no title". 2, only the topic section of the content of the web page. There will be a lot of HTML code inside when you are a web page, and these are not what we want to search, so you have to remove them. I have now used regular expressions and strip_tags, and I can't remove all all.</p> <p>Put $ Data = FREAD (FOPEN ("$ DIR / $ FILE"), FileSize ("$ DIR / $ FILE")); // Do not search for your own IF ($ file! = "Search.php") {/: Do IF (EREGI ("$ keyword", $ data)) {change to $ data = FREAD (FOPEN ("$ DIR / $ FILE", "R"), FileSize ("$ dir / $ file" )); if (EREGI ("<body ([^>] )> (. ) </ body>", $ data, $ b)) {$ body = strip_tags ($ b ["2"]); } else {$ body = Strip_tags ($ data);} if ($ file! = "search.php") {IF (EREGI ("$ keyword", $ body)) {3, Title Plus link Foreach ($ Array AS $ VALUE) {Echo "$ value". "<br> / n";} Change to Foreach ($ Array AS $ VALUE) {// Disassemble List ($ FileDir, $ TITLE) = Split ("[]" , $ value, "2"); // Output Echo "<a href=$filediver $ value </a>". "<br> / n";} 4 Prevent timeout If the file is more, then prevent PHP execution Time timeout is necessary. You can add set_time_limit ("600") in the file header; in seconds, it is 10 minutes.</p></div><div class="text-center mt-3 text-grey"> 转载请注明原文地址:https://www.9cbs.com/read-111759.html</div><div class="plugin d-flex justify-content-center mt-3"></div><hr><div class="row"><div class="col-lg-12 text-muted mt-2"><i class="icon-tags mr-2"></i><span class="badge border border-secondary mr-2"><h2 class="h6 mb-0 small"><a class="text-secondary" href="tag-2.html">9cbs</a></h2></span></div></div></div></div><div class="card card-postlist border-white shadow"><div class="card-body"><div class="card-title"><div class="d-flex justify-content-between"><div><b>New Post</b>(<span class="posts">0</span>) </div><div></div></div></div><ul class="postlist list-unstyled"> </ul></div></div><div class="d-none threadlist"><input type="checkbox" name="modtid" value="111759" checked /></div></div></div></div></div><footer class="text-muted small bg-dark py-4 mt-3" id="footer"><div class="container"><div class="row"><div class="col">CopyRight © 2020 All Rights Reserved </div><div class="col text-right">Processed: <b>0.041</b>, SQL: <b>9</b></div></div></div></footer><script src="./lang/en-us/lang.js?2.2.0"></script><script src="view/js/jquery.min.js?2.2.0"></script><script src="view/js/popper.min.js?2.2.0"></script><script src="view/js/bootstrap.min.js?2.2.0"></script><script src="view/js/xiuno.js?2.2.0"></script><script src="view/js/bootstrap-plugin.js?2.2.0"></script><script src="view/js/async.min.js?2.2.0"></script><script src="view/js/form.js?2.2.0"></script><script> var debug = DEBUG = 0; var url_rewrite_on = 1; var url_path = './'; var forumarr = {"1":"Tech"}; var fid = 1; var uid = 0; var gid = 0; xn.options.water_image_url = 'view/img/water-small.png'; </script><script src="view/js/wellcms.js?2.2.0"></script><a class="scroll-to-top rounded" href="javascript:void(0);"><i class="icon-angle-up"></i></a><a class="scroll-to-bottom rounded" href="javascript:void(0);" style="display: inline;"><i class="icon-angle-down"></i></a></body></html><script> var forum_url = 'list-1.html'; var safe_token = 'ZuEPEK_2BEsa6ZaCv4BGBoxGfDrXaJ1_2F8N_2BoDZh_2B6qF9aLT1LUiETwkMl7nBPJBmGi3e9mtySsmNWwc_2BrDG6OuBQ_3D_3D'; var body = $('body'); body.on('submit', '#form', function() { var jthis = $(this); var jsubmit = jthis.find('#submit'); jthis.reset(); jsubmit.button('loading'); var postdata = jthis.serializeObject(); $.xpost(jthis.attr('action'), postdata, function(code, message) { if(code == 0) { location.reload(); } else { $.alert(message); jsubmit.button('reset'); } }); return false; }); function resize_image() { var jmessagelist = $('div.message'); var first_width = jmessagelist.width(); jmessagelist.each(function() { var jdiv = $(this); var maxwidth = jdiv.attr('isfirst') ? first_width : jdiv.width(); var jmessage_width = Math.min(jdiv.width(), maxwidth); jdiv.find('img, embed, iframe, video').each(function() { var jimg = $(this); var img_width = this.org_width; var img_height = this.org_height; if(!img_width) { var img_width = jimg.attr('width'); var img_height = jimg.attr('height'); this.org_width = img_width; this.org_height = img_height; } if(img_width > jmessage_width) { if(this.tagName == 'IMG') { jimg.width(jmessage_width); jimg.css('height', 'auto'); jimg.css('cursor', 'pointer'); jimg.on('click', function() { }); } else { jimg.width(jmessage_width); var height = (img_height / img_width) * jimg.width(); jimg.height(height); } } }); }); } function resize_table() { $('div.message').each(function() { var jdiv = $(this); jdiv.find('table').addClass('table').wrap('<div class="table-responsive"></div>'); }); } $(function() { resize_image(); resize_table(); $(window).on('resize', resize_image); }); var jmessage = $('#message'); jmessage.on('focus', function() {if(jmessage.t) { clearTimeout(jmessage.t); jmessage.t = null; } jmessage.css('height', '6rem'); }); jmessage.on('blur', function() {jmessage.t = setTimeout(function() { jmessage.css('height', '2.5rem');}, 1000); }); $('#nav li[data-active="fid-1"]').addClass('active'); </script>