Write document search page with ASP

xiaoxiao2021-03-06  105

Query language

Type words or phrases in the query form, then click the button to perform the query, you can search for any word or phrase in the Web site (for example, the "Executive Query" button in the query table example). This section will introduce the following topics:

Logic and similar operators: Show how to perform more accurate queries by inserting logic and similar operators.

Wildcard: Help you find a page that contains words similar to the word gives.

Free text query: Explain how to express a query based on the phrase, not to use precise words.

Vector Space Query: Explain how to get query results matching words and phrase lists.

Attribute value query: Tell you how to query the properties value of the file.

Sample: Example of various queries is given.

Attribute name list: List and instructions that can be used for querying properties.

Search Generate a list of files that contain a word or phrase (regardless of the word or phrase anywhere in the file). The following list gives the rules for the query:

Continuous words are treated as a phrase, and they must appear in the same order in the matching document.

The query is not case sensitive, so you can type inquiry with uppercase or lowercase.

You can search for any words, but not included in the exception list (for English, including A, AN, AND, AS, and other words), they will be ignored in the query.

The words in the exception list will be treated as a placeholder in the phrase to perform similar queries. For example, if you query "Word for Windows", the result will give "Word for Windows" and "Word and Windows" because for is a virtual word that appears in the exception list.

The punctuation symbol will be ignored at the time of search, such as the number (.), Colon (:), semicolon (;) and comma (,).

To use special treatment characters in the query, such as &, |, ^, #, @, $, (,), please check with query (").

To search for words or phrases containing quotation marks, please use quotation to enclose the entire short screen and use two quotes to enclose the words enclosed in quotes. For example, "World-Wide Web OR" "Web" "will search for World-Wide Web or" Web ".

You can insert logic operators (AND, OR and NOT) and similar operators (Near) to specify additional search information.

Wild characters (*) can match words with the prefixed prefix. Query ESC * You can match the entry "ESC", "Escape", and more.

Free text query may not specify queries as query syntax.

You can specify a vector space query.

You can execute an attribute value query for ActiveX "! (OLE) and file properties.

Logic and similar operators

Logic and similar operators can create more precise queries.

Search target

Example

result

Two entries are included in the same page

Access And Basic- or -access & Basic

Contains pages of words "Access" and "Basic".

One of the same page contains one of the two entries

CGI or isapi- or -cgi | isapi

Contains pages with words "CGI" or "Isapi".

Contains the first entry, but does not contain the second entry

Access and not basic- or -access &! Basic

Contains words "Access", but does not include a page of the word "Basic".

Page that does not match a property value

Not @Size = 100- or -! @size = 100

Size is not a 100-byte page.

The same page contains two entry and close to each other.

Excel Near Project- or -excel ~ Project

A page containing words "Excel" and "Project" that are close to each other.

prompt:

You can add parental arc in the query expression, and the part of the brackets in the expression is performed first than the other parts of the query. Using dual quotes (") can indicate that the logic or Near operator keyword needs to be ignored in the query. For example," Abbott and Costello "will match the page containing the phrase, not the page that matches the logical expression. As an operator Word and the word is a virtual word.

The Near operator is similar to the And operator. Near also returns a page containing two words. However, the NEAR and AND operators differ from the NEAR to see if the words are close. That is, the level of the page containing a relatively close to the search word will be greater than or equal to the level of the page remotely separated by the word. If the search for words are more than 50 words, this page will be set to zero

In the content query, the NOT operator can only be used after the AND operator, which only uses a page that matches the front content constraint. For attribute values ​​queries, the NOT operator may not be used with the And operator.

The priority of the AND operator is higher than that. For example, the first three queries below, but the fourth different: a and b or cc or a and bc or (a and b) (c or a) and b

Note that in all languages, symbols (&, |,!, ~) Are the same as English keywords and NEAR (INDEX Server support). If the browser is set to the following six languages, the localized keyword can also be used.

Language

Keyword

German

Und, Oder, Nicht, NAH

French

ET, OU, SANS, PRES

Spanish

Y, O, NO, CERCA

Dutch

En, of, NIET, NABIJ

Swedish language

OCH, Eller, INTE, N 腞 A

Italian

E, O, NO, Vicino

Note that the Near operator can only be used for words or phrases.

Wildcard

Wildcards can help you find pages that contain words similar to your words.

Search target

Example

result

Word with the same prefix

Comput *

A page containing a word with "Comput", such as "Computer", "computing", etc.

Words based on the same word

fly **

Contains a page based on the same word "fly", such as "flying", "flown", "flew", etc.

Free text query

In the free text query, the query engine finds the page with the best matching of words or phrases. This kind of query is matched, not a precise word. In the free text query, logic, similar, and wildcard operators will be ignored. Free text query is prefixed in $ contents.

Search target

Example

result

Match the file file

$ Contents How Do I Print in Microsoft Excel?

Refer to the page of Printing and Microsoft Excel.

Vector space query

Query engine supports vector space query. Vector query returns a page of matching words and phrase a list, each page specifies the level of the page and query matching.

Search target

Example

result

Page containing specified words

Light, BULB

Contains files with the most matching words with search words

Page containing weighted prefix, words and phrases

Invent *, Light [50], BULB [10], "Light Bulb" [400]

Contains files, words "Light", "BULB", and phrase "light bulbs" with "invent," prefixed (entry)

The components in the vector query are separated by commas.

Components in vector queries can be weighted with [Weight] syntax.

Vector query returned page does not have to match all the entry of the query.

When the result is sorted by level, the vector query is best.

Attribute value query

With an attribute value query, you can find a file that contains attribute values ​​that match the standard. The attributes that can be used to query include basic information of the file (such as file name, file size) and ActiveX property (included in the Document Summary), and the ActiveX property is created by the ActiveX application and stored in the file. Here are two types of properties queries:

Relationship property query includes "AT" character (@), attribute name, relational operator, and attribute value. For example, you want to find a file that exceeds a megabyte file, you can perform query @size> 1000000.

Regular expression queries include the regular expression of the number symbol (#), attribute name, and attribute value. For example, you want to find all video (.avi) files, you can perform Query #FileName * .avi. Regular expressions do not match the specific properties content (#Contents) and all (#all). The attributes that cannot be retrieved in the query cannot be used for # query, including HTML META properties that are not stored in the property cache.

This section contains the following topics:

Attribute name

Relational operator

Attribute value

Attribute name

The attribute name begins with "AT" (@) or number symbol (#), @ for relational query, # for regular expression query.

If the property name is not specified, it will be assumed to be @Contents.

The properties available in all files include:

Attribute name

Description

All

Match words, phrases, and any properties

Contents

File words and phrases

Filename

Name of the file

Size

File size

Write

The last revision time of the file

The ActiveX property value can also be used for queries, and most of the Web site files created by the ActiveX application can be queried by the following attributes:

Attribute name

Description

DOCTITLE

Title of the document

DOCSUBJECT

Document theme

Docauthor

Document author

Dockeywords

Keywords in the document

DocuComments

Note in the document

Attribute Name Complete list, see the list of property names later.

Relational operator

Relational operator is used in relational properties queries.

Search target

Example

result

Attribute values ​​related to fixed values

@Size <100 @Size <= 100 @Size = 100 @Size! = 100 @Size> = 100 @Size> 100

Size and query matching file

Have all bit-specific attribute values

@attrib ^ a 0x820

Compressed file with archive properties

Have some bit more than possible attribute values

@attrib ^ s 0x20

File with archive properties

Attribute value

Search target

Example

result

Specified value

@Docauthor = Bill Barnes

File created by "Bill Barnes"

Value started with a prefix

#Docauthor george *

Author name is a file that heads with "george"

File with any extension of extension

#filename *. | (Exe |, DLL |, SYS |)

Files with .exe, .dll or .sys extension

Modified file after a date

@Write> 96/2/14 10:00:00

Documents modified after 10:00 GMT on February 14, 1996

Modified file after the relative date

@Write> -1d2h

Modified files within 26 hours

Match a vector vector

@vectorprop = {10, 15, 20}

The vector value is {10, 15, 20} ActiveX documentation

Each value is a vector matching vector

@vectorprop> ^ a 15

Each value of the vector value is greater than 15 ActiveX documentation

At least one value matches the standard match

@vectorprop = ^ s 15

There is at least one value greater than 15 in the vector value.

When using a regular expression query, make sure to bring (#) characters before the attribute value, otherwise bring "AT" characters. A equals (=) relationship operator is defined as a regular expression query.

File name (#filename) is the only property that supports the regular expression of the wildcard in the left side of the text.

Date and time values ​​are YYYY / MM / DD HH: MM: SS or YYYY-MM-DD HH: MM: SS format. The first two and the whole time of the year can be ignored. If you ignore the first two digits of the year, less than or equal to 29 will be interpreted as the twenty-first century, greater than or equal to 30, will be interpreted as a twentieth century. All dates and times are GMT (GMT).

Relative to the date and time of the current time can be represented by minus or integer unit and time unit. The time unit is interpreted as: (Y) year, (m) month, (w) week, (d), (h) hours, (n) minute and (s) seconds. As an option, you can specify a triple millisecond value after the time expression. For example, 1997/12/8 10: 10: 03: 452

The currency value is x.y format. Here, X is an integer value of the amount, Y is a small value. The unit does not assume the value.

The logical value is: (t) or (true) corresponds to true, (f) or (false) corresponds to FALSE.

The vector (vt_vector) is represented as: the left bulk ({) begins, following the comma-separated value list, ends with the right bulk (}).

The single-value expression is represented by the vector as: a relational operator, and then (^ a) (corresponding to all) or (^ s) (corresponding to some).

The digital value can be a decimal or hexadecimal (add 0X in front).

The Contents property value does not support the relational operator. If the relationship operator is specified, no results will be found. For example, @ Contents Microsoft will look for documents containing Microsoft, but @ contents = Microsoft will not find anything.

Regular expression

The regular expression in the properties query is defined as follows:

In addition to the asterisk (*), a comma (.), Any character outside the question mark (?) And vertical line (|) is default to match yourself.

Regular expressions can be enclosed in quotation marks ("), if there is a space () or right bracket ()), you must be enclosed with quotation marks.

The role of characters *,., And?, Like they in Windows, the asterisk matching any character, the number of the junctions match (.) Or string, the question mark matches any single character.

Character | is a transduction character, after the character, the character has a special meaning:

(Start a group, there must be followed by it).

) End a group, the front must be (relative.

[Start a character class, there must be] (do not translate) relative.

{Matching of the start count, there must be} back.

} The matching of the end count must be {opposite.

Separate the OR clause.

* Match zero or multiple appearances of front expressions.

• Match zero or once in front of the previous expression.

Match one or more appearances of the previous expression.

Other characters, including |, match yourself.

The following characters between square brackets ([]) have special meanings:

^ Does anything except the following class. It must be the first character.

] Matching]. Only after ^, it will otherwise end the role of the class.

- Range operator. The front and back can be a normal character.

Other match yourself (start or end).

The following syntax is applied between braces ({}):

| {m |} Accurately match the M times of the previous expression. (0

| {m, N |} Match the appearance of the front expression from m to N times, including M and N. (0

To match *,., And?, Please enclose with parentheses (for example, | [*] Sample will match "* sample").

Query example

Example

result

@Size> 1000000

Greater than a megabyte page

@Write> 95/12/23

Modified page after this date

Apple Tree

Contains a page with phrase "apple trees"

Apple Tree

Equally

@Contents Apple Tree

Equally

Microsoft and @Size> 1000000

Contains words "Microsoft" and greater than greater than one megabyte page

"Microsoft and @SIZE> 1000000"

Contains pages with specified phrases (different from above)

#filename * .avi

Video file (because the query contains the regular expression, use the # prefix)

@attrib ^ s 32

Page with archive properties

@docauthor = john smith

Page created by the author

$ Contents why is the sky blue?

Match the query page

@Size <100 & #filename * .gif

GIF file greater than 100 bytes

Attribute name list

These properties are available in queries, and additional properties should be viewed by the web server.

Friendly name

type of data

Attributes

A_href

DBTYPE_WSTR | DBTYPE_BYREF

HTML HREF's text, this property name is created by Microsoft? Site Server, with the index server property name HTMLHREF. You can query, but you can't retrieve it.

ACCESS

VT_FileTime

The last access time of the file.

All

(Not applicable)

Search for each property of the string. You can query, but you can't retrieve it.

Allocsize

DBTYPE_I8

Disk size for file allocation.

Attribact

DBTYPE_UI4

File properties, in the Win32 SDK documentation.

ClassID

DBTYPE_GUID

Object class identity, such as WordPerfect, Word, etc.

CHARACTERIZATION

DBTYPE_WSTR | DBTYPE_BYREF

Description or summary of documents, used by Index Server.

Contents

(Not applicable)

The main content of the document. You can query, but you can't retrieve it.

Create

VT_FileTime

Create time of the file.

Directory

DBTYPE_WSTR | DBTYPE_BYREF

The physical path of the file does not contain the file name.

Docappname

DBTYPE_WSTR | DBTYPE_BYREF

The name of the application created the file.

Docauthor

DBTYPE_WSTR | DBTYPE_BYREF

Document author.

DocBytecount

DBTYPE_14

The number of bytes of the document.

Doccategory

DBTYPE_STR | DBTYPE_BYREF

Categories of documents, such as memo, plan or note.

Docucharcount

DBTYPE_I4

The number of characters of the document.

DocuComments

DBTYPE_WSTR | DBTYPE_BYREF

Notes on the documentation.

Doccompany

DBTYPE_STR | DBTYPE_BYREF

The company name of the document writing.

DocucreatedTM

VT_FileTime

Document creation time.

DOCEDITIME

VT_FileTime

Edit all the time used by the document.

DochiddenCount

DBTYPE_14

The number of hidden slides in the Microsoft® PowerPoint document.

Dockeywords

DBTYPE_WSTR | DBTYPE_BYREF

Document keyword.

DOCLASTAUTHOR

DBTYPE_WSTR | DBTYPE_BYREF

Recently edited the user of the document.

DOCLASTPRINTED

VT_FileTime

The most recent print time of the document.

Doclastsavedtm

VT_FileTime

The most recent save time of the document.

DoculineCount

DBTYPE_14

The number of rows containing the document.

Docmanager

DBTYPE_STR | DBTYPE_BYREF

The name of the manager of the document author.

DOCNOTECUNT

DBTYPE_14

The number of pages with comments in the PowerPoint document.

DocPageCount

DBTYPE_I4

The number of pages of the document.

Docparacone

DBTYPE_14

The number of graphics of the document.

DocPartTitles

DBTYPE_STR | DBTYPE_VECTOR

The name of the document section. For example, in Excel, some headings are the name of the electronic worksheet; in PowerPoint is a slide title; in Word for Windows, it is the name of each sub-document in the main document.

DocuPresentationTarget

DBTYPE_STR | DBTYPE_BYREF

PowerPoint presentation target format (35mm, printers, videos, etc.).

DOCREVNUMBER

DBTYPE_WSTR | DBTYPE_BYREF

The current version of the document.

ā ?? 僐 ?????? 㑩 ??? painted ?? DOCSLIDECOUNT

DBTYPE_14

The number of slideshows of the PowerPoint document.

DOCSUBJECT

DBTYPE_WSTR | DBTYPE_BYREF

The subject of the document.

DOCTEMPLATE

DBTYPE_WSTR | DBTYPE_BYREF

Document template.

DOCTITLE

DBTYPE_WSTR | DBTYPE_BYREF

Title of the document

Docwordcount

DBTYPE_I4

The number of words of the document.

FileIndex

DBTYPE_I8

The unique identifier of the file.

Filename

DBTYPE_WSTR | DBTYPE_BYREF

file name.

Hitcount

DBTYPE_I4

The number of hits in the file (word matching query).

HTMLHREF

DBTYPE_WSTR | DBTYPE_BYREF

HTML HREF text. You can query, but you can't retrieve it.

HtmlHeading1

DBTYPE_WSTR | DBTYPE_BYREF

The text in the HTML document is H1 text. You can query, but you can't retrieve it.

HTMLHEADING2

DBTYPE_WSTR | DBTYPE_BYREF

The text of the HTML document is H2. You can query, but you can't retrieve it.

HtmlHeading3

DBTYPE_WSTR | DBTYPE_BYREF

The text in the HTML document is H3 text. You can query, but you can't retrieve it.

HtmlHeading4

DBTYPE_WSTR | DBTYPE_BYREF

The text in the HTML document is H4. You can query, but you can't retrieve it.

HTMLHEADING5

DBTYPE_WSTR | DBTYPE_BYREF

The text in the HTML document is H5 text. Can be inquired, ā ?? 僐 ????? 㑩? 疀 ?? but cannot retrieve it. HtmlHeading6

DBTYPE_WSTR | DBTYPE_BYREF

The text in the HTML document is H6. You can query, but you can't retrieve it.

IMG_ALT

DBTYPE_WSTR | DBTYPE_BYREF

Tag candidate text. You can query, but you can't retrieve it.

Path

DBTYPE_WSTR | DBTYPE_BYREF

The physical path of the file contains the file name.

Rank

DBTYPE_I4

The level of the line, the range from 0 to 1000, the larger the number, the more match.

RankVector

DBTYPE_I4 | DBTYPE_VECTOR

Vector set of independent components of the vector.

Shortfilename

DBTYPE_WSTR | DBTYPE_BYREF

Short (8.3) file name.

Size

DBTYPE_I8

The file size is byte.

USN

DBTYPE_I8

Update serial number, only for NTFS drivers.

VPath

DBTYPE_WSTR | DBTYPE_BYREF

Point to the full virtual path of the file, including the file name. If there are multiple possible paths, one of the most conformable queries will be selected.

Workid

DBTYPE_I4

The INTERNAL ID of the file, INDEX Server is used.

Write

VT_FileTime

Recently writes the time of the file.

Define new attribute names

To define the properties in the previous list, you must list them in the [Names] section of the .idq file. To use these attributes defined in the .IDQ file in a list of restrictions, sort methods, or as the search, use the following format:

[Names] # Not in the standard list in the standard list = guid ["name" | PropID]

In the grammatical, "name" is the property name (below the example "Sales"), and the PropID is a hexadecimal attribute ID. Note that you have to put the friendly name with quotation marks, but the attribute ID does not use quotation marks.

For example, it is assumed that the HTML Meta tag is defined as the attribute name so that someone can search, the attribute to be defined is Sales.

Define Sales properties

Add the following line below [Names] in .IDQ file:

Metadescription (DBTYPE_WSTR) = D1B5D3F0-C0B3-11CF-9A92-00A0C908DBF1 "SALES"

GUID number from the metatagclsid parameter from the registry, this parameter is below:

HKEY_LOCAL_MACHINE / SYSTEM / CURRENTCONTROLSET / Control / HTMLFILTER / METATAGCLSID

Then, in the HTML file, you want to display a mark, define the META instructions.

For example, suppose you want to search all files containing the sales plan:

In File1.htm:

In File2.htm:

In File3.htm:

Note Make sure that the Meta Name tag is added to the file start and the HTML tag.

You can now search all files about the sales plan, please send the following query:

@Metadescription Projections

This query returns all files that contain the word Projections in the Content field of the Meta tag. In this example, File1.htm and File2.htm will be returned. However, if you want to search for sales, for example, in 1997, please send the following query:

@metadescription 1997

The file3.htm will be returned.

转载请注明原文地址:https://www.9cbs.com/read-106720.html

New Post(0)