RDF - Metadata Solution for Web Data Integration

zhaozj2021-02-16  79

One. Introduction In today's society, the information is everywhere, and the information that is useful from this information is not easy. Of course, there are exceptions, for example, in the library you can find the book number according to the book name or author name or keyword, so it is easy to find the book you want, in the audio store, you can be easily according to the title, starring Find the DVD you want. These two systems have a common feature - they are all built on metadata. Metadata is information about data or information about information. For example: The text of the book is the data of the book, and the title, author, copyrighted data is the metadata of the book. The metadata is not necessarily used to retrieve, or it can be used inside management. The use of metadata can greatly improve the efficiency of the retrieval and management of the system. The network is a large database, and the data contained in it complicates more than the library and audio-visual store, and there is a problem, but there is a problem - the network basically does not have metadata. How did the search engine work? In fact, in addition to the very few Yahoo! in the search engine, it is basically a full-text search service to provide retrieval services, which can be thought of rather rather. Yahoo! Index and abstracts collected from the website and the web page division class (completed by manual), which greatly improves the ratio, which is an important reason why it is popular. However, if the ocean such as such a vast information is clearly unrealistic, we use Yahoo! to check the full rate as the search engine such as Altavista, Infosek, because the number of website pages included is limited . If the resources on the network use metadata to describe their own information, can you save trouble? Yes, but how to use metadata, this has a standard, and the W3C is proposed by the RDF (Resource Description Framework Resource Description Framework) for describing the Web resource. The RDF gives the web data integration. Data solution. two. RDF Introduction The meaning of RDF is to describe the Framework for Description Resources. Let's see the three words one by one. Resource: All is named on the Web, with a URI (Unified Resource Identifier Unified Resource Descriptor). Such as a web page, an element in an XML document, etc .; Description: A statement of the resource attribute (statement) to indicate the characteristics of the resource or the connection between resources; FrameWord: Unrelated to the description resources General model, with diversity, inconsistency, and repetition of inclusion and management resources. Comprehensive, RDF is defined a universal frame, ie the resource-attribute-value ternary group, which does not change to 10,000 resources, and various resources on the Web.

Let's take a simple example of RDF: (specified URI described) Tim Bray (The resources are described in the author, the value is the author's property, its value is TIM BRAY) (Sketch "is called Home-page is the properties of the homepage, its value points to another resource) (End Sign). RDF implementation Web metadata description and exchange mechanism 3.1 Two key technologies of RDF RDF have two key technologies - URI and XML. The URI is the unique identifier of the web resource. It is a supercoming of the more commonly used resource locator URL. In addition to the webpage, it can identify elements, books, television such as elements, books, TV, and even identify someone. In RDF, the resources are omnipotent, the properties of resources are resources, and the value of the property can be a resource, and even a statement can be a resource, that is, all of these can be identified by the URI, and can be described again with RDF. How do the RDF is placed on the Internet? XML assumes this responsibility as a common file format, which defines the representation syntax of RDF, which can be easily exchanged with the data of RDF with XML. 3.2 Word Collection We can see that RDF only defines a framework for describing resources, which does not define which metadata is used to describe resources. This is exactly how it is. Because it is clear that metadata of different resources is different, if you want to define a metadata set, including all kinds of resources, this is still unrealistic, not only the workload is huge, but even if such a metadata set is defined Can you adopt a problem that you have used by everyone because you have used the system to describe its resources in the library, you should abandon the original metadata set with a new metadata set, and its workload is imagined. It is estimated that the resistance encountered during the implementation process will be large. RDF uses another method that it allows anyone to define metadata to describe a particular resource, so actually defining a metadata set in RDF because the properties of resources are not limited. Vocabulary, the word collection is also a resource, you can use the URI to uniquely identify, which can use various words collection when describing resources with RDF, as long as you specify them with URIs. Of course, the popularity of all kinds of words may be different, and some may have only been used by those who are defined, but some of them are accepted by their definitions, such as defined by similar library card catalogs. The word collection of resources Dublin Core, defines the content IMS metadata, define the V-CARD metadata of personal information. Since the word collection is the resource, of course, it can be used to describe its attributes and relationships between other word bolts. W3C specially propose RDF schema to define how to use RDF to describe the word collection, that is, rdf schema is defined RDF vocabulary The word collection, but this RDF schema is not casual, it can only be defined, it is only one, which is the version defined by the W3C.

For example: http://mymetadata.vocab.org/author --- rdfs: subpropertyof ---> http://purl/dc/ersion/1.0/creator means that someone's own defined metadata Author is Dublin Core Special form of metadata CREATOR. RDF Schema is using the relationship between metadata of different words collected in such a way, thereby making a basis for metadata exchange. 3.3 Implementation mechanism We can find how RDF is to implement metadata description and exchange on the web: it uses XML syntax, first specify the URI of the word collection, the word collection can be multiple, depending on the needs , Use the specified word collection to describe the resources, how to contact between different words bolts? Use RDF Schema. In order to understand this mechanism, let's take a look at an example of RDF expressed with XML: (URI of the Word Collection 2 (URI described) 1 Metadata Creator Description Author Properties) Elliotte (using word collection 2 metadata description author's name properties) Rusty Harold 4. RDF Features 4. 1 Easy to control RDF Use Simple Resources - Properties - Value Triple Group, So it is easy to control, even when the quantity is large. This feature is very important, because the web resources are now more and more, if the metadata format used to describe resources is too complicated, it will greatly reduce the efficiency of metadata, actually From the perspective of function, you can use XML to describe resources, but the XML structure is more complicated, allowing complex nested, not easy to control. Using RDF to improve the efficiency of resource retrieval and management, thereby truly playing the function of metadata 2. 2 Easy to expand when using RDF to describe resources, the word collection and resource description is separated, so it can be easily expanded. For example, if you want to add the properties of the description resources, you only need to increase the corresponding metadata in the vocabulary. And if you are using a relational database, add new fields that is easy. 4.3 Inclusive RDF Allows anyone to define its own word collection, and can seamlessly use multiple words collection to describe resources, to describe resources as needed To use, make each other. For example, in the previous example, use Dublin Core to describe its author attribute, while describing another special descriptor's word collection when describing the author's name.

4. 4 Exchangeability RDF uses XML syntax, it can be easily implemented on the network; Data exchange. 4. 5 Easy to integrate in RDF's properties is resource, the attribute value can be a resource, and the statement of resources can also be a resource, which can be described using RDF, so it can be easily integrated to achieve knowledge the goal of. For example, when describing a book, the author's attribute value is another resource, we can obtain the author's information according to the author's URI, such as graduation institutions, etc., and know that this book is a graduate of a certain college. Written, so it seems that there is no relationship between the two, and this connection is often the prelude of knowledge discovery. Fives. RDF and several web technology 5. 1 RDF and Resource Discovery Technology RDF uses a simple resource-attribute-value ternary group to describe resources. Imagine if resources on the web are described in RDF, because RDF uses XML syntax, so it can be easy Automatic search of resources, without performing manuality, and can achieve high check rate and quotation rate; additional, RDF description can be easily integrated, generating surface is not easy to observe. All of this will have a revolutionary impact on resource discovery techniques. 5. 2 RDF and Personalized Services With the development of web technology, personalized services are put on a schedule. W3C proposed comprehensive capacity / preference interface (CC / PP, Composite Capability / Preference Profile) Recommended standards are a collection of users and their performance and preferences for Internet tools (including hardware platforms, system software, and application software). It uses RDF technology. We can simply believe that the ability and preferences of users and tools are the user's properties, which is the user's metadata, so I can use RDF to describe the same way to describe the ability of Web content and users and preferences. When you get information, you can make a compromise through a certain rule, so that the obtained information is in line with the user's ability and preference, providing a personalized service for the user. For example, a web content is implemented in a variety of languages, but due to the problem of translation, the credibility of each language is high, and the user's ability to master the various languages ​​is different, so it needs a certain rule. Folding to allow users to choose a language that he can understand is the most faithful to the original document, using RDF Description Web content and user capacity / preferences can greatly simplify this process. 5.3 RDF and Web Information Filtering RDF initially proposed to match the PICS (Platform for Internet Content Selection, Internet Content Selection Platform) specification for W3C. PICS is a mechanism that passes the web content level to the client, such as whether a web page contains pornography and violence. Different institutions can grace the web content according to their own value standard so that users can easily filter some web pages by setting up a browser. A requirement for RDF design is to express all content expressed in PICS1.1, so that the PICS1.1 tag can be automatically translated into RDF identifiers, without loss of any information, the advantage of doing this is to perform data with RDF. Switch.

转载请注明原文地址:https://www.9cbs.com/read-16363.html

New Post(0)