Robotdog Product Program Recommendation
1 RobotDog positioning
Robotdog is a product with full-text search and network spider features. It is the first Java technology-based support Chinese intelligent search platform that automatically identifies non-structured information in different formats, focusing on providing intelligence to enterprises and individual users. Information classification and retrieval function. Unlike the general network search engine such as Google, Yahoo, etc., the Robotdog smart search engine has the characteristics of high retrieval information and high accuracy.
1.1 background
Information data within the company and organization is the most important resource of the company. The industry's survey shows that the current enterprise data increases 200% per year, 80% of the data stored in different corners of the enterprise computer system in document, email; business staff averages two and a half hours every day in finding information .
u The organization has various obstacles, so that the application efficiency is low:
There is too much volume, you want to get useful information, you need to spend a lot of time!
u Data Distribution In different systems, I forgot where information is required. You need to convert multiple times to find it.
u Handle new business and new issues, do not know which information can be used inside.
u Adopt new information management systems often change the work habits of employees, take time and energy to adapt and learn.
The U system itself performs time and cost.
1.2 RobotDog
As far as the information is stored, the company's own information is generally dispersed on their own working machine, while the company / organization information is concentrated on the enterprise / organization's database and file server, typical expression is an enterprise / organization. Website. Thus, from an application method, RobotDog can retrieve local file systems as a desktop program, or as a server product to retrieve internal information in an Internet / Intranet.
1.3 competitor
1.3.1 Microsoft
Microsoft's future products are our potential desktop application methods competitors. Microsoft is also intelligent search capabilities to join in the future Loonghorn operating system, but because of multiple languages, and the release of Longhorn itself is coming, our RobotDog will quickly occupy the desktop market. Microsoft has a fatal weaknesses that must be run under its own large operating system and do not support a variety of platforms. Our RobotDog technology is based on Java technology and various open source technologies, itself is an open system.
1.3.2 Massive Technology http://www.hylanda.com
Massive DESE Embedded Database Search Engine is our competitors in the server. But its product is based on C technology, which is strong for Microsoft's technical dependence, and must use SQL Server databases; not supported data format such as pictures, binary files; its price is more expensive.
And our RobotDog not only supports the full text of frequent text information such as DOC, Excel, Email, RTF, PDF, PS, CHM, HTML, XML, TXT, but we also support search for other format data such as pictures. More importantly, we also support indexing of compressed file formats such as ZIP, RAR, TAR, JAR, GZ, CAB and other files. When establishing an index, you can specify the file format that needs to be established by the user, and customizes very flexible. Our database uses a full-text inspection loop and MySQL database combined with the operating system and platform technology at all.
2 product solutions
From RobotDog's application, RobotDog provides two user interfaces. The first is a desktop application for ordinary users: Robotdog Personal Edition. It is a server-side application for enterprise users: Robotdog Enterprise Edition. Both of these two versions are based on RobotDog's core functional components, such as Chinese words, full-text retrieval, network reptile, multiple file protocol support modules, etc. Typical institutions are as follows: 2.1 Technology Lines
Robotdog uses Java, J2EE, XML technology, based on various excellent open source software, constructing a powerful intelligent retrieval system. RobotDog contains many functional base parts, such as Chinese words, full-text retrieval, network reptile, multi-format documents, and protocol access interface support, etc. These functional components can also be used alone, in addition to integrated in RobotDog, and can be easily entered into a variety of enterprise information systems (such as CMS / OA / CRM / ERP / Enterprise Collaborative Platform).
2.1.1 User Interface
Users of RobotDog may be corporate users, or it is only a single user. This requires two user interfaces for RobotDog. For corporate users, it may be reasonable to use centralized management, in general, web applications with B / S structures is relatively reasonable. The single-machine user mainly uses RobotDog to manage and retrieve the local file system, and use the GUI program. In the first edition, the function of enterprise users will be realized first.
2.1.2 Chinese word function module
The Chinese word function module will use the Chinese scratches of Peking University computing language, using a corrected forward maximum probability algorithm MFF, high recognition rate (accurate rate is over 99%), can automatically identify the name, support GBK / GB2312 / UTF -8 / Unicode and other codes. In order to compatibility with other better word techniques, the programming of this module should be fully considered to use an interface to facilitate replacement of more advanced fensed techniques. Peking University Retrieval Language provides a C version of the program implementation, this cordial module re-implements the algorithm using Java language to avoid JNI calls.
2.1.3 Full Text Retrieval Module
The full text retrieves the Lucene open source project under the Jakarta project. See http://jakarta.apache.org/lucene/.
2.1.4 Multi-File Format and MIME Protocol Support Interface
This function module needs to implement text information extract of DOC, PPT, XLS, PDF, PS, RTF, XML, HTML, TXT, etc., indexing the input stream for Lucene.
This function module also supports support of multiple protocols in HTTP communication, providing an interface between SPIDER and full-text retrieval functions.
2.1.5 SPIDER function
Spider contains two layers, one is to automatically analyze user-defined website directory structures, and provide the full view of the website for full-text, and the second is to automatically analyze the file system specified by the user, and the file has been retrieved.
2.2 RobotDog Personal Edition
Personal Edition provides a full management function of the local file system. For DOC, Excel, Email, RTF, PDF, PS, CHM, HTML, XML, TXT files, and its compression formats can be saved full-text retrieval, and two-way files such as pictures can be saved in the mysql database. The relevant information here refers to the file name, path, size, author, creation time, modification time, version history, etc. The personal version provides a background wizard thread that automatically initiates indexing when the CPU is idle. Once an index is established, the user can use the RobotDog personal version of the front-end user interface to search, can be performed by means of keywords, file names, time ranges, etc., and search speed is very fast. After the user uses the RobotDog personal version to manage the computer's files, you don't have to worry about finding the file.
In contrast, the current Windows search function looks for a file or looks behind by the keyword, often after waiting for a long time, you can't search. The market prospect of the Robotdog is very optimistic. 2.3 Robotdog Enterprise Edition
Compared to personal version, RobotDog Enterprise Edition requires more enterprise-class computing features, such as security, distributed, multi-threads, data backups, transactions.
The RobotDog Enterprise Edition is built in the J2EE platform. In addition to supporting all the features of your personal version, you can index (increment, bulk, fully reconstructed indexing mode) in a variety of ways, support the search content indexing and retrieval, support data Hot spare and regular backup. In addition, it also supports the analysis of the website structure, checks the dead link, checks page completeness, downloading the entire website.
The Robotdog Enterprise Edition provides secondary development, which can be seamlessly integrated with the company's existing IT system.