Free download Arch Search Engine

Arch Search Engine

An open source, high precision corporate search engine based on Apache Nutch
Free Download
User rating
12 votes
CSIRO Astronomy and Space Science
File size
22.8 MB
Release date
3 July 2012

Editor's review

Since the time Google came into horizon, it has redefined the way we search for information on the way. Today it has become the singly point indexer of a large amount of human knowledge and it results are highly accurate. Now if you run an intranet website for your company and were looking to replicate a search engine with similar capability then you should consider Arch Search Engine 1.43. Most search engines tailored for intranet sites fail to deliver accurate results but Arch manages to pull a spectacular success in dishing out refined search results owing to its class leading algorithm.

Once you set up the Arch Search Engine 1.43 on your corporate web server, you can simply forget about it. It keeps a continuously updated index and whenever something gets added on to your server it goes ahead and discovers it. It can cover an entire network of corporate sites, mini sites, collaboration blogs etc. It also can be configured to restrict search access to certain documents which may wish to remain hidden from most people on your corporate network. The interface of the search engine is quite easy to get used to and it comes with a Simple Query option for regular searches. While it also offers an Advanced search feature to find out accurate results. You can limit the search results by the presence of some words in a document or URl and even in title. It gives you the option to exclude a specific word or even search for a combination of words. Given its origin on Apache Nutch, the application is uniquely suitable for scanning a huge corporate database of files running into billions.

In simple words, the Arch Search Engine 1.43 can be considered as Google of intranet specific search tools. Owing to its awesome performance, highly refined advanced query features, we mark this remarkable tool with a full score of 5 rating stars.

Publisher's description

Arch is an open source extension of Apache Nutch (a popular, highly scalable general purpose search engine) for intranet search. Not happy with your corporate search engine? Not surprising, very few people are. To the best of our knowledge, there are no intranet engines that work as well as the Google's global Web search does. There is a fundamental reason for this: the algorithms used by Google on the global Web (or similar) do not work nearly as well on intranets for the lack of statistical data. Arch (finally!) solves this problem. It uses a novel method to deliver high precision search results that works great. Don't believe it? Blind test evaluation tools are included. You can deploy Arch and compare its performance to your current search engine and/or Google (on the public part of your site) using a blind test methodology.
In addition to the excellent search quality, Arch has many features critical for corporate environments:
- Document level security. Users can find only documents that they are authorized to see.
- Inexpensive index updates. Arch is able to keep indexes up to date and avoid regular complete site recrawling.
- 24/7 availabilty. There is always a working index available, even if a crawl fails.
- Support for simultaneous indexing and search of multiple web sites, with ability to search and administer any site separately, if needed. Dynamic adding and removal of web sites is easy.
- An automatically generated site directory.
- Low cost support once deployed.
- Dual interface (PHP and Java) for easy deployment and customization.
- Faceted search "out of the box".
- An extensive and extensible set of parsers for parsing a variety of file formats: HTML, PHP, PDF, MS Office, Open Office, etc.
- A modular, plugin-based architecture that can be easily customized and extended.
- The source code is included.
- High performance and scalability. Arch can run on computer clusters to index very large data sets.
Arch Search Engine
Arch Search Engine
Version 1.43
Free Download

User comments

Rate this program