The Lemur Project 4.12


 Sponsored links


 Sponsored links
License:
Freeware
Category:
Developer Tools
Publisher:
The-Lemur-Team
Size:
63.2 MB
Last Updated:
2013-12-06
Operating System:
Mac OS X
Price:
FREE
Download
Publisher's description - The Lemur Project 4.12
 
 Sponsored links

The Lemur Toolkit is a free and open source application designed to facilitate research in language modeling and information retrieval. The Lemur Toolkit includes technologies such as ad hoc and distributed retrieval, cross-language IR, summarization, filtering, and classification.

Here are some key features of "The Lemur Project":

· Sophisticated structured query languages (using InQuery and Indri)
· Support for XML and structured document retrieval
· Used commonly with a wide range of research test collections (e.g., TREC CDs 1-5, wt10g, RCV1, gov, gov2)
· Index your web pages with an "out-of-the-box" site search capability
· Interactive interfaces for Windows, Linux, and Web
· Distributed information retrieval and document clustering applications
· Cross-platform, fast and modular code written in C++
· C++, Java and C# APIs
· Free and open-source software
· In use for over 6 years by a large and growing user community

Indexing:
· Multiple indexing methods for small, medium and large-scale (terabyte) collections
· Built-in support for English, Chinese and Arabic text
· Porter and Krovetz word stemming
· Incremental indexing
· Out-of-the-box indexing support for TREC Text, TREC Web, plain text, HTML, XML, PDF, MBox, Microsoft Word, and Microsoft PowerPoint
· Indexes inline and offset text annotations (e.g., part-of-speech and named entities)
· Indexes document attributes

Retrieval:
· Supports major language modeling approaches such as Indri and KL-divergence, as well as vector space, tf.idf, Okapi and InQuery
· Relevance- and pseudo-relevance feedback
· Wildcard term expansion (using Indri)
· Passage and XML element retrieval
· Cross-lingual retrieval
· Smoothing via Dirichlet priors and Markov chains
· Supports arbitrary document priors (e.g., Page Rank, URL depth)

What`s New in This Release: [ read full changelog ]

· 02) Click to expand/collapse Version: 4.12
· BUG# 3014524 -- Update google parser for query log toolbar server.
· BUG# 3014521 -- Query log toolbar server can now be run with an optional
· hostname parameter, which will be used instead of localhost if
· specified.
· BUG# 3013328 -- Fix crash on large queries in the CGI.
· BUG# 3013325 -- Fix CGI snippets.
· BUG# 3013315 -- Fix crash in CGI when fewer than 50 documents are
· returned.
· BUG# 3013313 -- Fix CGI to get document text when using multiple indri
· indexes.
· BUG# 3004284 -- Fix memory leaks in QueryEnvironment::expressionCount
· and QueryEnvironment::expressionList.
· BUG# 3000138 -- Fix snippet generation with queries that use the #max
· operator.
· BUG# 2989973 -- Prevent SIGPIPE being raised in IndriDaemon.
· BUG# 2985880 -- Prevent field restricted queries when using the non-LM
· baseline retrieval.
· BUG# 2982858 -- Modify the query parser to transform hyphenated terms
· into #1 expressions. This is closest to the result of splitting tokens
· on hyphens...


 

Also See ...
Source Code Counter

Source Code Counter
Learn PhalconPHP

Learn PhalconPHP
PHP From Scratch

PHP From Scratch
Del

Del
QuickLens

QuickLens



More
GenRandomAppPlus

GenRandomAppPlus
Easy MP3 Streaming Server

Easy MP3 Streaming Server
Syberia

Syberia
The Fool

The Fool
Super SafeBox

Super SafeBox



Mac App
Syberia

Syberia
Iconset

Iconset
CalendarOnDesktop 1.2.0

CalendarOnDesktop 1.2.0
Pano2VR 3.0

Pano2VR 3.0
Fashion Rush 1.0

Fashion Rush 1.0